We presented a state-of-the-art method for automated Facial Action Coding System for neuropsychiatric research. By measuring the movements of facial action units, our method can objectively describe subtle and ambiguous facial expressions such as in , which is difficult for previous methods that use only prototypical emotions to describe facial expressions. Therefore the proposed system, which uses a combination of responses from different AUs, is more suitable for studying neuropsychiatric patients whose facial expressions are often subtle or ambiguous. While there are other automated AU detectors, they are trained on extreme expressions and hence unsuitable for use in a pathology that manifests as subtle deficits in facial affect expression.
We piloted the applicability of our system in neuropsychiatric research by analyzing videos of four healthy controls and four schizophrenia patients balanced in gender and race. We expect that the temporal profiles of AUs computed from videos of evoked emotions ( – ) can provide clinicians an informative visual summary of the dynamics of facial action. They show which AU or combination of AUs is present in the expressions of an intended emotion from a subject, and quantifies both intensity and duration.
– visualize dynamical characteristics of facial actions of each subject for five emotions at a glance. Overall, these figures revealed that the facial actions of the patients were more flattened compared to the controls. In a healthy control (), there was a gradual buildup of emotions that manifests as a relatively smooth increase of multiple AUs. Such a change in profile is expected from the experimental design: the contents of the vignettes progressed from mild to moderate to extreme intensity of emotions. For another control (), there were several bursts and underlying activities of multiple AUs. In contrast, patients showed fewer facial actions ( and ), and the lack of gradual increase of AU intensities ( and ). Also the AU peaks were isolated in time and across AUs (). Such sudden movements of facial muscles may be symptomatic of the emotional impairment. These findings lay the basis for a future study to verify the different facial action dynamics patterns in schizophrenia patients and other neuropsychiatric populations.
With the prior knowledge of qualifying and disqualifying AUs for each emotion, we can use the temporal profiles to further aid the diagnosis of affective impairment. and show the profiles of two subjects in happiness. In the first subject, Cheek Raiser (AU6) and Lip Corner Puller (AU12), which constitute the qualifying AUs of happiness (), are gradually increasing toward the end of the video, along with Lid Tightener (AU7), which is neither qualifying nor disqualifying. No disqualifying AU is engaged in producing the happiness expression. In contrast, the profiles of the second subject were almost flat. and show the profiles of two subjects in sadness. With the first subject, the qualifying AU of Chin Raiser (AU17) weakly activated from 30 (s). Although Lip Corner Depressor (AU15) is not a uniquely qualifying AU for sadness, its presence seems to help deliver the sad expression. The profiles of the second subject were flat and showed the weak presence of several AUs (AU6, AU7, AU18, AU23) which are neither qualifying nor disqualifying. and show the profiles of two subjects in anger. The first subject showed moderate activation of Lid Tightener (AU7) and Brow Lowerer (AU4). These two are neither qualifying nor disqualifying AUs, but they indicate an emotion of a negative valence. The second subject demonstrated an inappropriate expression throughout the duration of the video, which looks closer to happiness than anger. The presence of Cheek Raiser (AU6) and Lip Corner Puller (AU12) in time 50(s) – 100 (s), which constitute the qualifying AUs of happiness, strongly indicates inappropriateness of the expressions, as is Chin Raiser (AU17) at 100 (s) and 200 (s). and show the profiles of two subjects in fear. The first subject exhibited underlying activity of Cheek Raiser (AU6) and Lid Tightener (AU7) which are inappropriate for fear. The second subject displayed flat expressions except for Cheek Raiser (AU6), Lid Tightener (AU7), Lip Corner Puller (AU12), and Lip Stretcher (AU20) at around 40 (s), which constitute an inappropriate expression for fear. Lastly, and show the profiles of two subjects in disgust. The first subject was very expressive and showed multiple peaks of Inner Brow Raiser (AU1), Brow Lowerer (AU4), Cheek Raiser (AU6), Lid Tightener (AU7), Nose Wrinkler (AU9), Chin Raiser (AU17), and Lip Tightener (AU23). In contrast, the second subject showed flat profiles except for a brief period of Chin Raiser (AU17) at around 70 (s). We conclude from these findings that the proposed system automatically provides informative summary of the videos to study the affective impairment in schizophrenia, which is much more efficient than manually examining the whole videos by an observer.
The AU profiles from the pilot study were also analyzed quantitatively. From the temporal profiles, we computed the frequency of AUs in each emotion and subject, independently for each AU (), and jointly for AU combinations (). AU combinations measure the simultaneous activation of multiple facial muscles, which will shed light on the role of synchronized facial muscle movement in facial expressions of healthy controls and patients. This synchrony cannot be answered by studying single AU frequencies alone. The quantitative measures will allow us to statistically study the differences in facial action patterns between emotions and in multiple demographic or diagnostic groups. Such measures have been used in previous clinical studies (
Kohler et al. (2004,
2008)) but were limited to a small number of still images instead of videos due to the impractically large amount of effort in rating all individual frames manually.
Lastly, we derived the automated measures of flatness and inappropriateness for each subject and emotion. shows that the healthy control has both low flatness and low inappropriateness measures, whereas patients exhibited higher flatness and higher inappropriateness in general. However, there are intersubject variations, for example, control 1 showed a lower inappropriateness but a slightly higher flatnesss than the patient 3. These automated measures of flatness and inappropriateness also agreed with the flatness and inappropriateness from visual examination of the videos (). The correlation between the automated and the observer-based measurements will be further verified in a future study with a larger sample size. Compared to qualitative analysis, the flatness and inappropriateness measures provide detailed automated numerical information without an intervention of human observers. This highlights the potential of the proposed method to automatically and objectively measure clinical variables of interest, such as flatness and inappropriateness, which can aid in diagnosis of affect expression deficits in neuropsychiatric disorders. It should be noted that while we have acquired the videos with a specific experimental paradigm adopted from former studies, the method we have developed is general and applicable to any experimental paradigm.
The fully automated nature of our method allows us to perform facial expression analysis in large scale clinical study in psychiatric populations. We are currently acquiring a large number of videos of healthy controls and schizophrenia patients for a full clinical analysis. We will apply the method to the data and present the results in a clinical study with detailed clinical interpretation.