|Home | About | Journals | Submit | Contact Us | Français|
Human adults automatically mimic others' emotional expressions, which is believed to contribute to sharing emotions with others. Although this behaviour appears fundamental to social reciprocity, little is known about its developmental process. Therefore, we examined whether infants show automatic facial mimicry in response to others' emotional expressions. Facial electromyographic activity over the corrugator supercilii (brow) and zygomaticus major (cheek) of four- to five-month-old infants was measured while they viewed dynamic clips presenting audiovisual, visual and auditory emotions. The audiovisual bimodal emotion stimuli were a display of a laughing/crying facial expression with an emotionally congruent vocalization, whereas the visual/auditory unimodal emotion stimuli displayed those emotional faces/vocalizations paired with a neutral vocalization/face, respectively. Increased activation of the corrugator supercilii muscle in response to audiovisual cries and the zygomaticus major in response to audiovisual laughter were observed between 500 and 1000 ms after stimulus onset, which clearly suggests rapid facial mimicry. By contrast, both visual and auditory unimodal emotion stimuli did not activate the infants' corresponding muscles. These results revealed that automatic facial mimicry is present as early as five months of age, when multimodal emotional information is present.
Humans often spontaneously and unconsciously match their behaviours to those of others. In particular, matching facial expressions, often termed facial mimicry, have various social functions for our smooth social interactions. Facial mimicry has often been assessed by measuring facial electromyographic (EMG) activity during observation of facial expressions, which enables the detection of subtle, visually imperceptible reactions [1,2]. For example, observation of negative facial expressions (anger, sadness or crying) activates the observer's corrugator supercilii (brow) activity, whereas observation of positive facial expressions (happiness, smiling, laughter) activates the observer's zygomaticus major (cheek) activity. This distinct pattern of facial reactions to different emotions can be observed as soon as 500 ms after exposure to facial stimuli [3–5]. Interestingly, such responses can be evoked across modalities. For instance, people activate the facial muscles involved in expressing a certain emotion even when they see others' emotional bodily gestures or hear their voices [6–9]. These findings suggest that facial mimicry reflects a multimodal re-enactment of one's own sensory, motor and affective experiences, which occur in response to any signals across modalities (i.e. embodied emotion theory) . Moreover, a similar response can surprisingly be observed even when participants are unaware of the stimulus owing to short presentation time [11,12] or if participants are affected by cortical blindness . These further suggest that facial mimicry of emotional stimuli involves subconsciously controlled processes.
The automatic processes of mimicry raise important questions such as whether the system for facial mimicry is innate or acquired later. Some researchers have argued that there is an inborn connection between ‘seeing’ and ‘doing’ that is known as a sensory–motor coupling system [14–16]. This view is supported by the findings that new-born infants mimic the orofacial actions of others, a phenomenon referred to as the neonatal imitation  (see a review by Simpson et al. ). However, evidence for this appears very limited, because only a few actions (e.g. tongue protrusion; mouth opening) were reliably mimicked and several studies failed to replicate the results [18–22]. In particular, few reports exist of new-born infants mimicking emotional facial expressions (but see ). Neonatal imitation is thus likely a specific reaction to a particular condition, which should be differentiated from general facial mimicry, and facial mimicry in reaction to emotional expressions may not be present from birth and postnatally acquired.
This raises an important question: when and how does facial mimicry emerge during postnatal development? Previous studies reported that mothers' emotional expressions induce the similar categories of facial expressions in three-month-old infants [24–26]. However, this behaviour seems present only when emotional expressions are displayed by their mother, but not by strangers . Therefore, it is still unclear to what extent these responses can be considered equivalent to automatic and rapid facial reactions that have been investigated in adults. To address this question, the technique of EMG measurement would be useful, as it would allow us to analyse even subtle responses of facial muscles with high temporal resolution in a more objective manner. Nevertheless, there is no study that examined EMG reactions to facial emotions in children younger than 3 years . This study therefore examined young infants' facial EMG activities in response to others' emotional expressions. We focused particularly on four- to five-month-olds because it has been suggested that infants start to discriminate several emotions around this age [28,29], and we speculated that automatic motor responses to others' emotions would become differentiated with a similar developmental sequence of emotion recognition.
We also focused on investigating how the infants react to different modality of the emotional displays. Past studies on emotion recognition have proposed that infants first become able to discriminate emotions from audiovisual bimodal stimuli, and later this ability expands to unimodal auditory and visual stimuli [28,29]. We speculated that the consistent developmental sequence might be present in the mimicry domain, in which four- to five-month-old infants would demonstrate facial mimicry to audiovisual emotional displays, but may not to unimodal stimuli. We therefore measured infants' EMG activities over the corrugator supercilii and zygomaticus major muscles while viewing adults' dynamically expressing crying, laughing and neutral emotions with audiovisual, visual and auditory modalities. To capture the infant's attention equally across conditions, stimuli of every modality condition involved both visual and auditory information, but the conditions were differentiated by the modalities that convey ‘emotional’ information. That is, while the audio-visual bimodal emotion stimuli were a display of laughing/crying facial expression with an emotionally congruent vocalization, the visual/auditory unimodal emotion stimuli displayed emotional faces or vocalizations paired with a neutral vocalization or face, respectively. Audio-visual physical synchrony, regardless of the emotional sense, is known to affect the speed or threshold of the perception [30–33]. We therefore employed audio-visual asynchronous stimuli (that were emotionally congruent) in the AV condition, which allowed us to compare three conditions fairly in terms of modality conveying emotional information, but not in terms of their physical and temporal synchrony.
The final sample consisted of 15 full-term four- to five-month-old infants (nine males and six females, mean age = 154.6 days; standard deviation (s.d.) = 10.0 days; ranging from 140 to 169 days). An additional 25 infants were recruited, but were excluded from the analysis owing to the following reasons. Sixteen of them did not provide the minimum dataset that required at least one trial from all the seven experimental conditions because of their crying before (mostly during electrode attachment) or while viewing stimulus clips; three infants had frequent body movements during the EMG recordings, which made data processing difficult; the other six infants were excluded owing to experimental mistakes.
Video clips were presented on a 23-inch monitor with a resolution of 1920 × 1080 pixels (ColorEdge CS230, Eizo, Japan), which was placed at a distance of approximately 40 cm from the infant sitting on an experimenter's lap. Sounds were presented by a pair of speakers placed behind both sides of the monitor. A video camera that was masked from the infants was mounted on the top edge of the monitor. EMG activity was recorded by a bioamplifier, PolymateII AP2516 (Teac), which was also blocked out from the infant's view.
Infants viewed a series of video clips while their facial EMG activities over the left corrugator supercilii and zygomaticus major were recorded. The video consisted of three stimulus sets. A schema of the presentation during a stimulus set is shown in figure 1. Each stimulus set contained seven trials, which presented all stimulus conditions: 2 emotion-types (laughing and crying) × 3 emotion-modalities (audiovisual, AV; visual, V; auditory, A) and one additional control condition. Modalities refer to those related to emotion; that is, all stimuli contained both visual and auditory information, but in the unimodal (V and A) emotion conditions, an emotional facial expression or vocalization was paired with a neutral vocalization or facial expression, respectively. The AV condition had an emotionally consistent facial expression and vocalization, and the control condition involved a neutral facial expression with a neutral vocalization in which models made vocalizations such as ‘ah-ah’, ‘uh-uh’ or ‘oh-oh’. During production of the clips, facial expressions and vocalizations were recorded separately (i.e. they were not extracted from a common source) and integrated afterwards, so that visual and auditory information were not completely synchronized with each other, even for the stimuli in the AV condition. This manipulation allowed us to compare the AV, V and A conditions fairly in terms of their modality regarding emotional information, but not in terms of their physical and temporal synchrony. Each trial included three continuous presentations of the stimuli of the same condition represented by three different female models, each of which lasted 3 s. The presentation order of the three models was determined randomly across trials. Before each trial, a 6 s fixation clip was presented to capture the infant's attention. The fixation clips were selected from a commercially released DVD, Baby Mozart (The Walt Disney Co.), which presented various images of brightly coloured toys or visually captivating objects with classical music. Thus, each stimulus set lasted 111 s and almost 5.5 minutes were required to present the three stimulus clips. However, the experiment was terminated immediately if the infant became distracted or fussy. Individual data that did not have at least one trial from all seven conditions were excluded from the analysis.
Infants were seated on an experimenter's lap; their arms were held by the experimenter. The experimenter was not viewing the monitor, but the wall just above the monitor during stimulus presentation and she did not react to any of the infant's behaviour. Parents were seated at some distance behind the infant and were not visible by the infant.
To measure facial surface muscle activity, gold active electrodes were attached over the left corrugator supercilii and zygomaticus major muscles, following the guidelines by Fridlund & Cacioppo . Ground electrodes were placed on the forehead. Activity over each muscle was continuously recorded at a sampling rate of 1000 Hz with a 60 Hz notch filter.
Raw signals were filtered offline (bandpass: 10–400 Hz) and rectified. The signals were then screened in the following manner. First, the recoded videotapes were checked, and the trials during which infants were not attending to the stimuli were removed. Next, for further cleaning of extreme activity change, which included body movements, blinks or overt smiles, signals over 3 s.d. from the mean bandpass-filtered signals were detected and those 100 points (100 ms) before or after the detected signals were also removed. This procedure allowed us to analyse only very reliable data, and the data thus reflected relatively subtle muscle activation. The averaged signals from 250 to 1250 ms before the stimulus onset (during the presentation of the fixation stimulus) were calculated for each fixation clip, and the mean value of these was used as a baseline activity. Activities during stimulus presentation were expressed as percentage changes with respect to this baseline (see electronic supplementary material, figure S1 for the waveforms represented by the percentage of baseline activity during stimulus presentation). To focus on the time course of the muscle activation after the stimulus onset, reactions to the clips with three different models and those in different stimulus sets (if available) were all summed, and the averaged activity during the presentation of the 3 s clip were obtained for each condition. Finally, for the purpose of analysis, the activity percentages in each condition were epoched by averaging the data for each 500 ms chunk after the stimulus onset, which provided six time-windows (0–500, 500–1000, … 2500–3000 ms post-stimulus), in accordance with the earlier notion that any distinct muscle response to the stimuli was expected to be detectable after 500 ms of exposure .
The Kolmogorov–Smirnov normality test revealed that the data were not normally distributed. Therefore, we applied a non-parametric statistical method (Wilcoxon signed-rank test) for each paired comparison of the emotion types (cry, laughter versus control) at each of the six time-windows. The problem of multiple comparisons (increased probability of type I error) was corrected for by using the Bonferroni correction (significance level: p < 0.05/18).
To examine the facial mimicry responses of infants to presented emotions, we analysed whether the specific activation of the corrugator supercilii (brow) in response to crying and the zygomaticus major (cheek) in response to laughter were observed in each audiovisual, visual or auditory emotion-modality condition. Muscle activities in each 500 ms time-window after stimulus onset are shown in figure 2.
In the AV condition, the corrugator supercilii muscle showed remarkably increased activity at 500–1000 ms after stimulus onset in response to crying (left panel in figure 2a). This increased activity of the corrugator supercilii muscle was not clearly observed in both V and A conditions. Wilcoxon signed-rank tests confirmed that responses of the corrugator supercilli to crying were significantly larger than those to control stimuli (Z = 3.2, p = 0.0004, r = 0.82) in the 500–1000 ms time-window. In the V and A conditions, no significant differences in EMG activity across emotions were found in any time-widows (see table 1 for statistical results).
The zygomaticus major muscle also showed greater activity in response to laughter, compared with crying and control stimuli at 500–1000 ms after stimulus onset in the AV condition (left panel in figure 2b). Wilcoxon signed-rank tests revealed a greater activation of the zygomaticus major in response to laughter than to control stimuli in the 500–1000 ms time-window (Z = 3.0, p = 0.001, r = 0.78). In the V and A conditions, no significant differences in EMG activity across emotions were found in any time-windows (table 1).
Finally, we observed that activity changes with repeated presentation of stimulus clips were clearly different between the two muscles: the activity of the corrugator supercilli decreased with the repetitive presentation of the stimulus clip, whereas the activity of the zygomaticus major increased with the repetition of those (figure 3; electronic supplementary material, figure S1). Wilcoxon signed-rank tests with the Bonferroni correction (significant level: p < 0.05/9) revealed a significant decrease in the activity of the corrugator supercilii with the repetition of presentation of the auditory-cry (first clip > second clip: Z = 2.7, p = 0.004, r = 0.70) and the control (neutral) condition (first clip > second clip: Z = 3.1, p = 0.001, r = 0.810; first clip > third clip: Z = 2.9, p = 0.002, r = 0.75). This tendency was not statistically significant for other conditions (see table 2 for details). The significant increase in the activity of the zygomaticus major was found only for the control condition (first clip < third clip: Z = 2.8, p = 0.003, r = 0.72). Because this tendency was observed consistently for all emotions within the muscle, though many of them did not reach the level of statistically significance, the activity pattern related to habituation is more likely to depend on the feature of the muscle, rather than on the emotions to which the muscle reacts.
This study investigated whether four- to five-month-old infants show facial mimicry in response to audiovisual, visual or auditory emotions in others. Recordings were made of the infants' facial EMG activities over the corrugator supercilii (brow—a muscle involved in crying) and the zygomaticus major (cheek—a muscle involved in laughter) while the infants viewed the stimuli. The results showed increased activations of the corrugator supercilii in response to audiovisual crying, and the zygomaticus major in response to audiovisual laughing, between 500 and 1000 ms after the stimulus onset. Consistent with the findings of previous studies with adults [4,5] and children [27,35–37], infants' muscles activity in response to audiovisual emotional displays emerged and peaked between 500 and 1000 ms after the stimulus onset. Therefore, the current results clearly demonstrate that automatic facial mimicry is present as early as five months of age when multimodal emotional information is present.
One might argue that infants merely showed appropriate emotional responses to the stimuli, rather than mimicking the facial movements they saw. In adults, previous studies have suggested that facial mimicry involves both sensory–motor matching processes and emotional processes [38–40], thus we could consider that mimicry is in part a consequence of the emotional responses. In this study, we have observed the phenomenon of mimicry in infants, but at this point, it is unclear about the underlying mechanisms (i.e. to what extent those two processes are recruited behind the infant's mimicry responses). Nevertheless, it is likely that both processes are interactively linked and play important role on development of higher social cognition.
While clear evidence of facial mimicry of audiovisual emotional stimuli was found, four- to five-month-old infants did not produce automatic facial reactions to unimodal visual and auditory emotional signals. Adults show facial EMG reactions to both unimodal facial expressions (visual only) [4–6,41,42] and emotional vocalization (auditory only) . A recent study showed that even three-year-old children show the mimicking responses to unimodal facial expressions . In contrast to those previous findings in adults or children, our results revealed that four- to five-month-old infants do not show mimicking responses towards emotional faces and vocalization. These findings, together with the results that they showed clear mimicking responses to audiovisual bimodal emotions, suggest that four- to five-months-old infants have started to construct a system eliciting the mimicry, but at this stage, the system has matured only to the point to which motor responses are triggered only when multimodal emotional information is provided.
Previous developmental studies of emotion recognition reported that emotion recognition is initially promoted in naturalistic, multimodal conditions and that this ability is later extended to auditory and visual unimodal conditions [28,29]. For example, a study by Flom et al.  revealed that four-month-old infants discriminate happy, sad and angry emotions through audiovisual bimodal stimulations, but that sensitivity to auditory stimuli emerges at five months and that to visual stimuli emerges at seven months. This notion is also supported by event-related potential studies . In this study, we found that four- to five-month-old infants show a clear mimicry response only to bimodally presented emotions, but not to unimodal emotional signals, which is consistent with the initial stage of the development of emotion recognition in infants. These findings support the possibility that facial mimicry and emotion recognition develop in tandem in early infancy.
When facial mimicry emerges in postnatal development, reciprocal mimicry in our social interactions and Hebbian associative learning may explain its generating mechanisms [44–50]. Infants spontaneously and involuntarily produce several facial expressions during their early development. For example, they begin to smile in social contexts at around two to three months . Facial expressions in infants often induce adults around them to produce similar expressions [52,53]; this, in turn, provides infants with visual input that links their motor output to personal emotional experiences. Co-occurrence of perception, action and emotional experiences forms a network across these channels. This loop shapes the system that enables us to automatically produce the congruent facial action and emotion expressed by others, which enables infants to develop the ability to recognize others' emotional expressions. Recent studies have revealed that infants' motor resonance was recruited during observation of others' goal-directed actions, depending on the infant's capacity to produce the same action [54,55]. This suggests that infants' own sensory and motor simultaneous experience shapes the sensory–motor coupling network gradually during early development and facilitates the understanding of others' action. Similar developmental processes might exist in the domain of facial expressions during which infants express, observe and understand facial expressions, though faces are radically different from other actions in that we usually do not see our own facial actions, and that it is deeply linked with emotional processing. In future studies, it would be important to clarify the details of the emergence and the development of facial emotional mimicry by focusing on multiple developmental stages and to directly investigate the links between facial mimicry and emotion recognition in early development. Furthermore, clarifying neural mechanisms underlying mimicry development (e.g. Hebbian learning) will provide important insights considering how humans construct basic mechanisms that further facilitate higher social cognition, such as empathy or theory of mind, during early development.
Finally, several limitations of this study should be acknowledged. First of all, despite our best efforts to collect more data, given the difficulty of targeting infants as subjects, the analysis was conducted on a relatively small dataset. Partly related to that, owing to the non-normal distribution of the collected data, non-parametric methods with relatively conservative adjustment were employed for the statistical analyses. While this provided strong evidence of the positive result (i.e. presence of mimicry for the AV condition), it would have raised the possibility of causing the type II errors for the null results (i.e. absence of mimicry for the V and A conditions). Further investigations with larger samples are needed for more conclusive answer towards the modality-dependent response differences. Second, the stimuli used in this study had several limitations. Most notably, they lacked some ecological validity. For the bimodal emotion condition, we used auditory–visual asynchronous stimuli that were emotionally congruent. Although previous studies have shown that temporal synchrony between face and voice is not imperative for infants to detect common emotions across modalities at five months of age [28,29], it seems important for four-month-old infants . Thus, the use of audiovisual asynchronous stimuli might have weakened the infants' responses in this study, though we still observed clear mimicking responses in this condition. On the other hand, in unimodal emotion conditions, we presented an emotional signal of the target modality with neutral cues of the other modality. Although many studies testing infants' sensitivity to auditory emotions present auditory stimulus paired with neutral facial expressions [29,56,57], there still remains the possibility that these unnatural stimuli prevent infants from emitting natural responses. In addition, this manipulation made it difficult to compare the results fairly with previous findings of the mimicry for unimodal stimuli, which have mostly been focused on adults, as those studies usually do not present neutral signals of untargeted modality. In addition, we had only female models in our stimuli, which may have prevented us from determining whether the facial mimicry response is a more general mechanism regardless of the person who presents the emotions, or whether there are particular effects induced by the characteristics of models, such as sex or degree of familiarity. Moreover, we used different videos as a fixation clip at each trial in order to keep infants engaged. This procedure introduced the variance of muscle activity during the fixation clip across trials and reduced the validity of the baseline. In future studies, it will be important to confirm whether or not the same results could be obtained by using more natural stimuli, such as non-manipulated, completely synchronous audiovisual stimuli and auditory-only or visual-only unimodal stimuli, as well as using more various models including males or familiar persons to infants, and employing better-controlled procedures that are also compatible with an infant-friendly task. Finally and most importantly, because this study did not test infants younger than four months of age, we failed to clarify the earliest age at which the first mimicry emerges. Although one unpublished study has reported absence of mimicry measured by EMG among three-month-old infants , testing younger infants with the same paradigm would allow us to assess the developmental course of mimicry in a direct manner. Thus, further investigations are needed to understand the whole developmental trajectory of facial mimicry in infants and its links to various social cognitive abilities.
Overall, this study investigated automatic facial mimicry of infants in response to others' emotional displays (laughter and crying) by measuring facial EMG activities. We found that four- to five-month-old infants showed clear facial EMG reactions to dynamic presentation of audiovisual emotions. However, they did not show similar reactions towards unimodal, auditory and visual emotional stimuli. These results suggest that automatic facial mimicry is present in infants as young as five months when multimodal emotional information is provided, but responses to unimodal emotions would probably develop later.
We appreciate Kazuko Nakatani for her support in recruiting participants and collecting data. We are grateful to the models who cooperated in creating the stimuli and to all of the families that have kindly and generously given their time to participate in this study.
This study received approval from the institutional ethics committee, and we adhered to the Declaration of Helsinki. The parents of all participants gave written informed consent for their child's participation in the study.
Raw data are available from the Dryad Digital Repository .
T.I. and T.N. designed the study; T.I. performed experiments; T.I. and T.N. analysed the data and drafted the manuscript. Both authors gave final approval for publication.
We have no competing interests.
This work was supported by KAKENHI grants no. A15J000670 to T.I. and no. 25700014 to T.N. from the Japan Society for the Promotion of Science.