|Home | About | Journals | Submit | Contact Us | Français|
This paper presents follow-up longitudinal data to research that previously suggested the possibility of abnormal gaze behavior marked by decreased eye contact in a subgroup of 6-month-old infants at risk for autism (Merin et al., 2007). Using eye-tracking data and behavioral data recorded during a live mother-infant interaction involving the still-face procedure, the predictive utility of gaze behavior and affective behaviors at 6 months was examined using diagnostic outcome data obtained longitudinally over the following 18 months. Results revealed that none of the infants previously identified as showing lower rates of eye-contact had any signs of autism at outcome. In contrast, three infants who were diagnosed with autism demonstrated consistent gaze to the eye region and typical affective responses at 6 months. Individual differences in face scanning and affective responsivity during the live interaction were not related to any continuous measures of symptom frequency or symptom severity. In contrast, results of growth curve models for language development revealed significant relationships between face scanning and expressive language. Greater amounts of fixation to the mother’s mouth during live interaction predicted higher levels of expressive language at outcome and greater rates of growth. These findings suggest that although gaze behavior at 6 months may not provide early markers for autism as initially conceived, gaze to the mouth in particular may be useful in predicting individual differences in language development.
Although autism is typically diagnosed in the preschool years (Mandell, Novak, & Zuritsky, 2005), the dramatic social, emotional, and financial costs of the disorder (Ganz, 2007), coupled with the recent reports of increased prevalence rates (e.g., Baird et al., 2006; CDC, 2007) has given a particular sense of urgency to the goal of identifying autism at much earlier ages. The identification of early markers for autism will be crucial for developing and implementing more effective treatments at earlier ages, and may also help to identify specific causal developmental pathways that could ultimately focus efforts on prevention. To date, research using a variety of methodological designs – including retrospective review of home movies, early screening measures, and prospective follow-along designs of infants at risk – has identified a number of potential early signs of autism. For instance, Osterling, Dawson, and Munson (2002) reviewed home movies of first birthday parties and demonstrated that children later diagnosed with autism showed significantly less orienting to name and less looking at people at 12 months of age than comparison children matched on IQ. Similarly, in a two-stage early screening study of over 30,000 fourteen month old infants, Dietz, Swinkels, van Daalen, van Engeland, & Buitelaar (2006) found a number of parent reported behaviors that were strong predictors of later diagnosis, including “interest in other children or adults” and “reacts when spoken to”.
A number of more recent prospective studies using samples of at-risk infants (where at-risk is defined as having an older sibling with autism) have examined similar behavioral markers at even earlier ages. For instance, Cassel et al. (2007) reported significantly less smiling during face-to-face interactions in high-risk infants at 6 months of age compared to low-risk infants. Similarly, Yirmiya et al. (2006) reported that high-risk infants at 4 months of age tended to be less upset by the still-face procedure than comparison infants. More recently, Merin, Young, Ozonoff, & Rogers (2007) reported abnormal visual attention patterns to the eyes during face-to-face interactions in a subgroup of high-risk infants at 6 months of age. These findings are particularly compelling because they have been found at such early ages in a group of infants at risk for autism and because abnormal gaze to the eye region and decreased shared affect are such a prominent symptoms of autism at older ages. Importantly, however, these studies did not report on any outcome diagnostic data for these infant siblings and thus could not yet speak to the clinical utility of such behavior markers as predictors of actual formal diagnoses. Indeed, other recent longitudinal studies examining diagnostic outcome of at-risk infants have not found any measurable differences at 6 months of age between those infants who were later diagnosed with autism and those who were not (e.g., Landa & Garrett-Mayer, 2006; Zwaigenbaum et al., 2005). It is possible that behaviors such as a relative lack of visual attention to the eyes or a lack of shared affect during the first year of life are related more to risk-status (e.g., endophenotypes) than to diagnostic outcome and may ultimately have poor specificity and poor clinical utility. Thus, for studies finding evidence of candidate early markers in at-risk samples, it is crucial to follow-up such findings with diagnostic data at later ages.
This paper presents follow-up longitudinal data through 24 months of age for subjects in a study recently reported by Merin et al. (2007) that used eye-tracking methodology and behavioral coding to examine gaze behavior and affective responsiveness of 6-month-old high-risk infants during a face-to-face interaction with their mothers. The purpose of the paper is to evaluate relations between gaze behavior and affect variables at 6 months and discrete clinical outcomes at 24 months as well as developmental trajectories for a number of clinically relevant measures.
The Merin et al. (2007) study hypothesized that infants at risk for autism would exhibit abnormal gaze fixation patterns to the face. This hypothesis derived primarily from two lines of research: a) the findings noted above from home video tape and early screening studies that have suggested observable deficits in social orientation and attention in the first year of life (e.g., Dietz et al., 2006; Osterling et al., 2002), and b) a variety of eye-tracking studies suggesting that older individuals with autism look significantly less at the eyes when viewing static faces or dynamic social stimuli (e.g., Klin, Jones, Schultz, Volkmar, & Cohen, 2002; Speer, Cook, McMahon, & Clark, 2007; Spezio, Adolphs, Hurley, & Piven, 2007). Moreover, this hypothesis was consistent with the proposals that decreased social orientation in general and gaze to eyes in particular may be core behavioral features of autism that cascade throughout early development causing collateral deficits in theory of mind and language development (e.g., Dawson, Webb, & McPartland, 2005; Mundy, 2003). The Merin et al. (2007) study additionally proposed that these early abnormal face scanning patterns would be associated not only with formal diagnoses of autism, but also with the broad autism phenotype – a constellation of sub-clinical and much milder features of autism symptoms such as language difficulties and poor social reciprocity found at greater rates in first-degree relatives of individuals with autism (Bailey, Palferman, Heavey, & Le Couteur, 1998; Stone, McMahon, Yoder, & Walden, 2007; Yirmiya et al., 2006). Thus, to the extent that risk-status was related to the broad autism phenotype, Merin et al. (2007) expected to find that infants at-risk for autism as a group would look to the eyes significantly less during face scanning than low-risk infants (defined as infants who had no siblings with autism).
Merin et al. (2007) used eye-tracking technology during a video-linked face-to-face social interaction between 6-month-old infants and their mothers. The procedure allowed for the recording of each infant’s point of gaze to his or her mother’s face in real time and for the measurement of how the behavior of the mother – systematically manipulated to be contingent or non-contingent – affected the infant’s point of gaze. Behavioral coding was also employed to measure the degree to which the mother-infant interaction affected the infant’s gaze aversion, smiling behavior, and negativity (crying and fussiness). The mother-infant interaction was patterned on the still-face paradigm (Tronick, Als, Adamson, Wise, & Brazelton, 1978). The study sample consisted of 31 at-risk infants (infants with an older sibling who had a confirmed diagnosis of autism), and 24 low-risk infants (infants who did not have any older siblings diagnosed with autism). Consistent with prior research using the still-face paradigm (e.g., Gusella, Muir, & Tronick, 1988; Tarabulsy et al., 2003; Toda & Fogel, 1993; Tronick et al., 1978), all infants, regardless of risk-status, showed a significant decrease in smiling behavior and a significant increase in gaze aversion and negative affect in response to the still-face portion of the interaction. When examining the eye-tracking data for gaze fixations during the interaction, Merin et al. also found that the infants as a group exhibited a significant relative increase in looking at their mother’s eyes as opposed to other areas on the mother’s face when she became unresponsive. Although most infants reacted to the mother’s sudden unresponsiveness by repeatedly averting their gaze, when the infants did look to their mother during this still-face condition, their gaze was directed significantly more toward the eyes – a finding consistent with the interpretation that the infant’s visual attention to the mother’s eyes is a strategy for extracting meaning about an ambiguous situation. Interestingly, however, close inspection of individual differences in gaze behavior patterns across the interaction conditions suggested that not all infants displayed this pattern of gaze behavior to the same degree over the course of the still-face paradigm.
In examining gaze behavior patterns across conditions, Merin et al. used a ratio score based on gaze fixations to the eyes relative to the eyes and mouth combined. This metric – dubbed the “eye-mouth index” – was used as a way to represent the relative importance of the eye region for each infant when looking to the inner portions of the face. This data, from each condition separately, was then entered into an agglomerative clustering algorithm in order to identify subgroups of gaze behavior. Three clusters of gaze fixation patterns were identified: 1) infants who tended to look primarily at the mother’s eyes throughout the task (during interactive and unresponsive conditions alike); 2) infants who looked primarily at the mother’s mouth during the interactive portions but primarily at the mother’s eyes only during the still-face portion; and 3) infants who looked primarily at the mother’s mouth throughout the task and never showed a preference for looking at the mother’s eyes. As a test of the original hypothesis – that high-risk infants would show less gaze to the eyes than the low-risk infants – Merin et al. then examined the relationship between risk-status and gaze fixation cluster scores. Although the majority of infants in both the at-risk and the low-risk groups exhibited relatively more looking to the eyes, at least during the unresponsive condition, 10 of the 11 infants who showed patterns of relatively more looking to the mouth throughout all 3 conditions of the still face were in the high-risk status group. Consistent with the original hypothesis, it appeared that an overall decreased amount of gaze to the eye region – both overall relative lack of eye contact during social interaction and a notably lessened response to the sudden ambiguous unresponsiveness of the mother – was related to familial autism risk.
The obvious question raised by these findings is how many of the infants who displayed such a relative deficit in contingent gaze to the eyes would ultimately develop autism or show later signs of the broader autism phenotype – subclinical levels of autistic-like behaviors or other delays? Although Merin et al.’s findings suggested a relationship between risk-status and less looking toward the eyes, risk-status is quite a different thing than actual clinical outcome. Indeed, the majority of infant siblings of children with autism will exhibit typical development (Yirmiya & Ozonoff, 2007). Thus, examining the relationship between gaze patterns to the eyes at 6 months and autism outcome status at 24 months was the primary goal of this follow-up study. Our specific hypotheses were as follows:
Infants were recruited from families who either had an older child already diagnosed with autism, or who had an older child developing typically or showing non-autistic developmental delays. A total of 108 six month-old infants were recruited and enrolled into the study, 91 of whom comprised the sample reported on previously (Merin et al., 2007). Of these 108 infants, 55 had an older sibling who had previously received a diagnosis of autism spectrum disorder, 43 had typically developing older siblings, and 10 had an older sibling with some developmental delay (e.g., cerebral palsy, Down syndrome, global developmental delays, or speech delay). Twenty-two infants did not participate in the eye-tracking battery because of high degrees of fussiness and crying or because of technical difficulties with lab equipment. An additional 20 infants who did participate in the eye-tracking battery did not have usable or valid data (e.g., poor tracking signal, poor calibration, invalid still-face procedure). No differences in recruitment group, later outcome, or other variables examined below were found between these infants and the final sample. For the purposes of this follow-up study, we further decided not to include any infants who had an older sibling with a developmental delay, which led to an additional 8 subjects being dropped from the sample (2 newly recruited infants and 6 infants from the original sample). The final sample consisted of 58 infants: 33 high-risk infant siblings of children diagnosed with autism, and 25 infant siblings of children with typical development.
Table 1 presents demographic data for the final sample of 58 infants. Chi-square analyses of the data presented in Table 1 revealed no significant differences between risk status and any of the demographic variables.
Following the initial visit at 6 months, infants were seen again at 12, 18, and 24 months of age. Nine infants did not return for outcome testing at 18 or 24 months. Reasons for these dropped participants were: declined to continue (n=4), moved away (n=4), didn’t return contacts (n=1). There were no measurable differences between these nine infants who dropped out of the study compared to those who remained enrolled on any of the variables examined below (e.g., recruitment group, demographics, intellectual and developmental functioning, gaze fixation patterns). One child missed the 18 month visit but was seen at 24 months and one child missed the 24 month visit but was seen at 18 months. Clinical outcome data was therefore available on 49 infants, forty-eight of whom were seen through 24 months. Clinical outcome data for measures described below is presented in Table 2.
The eye-tracking procedures during the 6-month visit were conducted in the context of a live video transmitted 3-minute parent-infant interaction. As noted, the interaction was structured using the still-face procedure developed by Tronick et al. (1978). Specifically, the procedure involved a 1-minute spontaneous face-to-face interaction between mother and infant, a 1-minute “still face” episode in which the mother was unresponsive and neutral while making eye contact with the infant, and a 1-minute re-engagement where the mother was again responsive and interactive.
Eye tracking during this mother-infant interaction was accomplished using a Tobii ET-17 bright-pupil corneal-reflection eye-tracker. The Tobii ET-17 employs a fixed wide-angle camera mounted within a video monitor to record gaze position to the screen from a freely moving person seated in front of the monitor. The image and sound of the infant’s mother, who was seated in front of a remote camera and video monitor located in an adjacent room, was presented to the infant on the Tobii eye-tracking monitor. A second camera and microphone recording the infant was simultaneously fed to the monitor viewed by the mother, thereby completing a video link between the infant and mother that enabled real-time spontaneous interaction. Images of the infant and mother were displayed at approximately life size on the video monitor and eye-tracking monitor, respectively, and sound levels were set at naturalistic levels.
In addition to the eye-tracking and still-face procedure during the initial 6-month visit, data on overall developmental level, demographics, and other behavioral observations were also collected, as described below. A number of these measures were administered again during later visits at 12, 18, and 24 months as part of the larger aims of the study to track the development and onset of clinically significant concerns or diagnoses in infants at risk for autism.
(ADOS; Lord, Rutter, DiLavore, & Risi, 1999). The ADOS is a standardized play-based behavioral observation measure of autism symptoms consisting of 25 items across 4 domains: social interaction, communication, repetitive and stereotyped behaviors, and play. Items are scored as 0 (developmentally appropriate and not autistic), 1 (mildly atypical), 2 (atypical and autistic in quality), or 3 (severely autistic). The ADOS yields scores summarizing the number and severity of symptoms in each domain and provides clinical cut-off scores for use in diagnosis of autism spectrum disorders and autistic disorder. All examiners were required to meet reliability criteria of greater than 80% exact agreement in scoring and administration as part of initial and ongoing training. All reliability scoring and training was conducted by 2 licensed psychologists with expertise in autism diagnosis and treatment. The ADOS was administered at 18 (n=47) and 24 months (n=48). All ADOS administrations involved Module 1 except for 3 assessments at 24 months which utilized Module 2.
(MCHAT; Robins, Fein, Barton, & Green, 2001). The MCHAT is a screening measure completed by parents that consists of 23 yes/no items concerning behaviors often indicative of early signs of autism. The measure yields a summary score consisting of the total number of endorsed items as well as a subscale of 6 critical items used to determine autism risk. The MCHAT was administered at 18 months (n=46).
(MSEL; Mullen, 1995). The MSEL is a normed standardized developmental measure of language, cognitive, and motor functioning that provides age-equivalent and standard scores from birth to 68 months of age. Four subscales were used: visual reception, fine motor, expressive language, and receptive language. The MSEL was administered at ages 6 (n=57), 12 (n=53), 18 (n=47), and 24 months (n=48).
(Sparrow, Balla, & Cicchetti, 1984). The Vineland is a parent interview that assesses social, communication, motor, and daily living skills and provides age equivalent and standard scores for a variety of summary scales and subscales, including expressive and receptive language and social adaptive functioning. The Vineland was administered at ages 12 (n=54), 18 (n=44), and 24 months (n=44).
(CDI; Fenson et al., 1993). The CDI is a parent questionnaire that assesses a variety of aspects of language development, including vocabulary production, grammar, and sentence construction. The total raw word production score was used, consisting of the number of words endorsed by the parent out of 680 words across 22 categories (e.g., clothing, body parts, action words, etc.). The CDI was administered at ages 18 (n=43) and 24 months (n=32).
Diagnostic outcome was formally determined by a licensed clinical psychologist with specific expertise in the early diagnosis of autism and developmental disorders (SO or SR). Using data gathered on the ADOS, as well as supplementary data (e.g., M-CHAT, MSEL, behavioral observations), the clinician completed a formal checklist of DSM-IV criteria for Autistic Disorder and Pervasive Developmental Disorder Not Otherwise Specified (PDDNOS). If a child met criteria for either condition on the autism spectrum, they were assigned to the Autism/ASD outcome group. If not, their outcome was classified as either Speech-Language Delay (defined by a score on the MSEL Expressive Language scale of more than 1.5 standard deviations below the normative mean), Other Concerns (e.g., global developmental delay, marked shyness, behavior problems like oppositionality or hyperactivity), or No Concerns.
As detailed in Merin et al. (2007), a number of infant behaviors during the mother-infant interaction were reliably coded from video and shown to be sensitive to the mother-infant interaction conditions. The behaviors coded during the mother-infant interaction that were used in the current study as predictors of clinical outcome consisted of gaze aversion (looking away from the screen), negative affect (fussiness, grimacing, brief bouts of crying), and smiling. Behavior codes were expressed as proportion scores – the duration of each behavior per episode divided by the total duration of the respective episode. As reported by Merin et al., coder reliabilities for each of these codes was high (ICCs ranging from .86 to .95).
Gaze fixation data was measured and recorded using the Tobii ET-17 infrared bright-pupil corneal-reflection eye-tracker. Data was recorded at 30 Hz (i.e., 1 video frame per second) and each infant was calibrated using a five-point attention-getter prior to beginning the experiment in order to ensure positional validity of gaze measurements. All five calibration points were required prior to data collection. Fixations were calculated as the mean x–y position of all consecutive gaze data that fell within a 30-pixel radius (~1.5 cm) for at least 100 milliseconds (~3 video frames). Data was not used if calibration was unsuccessful, if too little eye-tracking data was recorded (i.e., less than 15% of time when infant was actually looking toward the screen in any given episode1), or if the infant became too upset during the task to track.
Raw gaze position was exported as a video overlay on the video of the mother as seen by the infant and as recorded at the time of the experiment. Fixation start and stop times, synchronized to the video overlay, were simultaneously exported as a text file and used as the coding template for each infant. As detailed in Merin et al., (2007), coders then coded each fixation listed on the template as they watched the video with gaze overlay, indicating the area-of-interest (AOI) for each fixation. Coder reliabilities for each of 10 different AOIs (e.g., left eye, right eye, cheek, hair, etc.) were calculated on approximately 20% of the files, with intra-class correlations for each AOI all above .90. Coded eye-tracking data yielded duration and frequency scores for each AOI which were further collapsed to yield eye, mouth, and other face fixation frequencies and durations.
From these data, for each of the conditions in the mother-infant interaction, an “eye-mouth index” score was calculated as the amount of gaze to the eye region divided by the combined amount of gaze to both the eye and the mouth region. As reported in Merin et al. (2007), a hierarchical agglomerative cluster analysis was then employed to classify infants into one of 3 distinct groups: 1) infants who exhibited preferential gaze to the eye region across all phases of the still-face paradigm (the high-high-high or “HHH” cluster; n=32), 2) infants who exhibited preferential gaze to the mouth region (the low-low-low or “LLL” cluster; n=9), or 3) infants who exhibited preferential gaze to the eyes only during the unresponsive condition of the still-face paradigm (the low-high-low or “LHL” cluster; n=17). This technique allowed us to include the 9 new infants in our sample into the same classification scheme used by Merin et al. Consistent with Merin et al., the majority (78%, n=7) of infants in the LLL cluster were in the at-risk group, although this proportional difference was no longer statistically significant (X2 = 2.25, df = 2, p = .33).
The decision to examine data from both eye and mouth regions in the same metric rather than eye region separately was made for a number of reasons. In addition to the obvious reason that it allowed us to follow-up the findings reported by Merin et al. which used the same index for cluster analyses, it reflected the measurement emphasis in prior literature which has focused primarily on these particular regions as being the most revealing of group differences. The majority of studies have reported less gaze to the eye region and more gaze to the mouth region in autism than in comparison subjects (e.g., Hobson, Ouston, & Lee, 1988, Joseph & Tanaka, 2003; Klin et al., 2002; Langdell, 1978; Spezio et al., 2007).2 A third reason for using the eye-mouth index was the fact that gaze to the inner region of the face (i.e., left and right eyes, bridge of nose, nose, and mouth) accounted for approximately 74% of all gaze recorded in the present study across all episodes of the still-face, with limited variability in other regions (e.g., hair, chin, non-face). Moreover, overall gaze to the eye region was highly and negatively correlated with gaze to the mouth region (r = −.75, p < .001) suggesting that individual preferences for looking to the eyes were directly offset by less looking to the mouth, a relationship found for both the high-risk (r = −.79) and low-risk samples (r = −.69). Given such high correlations, including gaze to both the eyes and to the mouth in the same metric was a way to avoid the statistical redundancy and increased family-wise error that would result from separate predictive models examining gaze to eyes and gaze to mouth separately.
It is important to note, however, that using gaze data only from the eye and mouth regions as a predictor of autism symptoms might miss important differences with respect to actual looking to the inner regions of the face (e.g., eyes and mouth) relative to other regions (e.g., hair, forehead, chin). As such, a second type of gaze index was calculated as the amount of time spent looking to both the eyes and mouth relative to all face regions. In order to verify the relative independence of this “inner-outer face index” from the eye-mouth index, correlational analyses were conducted and revealed that this second index score was unrelated to the eye-mouth index score for both the at-risk sample (r = − .06, p = .74) and the low-risk sample (r = −.14, p = .51).
Given that both the eye-mouth index and inner-outer face index were proportion scores, and that proportions have, by definition, a non-constant variance, we calculated the inverse sine of the square-root of these proportion scores in order to normalize the distribution and to stabilize the variance (Mosteller & Youtz, 1961). Subsequent inspection of Q-Q plots and formal tests of normality indicated that transformed scores followed relatively normal distributions, with skewness and kurtosis statistics well within ±1. The eye-mouth index scores for each condition – initial interaction, unresponsive still-face, and re-engagement – were all highly correlated, ranging from r = .68 (p < .001) for unresponsive still-face and re-engagement, to r = .75 (p < .001) for initial interaction and still-face. A similar pattern was observed for the inner-outer face index scores, ranging from r = .49 (p < .001) for initial interaction and unresponsive still-face, to r = .58 (p < .001) for initial interaction and re-engagement.
In order to provide a direct follow-up to our previously reported results, initial analyses focused on examining gaze fixation pattern clusters of the originally included and the newly recruited subjects in relation to categorical outcome data obtained at 24 months of age. We then extended these initial categorical analyses by examining the degree to which individual differences in gaze and affect predicted differences in primary symptoms (e.g., ADOS scores) and secondary symptoms (e.g., language, adaptive behavior) associated with autism. To this end, multi-level modeling was employed to assess the degree to which gaze behavior and affect at 6 months (as indexed by arcsine transformed eye-mouth ratio scores) predicted the developmental course (i.e., slope) and outcome (i.e., intercept, centered at 24 months) from 6 to 24 months on a variety of behavioral and parent report measures. We also employed generalized estimating equations to examine predictive longitudinal models for count data (e.g., CDI, ADOS symptom frequency) using negative binomial distributions.
Given the inclusion of additional subjects for the current study relative to Merin et al. (2007), preliminary analyses were conducted to verify the methodology (i.e., the “still-face effect”) and to replicate our previously reported findings of the gaze behavior and affect variables: gaze aversion, inner-outer face index, eye-mouth index, smiling, and negative affect. Consistent with our previous report (Merin et al., 2007), repeated measures analyses for each variable revealed significant main effects for episode with significant behavioral changes in response to the mother becoming unresponsive in the “still-face” episode: more overall gaze aversion, more negative affect, less smiling, more relative gaze to the eyes, and a slightly decreased amount of gaze to the inner portion of the face. Also consistent with our prior report, no significant effects were found for risk-status or for risk-status by episode interactions for any of the 5 variables.
Of the sample of 58 infants, 49 had clinical outcome data. The top portion of Table 3 presents risk group by 4 separate outcome diagnostic categories: No Concerns, Other Concerns, Speech-Language Delays, and Autism/ASD. As can be seen in Table 3, only three infants were eventually diagnosed with autism; one was from the low-risk sibling group, and two infants were from the high-risk sibling group. In order to formally test for relationships between risk-group and outcome, clinical outcome groups were collapsed into a dichotomous variable consisting of “no concerns” versus “concerns of any type” (i.e., Speech-Language Delay, Other Concerns, Autism/ASD). A chi-square analysis of risk-group by dichotomous outcome revealed a marginally significant relationship between risk-group and outcome (X2 = 3.57, df = 1, p = .06), with 11 infants in the at-risk group showing some sort of clinical concerns or delay vs. only 4 in the low-risk group.
The bottom portion of Table 3 presents the eye-mouth index cluster by clinical outcome group. Using the same dichotomous outcome categories employed above, a chi-square analysis of eye-mouth cluster by outcome group revealed no significant relationship (X2=1.51, df=2, p=.47). Of the 7 infants in the sample with outcome data who were classified as looking primarily at the mouth region throughout, only 1 infant showed any clinical concerns (severe hyperactivity), and this infant had an ADOS algorithm score of 3 (where the diagnostic cutoff for autism is 12 and ASD is 7) and MSEL scores all well within the normal range.
A second set of analyses was performed that again examined each of the 5 gaze behavior and affect variables in a repeated measures ANOVA, with episode, dichotomous clinical outcome (concerns or no-concerns), and the episode by outcome interaction term. Results revealed no significant main effect or interaction effect for clinical outcome on any of the still-face variables. Both the concerns group and the no-concerns group showed similar responses to the mother-infant interaction on each of the gaze and affect variables.
Next, we considered the data for the 3 infants diagnosed with Autism/ASD, as shown in Table 4. As can be seen, all three infants clearly showed relatively severe symptoms as measured by ADOS scores and MSEL language scores at the time of diagnosis. Surprisingly, however, all 3 infants also showed a high proportion of gaze to the eye region, especially during the unresponsive still-face portion of the experiment (two in the HHH cluster and one in the LHL cluster). As a point of comparison, and as Table 3 indicates, an additional 19 infants in the No Concerns group at outcome also displayed this pattern of looking behavior. Table 4 also shows that the total amount of looking time to the mother’s face (eyes and mouth) exhibited by each of the 3 infants diagnosed with autism or ASD was very close to the mean for the entire group (Mean = 52.14, SD = 24.53, n=58), as evidenced by their individual z-scores for total looking time. This offered further evidence that these 3 infants did not differ in any meaningful way from the sample as a whole. With respect to behavioral observations of these 3 infants, Table 4 also indicates that each infant exhibited a normative pattern of increased gaze aversion and decreased smiling during the still face episode of the mother-infant interaction. Closer examination of these individual data revealed that none of the 3 infants’ scores were more than one standard deviation above or below the sample mean for any of the variables except for infant 23032 who exhibited somewhat more smiling behavior than typical infants (+1.7 to +1.8 SDs for the interactive and re-engage episodes respectively).
As an independent verification that the 3 infants who developed autism exhibited relatively typical behavior at 6 months, inspection of clinical notes taken during each visit for these 3 infants revealed that none of the parents reported any concerns about their infant at the 6 month visit. The examiner did note some subtle motor delays for infant 23032 at 6 months, but no other examiner concerns were specified for any of these 3 infants, particularly with regard to social and communication development.
The next set of analyses focused on examining gaze behavior and affect variables as predictors of autism symptoms using continuous measures of each, rather than categorical variables which collapse across potentially meaningful individual variability. To this end, we examined 3 separate measures of autism symptoms: a) symptom severity as measured by the communication + social interaction algorithm score on the ADOS, b) the number of ADOS items (out of 25) scored 2 or 3, and c) the total number of items endorsed on the M-CHAT. Although these 3 dependent variables were highly correlated with each other (between .64 and .85), these three scores represent a rough progression from greater diagnostic specificity (ADOS expert ratings of symptom severity) to lower diagnostic specificity (M-CHAT parent screening items) and provide several options for assessing any resultant predictive utility of eye-mouth index scores.
Examination of the data for these three autism outcome variables revealed variances many times greater than the means, indicating significant over-dispersion in the distributions. We therefore analyzed autism symptom data with generalized estimating equations using negative binomial distributions with log link functions. This strategy also allowed us to examine more subtle change in autism symptoms at different outcome ages by estimating within-subject error variances in a repeated measures model (Hardin & Hilbe, 2003). For each of these analyses, we first examined the main effects for risk status and time, and the interaction between risk status and time. We then examined the extent to which gaze behavior scores and smiling during the still face procedure at 6 months explained variance in the dependent variable above and beyond any significant effects of group and time point included in the model. Variables were entered into models using a hierarchical approach, in the following order: 1) total duration of gaze aversion from mother’s face; 2) proportion of gaze to inner face versus total face (e.g., the inner-outer face index); 3) proportion of gaze to eyes versus both eyes and mouth (e.g., the eye-mouth index); and 4) amount of smiling and negativity.3 Intercorrelations between these variables were all quite low, ranging from −.02 for eye-mouth index and gaze avert scores to .25 for negative affect and gaze avert scores. Unique contributions of model effects were examined by evaluating changes in corrected quasi-likelihood values and Wald chi-square tests of parameter effects. Only significant main effects and higher-order interaction effects involving gaze behavior scores or smiles were retained in the final models.
Analysis of symptom severity scores on the ADOS at 18 and 24 months revealed a significant main effect for time (Wald X2 = 3.81, df = 1, p = .05) with an expected log count decrease in ADOS severity scores of .21 from 18 to 24 months, a decrease corresponding to .78 points in the ADOS algorithm score for the entire sample. There was no main effect for group and no interaction effect for group by time. None of the 6 month gaze or affect behaviors predicted overall severity scores or change in severity from 18 to 24 months (ranging from p=.22 for gaze averts to p=.80 for negative affect).
Analysis of symptom frequency, as measured by the total number of symptoms coded either a 2 or 3 on the ADOS at 18 and 24 months, similarly revealed a significant main effect for time (Wald X2 = 6.14, df = 1, p < .05) with an expected log count decrease in number of symptoms of .43 from 18 to 24 months, a decrease corresponding to approximately 1 symptom count. The effects for group and group by time were not significant. None of the gaze or affect behaviors predicted overall autism symptoms or change in number of symptoms over time (ranging from p=.24 for eye-mouth index to p=.59 for negative affect).
Analysis of the total number of items endorsed by the parent on the M-CHAT at 18 months revealed a significant group effect (Wald X2 = 5.40, df = 1, p < .05), with an expected log count difference between groups of 1.28, corresponding to approximately 1.55 more symptoms endorsed by parents of high-risk infants (Mean = 2.15, SE =.78) compared to low-risk infants (Mean = .60, SE = .24). None of the gaze or affect behaviors predicted individual differences in M-CHAT scores and did not interact with group differences (ranging from p=.67 for eye-mouth index to p=.97 for inner-outer face index).
The next set of analyses focused on the relationship between gaze and affect variables and standardized measures of language using the MSEL. For these analyses, we used multi-level modeling to predict growth trajectories from 6 to 24 months (i.e., slopes) and differences at outcome (i.e., intercept, centered at 24 months) as a function of risk status and gaze behavior and affect variables. For each model, we first examined unconditional growth models for both overall linear and non-linear (e.g., quadratic) growth trajectories, retaining higher-order growth trajectories when significant. Risk status and the risk status by time interaction term was entered next followed by gaze behavior and affect variables in a step-wise fashion from total gaze aversion to eye-mouth index and, lastly, smiling and negative affect. All higher order effects involving risk status were not significant in any of the models tested and thus were not retained in the final solutions and are not reported for any of the following analyses. Error-covariance matrices were estimated using restricted maximum-likelihood methods, and inspection of information criteria for the same models using a variety of error-covariance structures suggested that the best fitting and most parsimonious structure was a heterogeneous first-order autoregressive matrix. Effect parameters were then estimated using full maximum likelihood methods. Normality and homoscedasticity assumptions were evaluated using q-q plots of both level-1 and level-2 residuals and by examining scatterplots of level-1 and level-2 residuals against each of the model’s predictors. These diagnostic methods suggested the model assumptions were tenable.
Results of the growth-curve analysis for MSEL expressive language age equivalents revealed a significant linear effect (γ = 1.09, p < .001) for age from 6 to 24 months. There were no significant effects for group or group by time interactions. There were no significant effects for gaze averts (p=.55), for inner-outer face index scores (p=.26), for smiling (p=.28), or for negative affect (p=.32). However, for eye-mouth index scores, there was a significant effect in predicting expressive language age at 24 months and a significant interaction between eye-mouth index scores and time. Specifically, eye-mouth index scores were negatively related to expressive language outcome at 24 months of age (γ = −5.50, p = .001), and were negatively related to expressive language growth trajectories (γ = −0.27, p < .01). These effects are shown in Figure 1 where prototypical trajectories for an infant with an eye-mouth index score one standard deviation below the mean (solid line) and for an infant with an eye-mouth index score one standard deviation above the mean (dotted line), revealing that higher amounts of gaze to the eyes at 6 months are associated with significantly slower rates of expressive language development and, by 24 months of age, with a significant 4-month delay in expressive language age compared to higher amounts of gaze to the mouth.
The finding that infants with greater overall looking at the mother’s eyes predicts slower expressive language development may have been driven in part by the inclusion of 3 infants who developed autism. As noted earlier in Table 3, each of these 3 infants had relatively high eye preference gaze fixations at 6 months and also had lower language scores. Thus, in order to determine whether the relationship between face scanning and language was not simply an artifact of the inclusion of these infants, the data were reanalyzed, after removing these 3 participants, on the remaining sample of 55 infants. Indeed, we adopted the conservative strategy of removing these 3 children with autism from all remaining analyses examining gaze and affect variables and developmental measures. In none of the subsequent analyses did gaze aversion, inner-outer face index scores, smiling, and negative affect significantly predict 24-month intercepts or rates of change for any of the dependent variables examined. As such, only results for models with eye-mouth index scores as a predictor are presented.
Results of the analyses conducted on the sample without the 3 individuals with autism are presented in Table 5. As can be seen, removing these 3 individuals did not alter the results for the eye-mouth index scores and MSEL expressive language scores; the eye-mouth index score was still significantly negatively related to outcome at 24 months (γ = −4.93, p < .01) and rate of change (γ = −0.24, p < .01). Table 5 also displays the results of additional growth curve models using eye-mouth index scores fit to the MSEL receptive language age equivalent scores, the Vineland expressive language age equivalent scores, and the visual reception and fine motor age equivalent scores from the MSEL. The analysis for the Vineland expressive language scores revealed similar relationships between eye-mouth index scores and expressive language age equivalent scores for both 24-month outcome (p < .05) and, at a marginal significance level (p = .06), rate of change. In contrast, analyses of the other 3 MSEL subscales – visual reception, fine motor, and receptive language – did not reveal the same predictive patterns as those between eye-mouth index scores and expressive language age equivalent scores. Indeed, only the relationship between eye-mouth index scores and the MSEL receptive language intercept at 24 months was significant (p = .05). The relationship between eye-mouth index scores and rate of growth in MSEL receptive language was not significant. A similar analysis of the Vineland receptive language age equivalent scores at outcome or growth over time did not show any significant relationships with eye-mouth index scores.
A reanalysis of MSEL expressive language age equivalent scores, using the same growth model as above but with the addition of receptive language scores as a time-varying covariate revealed that eye-mouth index scores were still significantly related to expressive language scores at 24 months (γ = −3.49, p < .01), and rate of growth (γ = −0.16, p < .05). Thus, the relationship between eye-mouth index scores and expressive language development was relatively independent of any general language ability as represented by the shared variance between receptive and expressive language also included in the model.
To further explore the relationship between gaze behavior and expressive language, we next examined parent reported expressive vocabulary on the MacArthur CDI, administered at both 18 and 24 months. Given that the CDI yields a raw count of words produced by the child, we modeled vocabulary production by using a negative binomial distribution within a repeated measures framework using generalized estimating equations. Results of the analysis revealed a significant main effect for time (Wald X2 = 110.50, df = 1, p<.001), with an expected log count increase of 1.52 over time, an increase of approximately 220 words between 18 and 24 months. There was no difference in vocabulary production by group, and no group by time interaction. The main effect for eye-mouth index scores was significant (Wald X2 = 9.16, df = 1, p < .01), with an expected log count decrease of 0.76 in relation to respective increases in eye-mouth index scores, an effect corresponding to approximately 41 fewer vocabulary words per standard deviation increase in eye-mouth index scores. Figure 2 displays the estimated marginal means for vocabulary production for representative groups of infants with high eye-mouth index scores (i.e., +1 SD above the group mean) and low eye-mouth index scores (i.e., −1 SD below the group mean). Consistent with findings for the MSEL and Vineland expressive language scores, and as depicted in Figure 2, infants who fixated predominantly on the mother’s eyes evidenced significantly smaller vocabularies at both 18 and 24 months of age compared to infants who fixated predominantly on the mother’s mouth. As with the results for the MSEL expressive language scores, this overall main effect for eye-mouth index scores remained significant even when covarying receptive language scores from the MSEL (Wald X2 = 3.89, df = 1, p < .05).
In order to examine whether the relationship between eye-mouth index scores and language was a function of only the interactive aspects of the still-face paradigm (when the mother was speaking), or was related to face processing in general (regardless of whether the mother was speaking or not), we repeated the analyses of the MacArthur vocabulary production scores using the eye-mouth index scores derived from each of the 3 still-face conditions separately: the interactive phase, the unresponsive still-face phase, and the re-engagement phase. Results of these analyses revealed that the relationship between eye-mouth index scores and expressive language as measured by the MacArthur, was only significant for the interactive (Wald X2 = 5.90, df = 1, p < .05) and re-engagement phases (Wald X2 = 5.74, df = 1, p < .05). The amount of gaze fixation to the eyes during the unresponsive still-face condition was not related to vocabulary development. Re-analyses of the MSEL and Vineland using eye-mouth index scores from each condition separately suggested a similar overall pattern of results with much stronger, significant relationships between eye-mouth index scores and expressive language during the interactive conditions as opposed to the unresponsive condition.
The last analysis involved examining a growth model for the Vineland socialization domain age equivalent scores again using gaze behavior and affect scores as predictors of both outcome at 24 months and of developmental rate. The results revealed a significant main effect for eye-mouth index scores as a predictor of socialization age equivalent scores at 24 months (γ = −4.81, SE = 1.80, p < .05) and a significant interaction effect between eye-mouth index scores and time (γ= −0.35, SE = 0.16, p < .05). A significant unique effect was also found for overall duration of smiling as a predictor of rate of growth in socialization age equivalent scores from 12 to 24 months (γ = 0.49, SE = 0.23, p < .05) In order to account for the possibility that expressive language may mediate the relationship between socialization scores and eye-mouth index scores (see Baron & Kenny, 1986), especially given the previous findings for expressive language and face scanning, a follow-up analysis was conducted using the same growth model but with the addition of expressive language age equivalent scores as a time varying covariate. Results for this analysis revealed a significant main effect for expressive language as a predictor of socialization scores (γ = 0.44, SE = 0.08, p < .001). The effect for eye-mouth index scores as a predictor of socialization scores at 24 months and rate of growth was no longer significant after controlling for expressive language. The effect of smiling behavior on the rate of growth was marginally significant (γ = 0.36, SE = 0.21, p = .09) after inclusion of expressive language scores as a mediator variable.
The primary goal of this paper was to examine the degree to which individual differences in gaze and affect behavior at 6 months of age might serve as an early marker of autism diagnosis and related symptoms at outcome. Based on prior literature suggesting that deficits in affect sharing, social orienting in general and gaze to the eye region in particular are features of autism at later ages, we hypothesized that similar affect sharing and visual attention abnormalities might be present much earlier than ages at which formal diagnoses can be made. Indeed, previous cross-sectional research with this same group of infants at 6 months of age (Merin et al., 2007) suggested that risk-status, with high risk defined as having an older sibling with autism, was associated with less gaze fixation to the eye region in a subgroup of infants during a live interaction between the infant and mother. Although longitudinal follow-up data for the same sample presented here yielded only 3 infants who developed autism by 24 months of age, it was relatively clear that these 3 infants did not exhibit any abnormal face scanning patterns at 6 months. Indeed, these 3 infants exhibited a pattern of face scanning that was very similar to the sample mean and was characterized by increased overall gaze aversion during the unresponsive portion of the still face, high overall gaze to the inner regions of the face, and a majority of time spent looking to the eye region of the mother’s face during the structured interaction task. Moreover, the smiling behavior of all 3 infants was likewise very close to the sample means and showed a typical and normative pattern of significant decrease in response to the mothers’ unresponsivity. The fact that these 3 infants did not differ in any quantitatively meaningful way from typical infants at 6 months was also consistent with clinical impressions and parental reports of behavior at 6 months; none of the parents or examiners noted any evidence of autism related behaviors in any of these 3 infants at 6 months of age, despite later diagnoses of autism. Thus, the possibility suggested by our previous findings that decreased gaze to the eyes at 6 months might predict a later diagnosis of autism was not supported by the profiles of the 3 subjects in the sample who did develop autism. Moreover, when examining gaze behavior and affect as predictors of the severity or frequency of autism symptoms in the sample as a whole, analyses revealed no relationships between individual differences in gaze behavior or affect and autism symptoms at 24 months, further suggesting that gaze behavior and affect at 6 months are not predictive of later autism symptoms even at sub-clinical levels.
Although we did not have data from enough 6 month old infants who later developed autism to comprise a representative sample of autism, these results do help put into perspective our original cross-sectional findings: having an older sibling with autism may have been associated with relatively less gaze to the eyes in a sub-sample of 6-month-olds as originally reported in Merin et al. (2007), but any such differences in gaze behavior at 6 months had nothing to do with the later development of autism symptoms. Moreover, the original finding reported by Merin et al. of a significant relationship between high-risk status and decreased gaze to the eyes was not replicated in the current study after the inclusion of additional infants. These findings underscore the fact that “at-risk” status is not synonymous with later delays, concerns, or autism symptomatology at outcome. Indeed, we found only a marginally significant relationship between risk status and clinical outcome (with a number of low-risk infants showing concerns other than autism or language specific symptoms).4 As such, the current results do provide an important corrective to our previous findings by illustrating the importance of obtaining outcome measures in place of risk-status alone when utilizing a prospective study design to identify early markers of autism.5
Although we failed to find evidence for abnormal gaze behavior or affect at 6 months in the 3 children in our sample who developed autism, it may be that such abnormalities might yet develop in these infants by 12 or 18 months of age. Thus, our findings may say more about the timing of symptom onset than about the absolute importance of gaze behavior and affect as early signs of autism. Research on the early developmental course of autism, using a variety of methodologies such as parent report, analysis of home movies, and prospective case studies, has documented significant heterogeneity in symptom onset patterns and course over the first two years of life, ranging from early onset prior to 6 months of age (e.g., Dawson, Osterling, Meltzoff, & Kuhl, 2000; Werner, Dawson, Munson, & Osterling, 2005) to developmental regression as late as 18 to 21 months (e.g., Goldberg, Osann, Filipek, Laulhere, Jarvis, Modahl, et al., 2003). Therefore, it may be that we failed to see abnormal gaze and affect at 6 months in the 3 infants from our sample who developed autism not because gaze and affect are unimportant to the early identification of autism, but because the process of symptom onset had simply not yet begun for any of these 3 infants at 6 months.
In contrast to the lack of relationships between gaze behavior and affect at 6 months and primary autism symptoms at outcome, parallel analyses predicting secondary symptoms such as language functioning and socialization measures revealed a number of significant predictive relationships. Our hypotheses regarding secondary symptoms such as language were similar to those for autism symptoms: less gaze to the eyes would also predict slower developmental rates and poorer outcomes. Although we found significant predictive effects, these effects were in the opposite direction of our initial hypotheses. Growth curve analyses revealed that infants who fixated more on their mother’s mouth during the 6 month mother-infant interaction developed language at significantly higher rates and had significantly higher expressive language scores at 24 months. In terms of age equivalent scores on the MSEL, this effect amounted to a difference of more than 4 months in developmental age by 24 months between infants who looked preferentially at the mouth (−1 SD eye-mouth index) vs. those who looked preferentially at the eyes (+1 SD eye-mouth index).
With respect to other subscales of the MSEL – namely, the receptive language, fine motor, and visual reception subscales – we did not find the same relationships between face scanning and developmental levels except for a weaker relationship between face scanning and receptive language levels at 24 months (with a difference of 2 months at outcome in age equivalent scores between preferential mouth vs. preferential eye gaze). As such, within the MSEL itself, the relationship with face scanning appeared to be fairly specific to expressive language. Indeed, after covarying receptive language, the relationship between face scanning and expressive language remained, suggesting that the effect was specific to an aspect of expressive language separate from the shared variance between expressive and receptive language.
Analysis of the Vineland expressive language scale likewise revealed a similar relationship between gaze to the mouth and rate of development and developmental levels for expressive language at 24 months of age. Moreover, within the Vineland communication domain, this effect was again constrained only to the expressive language subscale and not found with the receptive language subscale. The fact that gaze to the mouth showed a similar relationship to the MacArthur CDI, a parent report measure of expressive vocabulary size at both 18 and 24 months of age, provides additional evidence of the specific relationship between face scanning and expressive language even across two very different measurement methods. Again, this effect remained significant even after covarying the MSEL receptive language scores.
A final growth curve analysis conducted with the Vineland socialization domain age equivalent scores showed the same predictive relationships with face scanning at 6 months: increased relative gaze to the mouth at 6 months was related to increased rates of growth and 24 month outcome socialization scores. Moreover, the amount of smiling behavior displayed at 6 months was also uniquely and significantly related to the rate of change in socialization scores, with greater smiling at 6 months predicting greater rates of growth in socialization scores between 12 and 24 months of age.
The relationship between less eye gaze and socialization scores is reminiscent of results reported by Klin et al. (2002). Using similar eye-tracking technology with adolescents and adults diagnosed with high-functioning autism, Klin et al. instructed participants to watch a dynamic movie clip of a complex and emotionally charged social interaction. The analyses of visual scanning patterns indicated that decreased gaze fixation to the eye regions of faces was related to higher Vineland socialization domain age equivalent scores for subjects with high-functioning autism. The interpretation offered by Klin et al. (2002) was that decreased gaze to the eyes, and the concomitant increased gaze to the mouth, was a compensatory strategy for increasing one’s understanding of social situations by focusing on language, a strategy revealed in higher socialization domain scores.
In our own sample of infants, the relationship between mouth gaze and socialization more likely reflects a normative developmental process rather than a compensatory strategy. Nevertheless, an important component of Klin et al.’s interpretation is the proposition that language processing is a mediator of the relationship between face scanning and social adaptation. Indeed, considering that language is one of many mediums by which an individual might develop and exhibit socially adaptive behavior, this mediation hypothesis is particularly plausible, whether as a compensatory mechanism in adults with a disorder or as a normative developmental process in 6 month old infants who are first beginning to learn language. The role of language in social adaptation is apparent even in the developmentally early items on the Vineland Socialization scale (e.g., “Plays very simple interaction games with others”, “Imitates adult phrases heard on previous occasions,” “Addresses at least two familiar people by name”).
Statistically, such a mediation effect would be detectable by the disappearance or at least dramatic reduction of any previously significant relationship between eye-mouth index scores and socialization scores after co-varying for significant language effects (Baron & Kenny, 1986; Judd & Kenny, 1981; MacKinnon, Warsi, & Dwyer, 1995). In fact, our own reanalysis of the growth curve model examining Vineland socialization scores revealed that the inclusion of expressive language scores as a covariate resulted in a highly significant relationship between expressive language and socialization domain scores while the relationship between eye-mouth index scores and socialization domain scores became non-significant. Moreover, the significant relationship between smiling behavior at 6 months and rates of growth in socialization scores was not affected to the same degree by the inclusion of expressive language as a mediator – a finding that suggests that smiling behavior at 6 months is related to the development of social adaptation either more directly as a measure of the same construct, or indirectly through an altogether different developmental mechanism.
Despite the fact that we found a clear relationship between gaze to the mouth at 6 months and language development over the first two years, it is important to note that this finding was entirely independent of autism diagnoses at outcome or initial risk status at enrollment. Although we did find one indication for a risk-status effect in receptive language at 24 months, there were no group differences in gaze behaviors as predictors of language and, indeed, no group differences in gaze behavior itself aside from a trend for more high-risk infants in the lowest eye-preference group (the “LLL” group). Even this latter finding is tempered by the fact that the relationship between gaze to mouth and language was significant only for the interactive phases of the still face paradigm, a finding which encompasses both the “LLL” cluster and “LHL” cluster – or 44% of the entire sample, both low-risk (n=9) and high-risk (n=13) infants combined. Nevertheless, the findings for a relationship between gaze to mouth at 6 months and language development, though unexpected, are important in their own right for what they tell us about possible normative mechanisms of language development.
One area of early language development that has received intensive research focus is the mechanism by which infants are able to perceive meaningful components within the streams of speech to which they are constantly exposed (e.g., see Aslin, Jusczyk, & Pisoni, 1998). The task of learning language – of word comprehension and articulation and expression – is perhaps initially facilitated by an infant’s ability to perceive the phonemic building blocks of spoken language (Aldridge, Stillman, & Bower, 2001; Eimas, Siqueland, Jusczyk, & Vigorito, 1971; Jusczyk & Aslin, 1995; Maye, Werker, & Gerken, 2002). Moreover, individual differences in such early speech perception abilities have been found to predict language development by 24 months of age in several recent longitudinal studies (Fernald, Perfors, & Marchman, 2006; Newman, Ratner, Jusczyk, Jusczyk, & Dow, 2006; Tsao, Liu, & Kuhl, 2004). This suggests that any developmental phenomenon that might facilitate an infant’s early speech perception abilities might likewise facilitate later language development
Of particular relevance to our own data showing a relationship between gaze to mouth and expressive language is research on infants’ ability to integrate auditory and visual information in the service of speech perception. Several lines of research have demonstrated that visual attention to the mouth can profoundly influence and facilitate speech perception. Perhaps one of the most long-standing lines of research demonstrating this is the McGurk effect (McGurk & MacDonald, 1976) which consists of the alteration of speech perception with the presentation of synchronized but unrelated audio and video input (e.g., the sound [ba] and the image of [ga] produces the perception of [da]). Even in infants as young as 2 months of age, visual input of speech is readily integrated with auditory input (Burnham & Dodd, 2004; Kuhl & Meltzoff, 1982; Patterson & Werker, 2003). In paradigms using degraded auditory speech signals coupled with clear corresponding visual signals (e.g., Schwartz, Berthommier, & Savariaux, 2004), research has also demonstrated that access to visual input allows the perceiver to “recover” what is lost in the auditory signal. Thus, although the acoustic properties of speech are undoubtedly important for learning language, the fact that visual attention to the mouth can greatly influence and facilitate language perception, coupled with the fact that such audiovisual integration appears to occur quite naturally even in 2 month old infants, it is reasonable to assume that the degree to which an infant focuses on his or her mother’s mouth during vocal face-to-face interactions may likewise facilitate speech perception and, in the long-run, language development.
Turning back to our own findings regarding the relationship between face scanning and language, individual differences in looking times to the mouth may index the degree to which infants utilize visual information of a speaker’s mouth to assist in the task of segmenting speech. This interpretation is supported by the fact that we found the relationship between gaze to the mouth and language development to be particularly strong only during the episodes when the mother was actively talking to the infant. The face scanning data recorded during the unresponsive still-face condition did not predict language development as strongly, particularly as measured by vocabulary size on the MacArthur.
Nevertheless, it might still be argued that speech perception is primarily, if not entirely dependent on acoustic properties of speech (Ohala, 1996) and that individual differences in early speech perception abilities have little to do with visual attention to the mouth. In fact, it could be argued that our findings for an association between visual attention to the mouth and language development are spurious – that both of these phenomena are related to the mothers’ use of infant directed speech. To the extent that infant directed speech involves exaggerated articulation and prosody, it might also involve exaggerated movement – a visual aspect that may encourage greater amounts of looking behavior to the mouth but without any related facilitative effects on speech perception. Future research should include a measure of individual differences in audio-visual integration and cross-modal mapping in relation to visual attention to the mouth and later language development in order to better address these various interpretations.
Although our own data from 6 month olds does little to help predict the development of autism, it clearly suggests a normative developmental mechanism that might be involved in the development of language skills – a mechanism that was perhaps understandably overlooked in working backwards from prior research on older individuals with autism who look relatively little to the eyes when scanning faces. Armed now instead with a more developmentally appropriate hypothesis of the importance of looking to the mouth when learning language between 6 and 24 months of age, it is interesting to return to the question of early predictors for autism. Clearly, language development is severely affected in autism (see Tager-Flusberg & Caronna, 2007 for a review). Might it be possible that early audiovisual integration of speech perception is derailed in infants who later develop autism, and that this plays some role in their later deficits in language development? Could the poor language skills in the 3 children in our own sample who later developed autism be due in some small part to their failure to attend preferentially to the mouth during speech as an aid to learning language? Recent research examining audiovisual speech integration in adolescents with autism using a speech-in-noise paradigm, has indeed demonstrated significant deficits in such integration (Alacantra, Weisblatt, Moore, & Bolton, 2004; Smith & Bennetto, 2007). Moreover, this deficit is consistent with more general accounts of significant deficits in cross-modal integration (e.g., see Iarocci & McDonald, 2006), and in the visual processing of biological motion (e.g., Blake, Turner, Smoski, Pozdol & Stone, 2003; Gepner & Mestre, 2002). Research on the links between cortical motor centers and speech perception (e.g., Fadiga, Craighero, Buccino, & Rizzolatti, 2002; Pekkola et al., 2006; Skipper, Nusbaum, & Small, 2005) further suggests that deficits in audiovisual integration during speech perception might be a function of abnormalities in specific neural substrates underlying multisensory processing such as Broca’s area and the superior temporal sulcus. The degree to which such areas may be specifically impaired in autism compared to other disorders may ultimately provide important clues to the nature of language deficits in autism (cf., Oberman & Ramachandran, 2007).
In conclusion, although we found no evidence for early markers of autism in gaze behavior and affective responsivity at 6 months of age, our findings do raise a number of issues and questions for future research. This lack of relationship may suggest that early behavioral signs emerge later in the first year of life for most children with autism. As such, future research employing longitudinal measurement of variables such as face scanning at multiple timepoints may be better suited to identify if and when variables such as gaze fixation to the eyes or mouth become abnormal in autism. Our results regarding language development and face scanning will also need to be replicated with a much larger sample, and refined using more sensitive measurements of language development, maternal speech characteristics, and infant face scanning behavior, collected more frequently during development. To the extent that these findings can be replicated and refined for typically developing infants and integrated with the extant developmental literature, it would then be informative to re-examine the role of face scanning in autism and to what extent abnormalities in face scanning are related to comorbid deficits in audiovisual integration, speech perception, and language development in a larger sample of young children already diagnosed with autism. After such groundwork has been done, the next question to be asked may be whether or not any of these newly detailed phenomena can be used as an early marker for autism.
The work in this manuscript was supported by grant number RO1 MH068398 from the National Institute of Mental Health awarded to S. Ozonoff, P.I., grants from the National Association for Autism Research (NAAR) and from Cure Autism Now (CAN) awarded to S. Rogers, and grants from the Medical Investigation of Neurodevelopmental Disorders Institute (MIND Institute) awarded to S. Ozonoff, S. Rogers, and N. Merin.
We extend special thanks to Aparna Nadig for her invaluable feedback on earlier versions of this paper.
1This criterion was established previously by Merin et al. (2007) to exclude three significant outliers in the sample who proved particularly difficult to track during the still-face episode. The mean percent tracking data out of total gaze directed to the screen was 61% (SD = 19%).
2The use of the eye-mouth index does not necessarily imply the hypothesis that individuals with autism show greater preference for the mouth region. Indeed, although some research has suggested that there is no autism specific preference for looking to the mouth (e.g., van der Geest, Kemner, Verbaten, & van Engeland, 2002), the eye-mouth index here allows for either of the two possible scenarios suggested by the literature: a) a true group difference only for less gaze to the eye region in autism and not more gaze to the mouth region, in which case the inclusion of gaze to the mouth region in the denominator essentially becomes a constant term across both groups without affecting the overall results, or b) a true group difference for less gaze to the eye region but greater gaze to the mouth region in autism, in which case the inclusion of gaze to the mouth increases the sensitivity of our measure to detect autism specific differences.
3For each of these predictor variables, we used mean proportion scores calculated across each of the 3 conditions of the still face. This was thought to be the most parsimonious approach given the high intercorrelations between each still-face condition within each of the 4 measures, and the fact that raw change scores (e.g., difference scores between conditions) tended to confound those infants having high overall behaviors (such as in the HHH group for eye-mouth index scores) with those infants having low overall behaviors (such as in the LLL group). Moreover, even with behavior change as represented by cluster membership, as in the case of the eye-mouth index, or when using outcome variables from 24 months as covariates to predict episode effects in repeated measures analyses, the results did not differ from those reported here using continuous measures collapsed across condition. Thus, to avoid the loss of important variability in continuous measures, as well as the problems involved with using raw difference scores and complexities in justifying separate cluster analyses for each predictive variable, we decided to employ mean overall proportion scores for each variable.
4It is possible that this relatively weak relationship between risk-status and clinical outcome reflects a recruitment bias that led to a greater number of general clinical concerns in the low-risk group than might be expected (i.e., 4 out of 24). Nevertheless, none of the parents of the low-risk group reported any concerns at enrollment, and none of the 3 low-risk infants with “Other Concerns” had formal clinical diagnoses, which makes it difficult to assess whether this number is abnormally high compared to the general population.
5Although it could be argued that clinical diagnoses of autism at 24 months are somewhat unstable (e.g., Chawarska, Klin, Paul, & Volkmar, 2007), such diagnostic instability is relatively subtle (typically encompassing changes between a formal autism diagnosis to a more mild autism spectrum disorder) and appears to be due more to false positives than to false negatives (Turner & Stone, 2007). Thus, it is unlikely that any additional infants in our sample classified as typical at 24 months would later be diagnosed with autism or that any such diagnostic instability would alter the results presented here. Furthermore, all of the children with autism/ASD outcomes have had their diagnoses confirmed at 36 months and there were no false positives in this sample.
Gregory S. Young, M.I.N.D. Institute, Department of Psychiatry and Behavioral Sciences, School of Medicine, University of California, Davis, CA, USA.
Noah Merin, Neuroscience Graduate Group, University of California, Davis, CA, USA.
Sally J. Rogers, M.I.N.D. Institute, Department of Psychiatry and Behavioral Sciences, School of Medicine, University of California, Davis, CA, USA.
Sally Ozonoff, M.I.N.D. Institute, Department of Psychiatry and Behavioral Sciences, School of Medicine, University of California, Davis, CA, USA.