Here, we sought to investigate how children with ASD integrate multimodal cues during social communication. In light of the linguistic and socio-communicative impairments that characterize this disorder, we hypothesized that children with ASD would demonstrate abnormal neural responses while viewing co-speech beat gesture. Indeed, our results confirmed that children with ASD recruited different neural networks during the processing of co-speech beat gesture than age- and IQ-matched TD counterparts.
Similar to what has been observed in neurotypical adults (
Holle et al. 2008;
Hubbard et al. 2009), the TD children in our study showed increased activity in STG/S while viewing co-speech gesture. In contrast, the children with ASD did not show significant increases in activity within these regions specific to the presence of co-speech beat gesture. Furthermore confirming this observation, direct group comparisons showed that STG/S was significantly more active in response to the presence of co-speech beat gesture in TD children than in children with ASD. Rather, the direct group comparisons revealed that children with ASD showed significantly greater activity than TD children within visual areas when processing co-speech beat gesture. Interestingly, activity in these visual areas was found to positively correlate with symptom severity as indexed by both the ADOS-G and SRS. Between-group comparisons of STG/S activity in response to viewing co-speech beat gesture – observed both in neurotypical adults and in TD children – may represent the integration of multimodal speech cues. Thus, for children with ASD, the observation that co-speech beat gesture has a modulatory effect on visual cortices (and that this effect becomes greater as a function of symptom severity) instead of on STG/S suggests that the auditory and visual aspects of the stimuli are being processed somewhat independently. Taken together, these findings suggest that children with ASD are not effectively integrating information from multiple sensory modalities during social communication.
Although there are similarities between the responses we observed in this sample of TD children and those we previously observed in normal adults (
Hubbard et al. 2009) for viewing co-speech beat gesture, there were also a number of differences. Neurotypical adults demonstrate greater activity in right anterior STG for the contrast of beat gesture with speech versus nonsense hand movement with speech (
Hubbard et al. 2009); in TD children, however, significant differences for this contrast were observed only at liberal thresholds. Additionally, unlike neurotypical adults, TD children did not show increases in motor cortex in response to viewing co-speech beat gesture, and STG/S responses to co-speech beat gesture were limited to the right hemisphere (whereas responses were bilateral in normal adults). This decreased sensitivity in TD children may perhaps reflect developmental differences in multimodal speech perception. For example, in a seminal study on audiovisual speech perception (
McGurk and MacDonald 1976), only 52% of TD children ages 7–8 years old were shown to be impacted by the presence of contradictory audiovisual speech cues. Future studies directly comparing children and adults are needed to further characterize developmental changes in the neural basis of multimodal speech perception.
In the case of children with ASD, increases in neural activity over that observed in TD controls is often interpreted as reflecting a compensatory strategy. For example, in
Wang et al. (2006), increased activity for children with ASD (within regions recruited by TD controls) was suggested to reflect more effortful processing needed to complete the language processing task. Because there was no overt task in this study, it is unlikely that the additional activity we observed in visual areas reflects an explicit compensatory mechanism on the part of the children with ASD. Further support for this conclusion comes from an examination of areas in the brain, where activity was modulated by symptom severity. The visual areas identified in between-group analyses as showing stronger activity in the ASD children were the only areas in the brain where activity correlated with symptom severity: the more severe the ASD symptoms, the greater the activity in these visual areas. We therefore conclude that the abnormal activity observed in children with ASD in these regions is most likely indicative of a deficit in multisensory integration, observed most substantially (at both the neural and behavioral level) in children with the greatest symptom severity. The findings of
Mongillo et al. (2008) lend further support to this interpretation as they found that SRS scores were negatively correlated with scores on the McGurk test – a test of auditory and visual speech integration (
McGurk and MacDonald 1976). Thus, consistent with our results, greater symptom severity is associated with less evidence of multisensory integration.
The current findings – especially with regard to the positive correlation observed between symptom severity and neural activity in visual areas – are consistent with growing evidence of abnormal cortical connectivity in children with ASD (e.g.,
Kleinhans et al. 2008). It has been theorized that individuals with ASDs exhibit increased local connectivity, to the detriment of long-range connectivity (for review, see
Minshew and Williams 2007). For example, several studies have identified decreased connectivity between visual and frontal cortices (
Villalobos et al. 2005;
Koshino et al. 2008), and other studies have found increases in thalamocortical connectivity, hypothesized to compensate for reduced cortico-cortical connectivity (
Mizuno et al. 2006). Also, highly relevant to the current findings are studies reporting abnormal low-level visual processing (
Bertone et al. 2005), visual hypersensitivity (
Ashwin et al. 2009), and/or low-level visual problems (
Vandenbroucke et al. 2008) in individuals with ASD. In this study, audiovisual integration – which depends on the synthesis of information from primary visual and auditory cortices – may be disrupted as a result of abnormal cortico-cortical connectivity and/or a specific deficit in visual processing. Future studies are needed to address these competing accounts.
Finally, our findings are in line with considerable evidence suggesting specific deficits in integrating communicative cues in individuals with ASD (
Williams et al. 2004;
Mongillo et al. 2008;
Whitehouse and Bishop 2008;
Klin et al. 2009). Recently,
Mongillo et al. (2008) found that for a group of children with ASD, deficits in audiovisual integration were more salient when stimuli involved audiovisual elements of human communication (i.e., faces and voices) versus nonhuman visual and auditory stimuli. Similarly,
Whitehouse and Bishop (2008) showed that children with ASD responded less to repetitive speech sounds than to repetitive nonspeech sounds, although responses to both types of sounds were the same when children with ASD were explicitly instructed to attend to the sounds.
Williams et al. (2004) also reported deficits in audiovisual integration of visual speech (i.e., the movements of lips, mouth, and tongue which produce speech) in children with ASD.
Klin et al. (2009) observed that 2-year-olds with ASD were more likely than controls to attend to nonbiological motion than to human biological motion. Most recently,
Silverman et al. (2010) reported differences in how neurotypical individuals and individuals with ASD utilize iconic co-speech gesture to aide comprehension. Namely, the presence of iconic gesture facilitated comprehension in neurotypical individuals, but did not facilitate comprehension in individuals with ASD. There is behavioral and neural evidence of a tight link between gesture and speech integration during speech processing in neurotypical individuals (
Özyürek et al. 2007;
Willems et al. 2007,
2008;
Kelly et al. 2010). The abnormal neural responses we observed in children with ASD while listening to speech accompanied by beat gesture (i.e., audiovisual stimuli which have inherent communicative value) provide additional evidence of disrupted processing of communicative audiovisual cues even in high-functioning individuals with ASD.
Taken together, these findings highlight the importance of further examining how individuals with ASD process information that is directly relevant to social communication. In face-to-face communication, there is continuous information available from multiple sensory modalities (e.g., facial expression, tone of voice, and body posture). This study is only the first to investigate how cues conveyed by hand gesture may impact speech perception in individuals with ASD; there remains much to be explored with regard to how individuals with ASD process other types of communicative cues in real-world contexts. Further work in this area would not only contribute to our understanding of the communicative impairments seen in ASD but may also inform the design of future diagnostic tools and behavioral interventions.