|Home | About | Journals | Submit | Contact Us | Français|
One of the defining characteristics of autism spectrum disorder (ASD) is difficulty with language and communication.1 Children with ASD's onset of speaking is usually delayed, and many children with ASD consistently produce language less frequently and of lower lexical and grammatical complexity than their typically developing (TD) peers.6,8,12,23 However, children with ASD also exhibit a significant social deficit, and researchers and clinicians continue to debate the extent to which the deficits in social interaction account for or contribute to the deficits in language production.5,14,19,25
Standardized assessments of language in children with ASD usually do include a comprehension component; however, many such comprehension tasks assess just one aspect of language (e.g., vocabulary),5 or include a significant motor component (e.g., pointing, act-out), and/or require children to deliberately choose between a number of alternatives. These last two behaviors are known to also be challenging to children with ASD.7,12,13,16
We present a method which can assess the language comprehension of young typically developing children (9-36 months) and children with autism.2,4,9,11,22 This method, Portable Intermodal Preferential Looking (P-IPL), projects side-by-side video images from a laptop onto a portable screen. The video images are paired first with a 'baseline' (nondirecting) audio, and then presented again paired with a 'test' linguistic audio that matches only one of the video images. Children's eye movements while watching the video are filmed and later coded. Children who understand the linguistic audio will look more quickly to, and longer at, the video that matches the linguistic audio.2,4,11,18,22,26
This paradigm includes a number of components that have recently been miniaturized (projector, camcorder, digitizer) to enable portability and easy setup in children's homes. This is a crucial point for assessing young children with ASD, who are frequently uncomfortable in new (e.g., laboratory) settings. Videos can be created to assess a wide range of specific components of linguistic knowledge, such as Subject-Verb-Object word order, wh-questions, and tense/aspect suffixes on verbs; videos can also assess principles of word learning such as a noun bias, a shape bias, and syntactic bootstrapping.10,14,17,21,24 Videos include characters and speech that are visually and acoustically salient and well tolerated by children with ASD.
We design the videos to be interesting and attractive, but also non-aversive to young children with autism, in a number of ways: When animate characters are needed, we use animals rather than humans to make the scenes less socially/emotionally challenging for children with ASD. We use dynamic scenes with brightly colored objects to capture and hold attention. A red blinking light during the intertrial intervals (ITIs) holds the children's attention when the video screens are blank. We produce the voice audio using speech that is exaggerated in both intonation and duration to capture and hold the children's attention.
Figure 1. (Movie) One block of the Noun Bias video. Shows one block of the Noun Bias video, which tests whether children map a novel word onto an object vs. an action. Click here to view movie.
Video stimuli are created via commercial nonlinear editing programs such as FinalCut Pro or Avid Media Composer, using 4-8 sec movie clips.
The first 2 clips are arranged sequentially, alternating left and right, paired with the same familiarization or teaching audio.
The baseline trials come next; these present the test stimuli side-by-side, but are paired with nondirecting audios. The baseline trials tell us whether the children have any inherent or visual preference for either stimulus when there is no linguistic or directing audio.
The test trials appear last, paired with the test audios that distinguish the two visuals.
A 1kHz tone goes on the second audio channel, coincident with each trial, providing visual designation to coders of the onset and offset of trials.
Figure 2. The portable IPL components and the arrangement of their setup in the home. Shows the portable IPL components and the spatial arrangement of their setup in the home. Click here to view larger figure.
The coding program outputs columns of numbers indicating the timing and type of each code entered. So, TSC 33499 means that at the onset of the trial, 3.3499 sec from the start of the video, the child looked to the center.
The coding arrays are analyzed by a custom matlab analysis program, which accesses the specific layout-which side is the match for each trial-of each video.
The analysis programs then calculate the child's duration and direction of looking during each coded trial, the child's latency to first look at the matching stimulus, and the number of times the child switches attention during each coded trial.
We then compare children's looking during the baseline vs. test trials on (a) duration of gaze to the matching image, (b) latency of first look to the matching image, (c) timecourse of looking to both images during the entire trial, and (d) number of switches of attention.
The rationale is that children who understand the linguistic stimulus will look longer at the matching image during the test trial than they had during the baseline trial, that they will look more quickly (once the trial begins) at the matching image than the nonmatching image, and that they will switch attention less during the test trials than during the baseline trials, because the linguistic stimulus is guiding their looking.
The only difference between baseline and test trials is the linguistic audio; the visual stimuli are the same but the baseline trials have a nondirecting audio (e.g., "look here!") while the test trials have a directing audio (e.g., "Where's toopen?") Therefore, comparing test trials to baseline trials for the duration and switches of attention measures tells us how much the directing audio is guiding the children's looking patterns. Also, if the children understand the directing audio, they should look more quickly at the matching image than at the nonmatching image.
An example of looking patterns from children tested using P-IPL is presented in Figure 3. Viewing the NounBias video, TD children 21 months of age looked longer at the matching image during test compared with baseline trials, indicating they had mapped the novel words onto objects rather than actions.21 Children with ASD averaging 33 months of age behaved similarly, with similar effect sizes.21,24 Therefore, children with ASD show a similar word learning bias as TD children. Viewing the SVO word order video, children in both groups also showed longer looking to matching images during test over baseline (i.e., significant comprehension), demonstrating understanding of English word order (e.g., that "The girl pushes the boy" is a different relationship from "The boy pushes the girl").3,14,21 Developmental differences are seen with the wh-question video, as TD children showed significant comprehension at 28 months of age, whereas significant comprehension was first found in children with ASD at 54 months of age.10 Therefore, children with ASD are delayed in learning how to correctly interpret questions such as "What did the apple hit?"
Figure 3. Children's percent of looking time to the match (i.e., object rather than action), while viewing the Noun Bias video. Shows children's percent of looking time to the match (i.e., object rather than action), while viewing the Noun Bias video.
Figure 4 presents the more detailed timecourse analyses of TD children viewing the ShapeBias video at four successive visits, spanning 20-32 months. Blue lines indicate looking to the match (i.e., same shape object); notice that the height of the blue lines during the test trials (right side of each graph) increases with age, showing increasing shape preferences. Moreover, as the children get older, the blue lines rise to the match earlier in the trial, indicating that they are understanding the directing audio more quickly with age. Red lines indicate looking away; notice that the breadth of the red lines diminishes with age as children spend less time looking away. Pink lines indicate looking to the Center during the ITI; we use these to make sure that children aren't biased before the trial starts. These IPL data confirm those from other methods that the Shape Bias is an increasingly strong word learning principle for TD children. However, three replications have not yet succeeded in demonstrating a significant word-based shape bias in the children with ASD.17,24
Figure 4. TD children's timecourse of looking while viewing the Shape Bias video. Shows TD children's timecourse of looking to the match, nonmatch, center, and away, while viewing the Shape Bias video. Click here to view larger figure.
Figure 5 presents the timecourse data for the same children with ASD, when they averaged 41 months of age, while viewing the syntactic bootstrapping video, when they are asked to learn novel verbs using surrounding sentence frames (e.g., to determine that gorping in "The duck is gorping the bunny" refers to a causative action rather than to a noncausative action). Syntactic Bootstrapping is another word learning principle replicated in many studies of TD children. Notice that the children with ASD look longer at the nonmatching image (green lines) during the baseline trials, but then their looking is directed more at the matching images (blue lines) during the second half of the test trials, when they are asked to find the referent of the novel verb (e.g., "Find gorping!"). As a group, these children with ASD were thus able to correctly map the verb onto the causative action. Cross-visit regressions reveal that the children with ASD who looked longer at the match during the Syntactic Bootstrapping video had shorter latencies to the match when they viewed the Word Order video, 8 months earlier.14
Figure 5. Children with ASD's timecourse of looking while viewing the Syntactic Bootstrapping video. Shows children with ASD's timecourse of looking to the match, nonmatch, center, and away while viewing the Syntactic Bootstrapping video.
Specific attention is needed to ensure that coders be 'deaf' to/unaware of the specific stimuli that the child is experiencing on a given trial, so that they don't know which side is the matching one and inadvertently bias their coding. We ensure this via the inclusion of the 1kHz tone on the 2nd audio channel of the video, which is then copied onto the film of the child's eye movements. The waveform of this tone provides the visual indication of the onset and offset of each trial, thereby delimiting the trials without the coder knowing their video or audio content.
It is very important to ensure that the analysis layout takes account of the left-right switch that occurs when children are coded (i.e., if the matching scene is on the left side while the child is watching, it will appear to be on the right side to the eye-movement coder, and vice versa).
Intercoder reliability assessments yield correlations of about 0.98 for pairs of experienced coders on random selections of 10% of the data;3 with less experienced coders or if there is high turn-over in the coders used across the participants of the study, we recommend having every child's eye-movements coded by at least 2 people and requiring that their codes (usually duration to match) be within 0.3 sec of each other for each trial. If they are not, then a third, fourth or fifth person should code the child until reliability is achieved.17
With participants of toddler age and/or children with developmental delays, it is inevitable that some will not look at either scene for some proportion of each trial, and for some trials, never. The following conventions are generally applied for these lapses of attention: (a) Children need to look to at least one scene for a minimum of 0.3 sec for that trial to be counted. Otherwise, it is a missing trial. (b) For a given video, children need to provide data for more than half of the test trials in order to be included in the final dataset. (c) Missing trials are replaced with the mean across children in that age group/condition for that item.
IPL is a method that taps children's earliest mappings of linguistic forms (i.e., words and sentences) onto referential (i.e., objects and actions) or propositional (i.e., relations and events) meanings. It can be used to assess the processes by which children of different ages learn new words, as well as the ages at which they understand different types of grammatical forms. By requiring only eye movements as overt indicators of understanding (or not), IPL can be used with toddlers whose behavioral compliance is generally low, as well as with some special populations such as children with ASD. Moreover, because this technology can be brought to children's homes, it may also be used with populations which are hard to reach in conventional laboratories, such as children from low social economic groups, children in rural communities, and bilingual children of monolingual non-native speaking parents. Newer ways of analyzing these eye movements are revealing interesting and important effects of learning different types of languages, and of learning words at different ages. Future plans include using IPL concurrently with neuroimaging methodologies, so that children's brain activity during language comprehension can be compared with their behavioral indicators.
No conflicts of interest declared.
This research was funded by the National Institutes of Health-Deafness and Communication Disorders (R01 2DC007428).