Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Child Psychol Psychiatry. Author manuscript; available in PMC 2013 February 1.
Published in final edited form as:
PMCID: PMC3235227

Combining Information from Multiple Sources for the Diagnosis of Autism Spectrum Disorders for Toddlers and Young Preschoolers from 12 to 47 Months of Age

So Hyun Kim, M.A. and Catherine Lord, Ph.D.



Purpose of this study was to systematically examine combined use of the Autism Diagnostic Interview-Revised (ADI-R) and Autism Diagnostic Observation Schedule (ADOS) for children under age 4 using newly developed and revised diagnostic algorithms.


Single and combined use of the ADI-R and ADOS algorithms were compared to clinical best estimate diagnoses for 435 children with Autism Spectrum Disorders (ASD), 113 children with non-spectrum disorders, and 47 children with typical development from 12 to 47 months of age. Sequential strategies to reach a diagnostic decision by prioritizing administrations of instruments were also evaluated.


Well-balanced sensitivities and specificities above 80% were obtained for ASD diagnoses using both instruments. Specificities significantly improved when both instruments were used compared to one. Scores that can be used to systematically prioritize administrations of instruments were identified.


The ADI-R and ADOS make independent, additive contributions to more accurate diagnostic decisions for clinicians evaluating toddlers and young preschoolers with ASD. Sequential assessment strategies using the scores identified may be appropriate for some children.

Keywords: Autism Spectrum Disorders (ASD), Autism Diagnostic Interview-Revised (ADI-R), Autism Diagnostic Observation Schedule (ADOS), Early Diagnosis

In the past few years, research has flourished concerning detection of ASD symptoms in the first 3 years of life due to the belief that earlier provision of services and treatments is associated with better outcomes. With increasing needs for early detention of ASD, new diagnostic algorithms for toddlers and young preschoolers from 12 to 47 months of age have recently been developed for the Autism Diagnostic Interview-Revised (ADI-R; Rutter, Le Couteur, & Lord, 2003), a standardized, semi-structured, investigator-based interview for caregivers (Kim & Lord, in press). In addition, algorithms for the Autism Diagnostic Observation Schedule (ADOS; Lord, Rutter, DiLavore, & Risi, 2001) have been revised to improve the diagnostic validity of the instrument, a standardized, semi-structured, clinician-administered observation of communication, social interaction, and play (Gotham et al., 2007), including two revised algorithms for children who use 5 or more single words or even less language and for children under age 5 using phrase speech. New algorithms have also been developed for the ADOS-Toddler module for children under 30 months of age (Luyster et al., 2009). Past studies showed enhanced diagnostic validity when information from both the ADI-R and ADOS are used together. There has not yet been a systematic attempt to examine the combined use of these instruments using the newly developed and revised algorithms. Thus, the present study focuses on the validity of the combined use of the ADI-R and ADOS using the new and revised algorithms with children under age 4.

With a best estimate clinical diagnosis treated as the gold standard, previous work primarily with older children and adolescents found that using data from multiple sources (i.e. clinicians, caregivers, and teachers) enhances accuracy for the diagnosis of ASD. For example, the Social Responsiveness Scale resulted in high diagnostic specificity for children and adolescents with ASD when information from both parent and teacher reports were combined (Constantino et al., 2007). Bishop and Baird (2001) reported improved validity of the Children’s Communication Checklist when information from both parents and professionals were used for 151 children with pervasive developmental disorders (PDD) or other developmental disorders between 5 to 17 years of age. Corsello et al. (2007) reported enhanced diagnostic validity by combining information across instruments, either the Social Communication Questionnaire (SCQ; Rutter, Bailey, & Lord, 2003) or the ADI-R with the ADOS for the diagnosis of children with ASD between age 2 and 16 years.

Risi et al. (2006) found a better balance of sensitivity and specificity when the ADI-R and ADOS were used in combination compared to when each instrument was used alone. For example, the combined use of these instruments resulted in sensitivity and specificity of 82% and 86% for children with autism compared to children with non-spectrum disorders over age 3 years. For younger children, sensitivity and specificity for the same diagnostic comparison using both instruments were 81% and 87%, respectively. In contrast, when each instrument was used alone, specificities ranged from 59% to 72%, with sensitivities remaining above 80%. Most of the results were obtained primarily from older children, even though a study by Le Couteur et al. (2008) also found that combining information from both instruments using preexisting algorithms provided improved diagnostic accuracy for preschoolers with ASD compared to either instrument in isolation. However, using newly developed and revised algorithms, the present study is the first study examining the validity of the combined use of the ADI-R and ADOS for toddlers as young as 12 months of age.

In very young children, diagnostic differentiation between non-autism ASD (e.g. PDD-NOS) and autism is less stable than for older children and adolescents (Lord et al., 1999; Szatmari et al., 2002; Wiggins, Robins, Adamson, Bakeman, & Henrich, in press). Consequently, as the newest ADI-R and ADOS-T algorithms for toddlers and young preschoolers have been developed (Luyster et al., 2009; Kim & Lord, in press), a shift has occurred from having separate autism and other ASD cutoffs to using only a single classification of ASD. In addition, in order to formally acknowledge the less clear stability of diagnoses in younger children, these algorithms provide ranges of concern (little-to-no, mild-to-moderate, or moderate-to-severe concern), to be used in clinical monitoring and follow-up. However, groupings are necessary for several different purposes. Thus, the new ADI-R algorithms also provide two cutoffs, one for research (more restrictive; higher specificity with lower sensitivity) and one for clinical purposes (more inclusive; higher sensitivity with lower specificity).

Past studies examining validity of the ADI-R and ADOS have found that parent reports and clinician observations do not always agree. Agreement between these instruments has varied across samples and analytic techniques. In a sample of 797 ASD and 163 non-spectrum cases over 36 months of age, Risi et al. (2006) found that the Pearson r correlation between ADI-R and ADOS algorithm totals was 0.57. Correlations differed by domains in the study by Le Couteur et al. (2008), ranging from 0.51 to 0.71 for a sample of 77 preschoolers with ASD and 24 with other developmental disorders. Agreement between the instruments using Kappa ranged from 0.48 to 0.62. In another study (de Bildt et al., 2004), correlations ranged from 0.52 to 0.54 between the ADI-R and ADOS algorithm totals for 123 children aged 5 to 20 years with ASD and intellectual disability and 62 with intellectual disability only. In contrast, Ventola et al. (2006) compared the application of full ADI-R and ADOS diagnostic criteria to each other and clinical diagnosis in a sample of 36 ASD and 9 non-spectrum cases aged 16 to 31 months. Significant levels of agreement were found between the ADOS and clinical judgment (κ=0.59, p<.001) but agreement between the ADI-R and clinical judgment (κ=0.15, ns) and between the ADI-R and the ADOS (κ=0.07, ns) was poor.

Because the combined use of the ADI-R and ADOS has shown better diagnostic validity than either individual instrument, it is recommended that clinicians and researchers use information from both instruments when making diagnoses. However, due to constraints in time, cost, or expertise, often only one of the instruments is actually used. Relatively little is known about ways to maximize validity in this case. One approach would be to determine scores on the instruments associated with a very high (or low) probability of receiving the classification of ASD on the “alternative instrument” (referred to as “positive (or negative) screening estimate” hereafter). For instance, if a child’s score reaches a positive screening estimate on the ADI-R, a clinician could presumably omit the ADOS assuming that the probability of the child receiving an ASD classification on the ADOS would be very high. The same strategy could be used with a negative screening estimate.

Another approach is to conduct similar analyses using best estimate clinical (BEC) diagnoses based on all available information as the gold standard and then to determine if there are scores on each instrument that result in 100% specificity for ASD. That is, we can examine what score on each instrument successfully excludes all cases determined to not have ASD (henceforth referred to as “high specificity case scores”) and then describe the sensitivities of these scores. For example, if a child meets or exceeds a high specificity case score on the ADOS, a clinician evaluating the child could assume that the chance of the child receiving a BEC diagnosis of ASD would be very high and choose to omit the ADI-R.

In sum, the purpose of this study is to examine the combined use of the ADI-R and ADOS for children under age 4 using the new and revised algorithms. Often, a misdiagnosis that results in a child failing to receive necessary services is the greatest concern. On the other hand, over-diagnosis has negative consequences for individual children, public health strategies and research. Consequently, we present data supporting alternative methods for using both research and clinical cutoffs from the new ADI-R algorithms. Agreement between the two instruments is also evaluated by examining the overlap between the ADI-R and ADOS-T ranges of concern and correlations between algorithm totals. A final goal was to evaluate sequential assessment strategies using positive/negative screening estimates and high specificity case scores that could allow the empirically-validated use of a single instrument when information from a second instrument would very likely be redundant.



All 604 children with complete data from a contemporaneous ADOS, ADI-R, nonverbal IQ, and BEC diagnosis were included from two projects, Early Diagnosis of Autism (EDX) and First Words and Toddlers (FW/T) and for clinic patients at the University of Michigan Autism and Communication Disorders Center (UMACC).

Children in the FW/T projects entered the study between 12 to 18 months and were administered the ADI-R and ADOS-T. The remaining children were administered the ADI-R and either the Pre-Linguistic ADOS (PL-ADOS; DiLavore, Rutter & Lord, 1995), or ADOS Module 1 to 3 depending on their age and language level. Out of 604 children, 195 children, who were nonverbal or had single words only, received the PL-ADOS, which was re-coded to the ADOS Module 1.

All participants, aged 12 to 47 months, were walking at the time of assessment. Mean age was 31.8 months (SD=9.6), and 435 children had ASD (345 males), 113 children non-spectrum disorders (NS; 81 males), and 47 children typical development (TD; all younger than 21 months; 31 males). NS participants had a range of diagnoses, including language disorders (53%), intellectual disability of unknown etiology (18%), Down syndrome (6.4%), externalizing disorders (5.5%), internalizing disorders (2.7%), and general, mild developmental delays (14.4%). Ethnicity was not associated with diagnosis; 74% of participants were Caucasian, 15% African American, 3% Asian American, 3% biracial, and 5% Native American or other races. The sample in the present study was a subset of children (about 30%) from the sample used to develop the new ADI-R algorithms for toddlers and young preschoolers (Kim & Lord, in press). In addition, approximately at least 30% and 15% of the sample also used for the development of revised ADOS algorithms and new ADOS-T algorithms, respectively (Gotham et al., 2007; Luyster et al., 2009).

Participants were divided into three developmental cells by the child’s age and language level following the structure of the developmental groupings of the new ADI-R algorithms: (1) all children between 12 and 20 months, 31 days of age and nonverbal children between 21 and 47 months, 31 days of age (“12–20/NV21–47”); (2) children between 21 and 47 months, 31 days of age with single words (“SW21–47”); and (3) children between 21 and 47 months, 31 days of age with phrase speech (“PH21–47”).

As shown in Table 1, children with TD and NS were significantly younger and had significantly higher NVIQ and Vineland Adaptive Behavior Composite scores (Sparrow, Balla, & Cicchetti, 1984) than children with ASD for the “12–20/NV21–47” group (p<.001). For both “SW21–47” and “PH21–47” groups, Vineland composite scores were significantly higher for children with NS than ASD (p<.001). A significant age difference emerged for the “SW21–47” group (children with ASD were older than children with NS, p<.05). For the 12–20/NV21–47 group, 156 children received the PL-ADOS, 122 children module 1, and 60 children the ADOS-T. For the SW21–47 group, 105 children received module 1, 39 children the PL-ADOS, and 7 children the ADOS-T. For the PH21–47 group, 106 children received module 2 and 4 children module 3. All of these children received appropriate modules based on their level of language.

Table 1
Description of sample


In the new ADI-R algorithms for toddlers and young preschoolers, item scores in Social Affect (SA) and Restricted and Repetitive Behaviors (RRBs) for the “12–20/NV21–47” and “SW21–47” groups and Social Communication (SC), RRBs, and Reciprocal and Peer Interaction (RPI) for the “PH21–47” group are combined to generate cutoffs for the classification of ASD. Thirteen to 20 items comprise the new ADI-R algorithms depending on children’s ages and language levels. For the revised ADOS and new ADOS-T algorithms, the total number of items in the algorithms is 14, with the composition of items in each algorithm differing by children’s ages and language levels.


Each caregiver was administered the ADI-R and the Vineland. The ADOS and cognitive testing were then completed by the same or by a different clinical psychologist or a trainee within a few days’ time. A standard hierarchy of cognitive measures, most frequently the Mullen Scales of Early Learning (n=438; Mullen, 1995) or the Differential Ability Scales (n=61; Elliott, 1990) was used to determine IQ scores. Examiners in the study had completed research training and met standard requirements for research reliability for the ADI-R and ADOS. Inter-rater reliability was monitored through periodic observations and scoring by two examiners and scoring of videotapes. Caregivers signed an Institutional Review Board approved informed consent to participate in research before participation.

Consensus Best Estimate Clinical Diagnosis

For children in the EDX study, an experienced clinical researcher used the videotaped ADOS and ADI-R scores and observations made during the testing to generate an independent BEC diagnosis of autism, PDD-NOS, or non-spectrum disorders (APA, 1994). For children in the FW/T project, scores on the ADI-R, ADOS, and clinical observations were used by two clinicians to make a BEC diagnosis operationalizing DSM-IV criteria (APA, 1994; See Luyster et al., 2009). For clinic cases, a diagnosis was made by a psychologist and/or psychiatrist after review of all information.


Sensitivities and specificities for single and combined use of the ADI-R and ADOS algorithms were compared with BEC diagnoses. Sensitivities and specificities (Siegel, Vukicevic, Elliott, & Kraemer, 1989) were considered in each of these conditions: 1) Meeting ADI-R criteria; 2) Meeting ADOS criteria; 3) Meeting either ADI-R or ADOS criteria when both were administered; 4) Meeting criteria on both the ADI-R and ADOS. For the sensitivities and specificities, 95% confidence intervals were also calculated using the Wilson score method (Newcombe, 1998). Characteristics of children correctly or incorrectly classified were examined. Correlations were used to assess the agreement between the ADI-R and ADOS algorithm totals as well as between domain totals for three different developmental cells (“12–20/NV21–47,” “SW21–47,” “PH21–47). Correlation coefficients were compared using Fisher’s Z transformations (Steiger, 1980). Seventy children who received both new ADI-R and ADOS-T algorithms were selected to examine the overlap between the ranges of concern from both instruments. Odds ratios were calculated to assess the likelihood of receiving a diagnosis of ASD when a child was classified by the ADI-R and/or ADOS in these ranges.

Positive/negative screening estimates were identified for each instrument by selecting scores associated with very high/low percentages of cases that received a classification of ASD on the other instrument. Sensitivities and specificities for these scores were then evaluated. In addition, high specificity case scores were selected for each instrument by examining total scores that resulted in high specificities (100%, 90%, and 80%) of the BEC diagnoses for the comparison of ASD vs. NS cases. Sensitivities for these scores were also examined.


Sensitivities and specificities for the comparison of ASD vs. NS

Not surprisingly, as shown in Table 2, the most satisfactory results were obtained when the most stringent condition, requiring a child to meet criteria on both the ADI-R (using clinical cutoffs) and the ADOS was used. In these cases, sensitivity and specificity for ASD vs. NS were consistently above 80%. For example, using both instruments yielded comparable sensitivities and significant improvements in specificities (4–22%) beyond when only ADI-R algorithms were used. Compared to when ADOS algorithms were used alone, using both instruments resulted in significant gains in specificities (10–31%) and slightly lower sensitivities, though they were still above 80% when ADI-R clinical cutoffs were used. As noted in previous papers (Gotham et al., 2007; Risi et al., 2006), when children with nonverbal mental ages below 15 months were included, specificities were slightly lower. Because evaluating children with low non-verbal mental age is a reality in clinical practice, these specificities are reported in parentheses (See Table 2). However, we also present separate results from data without children whose non-verbal mental ages fell below 15 months for researchers who wish to restrict their samples for better diagnostic accuracy.

Table 2
Validity of all conditions tested

As expected, the least restrictive condition, requiring a child to meet either the ADI-R or ADOS criteria, resulted in excellent sensitivities for ASD cases (97–99%), but poor specificities (45–85%). As in past studies, for all developmental cells, sensitivities improved when children whose BEC diagnoses were PDD-NOS were excluded. Although comparisons between the ASD and TD cases are not very informative clinically, because much research with younger children contrasts ASD and mixed TD and NS samples (as in studies with baby siblings), it is useful to know that not surprisingly, specificities also improved when TD cases were included. Likelihood ratios for the comparison of ASD vs. NS were most satisfactory when both instruments were used in combination using the conventional criteria (Likelihood ratio above 5 is considered satisfactory; Jaeskchke, Guyatt, & Lijmer, 2002).

Characteristics of misclassified children

We then compared the characteristics of true positives (TPs) and false negatives (FNs) for each instrument as well as false positives (FPs) and true negatives (TNs). The most common trend was that FPs (NS cases misclassified as ASD) were significantly older and had significantly lower NVIQ and Vineland scores than TNs (correctly classified NS cases). On the other hand, FNs (ASD cases misclassified as NS) were younger and showed higher NVIQ and Vineland scores than TPs (correctly classified ASD cases). See eTable 1 in Electronic Appendix.

Overlap between the ADI-R and ADOS-T ranges of concern

Most children (71%) whose scores were in the little-to-no range of concern in the ADI-R fell in the same range in the ADOS-T. Similarly, 64% of children whose scores fell in the moderate-to-severe range in the ADI-R fell in the same range in the ADOS-T. If a child was classified as at risk (mild-to-moderate and moderate-to-severe ranges) by only one instrument (23%), the odds ratio for the child to be placed in a risk group by the other instrument was 12.69 (χ2=19.2, p<.001, See Figure 1). When children were placed in a risk group by both instruments (50%), the odds ratio of having a BEC ASD diagnosis was 56.19 (χ2=19.2, p<.001).

Figure 1
Overlap between the ADI-R and ADOS Ranges of Concern. ADI-R Autism Diagnostic Interview-Revised, ADOS Autism Diagnostic Observation Schedule

Agreement across the instruments

A correlation between the ADI-R and ADOS algorithm totals for the “12–20/NV21–47” group (r=0.75) was also significantly greater than those of the “SW21–47” and “PH21–47” groups (r=0.47, Z=4.7; r=0.59, Z=2.7, both p<.01). The correlation between the ADI-R and ADOS SA domains for the “12–20/NV21–47” group (r=0.69) was significantly greater than that of the “SW21–47” group (r=0.49, Z=3.1, p<.01). The correlation between the ADI-R and ADOS RRB domains for the “12–20/NV21–47” group (r=0.62) was significant greater than those of the “SW21–47” and “PH21–47” groups (r=0.44, Z=2.5, p<.05; r=0.29, Z=3.9, p<.01, respectively).

Positive and negative screening estimates

Total scores on the ADI-R and ADOS algorithms which resulted in very high probabilities (100%) of receiving an ASD classification on the other instrument (positive screening estimate) for all ASD cases, ranged from 18 to 25 and 18 to 22 respectively. Total algorithm scores resulting in very low probabilities (less than 5%) of receiving an ASD classification on the other instrument (negative screening estimate) ranged from 4 to 5 and 8 to 11 respectively. See eTable 2 in Electronic Appendix for sensitivities and specificities for these scores.

High specificity case scores

We then identified scores on the ADI-R and ADOS that resulted in high specificities for BEC diagnoses of ASD. The lowest scores on both instruments that resulted in 100% specificities were first identified for each developmental cell. For the ADOS, when scores were selected by 100% specificity, sensitivities ranged from 17 to 80% depending on developmental cells. Clinical cutoffs on the ADI-R are reported. As expected, when the ADI-R research cutoffs were used, a similar pattern emerged but with lower sensitivities and higher specificities. For high specificity case scores (100%) on the ADI-R, sensitivities ranged from 14 to 41%. Scores that resulted in specificities around 90% and 80% were also identified. An example of sequential assessment strategies using the PSE, NSE and high specificity case scores are described in Figure 2 and the Discussion.

Figure 2
Sequential assessment strategies using Positive/Negative Screening Estimates (PSE/NSE) and High Specificity Case Scores. ADI-R Autism Diagnostic Interview-Revised, ADOS Autism Diagnostic Observation Schedule. *In general developmental disorders clinics, ...


Consistent with findings from older children (Risi et al., 2006), use of information from both the new ADI-R algorithms for toddlers and young preschoolers and the revised ADOS and new ADOS-T algorithms together better reflected clinical best estimate diagnoses of ASD than when either single instrument was used. The ADI-R includes a developmental history and a detailed description of individual’s functioning in a variety of social contexts as well as caregivers’ perceptions of the level of impairment and/or frequency of different behaviors. The ADOS provides a summary of an experienced clinician’s standardized observations of individual’s behaviors within contexts that elicit social initiations and responses as well as communication interchanges. As suggested by low to moderate correlations between the ADI-R and ADOS in this study and in previous research (de Bildt et al., 2004; Ventola et al., 2006), the instruments provide overlapping but not identical information. Though the lack of high agreement between the instruments is frustrating in terms of each instrument’s diagnostic validity, it increases their additive value. In fact, the combination of new and revised algorithms revealed even higher validity for toddlers and preschoolers than expected from studies using the original algorithms (Risi et al., 2006).

These newly developed and revised algorithms were created in a way that the influence of age and IQ scores on the algorithm scores was minimized. Nevertheless, we found differences in age, IQ, and adaptive functioning between children who were correctly identified and those misclassified by the instruments. For example, ASD cases misclassified as NS tended to be younger toddlers who had higher nonverbal intellectual and adaptive functioning. On the other hand, NS cases misclassified as ASD were older preschoolers with lower intellectual and adaptive functioning. These results are consistent with past studies showing that differentiating children with ASD from other developmental disorders is more difficult for very young children, children with severe delays (with lower IQ scores and/or who are nonverbal), and the most able toddlers and young preschoolers (with very high IQ scores and/or phrase speech; Gotham et al., 2007; Lord, Storoschuk, Rutter, & Pickles, 1993).

More able children, in this case, primarily older preschoolers, showed lower correlations between the ADI-R and ADOS than the younger and/or nonverbal group. In addition, mean ADI-R domain and algorithm total scores were lower for the “PH21–47” group than the “SW21–47” and “12–20/NV21–47” groups whereas mean ADOS scores were similar across all three groups. This may indicate that parents of preschoolers with more advanced levels of language, children who almost always also have stronger nonverbal skills, perceive their children’s symptoms as less severe than clinicians evaluating the same children based on direct observations. This supports the usefulness of integrating perspectives from both caregivers and experienced clinicians especially when evaluating more complex cases.

Different sequential strategies could be used to determine when use of a single instrument might be sufficient. Each strategy has a distinct process in terms of obtaining a diagnostic classification. For example, as in Figure 2, if a clinician first administered the ADOS and the child’s score on the ADOS was above the PSE (or the high specificity case score), unless other information suggested otherwise, the clinician could reasonably assume that the child would be likely to receive an ASD diagnosis without administering the ADI-R. Based on the clinic referrals from the dataset used in the present study, such an approach was appropriate for about 72% of the clinic referrals (with 52% very likely ASD and 20% likely not ASD). However, about 28% of the referrals obtained less decisive scores, showing that such an approach may not be appropriate for all children (See Table 3 and eTable 2 in Electronic Appendix). It is also important to note that UMACC is an autism clinic; in general developmental disorders clinics, autism cases would comprise a smaller proportion of likely diagnoses, so that the percent of cases with scores below or equal to NSE and possibly those in the less decisive range would increase. Studies of baby siblings of children with autism also suggest that the proportion of less decisive cases may be higher when children are not specifically referred for an autism assessment (Landa & Garrett-Mayer, 2006; Zwaigenbaum et al., 2005).

Table 3
High specificity (100%, 90%, and 80%) case scores and sensitivities

Although distributions of children by ranges of concern did not overlap perfectly between the ADI-R and ADOS-T, the majority of children classified as those needing follow-up evaluations and treatments by one instrument were also classified as at risk by the other instrument. In addition, the high likelihood ratio of receiving a BEC diagnosis of ASD for children classified into the risk groups by both instruments supports the validity of ASD risk categories even in very young children.


Compared to the samples in past studies, the sample used in the present study was smaller because we selected only children with a contemporaneous ADI-R and ADOS for each case. Thus, restricted size and possible recruitment biases of more complex cases in the control groups (children with NS and TD) may have resulted in lower specificities in the present study compared to the original studies (Gotham et al., 2007; Kim & Lord, in press; Luyster et al., 2009). In addition, because samples included a subset of children from previous studies mentioned above, replications from different sites will be critical.

The ADI-R and ADOS were administered by the same clinician for 75% of children; for 66% of these children, the ADI-R was administered before the ADOS. Thus, in about half of the cases, clinicians were not blind to developmental history and the caregiver’s descriptions, which might have affected their ADOS administration and coding. However, the correlation between the algorithm total scores for the two instruments was slightly higher when different clinicians versus the same clinician administered the instruments (r of 0.66 vs. 0.59).

Even though about 280 children (46% of the entire sample) were clinic referrals at UMACC, the results were not entirely based on clinic patients but also derived from research participants. In general developmental disorders clinics, characteristics of referrals might differ from the children in the present study. Thus, inferences about the generalization of results from the present study should be carefully considered.


The ADI-R and ADOS provide both unique and overlapping information important for clinicians and researchers making diagnostic decisions. When both instruments were used in combination, well-balanced sensitivities and specificities were obtained. Using the newly developed ADI-R algorithms and the revised ADOS and new ADOS-T algorithms, even with young children, validity for combined use of the instruments in the present study was comparable or higher than in past research (Risi et al., 2006). Taking into account information from both a skilled clinician and a caregiver contributes to diagnostic differentiations especially for more complex cases. Alternative combinations with other instruments besides or in addition to the ADI-R and/or ADOS, such as the SCQ (Rutter et al., 2003), SRS (Constantino & Gruber, 2005), CCC-2 (Bishop, 2003), and the Screening Tool for Autism in Two-Year-Olds (STAT; Stone, Coonrod, & Ousley, 2000) may be equally effective. In addition, sequential assessment strategies may be appropriate for some children allowing cost- and time- effective research and clinical practice.

Key Points

  • The ADI-R and ADOS make independent, additive contributions to more consistent and accurate diagnostic decisions for clinicians evaluating toddlers and young preschoolers with ASD.
  • Well-balanced sensitivities and specificities are obtained for ASD diagnoses using information from both parent interview and clinician observation.
  • Sequential assessment strategies using high specificity case scores and negative/positive screening estimates may allow clinicians to use a single measure for the diagnosis of ASD in some cases.

Supplementary Material

Supp App S1


We gratefully acknowledge the help of Susan Risi, Pamela Dixon Thomas, Suzi Naguib, Fiona Miller, Rhiannon Luyster, Whitney Guthrie, Kaite Gotham, Kathryn Larson, Kathy Hatfield, and Shanping Qiu, as well as the families that participated in this research. This study was funded by NIMH RO1 MH066469, MH57167, and HD 35482-01.

One of the authors, Catherine Lord, receives royalties for the ADI-R and ADOS; profits related to this study were donated to charity.


Conflict of Interest Statement: C.L. receives royalties for the ADI-R; profits from this study were donated to charity.


  • American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. 4. Washington, DC: Author; 1994.
  • Bishop D. The Children’s Communication Checklist-2. London: Psychological Corporation; 2003.
  • Bishop D, Baird G. Parent and teacher report of pragmatic aspects of communication: use of the Children’s Communication Checklist in a clinical setting. Developmental Medicine and Child Neurology. 2001;43:809–818. [PubMed]
  • Constantino J, Gruber C. Social Responsiveness Scale. Los Angeles, CA: Western Psychological Services; 2005.
  • Constantino J, LaVesser P, Zhang Y, Abbacchi A, Gray T, Todd R. Rapid quantitative assessment of autistic social impairment by classroom teachers. Journal of American Academy of Child and Adolescent Psychiatry. 2007;46(12):1668–1676. [PubMed]
  • Corsello C, Hus V, Pickles A, Risi S, Cook E, Leventahl B, et al. Between a ROC and a hard place: decision making and making decisions about using the SCQ. Journal of Child Psychology and Psychiatry. 2007;48(9):932–940. [PubMed]
  • de Bildt A, Sytema S, Ketelaars C, Kraijer D, Mulder Erik, Volkmar F, et al. Interrelationship between Autism Diagnostic Observation Schedule-Generic (ADOS-G), Autism Diagnostic Interview-Revised (ADI-R), and the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR) classification in children and adolescents with mental retardation. Journal of Autism and Developmental Disorders. 2004;34(2):129–137. [PubMed]
  • DiLavore P, Lord C, Rutter M. The Pre-Linguistic Autism Diagnostic Observation Schedule (PL-ADOS) Journal of Autism and Developmental Disorders. 1995;25:355–379. [PubMed]
  • Elliott CD. Differential Abilities Scale (DAS) San Antonio, TX: Psychological Corporation; 1990.
  • Gotham K, Risi S, Pickles A, Lord C. The Autism Diagnostic Observation Schedule (ADOS): Revised algorithms for improved diagnostic validity. Journal of Autism and Developmental Disorders. 2007;37(4):613–627. [PubMed]
  • Jaeskchke R, Guyatt G, Lijmer J. Diagnostic tests. In: Guyatt G, Rennie D, editors. User’s Guide to the Medical Literature. Chicago: AMA Press; 2002. pp. 121–140.
  • Kim S, Lord C. New Autism Diagnostic Interview-Revised (ADI-R) algorithms for toddlers and young preschoolers from 12 to 47 months of age. Journal of Autism and Developmental Disorders in press. [PubMed]
  • Kim S, Lord C. Restricted and repetitive behaviors in toddlers and preschoolers with autism spectrum disorders based on the Autism Diagnostic Observation Schedule (ADOS) Autism Research. 2010;3(4):162–173. [PMC free article] [PubMed]
  • Landa R, Garrett-Mayer E. Development in infants with autism spectrum disorders: a prospective study. Journal of Child Psychology and Psychiatry. 2006;47(6):629–638. [PubMed]
  • Le Couteur A, Haden G, Hammal D, McConachie H. Diagnosing autism spectrum disorders in preschoolers using two standardised assessment instruments: The ADI-R and the ADOS. Journal of Autism and Developmental Disorders. 2007;38(2):362–372. [PubMed]
  • Lord C, Rutter M, DiLavore P, Risi S. Autism Diagnostic Observation Schedule: Manual. Los Angeles: Western Psychological Services; 1999.
  • Lord C, Storoschuk S, Rutter M, Pickles A. Using the ADI-R to diagnose autism in preschoolers. Infant Mental Health Journal. 1993;14(3):234–252.
  • Luyster R, Gotham K, Guthrie W, Coffing M, Petrak R, Pierce K, et al. The Autism Diagnostic Observation Schedule—Toddler module: A new module of a standardized diagnostic measure for autism spectrum disorders. Journal of Autism and Developmental Disorders. 2009;39:1305–20. [PMC free article] [PubMed]
  • Mullen E. AGS. Mullen Scales of Early Learning. Circle Pines, MN: American Guidance Service; 1995.
  • Newcombe RG. Interval estimation for the difference between independent proportions: Comparison of eleven methods. Statistics in Medicine. 1998;17:873–890. [PubMed]
  • Risi S, Lord C, Gotham K, Corsello C, Chrysler C, Szatmari P, et al. Combining information from multiple sources in the diagnosis of autism spectrum disorders. Journal of the American Academy of Child and Adolescent Psychiatry. 2006;45(9):1094. [PubMed]
  • Rutter M, Bailey A, Lord C. The Social Communication Questionnaire. Los Angeles: Western Psychological Services; 2003.
  • Rutter M, Le Couteur A, Lord C. Autism Diagnostic Interview-Revised. Los Angeles: Western Psychological Services; 2003.
  • Siegel B, Vukicevic J, Elliott G, Kraemer H. The use of signal detection theory to assess DSM-III-R criteria for autistic disorder. Journal of the American Academy of Child and Adolescent Psychiatry. 1989;28:542–548. [PubMed]
  • Sparrow S, Balla D, Cicchetti D. Vineland Adaptive Behavior Scales. Circle Pines: American Guidance Service; 1984.
  • Steiger J. Test for comparing elements of a correlation matrix. Psychological Bulletin. 1980;87(2):245–251.
  • Stone W, Coonrod E, Ousley O. Screening Tool for Autism in Two-Year-Olds (STAT): Development and preliminary data. Journal of Autism and Developmental Disorders. 2000;30:607–612. [PubMed]
  • Szatmari P, Merette C, Bryson S, Thivierge J, Roy M, Cayer M, et al. Quantifying dimensions in autism: A factor-analytic study. Journal of American Academy of Child and Adolescent Psychiatry. 2002;41:467–474. [PubMed]
  • Ventola PE, Kleinman J, Pandey J, Barton M, Allen S, Green J, et al. Agreement among four diagnostic instruments for autism spectrum disorders in toddlers. Journal of Autism and Developmental Disorders. 2006;36:839–847. [PubMed]
  • Wiggins L, Robins D, Adamson L, Bakeman R, Henrich C. Support for a dimensional view of autism spectrum disorders in toddlers. Journal of Autism and Developmental Disorders in press. [PubMed]
  • Zwaigenbaum L, Bryson S, Rogers T, Roberts W, Brian J, Szatmari P. Behavioral manifestation of autism in the first year of life. International Journal of Developmental Neuroscience. 2005;23:143–152. [PubMed]