|Home | About | Journals | Submit | Contact Us | Français|
The primary aim of the present study was to evaluate the validity of proposed DSM-5 criteria for Autism Spectrum Disorder (ASD).
We analyzed symptoms from 14,744 siblings (8,911 ASD; 5,863 non-ASD) included in a national registry, the Interactive Autism Network. Youth aged 2–18 were included if at least one child in the family was diagnosed with ASD. Caregivers reported symptoms using the Social Responsiveness Scale and the Social Communication Questionnaire. The structure of autism symptoms was examined using latent variable models that included categories, dimensions, or hybrid models specifying categories and sub-dimensions. Diagnostic efficiency statistics evaluated the proposed DSM-5 algorithm in identifying ASD.
A hybrid model that included both a category (ASD vs. non-ASD) and two symptom dimensions (social communication/interaction and restricted/repetitive behaviors) was more parsimonious than all other models and replicated across measures and sub-samples. Empirical classifications from this hybrid model closely mirrored clinical ASD diagnoses (90% overlap), implying a broad ASD category distinct from non-ASD. DSM-5 criteria had superior specificity relative to DSM-IV-TR criteria (.97 vs. .86), however sensitivity was lower (.81 vs. .95). Relaxing DSM-5 criteria by requiring one less symptom criterion increased sensitivity (.93 vs. .81), with minimal reduction in specificity (.95 vs. .97).
Results supported the validity of proposed DSM-5 criteria for ASD as provided in Phase I field trials criteria. Increased specificity of DSM-5 relative to DSM-IV-TR may reduce false positive diagnoses, a particularly relevant consideration for low base rate clinical settings. Phase II testing of DSM-5 should consider a relaxed algorithm, without which as many as 12% of ASD-affected individuals, particularly females, will be missed. Relaxed DSM-5 criteria may improve identification of ASD, decreasing societal costs through appropriate early diagnosis and maximizing intervention resources.
Autisms are a highly heterogeneous set of disorders with wide variations in symptom severity, intellectual level, and functional disability.1 The importance of accurately identifying individuals with autism has never been greater, particularly given the growing prevalence,2 considerable family and societal costs,3 and recognized importance of early diagnosis and intervention. DSM-5 Field Trials have recently begun evaluating new diagnostic criteria that contain several important modifications relative to DSM-IV-TR.4, 5 The most remarkable change was combining specific DSM-IV-TR diagnoses into a single broad Autism Spectrum Disorder (ASD). This proposed change has generated considerable apprehension from patients and their families, who are concerned that individuals diagnosed with Asperger’s disorder will be orphaned or receive inappropriate service provision.6 Yet, to date, there is little evidence that Asperger’s disorder is qualitatively distinct from other autism diagnoses at the symptom level or that the new criteria will under-identify high functioning ASD.7 The present study evaluates an important aspect of this controversy - symptom continuity between individuals with Asperger’s disorder and other ASD cases - by explicitly testing conflicting views of the nature of autism symptom structure.
The first viewpoint proposes that autism symptoms are best represented dimensionally, with differences between typical and ASD symptom levels being a matter of degree (ie. no distinct ASD category is present).8 This perspective is supported by the observations from population and family studies of a broad distribution of observed autism symptoms in the population,9 elevated levels of autism traits in siblings and other family members of affected cases,10, 11 and shifting toward greater autism symptoms in children whose parents both show sub-threshold autism traits.12 Also compelling are possible genetic heterogeneity of symptom domains13, 14 and the substantial variability of autism symptoms among identical twins.15, 16 This evidence points toward the need to include a dimensional conceptualization of autism symptoms in its diagnosis.
The second viewpoint is that autism symptoms represent a category, with qualitative differences in symptom levels between ASD-affected and unaffected individuals.17–19 Two recent studies supported the categorical conceptualization of autism, identifying a latent symptom category parsing ASD and non-ASD cases.20, 21 Longitudinal studies of early (age 2) and later (ages 3–9) diagnoses also support the notion of a single, distinct ASD category.22, 23 These studies found strong stability of the broad ASD distinction, while substantial shifting occurred within specific DSM-IV diagnoses. Similarly, most studies of ASD symptoms and cognitive processes have identified only quantitative distinctions among DSM-IV disorders.7, 24, 25
It is possible that both viewpoints are correct and that categorical and dimensional aspects of autism symptoms should be considered in the conceptualization of ASD. In fact, in addition to an ASD category, DSM-5 also identifies two symptom dimensions - Social Communication and Interaction (SCI) and Restricted, Repetitive Behavior (RRB) that collapse the three DSM-IV-TR domains. Thus, DSM-5 presents a complex model with a single ASD category superimposed on two primary symptom dimensions. The present study investigated the plausibility of the DSM-5 conceptualization of autism symptoms by comparing categorical, dimensional, and hybrid latent variable models (Aim 1).26–28 Based on the DSM-5 model of ASD, we hypothesized that autism symptoms would be most parsimoniously represented by hybrid models specifying a categorical distinction (ASD vs. non-ASD) and two dimensions (SCI and RRB).
The distinction between dimensional, categorical, and hybrid models has relevance beyond diagnostic conceptualization - impacting the clinical assessment approach. If autism symptoms are best represented by a latent category, the prevalence of ASD, sibling recurrence rates, and the most useful diagnostic criteria are defined. In this case, an evidence-based medicine approach to clinical assessment that includes the generation of post-test probabilities of diagnosis is indicated.29 Under this scenario, instruments should be used to optimize classification rather than only grading autism symptom severity. In contrast, if only a latent continuum is needed to describe autism symptoms, additional research is needed to link the continuum to clinically relevant outcomes, such as functional deficits, prior to setting a diagnostic threshold. If hybrid models are supported, integration of the above approaches will be needed to most accurately characterize the presence vs. absence of ASD and to grade symptom severity.
Identifying the latent structure of autism also facilitates the design and analysis of future research.30, 31 If the latent structure of autism is categorical, then traditional group comparison research is optimized if individuals are accurately sorted into ASD and non-ASD groups. Alternatively, if a continuum is identified, research designs would be more efficient if autism symptoms are measured as quantitative traits and regression or latent factor approaches are implemented rather than group comparisons. A latent ASD category might also facilitate future neurobiological and genomic research. For example, a latent category may support developmental models postulating distinct divergence in brain structure and function.32, 33 An ASD category could also favor molecular models emphasizing the primacy of strong individual genomic effects,34–37 such as private mutations or copy number variations involving genes crucial for early brain maturation,38 although this is less certain because of the potential for genetic threshold effects. In contrast, strictly dimensional representations of autism would suggest graded alterations in brain development, overlapping typical brain trajectories, and a primary role for polygenic common variation and other additive genomic and environmental effects.37, 39, 40 Finally, hybrid models of ASD symptoms may imply a more complex pattern, such as combinations of rare and common variation as well as environmental effects contributing to the ASD phenotype.27, 41 Comparing models of autism symptom structure will be crucial for advancing diagnostic criteria, clinical assessment, and future molecular and neurobiological research.
The present study also evaluated specific changes in the proposed DSM-5 algorithm examined in Phase I trials (Aim 2). These changes include: 1) collapsing three domains into two - Social Communication and Interaction (SCI) and Restricted, Repetitive Behavior (RRB), 2) requiring all SCI criteria and at least 2 of 4 RRB criteria to be met, 3) necessitating at least 2 specific symptoms present per criterion, 4) providing a separate RRB criterion evaluating sensory sensitivities and unusual sensory interests, and 5) adding subtle behavioral manifestations of high functioning ASD. Thus, as a secondary aim, we explored the relative classification accuracy of DSM-IV-TR and proposed DSM-5 criteria to inform future diagnostic revisions in Phase II trials and to provide guidance to evaluating clinicians.
Data were obtained from the Interactive Autism Network (IAN; www.ianproject.org), a US, internet-based registry for families with one or more ASD-affected children (IAN Data Export ID: IAN_DATA_2010-07-06). Families were eligible for enrollment in IAN if the parent or legal guardian who provided information was English speaking, the family lived in the US, and their child was diagnosed with an ASD by a professional. To be included in the present study, caregivers must have reported ASD symptom data for at least one ASD-affected child. Caregivers also reported whether each registered child was clinically diagnosed with ASD and, if ASD-affected, the specific diagnosis obtained from a prior clinical evaluation. The vast majority of clinical ASD diagnoses (93%) were provided by a doctoral level professional or team. A substantial proportion of youth (74.9%) were diagnosed using the Autism Diagnostic Interview-Revised, the Autism Diagnostic Observation Schedule, or both. Of these, 98.7% scored in the ASD-affected range for one or both instruments. A recent study of youth from the IAN registry provided additional support for the validity of clinical ASD diagnoses.42 In this study, randomly selected verbal youth with a score >12 on the SCQ were evaluated using the Autism Diagnostic Interview-Revised and by an expert clinician’s observation. All but a handful of youth (98%) were confirmed to have a DSM-IV-TR clinical diagnosis of ASD.
Autism symptoms, ages of developmental milestones, and other developmental concerns were also reported for siblings without a clinical ASD diagnosis. This non-ASD sibling sub-sample includes substantial proportions of youth who have caregiver-reported diagnoses of ADHD (11.8%), anxiety disorder (5.4%), mood disorder (4.2%), intellectual disability (0.5%), motor delay (6.6%), and/or language delay (20.2%). A non-trivial fraction of these siblings also show characteristics of the broad autism phenotype.43 For these reasons, we refer to these youth as parent-designated non-ASD siblings. The heterogeneity of these siblings provides a reasonable comparison for examining both the latent structure and the relative efficiency of diagnostic criteria.
Informed consent was obtained from parents/guardians for all participants prior to entry into the IAN data collection. The procedures of IAN were reviewed and approved by the institutional review board of Kennedy Krieger Institute. The procedures of the present study were reviewed and approved by the institutional review board of the Cleveland Clinic.
Autism symptom data were provided using the Social Responsiveness Scale (SRS)44 and/or the Social Communication Questionnaire (SCQ).45 The SRS is 65-item, ordinally-scaled (1= “not true” to 4= “almost always true”) questionnaire that provides a quantitative assessment of the severity of autism traits. The SCQ is a dichotomously-keyed (yes/no) rating scale that consists of 40 questions many of which tap DSM-IV-TR symptom domains. Lifetime ratings referenced the child’s behavior throughout their developmental history, increasing diagnostic validity.46 Both the SRS and SCQ have been extensively validated and distinguish youth with autism from other psychiatric conditions.44, 47–49 SRS sub-scales provide a quantitative assessment of ASD symptoms and decrease the likelihood of falsely identifying a latent category due to limited scaling. Because the SRS includes only one scale evaluating RRB behaviors, we randomly divided the 12 items on the autism mannerisms scale into four packets to improve identification of the RRB domain in latent variable models. Packets or parcels have important psychometric advantages compared to directly analyzing items scores, including increasing the reliability and variance of the scales and improving model stability and interpretability.50
The hypothesis that autism symptoms would be most parsimoniously explained by a category (ASD vs. non-ASD) and two sub-dimensions (SCI and RRB) was evaluated by estimating a series of latent class (LCA; 2–6 classes) and exploratory factor analyses (EFA; 1–3 factors) (Aim 1). Latent class analysis assumes the existence of discrete categories, while exploratory factor analysis assumes latent dimensions or continua of symptoms. Recent developments in latent variable modeling now permit the simultaneous integration of categories and dimensions.26, 27 These hybrid factor mixture (FM) models integrate both EFA and LCA, with latent factors modeled across classes. Comparing the fit of EFA, LCA, and FM models addresses the question of whether categories (classes), dimensions (factors), or both are needed to represent autism symptoms. When applied to autism symptom data, hybrid models examine whether autism symptoms measure one or more dimensions and whether individual differences in those dimensions result from one or more groups of individuals. If a single group provides optimal fit, then dimensions capture graded increases in autism symptom levels. If two or more groups provide superior fit, then symptom dimensions result from two or more distinct groups or distributions (ex. ASD vs. non-ASD OR Autism vs. Asperger’s vs. non-ASD). Based on empirical work, the results of LCA and EFA models guided the selection of plausible FM models.26, 51 For example, if a two-factor EFA fit better than three-factor models, FM models emphasized one- and two-factor solutions since it is unlikely three-factor FM models would improve fit when both factors and classes are included.
Models were fit in the total available samples (including ASD and non-ASD siblings) and the non-ASD sibling sub-samples. Sibling comparators were examined separately because the full sample may be biased toward identification of a pseudo-category due to the polarizing effects of social comparison processes or rater contrast effects (ie. rating presumed non-ASD sibs as very healthy and ASD sibs as very severely affected). Estimating models in non-ASD siblings circumvents this potential bias. We expected to observe a low base rate class with significant ASD symptoms in the sibling comparison sub-sample. This expectation was based on previous analyses,20 the observation of high SCQ and SRS scores in a non-trivial minority of caregiver-designated non-ASD siblings in the IAN registry (>3% show SCQ total score ≥15 and/or SRS total t-score ≥70), a similar observation of high scoring youth with autism or neurobehavioral disorders in population samples,47 and recent findings of language delay with autistic qualities of speech in a minority of these siblings.43 Primary analyses focused on the SRS because this measure evaluates autism symptoms as a quantitative trait and is less biased toward identifying categorical latent structure. Sensitivity analyses were also conducted with the SCQ. These analyses were conducted to internally replicate latent variable model findings using this very different type of measure and to determine the sensitivity of model findings to variations in sampling (See Table S1, available online which presents the sample/sub-sample, item/scale set, and purpose of each latent variable model analysis).
All models were estimated accounting for the clustered nature of the sibling data. LCA models assumed conditional independence of variables within classes. FM models were estimated specifying strict or strong measurement invariance across classes.52 The pattern of model comparisons was similar, and therefore results were presented with strong measurement invariance. Best fitting models were determined by comparing the Bayesian Information Criteria (BIC),26, 28 where lower values indicate better model fit. Results were highly similar for the Akaike and sample-adjusted Bayesian criteria. Class proportions were also evaluated for FM models to ensure that better fitting models did not over-fit the observed score distribution. Empirically-derived class assignments (ASD vs. non-ASD) were computed using the best fitting model.52 These assignments served as an additional proxy of ASD diagnosis,53 complementing clinical ASD diagnoses. The terms “ASD class” and “non-ASD class” described these empirical assignments even when they were derived from the caregiver-designated non-ASD sibling sub-sample. Independent samples t-tests and chi-square evaluated differences in developmental milestones across empirical classifications in the non-ASD siblings. Differences in the ages of developmental milestones were anticipated to be small in magnitude across ASD and non-ASD empirical classifications. Thus, they were not expected to be clinically meaningful, but rather to provide theoretical validation of the notion of a small ASD class in these caregiver-designated non-ASD siblings.
To explore the relative classification accuracy of DSM-IV-TR and DSM-5, autism symptoms from the SCQ and SRS were mapped onto DSM-IV-TR and proposed DSM-5 diagnostic criteria (See Table S2, available online). SCQ items were mapped first with a limited number of SRS items (8 total) chosen only to fill gaps where SCQ items did not evaluate a DSM-5 symptom. Because SCQ items are based on DSM-IV-TR criteria, only 6 SRS items were included to ensure a balance of the total number of items mapped to DSM-IV-TR and DSM-5 criteria. SCQ items were predominant to ensure that subsequent analyses were favored DSM-IV-TR rather than DSM-5 criteria. This provides a stronger test of the relative efficiency of DSM-5 criteria rather than predominantly SRS items, which were not developed in relation to diagnostic criteria. Mapped DSM-IV-TR criteria were considered met if any of the algorithms for Autistic Disorder, PDD NOS, or Asperger’s disorder were endorsed. Algorithms were followed as specified in the Phase I Field Trials criteria. Iterations of this criteria were also examined, including the most recent iteration as of this writing that requires only 1 specific symptom present per criterion.5 SCQ and SRS item mapping was done in a sub-sample with complete data on both measures. Because most individuals enrolling in IAN receive the SCQ, this sub-sample has highly similar composition to the SRS sample (see results for comparison of SCQ and SRS samples).
Analyses compared the sensitivity and specificity of DSM-IV-TR, proposed DSM-5, and alterations of DSM-5 to proxy ASD diagnoses. Greater weight was given to specificity to limit false positive ASD diagnoses. Analyses of the DSM-5 algorithm are exploratory because criteria were not perfectly mapped, the non-ASD sibling sample is not equivalent to a clinical comparison sample, clinical observation was not included, and proxies for ASD diagnosis are only estimates of true ASD. However, because of the important role of parent report, particularly for symptom history in older individuals and for behaviors that are difficult to observe in an office visit,54 these analyses provide a reasonable evaluation of the relative efficiency of different versions of diagnostic criteria.
Latent variable models were estimated using MPlus v5.2. All other analyses used SPSS v18. Effect sizes are presented as Cohen’s d with conventions of very small (d<.10), small (d=.20), medium (.50), and large (d=.80).55
The total IAN sample included symptom data from 14,774 youth aged 2–18 (8,911 ASD; 5,863 caregiver-designated non-ASD siblings) provided by 10,038 caregivers. There was a predominance of males with clinical ASD diagnoses (82.3%) and females in the (53.9%) sibling comparison group, consistent with the sex ratio of autism. SRS data were available from 6,949 youth (4,248 ASD, 2,701 non-ASD siblings) and SCQ data were available from 14,200 youth (8,606 ASD, 5,594 non-ASD siblings), respectively. The smaller SRS sample had highly similar demographic and clinical characteristics to the larger SCQ sample with a three notable exceptions. First, the SRS sample was slightly older (SRS M=8.40, SD=3.99, SCQ M=7.53, SD=4.37; t(14772)=12.69, p<.001, d=.21) due to the fact that the instrument is completed only for individuals ages 4 and older, while the SCQ is completed for ages 2 and older. Additionally, the total SRS sample included fewer non-verbal individuals even when only youth ages 4 and older are considered (SRS 8.4%, SCQ 12.8%; Χ2(1)=61.10, p<.001; d=.14) and a slightly higher proportion of white/non-Hispanic individuals (SRS 86.3%, SCQ 85.0%; Χ2(1)=5.42, p=.020; d=.04). The above differences, while statistically significant, ranged from very small to small in magnitude. The SRS and SCQ samples had similar distributions across sex and specific DSM-IV-TR ASD diagnoses and showed equivalent proportions of individuals receiving the ADOS or ADI-R as part of their clinical diagnostic evaluation (largest Χ2(1)=3.65, p>.05). Additional characteristics of the IAN sample have been described elsewhere.20, 43
Table 1 displays comparisons for models using SRS sub-scales. LCA models showed steady improvements in fit from 2- to 6-class solutions, but no minimum was reached for any information criterion and fit was generally weaker than EFA models. The two-factor EFA model divided SRS social and autism mannerisms variables consistent with proposed DSM-5 SCI and RRB domains. The three-factor EFA model fit better, but had only one minor loading on the third factor. Therefore, FM models were estimated with two factors separating social and autism mannerisms variables.28
The two-factor/three-class FM model fit slightly better than all other models. However, the third class appears to overfit the symptom distribution by splitting ASD-affected youth according to extreme and less extreme groups across all SRS scales. In spite of the large sample size, there were no significant differences in the proportion of individuals diagnosed with Autistic Disorder, PDD NOS, or Asperger’s disorder between these classes (Χ2(1)=2.45, p=.118), indicating no additional information was added by retaining the third class. Furthermore, the third class tended to fit more poorly and was not replicated in other analyses. Thus, the two-factor/two-class FM model represented the most parsimonious solution (Figure 1).
Supporting our prediction, similar results were obtained in the non-ASD sibling sub-sample. EFA and LCA models fit more poorly (lowest BIC=42420) than FM models. The two-factor/three-class FM model fit best (BIC=41480) but the third class (5%) overfit extreme scores with no qualitative differences in symptom pattern. The two-factor/two-class FM model was most parsimonious (BIC=41981).
Figure 1 displays results of the two-factor/two-class hybrid model. Base rates of the ASD and non-ASD empirically derived classes are presented for both the total sample and non-ASD sibling sub-sample. In the total sample, the ASD class had a high base rate (63%) that was similar to the base rate of caregiver-reported clinical ASD diagnosis (60%). In the non-ASD sibling sub-sample, a low base rate of the ASD class (9%) was observed. This was consistent with the prediction of only a small proportion of non-ASD siblings having ASD and should be viewed in light of suggestions from the empirical literature that small base rates such as this may be over-estimated.52
Figure 1 also presents ASD and non-ASD class means and standard deviations for SRS social and autism mannerism T-scores, separately for the total and non-ASD sibling sub-samples. The means and standard deviations show strong separation of scores across ASD and non-ASD classes in both samples. Figure 2 plots SRS social and autism mannerism T-scores for youth in ASD and non-ASD classes, separately for the total sample (left panel) and the non-ASD sibling sub-sample (right panel). Inspection of the total sample (left panel) reveals clustering of class values into ASD (orange) and non-ASD classes (blue). However, broad dimensions of symptom severity were also present within each cluster, with higher variance and a weaker correlation between social and autism mannerisms scales in the ASD class (correlations are presented in Figure 1). For the total sample (left panel), social and autism mannerism T-scores in the ASD class (orange) ranged from 45 to >100. Social and autism mannerism T-scores were lower in the non-ASD class, ranging from 35 to 65. A similar pattern was observed for the non-ASD sibling sub-sample (right panel). The only exception was that the ASD class derived from the non-ASD sibling sub-sample is considerably smaller and has few very high scores relative to the ASD class from the total sample. Importantly, and in spite of overlapping score ranges, ASD and non-ASD classes showed very large (d>2.0), highly significant, separation on SRS social and autism mannerism t-scores (smallest t(2699)=57.3, p<.001, d=2.21). Large separation on these scores suggested that low and high scores might have clinical utility for separating ASD and non-ASD classifications. To examine this possibility, we computed likelihood ratios for low (T≤50) and high score (T≥65) ranges on SRS scales in the total sample. Low scores on both the SRS social and autism mannerism scales were very useful for decreasing the probability of ASD (Likelihood ratio=.007). High scores on both scales greatly increased the probability of ASD (Likelihood ratio=14.3).
To examine the sensitivity of model comparisons to the measure used (SRS vs. SCQ) and sampling characteristics, the same set of models were estimated using SCQ items across the total available sample and several demographic sub-samples (non-ASD siblings, verbal youth, multiple incidence families, females, ages<7, and ages ≥7). The pattern of results for each set of model comparisons was highly similar to the primary models using SRS indictors. EFA supported the presence of two primary factors. LCA tended to require a large number of latent classes to approximate FM model fit. The two-factor/two-class FM model was typically the best fitting and always the most parsimonious model in the SCQ total and sub-samples. The SCQ total sample closely matches the complete IAN registry,56 ensuring that results were not a function of a biased subset of IAN. Results also generalized across changes in FM model specifications.26, 28, 52
Total sample FM classifications closely mirrored clinical ASD diagnosis (90% of youth classified identically; K=.78). Empirical classifications from the non-ASD sub-sample included a higher proportion of females identified as belonging to the ASD class (50.9%) relative to empirical classifications from the total sample (20.8%), reinforcing the notion that female siblings may be under-identified by caregivers as having ASD.10, 57 Figure 3 presents the ages at which developmental milestones were achieved in caregiver-designated non-ASD siblings, separately for empirically identified ASD and non-ASD classes. Individuals in the ASD class were reported to have later ages of first words (ASD class M=12.6 months, SD=9.0; non-ASD class M=10.9 months, SD=4.9), walking (ASD class M=12.8 months, SD=3.2; non-ASD class M=12.1 months, SD=2.5), meaningful phrase speech (ASD class M=22.6 months, SD=11.4; non-ASD class M=18.7 months, SD=5.9), and toilet training (ASD class M=35.6 months, SD=12.7; non-ASD class M=31.0 months, SD=9.3) relative to youth in the non-ASD class. These differences were all highly significant (all p<.001) and remained highly significant when excluding youth with intellectual disability. Thus, caregiver-designated non-ASD siblings empirically classified into an ASD category showed a reliable pattern of developmental delay across a range of early milestones. However, these delays tended to be subtle and highly variable across cases. As a result the above differences tended to be small (largest d=.33) and are not large enough to be clinically useful for distinguishing ASD from non-ASD cases.
Table 2 presents sensitivity and specificity of DSM-IV-TR, proposed DSM-5, and modified DSM-5 criteria to proxy ASD diagnoses. DSM-IV-TR and DSM-5 criteria showed substantial, but far from complete, overlap (Kappa=.74, agreement=87.1%). DSM-5 criteria had superior specificity, but lower sensitivity relative to DSM-IV criteria (p<.001). The increase in specificity is remarkable in that it could reduce false positives by more than four times the estimated DSM-IV-TR rate (DSM-5=3% vs. DSM-IV-TR=14%). Interestingly, relaxing the DSM-5 algorithm to require one less symptom from either SCI or RRB provided the best balance of sensitivity and specificity. With the relaxed algorithm an additional 11–12% of ASD cases would be captured (Table 2 – comparing sensitivity values in row 2 vs. row 3). The superior sensitivity of a relaxed DSM-5 algorithm was at least partly due to decreased endorsement of 2 SCI criteria and 1 RRB criterion in older youth. Specifically, we observed a weak, but highly significant, negative relationship between age of the child and endorsement of the DSM5 non-verbal communication, relationships, and repetitive behavior criteria (age with SCI criterion A2:non-verbal communication r=−.10; age with SCI criterion A3:relationships r=−.09; age with RRB criterion B1:repetitive behaviors r=−.09; all p<.001).
Including symptoms tapping subtler behavioral manifestations of high functioning ASD improved sensitivity (.81 vs. .64). Symptoms evaluating social difficulties in high functioning ASD included: impaired social understanding or awareness (SRS 4), literal or pedantic use of language (SRS 10), difficulties in adjusting behavior to various contexts (SRS 52), unusual prosody (SRS 53), and problems with body orientation or social distance (SRS 55). The sizable improvement in sensitivity was attributable to a higher proportion of individuals with Asperger’s disorder being correctly identified when these symptoms were included (78.3% vs. 58.7%). Detection of Asperger’s disorder was further enhanced using the relaxed algorithm described above (90.9%). Adding sensory sensitivities and unusual interests as a RRB criterion also improved sensitivity (.81 vs. .78) without substantially altering specificity. Requiring two specific symptom endorsements per DSM-5 criterion enhanced specificity (.97 vs. .90), although it is important to note that sensitivity is highest (.96) when only one symptom is required per criterion. This version of criteria may be preferred in high base rate settings where specificity is less crucial. The same pattern of findings was observed across both ASD diagnostic proxies. Results were also consistent across clinically relevant groups (age<7, age≥7, females, verbal youth, and multiple incidence families) with the notable exception that sensitivity and specificity tended to be slightly weaker in youth age<7 and multiple incidence families (See Table S3, available online).
In this investigation, a hybrid model, closely matching proposed DSM-5 criteria for ASD, yielded the most parsimonious representation of autism symptoms. The model included both a categorical distinction between youth with and without ASD and dimensional representations of social communication and interaction difficulties and restricted, repetitive behavior. Sensitivity analyses supported this hybrid conceptualization of ASD with generalization across measurement scales and demographic sub-samples. Furthermore, results were consistent with taxometric analyses of autism symptom data from this sample.20 If the present results are replicated, large sample neurobiological studies will be useful for determining whether biological markers of the categorical ASD distinction can be identified58 or if the DSM-5 ASD diagnosis can be empirically parsed into biologically meaningful sub-phenotypes.
The hybrid model merges two previously competing views of ASD symptoms – dimensional and categorical perspectives.9, 20 Figure 2 provides a visualization of how an ASD diagnostic category can be superimposed on quantitative symptom distributions, clarifying the underlying nature of autism symptoms. Merging categorical and dimensional models presents the opportunity to enhance clinical assessment through a two-pronged approach. In this approach, assessment tools will provide both quantitative evaluation of symptom severity and generate post-test probabilities of ASD diagnosis following evidence-based medicine procedures.29 A key next step will be identifying or adapting gold-standard measures which map DSM-5 criteria and can be used in an evidence-based medicine fashion to generate post-test probabilities of ASD.59 Dissemination of these measures to clinicians is essential to meet the growing demand for early diagnosis of ASD.
The presence of an ASD vs. non-ASD distinction coheres with data identifying a divergent trajectory of brain development in ASD.32, 33 The hybrid model of ASD also meshes with an emerging understanding of the complexity of autism genetics. Clearly, the present findings do not specify a genomic mechanism. However, they do cohere with recent findings from genomic studies identifying roles for highly penetrant mono- or oligogenic effects and modifying effects from common variants and epigenetic processes.41, 60–62 It is important to note that the presence of a categorical distinction (ASD vs. no ASD) does not rule out a role for quantitative biological variation. These constitute different levels of analysis: For example, combinations of quantitative trait loci can produce threshold effects that might manifest as categorical distributions of symptoms. With this important caveat in mind, future genomic and neurobiological studies will benefit from designs and analytic methods that incorporate both the categorical and dimensional nature of ASD symptoms. Doing so will facilitate capturing the full nature and intricacy of ASD.
Findings from exploratory analyses supported the diagnostic efficiency of the proposed DSM-5 Phase I Field Trials algorithm in identifying ASD. The only exception was possible relaxing of the algorithm by requiring one less SCI or RRB criterion, which may increase capture of ASD-affected individuals by as much as 11–12%. Relaxing the number of criteria required may be particularly important for improving sensitivity to individuals previously classified as Asperger’s disorder or other under-identified sub-groups of ASD-affected youth57, 63 and for enhancing detection in settings where reports of early-life symptoms are not readily obtained (e.g. adopted children or youth in foster or juvenile justice settings). Accounting for these individuals would be consistent with the DSM task force efforts to improve the applicability of the diagnostic system across sub-groups. Additional studies of the influence of age on DSM-5 ASD criteria will also be vital, particularly given the superior validity of retrospective reports of earlier symptoms and the primary role of caregiver report in evaluations of older individuals.46 The present results suggest that caregiver report may be less helpful in distinguishing younger children (age<7) with ASD and affected and unaffected youth from multiple incidence families. Careful clinical observation may be crucial for accurate diagnosis in these groups.
The superior specificity of DSM-5 - especially criteria used in the Phase I Field Trials - relative to DSM-IV-TR criteria was striking. High specificity is a desirable trait for the new DSM-5 ASD diagnosis, reducing the likelihood that individuals without ASD are inappropriately given the diagnosis. However, the improved specificity of DSM-5 criteria to ASD raises the possibility that, if adopted, some individuals meeting criteria for DSM-IV-TR may not meet DSM-5 criteria. Studies examining youth with autism symptoms who do not meet full DSM-5 Phase I Field Trials and currently proposed DSM-5 criteria are needed to determine whether the newly planned Social Communication Disorder captures all of these youth or whether some youth with high functioning ASD might be misclassified. Revision and phase II testing of DSM-5 criteria should continue to include sensory sensitivities/unusual sensory interests and subtler SCI symptoms to maximize sensitivity to high functioning ASD. The possibility of requiring two or more specific symptoms per criterion (as specified in Phase I Field Trials criteria) should also be re-examined as this appeared to improve specificity substantially over the currently proposed criteria.
The primary limitations of this study are: use of a self-selected internet registry, comparison with caregiver-designated non-ASD siblings, mapping of caregiver-reports of autism symptoms to diagnostic criteria, and reliance on clinical ASD diagnoses. The IAN internet registry is not representative of all families affected by ASD. However, the national scope and large size of this registry make it preferable to smaller community samples that limit generalizability and preclude examining hybrid FM models. Replication in population samples will be important for estimating the base rate and demographic characteristics of ASD. Very large population samples (N>10,000) will be needed to ensure a sufficient number of ASD cases are captured and to limit the influence of ascertainment biases on the detection of low base rate latent classes.
Non-ASD siblings are not equivalent to a clinical comparison sample. For this reason, the specificity values obtained may over-estimate values seen in typical clinical assessment situations where many cases mimicking ASD may present. However, the sibling comparison group may provide a reasonable approximation to the heterogeneous presentations often seen in clinical practice. Specifically, this group included a substantial proportion of individuals with other developmental concerns, including language delay, characteristics of the broad autism phenotype,43 ADHD, anxiety, and intellectual disability. Furthermore, analyses of this sub-sample identified a small proportion of caregiver-designated non-ASD siblings as falling in the ASD category. Thus, it is possible that diagnostic efficiency may actually be under-estimated. Regardless, the aim of the present study was to compare the relative efficiency of different iterations of diagnostic criteria rather than absolute values that would mirror those obtained in clinical settings. Phase II Field Trials and additional studies using clinical comparison samples will be needed to provide clinically-realistic absolute values.
Clinical ASD diagnoses are not as reliable and valid as diagnoses based on gold-standard semi-structured research interviews or observational instruments. However, these diagnoses were often based on gold-standard measures and available data suggests that clinical diagnoses in IAN are quite accurate.42 To circumvent the limitations of this diagnostic proxy, we also included FM-based empirical classifications. The consistency of findings across these surrogates is comforting and implies that many of the present findings may also apply to gold-standard ASD diagnoses.
Mapping caregiver reports of symptoms on the SRS and SCQ to diagnostic criteria is also a limitation. The SCQ and SRS items may not have been mapped optimally and/or the items themselves may not adequately capture the targeted diagnostic criteria. It is possible that inclusion of clinical observation and judgment or other clinical information such as adaptive impairment would have improved sensitivity to ASD without the need to alter the proposed DSM-5 algorithm. However, caregiver report is often a crucial aspect of a thorough diagnostic evaluation and it is sometimes difficult to elicit accurate historical information for early-life ASD symptoms.54 Thus, assessment approaches that lean heavily on caregiver-report in determining ASD may benefit from relaxing of DSM-5 Phase I Field Trials criteria. Relaxed criteria should also be considered in conjunction with adaptive impairment to determine the simplest, most efficient criteria. Finally, it should also be noted that the present findings do not prima facie imply that DSM-IV-TR diagnoses should be lumped. It is possible that retaining a separate diagnosis or a specific diagnostic modifier for particular cases (such as individuals previously diagnosed with Asperger’s disorder) might be clinically useful if these cases were found to have a different course, prognosis, or treatment response.
In spite of these limitations, the present findings provided a comprehensive validation of DSM-5 criteria and generated several promising leads for future examinations of the DSM-5 algorithm. Adopting relaxed DSM-5 Phase I Field Trials criteria may substantially decrease false positive diagnoses and simultaneously maintain or improve identification of ASD cases. The importance of enhanced detection is magnified by increasing recognition of the prevalence of ASD. Many ASD cases will have their first, and possibly only, point of contact in non-ASD specialty settings.64 In these settings, clinicians without expertise in ASD are and will be called upon on an increasing basis to make an appraisal of the diagnosis, to initiate additional referral for specialty evaluation, and to provide preliminary treatment recommendations while they await diagnostic confirmation. Thus, improvements in ASD detection will be meaningful for all consumers and providers, facilitating earlier and more appropriate intervention, diminishing inappropriate resource allocation, and reducing family and societal costs.
This publication was made possible by the Case Western Reserve University / Cleveland Clinic Clinical and Translational Grant Number UL1 RR024989 from the National Center for Research Resources. Autism Speaks provided support for the Interactive Autism Network project and Dr. Law. Funding from the National Institute of Child Health and Human Development Grant Number HD42541 supported Dr. Constantino’s involvement. Dr. Charis Eng is a Doris Duke Distinguished Clinical Scientist and holds the Sondra J. and Stephen R. Hardis Chair of Cancer Genomic Medicine at the Cleveland Clinic.
We gratefully acknowledge and appreciate the efforts of families contributing to the Interactive Autism Network. Drs. Frazier and Youngstrom served as the statistical experts.
Dr. Frazier has received federal funding or research support from, acted as a consultant to, or received travel support from Shire Development, Inc., Bristol-Myers Squibb, the National Institute of Health (NIH), and the Brain and Behavior Research Foundation. Dr. Youngstrom has received travel support from Bristol-Myers Squibb. Dr. Findling receives or has received research support, acted as a consultant to and/or served on a speaker's bureau for Abbott, Addrenex, AstraZeneca, Biovail, Bristol-Myers Squibb, Forest, GlaxoSmithKline, Johnson and Johnson, KemPharm, Eli Lilly and Co., Lundbeck, Merck, Neuropharm, Novartis, Noven, Organon, Otsuka, Pfizer, Rhodes Pharmaceuticals, Sanofi-Aventis, Schering-Plough, Seaside Therapeutics, Sepracore, Shire, Solvay, Sunovion, Supernus Pharmaceuticals, Transcept, Validus, and Wyeth. Dr. Constantino receives royalties from Western Psychological Services for the commercial distribution of one of the metrics used in this study (Social Responsiveness Scale); no royalties were generated by any of the assessments performed in the present research.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Dr. Law designed and maintains the Interactive Autism Network registry. Drs. Frazier, Youngstrom, and Eng designed the present study. Dr. Frazier obtained funding to support analyses. Drs. Frazier, Youngstrom, Speer, Constantino, Findling, and Hardan, and Ms. Embacher supervised data interpretation of the study. Drs. Frazier and Ms. Embacher conducted data management and data analyses. All authors contributed to writing and revision.
Disclosure: Drs. Law, Speer, Eng, and Hardan, and Mrs. Embacher report no biomedical financial interests or potential conflicts of interest.