Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Int J Methods Psychiatr Res. Author manuscript; available in PMC 2010 September 14.
Published in final edited form as:
PMCID: PMC2938790

Attention deficit hyperactivity disorder: concordance of the adolescent version of the Composite International Diagnostic Interview Version 3.0 (CIDI) with the K-SADS in the US National Comorbidity Survey Replication Adolescent (NCS-A) supplement


This paper evaluates the internal consistency reliability and concurrent validity of the assessment of Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) attention deficit hyperactivity disorder (ADHD) in the adolescent version of the World Health Organization (WHO) Composite International Diagnostic Interview Version 3.0 (CIDI). The CIDI is a lay-administered diagnostic interview that was carried out in conjunction with the US National Comorbidity Survey Adolescent Supplement, a US nationally representative survey of 10,148 adolescents and their parents. Internal consistency reliability was evaluated using factor and item response theory analyses. Concurrent validity was evaluated against diagnoses based on blinded clinician-administered interviews. Inattention and hyperactivity-impulsivity items loaded on separate but correlated factors, with hyperactivity and impulsivity items forming a single factor in parent reports but separate factors in youth reports. We were able to differentiate hyperactivity and impulsivity factors for parents as well by eliminating a subset who endorsed zero ADHD items from the factor analysis. Although concurrent validity was relatively weak, decomposition showed that this was due to low validity of adolescent reports. A modified CIDI diagnosis based exclusively on parent reports generated a diagnosis that had good concordance with clinical diagnoses [area under the curve (AUC) = 0.78]. Implications for assessing ADHD using the CIDI and the effect of different informants on measurement are discussed.

Keywords: attention deficit hyperactivity disorder, WHO Composite International Diagnostic Interview (CIDI), validity, National Comorbidity Survey Replication Adolescent Supplement (NCS-A)


The accuracy of structured assessments of attention deficit hyperactivity disorder (ADHD) has been a topic of considerable interest because of concerns about the potential for both over-diagnosis and misdiagnosis of the disorder (American Academy of Pediatrics, 2000; LeFever et al., 1999; Sciutto and Eisenberg, 2007). In the Diagnostic and Statistic Manual of Mental Disorders, fourth edition (DSM-IV; American Psychiatric Association, 2000), ADHD is defined as a syndrome characterized by ongoing and atypical inattention or hyperactivity-impulsivity in at least two settings that begins prior to age seven. The DSM-IV presents three subtypes of ADHD: predominantly inattentive type (six or more symptoms of inattention), predominantly hyperactive-impulsive type (six or more symptoms of hyperactivity or impulsivity), and combined type (six or more symptoms of inattention and six or more symptoms of hyperactivity or impulsivity).

Recent debates about ADHD diagnosis have focused on the accuracy of current measurement tools (Furman, 2005) and the interpretation of discrepancies between multiple informants (Hartung et al., 2005). Using a nationally-representative dataset of adolescents and their parents, we contribute to these discussions by evaluating the internal consistency reliability and validity of a new multi-informant measure of ADHD, the ADHD module of the adolescent version of the World Health Organization (WHO) Composite International Diagnostic Interview Version 3.0 (CIDI 3.0; Merikangas et al., 2009). Our analysis of the internal consistency of the CIDI symptom assessment addresses questions about the structure of ADHD that have been reflected in modifications of criteria with each revision of the DSM, where ADHD has been presented as consisting of between one and three dimensions (DuPaul et al., 1998; Glutting et al., 2005). Although the most recent revision of the DSM presents an empirically based two-factor model (Glutting et al., 2005; Gomez et al., 2003; Willcutt et al., 1999), there remain questions about whether ADHD may be better conceptualized as a three-factor model that separates impulsivity from hyper-activity (Amador-Campos et al., 2006).

We next turn to evaluating the concurrent validity of the CIDI ADHD module by comparing diagnoses based on the CIDI with independent diagnoses based on blinded clinical reappraisal interviews with the Schedule for Affective Disorders and Schizophrenia for School-Age Children – Present and Lifetime Version (K-SADS-PL; Kaufman et al., 1997). Several other clinical reappraisal studies of lay-administered ADHD interviews have found weak to moderate concordance of diagnoses with clinical re-interviews. A review of these studies found a wide range in the level of agreement between parent and/or child reports and clinician-assigned diagnoses (κ = 0.09 – 0.60) (Ezpeleta et al., 1997). One reason for this wide variation is that some structured assessments are based only on reports by a single informant while others are based on reports by both youth and parents or teachers. Adolescent and informant reports of ADHD are often discrepant (Rubio-Stipec et al., 1994). Methodological analysis generally finds that parent reports are more strongly associated with blinded clinical assessments of ADHD than are youth reports (Schwab-Stone et al., 1996), a pattern that is often attributed to underreporting by youth (Rohde et al., 1999). Based on this prior evidence, we consider validity separately for youth and parent reports.



The National Comorbidity Survey Replication Adolescent Supplement (NCS-A) is a US nationally-representative face-to-face survey of 10,148 adolescents ages 13–17 (Merikangas et al., 2009). The NCS-A interviews were carried out by the professional interview staff of the Institute for Social Research at the University of Michigan between April 2001 and April 2003. A dual-frame sample was used that included a household sub-sample and a school sub-sample. The household sub-sample consisted of 904 adolescent residents of the households that participated in the National Comorbidity Survey Replication (NCS-R), a nationally representative household survey of adults (Kessler and Merikangas, 2004). The school sub-sample consisted of 9244 adolescents who were students in a probability sample of schools (public and private, day and boarding, selected with probabilities proportional to size) in the same counties as the NCS-R sample.

We began respondent recruitment by sending an informational letter and Study Fact Brochure to the parents of each target respondent. The letter contained an 800 phone number for parents to call if they had questions not covered in the Study Fact Brochure or if they wanted to opt out. A study interviewer then visited the household a few days later to talk with a parent and answer remaining questions before obtaining parental written informed consent. Only after obtaining this consent did the interviewer approach the adolescent to obtain written informed assent. Each target respondent and parent was offered a $50 incentive for participation. These recruitment and consent procedures were approved by the Human Subjects Committees of both the University of Michigan and Harvard Medical School.

The conditional (on adult participation in the NCS-R) response rate of adolescent respondents completing a face-to-face NCS-A interview in the household sub-sample was 86.8%. The corresponding response rate of adolescent respondents in the school sub-sample was 82.6%. The overall weighted (by sample size) NCS-A adolescent response rate across the two sub-samples was 82.9%. In addition to the adolescent face-to-face interviews, one parent of each adolescent was asked to complete an informant self-administered questionnaire (SAQ). An effort was made to have the parent complete the SAQ while the interviewer was in the household interviewing the adolescent. If this was not possible, the interviewer left the SAQ and a self-addressed pre-stamped envelope with the parent for later self-administration and mail return. Postcard and telephone reminders were used when the parent did not return the SAQ. An attempt was made to administer a truncated version of the SAQ, which included the ADHD section, to parents who never returned the paper-and-pencil version. Parents of 8470 NCS-A adolescent respondents completed the SAQ either in paper-and-pencil self-administration (n = 6483) or in telephone administration (n = 1987) by the end of the study. Extensive efforts were made to obtain as much parent report data as possible on ADHD symptoms in adolescents. The data were weighted for within-household probability of selection (only in the household sub-sample) and for residual discrepancies on the basis of socio-demographic and geographic variables between the samples and the population distributions of US residents in the 13–17 age range from the 2000 Census. More details on NCS-A weighting are reported elsewhere (Kessler et al., 2009a; Kessler et al., 2009b).

The NCS-A clinical reappraisal study was carried out in a quota sample of 321 adolescent–parent pairs (described in Kessler et al., 2009c). The sample was confined to adolescents residing in households with telephones because the K-SADS clinical reappraisal interviews were administered by telephone. Telephone administration is now widely accepted in clinical reappraisal studies based on evidence of comparable validity to in-person administration (Kendler et al., 1992; Rohde et al., 1997; Sobin et al., 1993). A great advantage of telephone administration is that a centralized and closely supervised clinical interview staff can carry out the interviews throughout the entire sample area without the geographic restriction that is typically required for face-to-face clinical assessment. A disadvantage is that the small part of the population without telephones cannot be included in clinical calibration studies when interviews are done by telephone.

Respondents who met DSM-IV/CIDI criteria for one or more relatively uncommon disorder (e.g. agoraphobia, bipolar disorder, panic disorder, substance dependence with abuse) were sampled at a higher rate than respondents in a second sampling stratum who met criteria only for more common disorders. The lowest sampling fraction was for a third stratum made up of respondents who did not meet criteria for any lifetime DSM-IV/CIDI disorder. Each respondent in the clinical reappraisal study was given a $50 incentive for participation (over and above the $50 incentive for participation in the main survey).


The CIDI is a fully-structured diagnostic interview administered by trained lay interviewers to assess a wide range of DSM-IV and International Classification of Diseases (ICD-10) disorders. The adult version of the CIDI was developed for use in community epidemiological surveys (Kessler and Ustun, 2004). The adolescent version introduced some modifications in wording aimed at increasing the relevance of questions to adolescents. After a series of warm-up questions, the CIDI administers screening questions made up of diagnostic stem questions for a wide range of disorders. Positive responses are then probed in subsequent CIDI sections. The ADHD screening questions ask about a history of concentration problems prior to the age of seven that lasted a minimum of six months and that, in retrospect, seemed excessive compared to peers. A second screening question assesses a history of hyperactivity-impulsivity present before the age of seven that lasted a minimum of six months.

Adolescents who endorse the first screening question are subsequently entered into the ADHD section of the CIDI and asked retrospective questions about inattentiveness in childhood that correspond to each of the nine DSM-IV TR Criterion A symptoms of the inattentive type of ADHD. A similar set of nine questions that correspond to the nine Criterion A symptoms of the hyperactive-impulsive type of ADHD are administered to respondents who endorse the CIDI screening question about childhood hyperactivity-impulsivity. Respondents are skipped out of each question sequence when they either endorse six questions or would not have a total of six even if they endorsed all remaining questions. Adolescents who endorse four or more of the nine questions in a given section are asked follow-up questions about role impairment associated with these symptoms. Subsequent questions in each series ask about age at onset and persistence of symptoms. Parents are administered an informant version of the same nine CIDI questions about Criterion A symptoms of inattentiveness and nine questions about hyperactivity-impulsivity. Parents who endorse any of these questions are asked about role impairment.

In the NCS-A, a dichotomous (yes-no) diagnostic classification of DSM-IV ADHD was generated from the CIDI adolescent and parent reports using an ‘or’ rule to count each symptom as present if it was endorsed by either the adolescent or the parent. In addition, independent diagnoses were generated separately based only on adolescent and parent reports. A series of exploratory analyses was also carried out to investigate the extent to which concordance of diagnoses based on CIDI reports with diagnoses based on blinded clinical reappraisal interviews could be improved by modifying the CIDI diagnostic classification rules in various ways described later in the paper.

The K-SADS-PL is a semi-structured clinician-administered research diagnostic interview designed to assess a range of child and adolescent DSM-IV disorders. The K-SADS interviewers in the NCS-A were experienced clinical interviewers who received 40 hours of training in the K-SADS from one of the developers of the K-SADS and were required to pass a certification test before beginning production interviewing. Completed clinical interviews were audio-taped and closely monitored for quality control. In addition, clinical interviewers attended bi-weekly quality control monitoring meetings to prevent interviewer drift. The K-SADS interviews were administered an average of two and a half months after the CIDI interviews (with the majority between one and a half to four months after CIDI interviews). This relatively long lag time between CIDI and K-SADS interviews was scheduled purposefully to be longer than the roughly two weeks between interviews that is more typical in clinical reappraisal studies in order to reduce the negative response set that often occurs in re-interviews due to respondent burden. In the case of ADHD, where age of onset is required to be in childhood to meet diagnostic criteria, this long lag time is unlikely to have had a meaningful effect on true lifetime prevalence.

A single clinician interviewed both the adolescent and the parent, in that order. A diagnosis based only on the information provided by the single respondent was generated after each interview. A final diagnosis was then made, taking into consideration the information provided by both respondents. For the purposes of this study, the final clinician-rated diagnoses that included information obtained from both respondents are considered the gold standard. Ninety of the original 321 K-SADS cases were interviewed a second time with the K-SADS following a review of the quality of overall clinical interviews. A small number of clinical diagnoses were changed based on this process. In the very few cases where interviewer quality concerns existed but it was not possible to carry out a second K-SADS interview, diagnoses were changed by random imputation at rates estimated from the 90 re-interviewed cases (Rubin, 1981).

Analysis methods

Analysis of internal consistency of the ADHD symptom reports was carried out in the sample of 8470 adolescent– parent dyads that completed the CIDI. Tetrachoric factor analysis with Promax rotation was used to establish the dimensionality of the symptom reports. Item response theory (IRT) models (Hambleton et al., 1991) were then used to evaluate the implicit assumption in the DSM-IV that each Criterion A symptom of ADHD has the same association as the others with true ADHD.

As there were a high percentage of respondents who did not endorse any ADHD items, in addition to the conventional two-parameter IRT model, we estimated a mixture model that is specifically designed for this situation and as a result tends to yield better fit (Finkelman et al., submitted for publication). The mixture model conceptualized respondents as falling into two mutually exclusive categories, the first consisting of respondents conceptualized as not being in the ADHD spectrum and the second consisting of respondents in the ADHD spectrum whose responses (some of which, like those of respondents not in the ADHD spectrum, consisted entirely of negative responses) could be described adequately by the normally distributed latent liability assumed in the IRT model. The theoretical distinction between respondents not in the ADHD spectrum and respondents in the spectrum who endorsed no items was that the latter were assumed to be people who would have displayed at least some evidence of sub-threshold ADHD symptoms in a more in-depth question series. Although we could not tell which specific respondents were not in the ADHD spectrum, we were able to estimate the percentage of people in that first category derived under the mixture model using the EM algorithm (Dempster et al., 1977), while new IRT slopes and thresholds were calculated in the second class. A second factor analysis was also carried out in the sub-sample of respondents in the second mixture model category.

A comparison of lifetime DSM-IV diagnoses of ADHD based on the CIDI and on the K-SADS was made in the full clinical reappraisal sample after weighting this sample to adjust for the over-sampling of CIDI cases and post-stratifying for small residual discrepancies between the weighted clinical reappraisal sample and the full weighted NCS-A sample on a wide range of matching variables. Diagnoses of DSM-IV ADHD based on the adolescents, parents, and combined were compared with diagnoses based on the K-SADS at both the aggregate and individual levels. At the aggregate level, we investigated whether prevalence estimates based on the CIDI are similar to those based on the K-SADS using McNemar χ2 tests that take into account unequal sampling weights. At the individual level, we estimated CIDI sensitivity (SN; the percent of K-SADS positives detected by the CIDI), specificity (SP; the percent of K-SADS negatives classified as negative by the CIDI), positive predictive (PPV; the percent of CIDI positives classified as positive by the K-SADS), and negative predictive value (NPV; the percent of CIDI negatives classified as negative by the K-SADS). We also calculated two measures of overall concordance between CIDI and K-SADS diagnoses: Cohen’s κ (Cohen, 1960) and the area under the receiver operating characteristic curve (AUC; Hanley and McNeil, 1982). Cohen’s κ is reported because it is the most widely used measure of diagnostic concordance, but κ has the undesirable characteristic of being influenced by marginal distributions. AUC is also reported because it is an alternative measure of diagnostic concordance that is not influenced by marginal distributions (Kraemer et al., 2003). All analyses of concordance were conducted using SAS 9.0 and SUDAAN 9.0.1 software programs (SAS Institute, 2002; Research Triangle Institute, 2005).


Internal consistency reliability

Tetrachoric factor analyses were calculated separately for inattention (AD) and hyperactivity-impulsivity (HD) criteria using the symptom reports of all adolescents and parents who screened into the AD and HD modules. For AD, results indicated a three-factor solution (unrotated eigenvalues: 8.01, 4.18, 1.23, 0.86), with the three rotated (promax) factors corresponding to parent reports of AD and two factors for youth reports (Table 1). The existence of two adolescent factors rather than one might indicate that distractibility is somewhat distinct from executive function problems. However, it is difficult to interpret the two factors in any clear conceptual way. As a result, we collapsed all items from these two factors into a single youth factor for the subsequent IRT analysis. The factor analysis of HD symptoms also indicated a three-factor solution (unrotated eigenvalues: 8.44, 3.33, 1.55, 0.72). The rotated (promax) factor solution corresponded to a parent HD factor and two factors for adolescents that separated hyperactivity from impulsivity, with the exception of one DSM hyperactivity item (often talks excessively) that loaded on the impulsivity factor.

Table 1
Rotated (promax)1 tetrachoric factor analysis (standardized regression coefficients) of parent and youth symptom reports separately for symptoms of inattention and hyperactivity-impulsivity (n = 8470)

One-parameter (1PL) and two-parameter (2PL) IRT models were estimated for each of the two informants (adolescent and parent) on each of the two dimensions (AD and HD) (Table 2). Pearson chi-square statistics were calculated for the 1PL and 2PL models, comparing expected and observed outcomes. For both informants on the AD criteria and parents on the HD criteria, the 2PL model was a significantly better fit than the 1PL model. For adolescents on the HD criteria, the 1PL model was a significantly better fit than the 2PL model. Focusing first on the adolescent data, slopes for both the AD and HD factors are moderate (0.80–1.14 for AD and 0.91 for HD), indicating that none of the items is a strong indicator of the underlying dimension. (A slope of at least 1.0 is usually defined as the lower bound for an item that has good precision at its threshold on the underlying scale.) Thresholds were for the most part within one-third (±) of a standard deviation of the mean, indicating that most of the information in the scales is in a part of the severity distribution that is well below the clinical threshold. The conjunction of low slopes and sub-clinical thresholds indicates that the scale is not highly sensitive or specific in discriminating clinical cases from non-cases.

Table 2
IRT model item parameters for adolescent and parent CIDI inattention and hyperactivity-impulsivity items1

Slopes were considerably higher in the parent data for both AD and HD factors (1.83–3.33 for AD and 1.34–3.39 for HD), indicating that the items have excellent precision at their thresholds. It is noteworthy that the existence of significant slope differences across items for both AD and HD means that optimal scaling would weight items differentially to arrive at an estimate of underlying scale scores. This is different from the stipulation in the DSM that each Criterion A symptom of AD and of HD contributes equally to a diagnosis. Like the slopes, the thresholds of the parent items were a good deal higher than in the youth data (0.81–1.24 for AD and 0.98–1.41 for HD), indicating that the parent scales have much better precision than the youth scales.

The fact that a high proportion of respondents endorsed none of the ADHD symptom questions raises the possibility that the IRT assumption of a normally distributed latent liability might be violated. Based on this concern, we fitted separate two-class IRT mixture models for the adolescent and parent HD and AD data, where one class was stipulated to consist exclusively of respondents outside of the AD or HD spectrum; that is, to have no risk of reporting any ADHD symptoms. The other class was assumed to consist of respondents in the ADHD spectrum. Respondents in the latter class were assumed to have a normally distributed latent liability (including some proportion that would be expected to endorse none of the CIDI symptom questions). Relatively small proportions of adolescent respondents who completed the symptom questions were estimated in this model to be outside the spectrum for AD (8.3%) or HD (3.7%), while much larger proportions were estimated to be outside the spectrum for the parent AD (50.4%) and HD (54.5%) scales. This substantial difference between adolescents and parents is presumably due to the fact, noted earlier, that screening questions were used in the assessment of adolescents but not parents.

The two-class mixture model was a better fit than the standard IRT model for both adolescent and parent AD and HD dimensions (based on Pearson χ2 tests comparing expected and observed values for 2PL and 2PL-mixture models). Eliminating respondents not in the spectrum from the database, we replicated the factor analyses for AD and HD and again identified a three-factor solution for AD (unrotated eigenvalues: 8.44, 3.33, 1.55, 0.72) and a four-factor solution for HD (unrotated eigenvalues 6.67, 3.44, 1.78, 1.16, 0.77). The rotated (promax) factor solution for AD had a factor pattern very similar to the one in the original factor analysis; one factor included all parent reports and two factors included adolescent reports, where we could find no meaningful interpretation of the distinction between items in the two adolescent factors. (Detailed results are not reported, but are available on request.) The rotated (promax) solution for HD, in comparison, was different from the original solution in that it differentiated symptoms of hyperactivity from symptoms of impulsivity both in parent reports and in adolescent reports (Table 3). Adolescent and parent primary factor loadings were very similar, with the exception of the impulsivity item ‘difficulty waiting turn,’ which loaded on the impulsivity factor in the adolescent data, but the hyperactivity factor in the parent data.

Table 3
Rotated (promax)1 tetrachoric factor analysis (standardized regression coefficients) for hyperactivity-impulsivity symptoms based on CIDI symptom reports in adolescent–parent pairs where both respondents were classified as being in the HD spectrum ...

Concordance of CIDI symptom reports with clinician ratings

Adolescent responses to the CIDI questions about Criterion A symptoms of AD generally underestimated K-SADS prevalence (Table 4). This was due to sensitivity being uniformly low (16.4–35.1%; i.e. a low proportion of adolescents classified by the K-SADS as having a history of the symptom reporting the symptom in the CIDI). Specificity, in comparison, was generally quite good (94.5– 97.9%; i.e. a very high proportion of adolescents classified by the K-SADS as not having a history of the symptom denying the symptom in the CIDI). Concordance of adolescent symptom reports with K-SADS estimates was higher, in comparison, for Criterion A symptoms of HD, but this was as much because specificity decreased (89.3– 96.5%) as because sensitivity increased (21.4–42.9%) (Table 4). Parent reports generally overestimated the prevalence of K-SADS Criterion A symptoms of both AD and HD (Table 4). This occurred because of both higher sensitivity and lower specificity than in adolescent reports. Similar patterns were found when we generated criterion-level estimates from the symptom reports, including Criterion A1 (six or more symptoms of inattention), Criterion A2 (six or more symptoms of hyperactivity-impulsivity), Criterion B (some symptoms present before the age of seven), and Criterion C (clinically significant impairment) (Table 5). CIDI ratings based on adolescent reports overestimated the prevalence of Criterion A2, but underestimated the other criteria. CIDI ratings based on parent reports, in comparison, overestimated the clinical K-SADS diagnosis for all of these criteria except impairment.

Table 4
Concordance (sensitivity and specificity) of CIDI/DSM-IV ADHD Criterion A inattention and hyperactivity-impulsivity symptoms with blinded K-SADS ratings in the NCS-A clinical reappraisal sample (n = 321)
Table 5
Concordance (sensitivity and specificity) of criterion-level assessments of DSM-IV ADHD based on the CIDI with those based on blinded K-SADS interviews in the NCS-A clinical reappraisal sample (n = 321)

The CIDI diagnostic algorithms

Consistent with the results reported in the last section, diagnoses based on adolescent CIDI reports substantially underestimated the prevalence of DSM-IV/K-SADS ADHD (3.0% versus 7.8%, χ2 = 9.4, p < 0.05). Individual-level concordance was also poor (κ = 0.19, AUC = 0.57) (Table 6). Diagnoses based on parent reports, in comparison, were much more consistent with DSM-IV/K-SADS prevalence (8.0% versus 7.8%, χ2 = 0.2, p = 0.68), and individual-level concordance was much better than for adolescent reports (κ = 0.41, AUC = 0.71). Diagnostic estimates based on composite (i.e. adolescent–parent) CIDI reports were inflated (11.9% versus 7.8%, χ2 = 5.9, p < 0.05), but had better concordance with K-SADS diagnoses than those based on parent report alone (κ = 0.43, AUC = 0.77).

Table 6
Diagnostic concordance of the original and modified DSM-IV ADHD diagnoses based on the CIDI with diagnoses based on the composite K-SADS in the NCS-A clinical reappraisal sample (n = 321)

We explored several options for modifying CIDI ratings to improve concordance between diagnoses based on the CIDI and the K-SADS. First, we considered the possibility of modifying the symptom thresholds in the CIDI Criterion A symptom reports, but this did not improve concordance. Second, we considered the possibility of using predictive logistic regression analysis to improve concordance of CIDI symptom reports with K-SADS diagnoses. These analyses showed clearly that CIDI parent reports were significant predictors of K-SADS diagnoses while CIDI adolescent reports were not, after controlling for parent reports. Third, based on this result, we focused subsequent analyses on parent reports and considered ways in which we might improve concordance with K-SADS diagnoses by cross-classifying Criterion A1 and A2 CIDI reports and selecting a higher diagnostic threshold than the DSM-IV six of nine A1 or A2 symptoms to address the fact that CIDI parent reports over-diagnose ADHD. In addition, we explored the effects of eliminating Criterion B from the diagnostic algorithm, as parents significantly overestimated this criterion, and the effects of modifying the Criterion C measure of impairment because of its low sensitivity.

The scoring rule that best predicted K-SADS diagnoses required 10 or more endorsed symptoms out of the 18 in the A1 and A2 series in conjunction with CIDI Criterion C modified to require ‘a lot’ of interference in at least one area of role functioning. We could not improve concordance by setting separate thresholds for CIDI A1 and A2 symptom counts, including Criterion B, or introducing additional information from the CIDI adolescent reports. The new algorithm yielded CIDI prevalence estimates of DSM-IV ADHD that did not differ significantly from diagnostic estimates based on the K-SADS (χ2 < 0.01, p = 1.00). This is a considerable improvement over the original CIDI diagnoses.


As described elsewhere (Kessler et al., 2009c) there were several limitations in the design of this clinical reappraisal study that may impact results. These limitations include the telephone administration of the K-SADS (in contrast to the face-to-face CIDI administration); however, as mentioned earlier, there is evidence that telephone interviews are a valid method for clinical assessment (Rohde et al., 1997). Second, we did not have any validating information from other sources (e.g. teachers, school records, behavioral observations) against which to compare the youth and parent reports in our sample. Without external validating information, we cannot relate often conflicting reports from adolescents and parents to behaviors observed by others outside the parent–child dyad. In the case of ADHD, the absence of data from school personnel is a particularly noteworthy limitation, as information from teachers is commonly obtained in research diagnoses of ADHD as a way to assess the presence of symptoms in multiple contexts and to corroborate reports from parents and adolescents (Collett et al., 2003; Pliszka, 2007).

Consistent with the DSM-IV, our factor analyses found separate, but correlated, factors for AD and HD symptoms of ADHD, although a more refined analysis that excluded respondents classified as not being in the ADHD spectrum found distinct factors for hyperactivity and impulsivity. Concordance of CIDI reports with blinded K-SADS diagnoses showed that parents were much more accurate reporters than adolescents. This is consistent with previous studies of ADHD measurement validity (Pelham et al., 2005). However, parents overestimated the vast majority of ADHD symptoms compared to blind clinician ratings. Recalibration of CIDI parent reports corrected for high prevalence rates based on parent report alone. The modified CIDI diagnostic classification yielded prevalences that did not significantly differ from K-SADS classifications and had good individual-level concordance with K-SADS diagnoses.

These results underscore the complexity of ADHD diagnostic assignment and illustrate the critical nature of multi-informant assessments for this disorder (Hoff et al., 2002). Future attempts to improve the CIDI should explore the possibility of increasing the severity of parent symptom questions to reduce the current excessive rate of symptom reports. A challenge in doing this will be that Criterion A symptom-level sensitivity was seldom greater than 50% in the NCS-A, which means that false positives might be increased beyond an acceptable level by increasing the severity of these questions. Clearly, though, the clinical interviewers elicited additional information from parents that allowed the other cases to be detected, so the goal of revising the CIDI parent assessment should be to develop questions that capture as much of this information as possible in a fully-structured format.

It is less clear that improvements can be made in adolescent symptom reports, as positive responses to the CIDI adolescent Criterion A symptom questions were not strongly related to K-SADS ratings. Indeed, the great majority of adolescents judged by clinical interviewers to have a history of a given Criterion A symptom failed to endorse that symptom in the CIDI. When combined with the fact that a small minority of respondents classified by the K-SADS as not having a history of a particular Criterion A symptom endorsed the symptom in the CIDI, we are left with very poor item-level concordance. Pelham et al. (2005) reviewed previous methodological studies that consistently documented a similar pattern of low concordance between youth self-reports and clinician assessments of ADHD due to underreporting of symptoms and severity of impairments. This research showed that youth with ADHD often have limited self-awareness of their symptoms (Zucker et al., 2002), which means that there are fewer opportunities for eliciting more complete data with in-depth probing for adolescent than parent reports. As a result, future improvements in CIDI adolescent assessment of ADHD might profit from using one of the brief computerized tests that has been found to be useful in assessing ADHD in neuropsy chological settings and other epidemiological studies (Conners et al., 2003; Epstein et al., 2003), although this would be more useful in assessing active than remitted cases.


The National Comorbidity Survey Replication Adolescent Supplement (NCS-A) is supported by the National Institute of Mental Health (NIMH; U01-MH60220) with supplemental support from the National Institute on Drug Abuse (NIDA), the Substance Abuse and Mental Health Services Administration (SAMHSA), the Robert Wood Johnson Foundation (RWJF; Grant 044780), and the John W. Alden Trust. The views and opinions expressed in this report are those of the authors and should not be construed to represent the views of any of the sponsoring organizations, agencies, or US Government. A complete list of NCS-A publications can be found at Send correspondence to ude.dravrah.dem.pch@scn. The NCS-A is carried out in conjunction with the World Health Organization World Mental Health (WMH) Survey Initiative. We thank the staff of the WMH Data Collection and Data Analysis Coordination Centres for assistance with instrumentation, fieldwork, and consultation on data analysis. The WMH Data Coordination Centres have received support from NIMH (R01-MH070884, R13-MH066849, R01-MH069864, R01-MH077883), NIDA (R01-DA016558), the Fogarty International Center of the National Institutes of Health (FIRCA R03-TW006481), the John D. and Catherine T. MacArthur Foundation, the Pfizer Foundation, and the Pan American Health Organization. The WMH Data Coordination Centres have also received unrestricted educational grants from Astra Zeneca, BristolMyersSquibb, Eli Lilly and Company, GlaxoSmithKline, Ortho-McNeil, Pfizer, Sanofi-Aventis, and Wyeth. A complete list of WMH publications can be found at


Declaration of interest statement

Dr Kessler has been a consultant for GlaxoSmithKline Inc., Kaiser Permanente, Pfizer Inc., Sanofi-Aventis, Shire Pharmaceuticals, and Wyeth-Ayerst; has served on advisory boards for Eli Lilly & Company and Wyeth-Ayerst; and has had research support for his epidemiological studies from Bristol-Myers Squibb, Eli Lilly & Company, GlaxoSmithKline, Johnson & Johnson Pharmaceuticals, Ortho-McNeil Pharmaceuticals Inc., Pfizer Inc., and Sanofi-Aventis. The remaining authors have no competing interests.


  • Amador-Campos JA, Forns-Santacana M, Guàrdia-Olmos J, Peró-Cebollero M. DSM-IV Attention Deficit Hyperactivity Disorder symptoms: Agreement between informants in prevalence and factor structure at different ages. Journal of Psychopathology and Behavior Assessment. 2006;28:23–32. DOI: 10.1007/s10862-006-4538-x.
  • American Academy of Pediatrics. Clinical practice guideline: Diagnosis and evaluation of the child with attention-deficit/hyperactivity disorder. Pediatrics. 2000;105:1158–1170. [PubMed]
  • American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. Fourth Edition, Text Revision. American Psychiatric Association; 2000.
  • Cohen J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement. 1960;20:37–46.
  • Collett BR, Ohan JL, Myers KM. Ten-year review of rating scales. V: Scales assessing attention-deficit/hyperactivity disorder. Journal of the American Academy of Child and Adolescent Psychiatry. 2003;42:1015–1037. DOI: 10.1097/01.chi.0000081821.25107.e7. [PubMed]
  • Conners CK, Epstein JN, Angold A, Klaric J. Continuous performance test performance in a normative epidemiological sample. Journal of Abnormal Child Psychology. 2003;31:555–562. DOI: 10.1023/A: 1025457300409. [PubMed]
  • Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. 1977;39:1–38.
  • DuPaul GJ, Anastopoulos AD, Power TJ, Reid R, Ikeda MJ, McGoey KE. Parent ratings of attention-deficit/hyperactivity disorder symptoms: Factor structure and normative data. Journal of Psychopathology and Behavior Assessment. 1998;20:83–102.
  • Epstein JN, Erkanli A, Conners CK, Klaric J, Costello JE, Angold A. Relations between Continuous Performance Test performance measures and ADHD behaviors. Journal of Abnormal Child Psychology. 2003;31:543–554. DOI: 10.1023/A:1025405216339. [PubMed]
  • Ezpeleta L, de la Osa N, Domenech JM, Navarro JB, Losilla JM. Diagnostic agreement between clinicians and the Diagnostic Interview for Children and Adolescents – DICA-R – in an outpatient sample. Journal of Child Psychology and Psychiatry. 1997;38:431–440. DOI: 10.1111/j.1469-7610.1997.tb 01528. x. [PubMed]
  • Finkelman M, Green JG, Gruber M, Zaslavsky AM. A zero- and k-inflated mixture model for health questionnaire data. (Submitted for publication) [PMC free article] [PubMed]
  • Furman L. What is attention-deficit hyperactivity disorder (ADHD)? Journal of Child Neurology. 2005;20:994–1002. [PubMed]
  • Glutting JJ, Youngstrom EA, Watkins MW. ADHD and college students: Exploratory and confirmatory factor structures with student and parent data. Psychological Assessment. 2005;17:44–55. DOI: 10.1037/1040-3590.17.1.44. [PubMed]
  • Gomez R, Burns GL, Walsh JA, de Moura MA. A multitrait-multisource confirmatory factor analytic approach to the construct validity of A DHD rating scales. Psychological Assessment. 2003;15:3–16. DOI: 10.1037/1040-3590.15.1.3. [PubMed]
  • Hambleton RK, Swaminathan H, Rogers HJ. Fundamentals of Item Response Theory. Sage Publications; 1991.
  • Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29–36. [PubMed]
  • Hartung CM, McCarthy DM, Milich R, Martin CA. Parent–adolescent agreement on disruptive behavior symptoms: A multitrait-multimethod model. Journal of Psychopathology and Behavior Assessment. 2005;27:159–168.
  • Hoff KE, Doepke K, Landau S. Best practices in the assessment of children with Attention Deficit/Hyperactivity Disorder: Linking assessment to intervention. In: Thomas A, Grimes J, editors. Best Practices in School Psychology IV. Volume 2. National Association of School Psychologists; 2002. pp. 1129–1150.
  • Kaufman J, Birmaher B, Brent D, Rao U, Flynn C, Moreci P, Williamson D, Ryan N. Schedule for Affective Disorders and Schizophrenia for School-Age Children-Present and Lifetime Version (K-SADS-PL): Initial reliability and validity data. Journal of the American Academy of Child and Adolescent Psychiatry. 1997;36:980–988. [PubMed]
  • Kendler KS, Neale MC, Kessler RC, Heath AC, Eaves LJ. A population-based twin study of major depression in women. The impact of varying definitions of illness. Archives of General Psychiatry. 1992;49:257–266. [PubMed]
  • Kessler RC, Avenevoli S, Costello EJ, Green JG, Gruber MJ, Heeringa S, Merikangas KR, Pennell B-E, Sampson NA, Zaslavsky AM. National Comorbidity Survey Replication Adolescent Supplement: II. Overview and design. Journal of the American Academy of Child and Adolescent Psychiatry. 2009a [PMC free article] [PubMed]
  • Kessler RC, Avenevoli S, Costello EJ, Green JG, Gruber MJ, Heeringa S, Merikangas KR, Pennell B-E, Sampson NA, Zaslavsky AM. Design and field procedures in the US National Comorbidity Survey Replication Adolescent Supplement (NCS-A) International Journal of Methods in Psychiatric Research. 2009b;18:69–83. [PMC free article] [PubMed]
  • Kessler RC, Avenevoli S, Green J, Gruber MJ, Guyer M, He Y, Jin R, Kaufman J, Sampson NA, Zaslavsky AM, Merikangas KR. National Comorbidity Survey Replication Adolescent Supplement: III. Concordance of DSM-IV/CIDI Diagnoses with clinical reassessments. Journal of the American Academy of Child and Adolescent Psychiatry. 2009c;48:386–399. [PMC free article] [PubMed]
  • Kessler RC, Merikangas KR. The National Comorbidity Survey Replication (NCS-R): Background and aims. International Journal of Methods in Psychiatric Research. 2004;13:60–68. [PubMed]
  • Kessler RC, Ustun TB. The World Mental Health (WMH) Survey Initiative Version of the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI) International Journal of Methods in Psychiatric Research. 2004;13:93–121. [PubMed]
  • Kraemer HC, Morgan GA, Leech NL, Gliner JA, Vaske JJ, Harmon RJ. Measures of clinical significance. Journal of the American Academy of Child and Adolescent Psychiatry. 2003;42:1524–1529. DOI: 10.1097/01. chi.0000091507.46853.d1. [PubMed]
  • LeFever GB, Dawson KV, Morrow AL. The extent of drug therapy for attention deficit-hyperactivity disorder among children in public schools. American Journal of Public Health. 1999;89:1359–1364. [PubMed]
  • Merikangas KR, Avenevoli S, Costello EJ, Koretz D, Kessler RC. National Comorbidity Survey Replication Adolescent Supplement: I. Background and measures. Journal of the American Academy of Child and Adolescent Psychiatry. 2009;48:367–369. [PMC free article] [PubMed]
  • Pelham WE, Jr, Fabiano GA, Massetti GM. Evidence-based assessment of attention deficit hyperactivity disorder in children and adolescents. Journal of Clinical Child and Adolescent Psychology. 2005;34:449–476. [PubMed]
  • Pliszka S. Practice parameter for the assessment and treatment of children and adolescents with attention-deficit/hyperactivity disorder. Journal of the American Academy of Child and Adolescent Psychiatry. 2007;46:894–921. [PubMed]
  • Research Triangle Institute. SUDAAN: Professional Software for Survey Data Analysis, version 9.0.1. Research Triangle Institute; 2005.
  • Rohde LA, Biederman J, Knijnik MP, Ketzer C, Chachamovich E, Vieria GM, Pinzon V. Exploring different information sources for DSM-IV ADHD diagnoses in Brazilian adolescents. Journal of Attention Disorders. 1999;3:91–96. DOI: 10.1177/108705479900300203.
  • Rohde P, Lewinsohn PM, Seeley JR. Comparability of telephone and face-to-face interviews in assessing axis I and II disorders. American Journal of Psychiatry. 1997;154:1593–1598. [PubMed]
  • Rubin DB. The Bayesian bootstrap. The Annals of Statistics. 1981;9:13–134.
  • Rubio-Stipec M, Canino GJ, Shrout P, Dulcan M, Freeman D, Bravo M. Psychometric properties of parents and children as informants in child psychiatry epidemiology with the Spanish Diagnostic Interview Schedule for Children (DISC.2) Journal of Abnormal Child Psychology. 1994;22:703–720. [PubMed]
  • SAS Institute. SAS 9.1.3 Help and Documentation. SAS Publishing; 2002.
  • Schwab-Stone ME, Shaffer D, Dulcan MK, Jensen PS, Fisher P, Bird HR, Goodman SH, Lahey BB, Licht-man JH, Canino G, Rubio-Stipec M, Rae DS. Criterion validity of the NIMH Diagnostic Interview Schedule for Children Version 2.3 (DISC-2.3) Journal of the American Academy of Child and Adolescent Psychiatry. 1996;35:878–888. [PubMed]
  • Sciutto MJ, Eisenberg M. Evaluating the evidence for and against the overdiagnosis of ADHD. Journal of Attention Disorders. 2007;11:106–113. [PubMed]
  • Sobin E, Weissman MM, Goldstein RB, Adams P. Diagnostic interviewing for family studies: Comparing telephone and face-to-face methods for the diagnosis of lifetime psychiatric disorders. Psychiatric Genetics. 1993;3:227–233.
  • Willcutt EG, Pennington BF, Chhabildas NA, Friedman MC, Alexander J. Psychiatric comorbidity associated with DSM-IV ADHD in a nonreferred sample of twins. Journal of the American Academy of Child and Adolescent Psychiatry. 2002;38:1355–1362. [PubMed]
  • Zucker M, Morris MK, Ingram SM, Morris RD, Bakeman R. Concordance of self- and informant ratings of adults’ current and childhood attention-deficit/hyperactivity disorder symptoms. Psychological Assessment. 2002;14:379–389. [PubMed]