|Home | About | Journals | Submit | Contact Us | Français|
Prospectively followed girls with attention-deficit/hyperactivity disorder (ADHD), along with a matched comparison sample, five years after childhood neuropsychological assessments. Follow-up neuropsychological measures emphasized attentional skills, executive functions, and language abilities. Paralleling childhood findings, the childhood-diagnosed ADHD group displayed moderate to large deficits in executive/attentional performance as well as rapid naming, relative to the comparison group, at follow-up (M age = 14.2 years). ADHD-Inattentive vs. ADHD-Combined contrasts were nonsignificant and of negligible effect size, even when a refined, “sluggish cognitive tempo” subgroup of the Inattentive type was examined. Although ADHD vs. comparison differences largely withstood statistical control of baseline demographics and comorbidities, control of childhood IQ reduced EF differences to nonsignificance. Yet when the subset of girls meeting diagnostic criteria for ADHD in adolescence were compared to the remainder of the participants, neuropsychological deficits emerged even with full statistical control. Overall, childhood ADHD in girls portends neuropsychological and executive deficits that persist for at least 5 years.
Neuropsychological measures tap a range of functions, including attention, inhibition, motor speed, and linguistic abilities. The construct of executive function (EF) refers to those neuropsychological skills, deemed essential for performance of complex human tasks, related to planning, set maintenance, set shifting, interference control, and working memory (e.g., Barkley, 1997; Pennington & Ozonoff, 1996). Considerable evidence exists that, in contrast to non-diagnosed comparison individuals, samples of children and adults with attention-deficit/hyperactivity disorder (ADHD) show significant neuropsychological deficits, particularly those linked with EF (see, for example, Hinshaw, Carte, Sami, Treuting, & Zupan, 2002; Klorman et al., 1999; Nigg, Blaskey, Huang-Pollack, & Rappley, 2002; Seidman, Biederman, Faraone, Weber, & Ouellette, 1997; Seidman et al., 2005; see reviews of Barkley, Grodzinsky, & duPaul, 1992; Hervey, Epstein, & Curry, 2004; Seidman et al., 2004; Sergeant, Guerts, & Oosterlaan, 2002). Such deficits appear on a number of different tests and are largely independent of comorbid conditions that accompany ADHD (Hinshaw et al., 2002; Seidman et al., 1997). Because EF deficits are particularly likely to predict continuing academic failure in youth with ADHD (Biederman et al., 2004) and because they may reveal a distinct subset of youth with this disorder (Coghill et al., 2005), their identification is important both clinically and conceptually.
Several issues are salient regarding the linkage between EF deficits, on the one hand, and ADHD on the other. First, other developmental conditions are characterized by executive dysfunction (e.g., autistic disorder; conduct problems, obsessive-compulsive disorder), meaning that there is no specific, unique linkage between ADHD and executive deficits (see Pennington & Ozonoff, 1996; Weyandt, 2005). Second, effect sizes of neuropsychological and EF deficits in ADHD samples range from small to large, depending on the specific tests used as well as the composition of different samples (e.g., Hinshaw et al., 2002; Klorman et al., 1999; Seidman et al., 2005). Thus, there is no uniformity of executive dysfunction with respect to ADHD. Third, and related, EF dysfunction does not characterize all individuals with this disorder (Biederman et al., 2004; Nigg, Willcutt, Doyle, & Sonuga-Barke, 2005). Thus, predicting ADHD status from tests of executive dysfunction yields both false positive and false negative classifications (Doyle, Seidman, Biederman, Weber, & Faraone, 2000; Hinshaw et al., 2002; see Coghill et al., 2005, for elucidation). Yet given the strong desire for objective assessment tools in the evaluation of ADHD plus the need to understand its underlying cognitive substrates, assessment of executive function remains a priority research area.
Four major gaps in research on EF deficits and ADHD are evident. First, definitions of EF vary across conceptual models and research investigations, making for a lack of comparability. In this report we emphasize classic notions of this construct, with tasks that emphasize planning, inhibitory control, set maintenance, and set shifting.
Second, as with all aspects of research on ADHD, samples are dominated by males, over and above the actual male:female ratio of approximately 3:1 (Gaub & Carlson, 1997; Gershon, 2002). Accordingly, until recently, relatively little has been known about the neuropsychological function of females with ADHD. Two large-sample investigations have yielded evidence for considerable EF deficits in girls with this condition (Hinshaw et al., 2002; Seidman et al., 2006), with such deficits surviving statistical control of comorbidities, demographic characteristics, and overall cognitive functioning (IQ). Indeed, in their review of extant literature on the comparability of executive, inhibitory, and attentional deficits across boys and girls with ADHD, Seidman et al. (2005) conclude that sex differences do not exist regarding EF deficits in ADHD. Seidman et al. (2006) found that neuropsychological deficits were strongest in girls with ADHD and comorbid learning disorders. Overall, because large, diverse samples of girls with ADHD are rare in the field, examination of neuropsychological deficits in female samples is a priority.
Third, understanding the course of neuropsychological deficits (EF deficits in particular) is crucial for those interested in the developmental trajectory of ADHD. Whereas developmental changes exist in core symptomatology, such that hyperactive-impulsive symptoms decline at a greater rate than inattentive symptoms by adolescence (e.g., Hart, Lahey, Loeber, Applegate, & Frick, 1995), relatively few data are available on the stability of EF deficits. Although cross-sectional investigations reveal no significant differences in EF deficits between children and adolescents with ADHD (see Seidman et al., 2005) and although reviews of the adult literature show that such deficits are present in adults with this disorder (Hervey et al., 2004; Woods, Lovejoy, & Ball, 2002), secular trends or cohort effects constrain interpretations of such non-prospective investigations.
Prospective longitudinal findings include those of Hopkins, Perlman, Hechtman, and Weiss (1979), who found evidence for continuing deficits in a hyperactive sample (relative to comparison individuals) through late adolescence, on tests related to impulsivity and set-shifting. By young adulthood, however, such differences had diminished (Hechtman, Weiss, & Perlman, 1984). Similarly, the New York study of Klein and colleagues found limited evidence for attentional and/or executive deficits by age 18 (see review in Mannuzza and Klein, 1999). Recently, Drechsler, Brandeis, Foldenyi, Imhof, & Steinhausen (2005) found mixed evidence for the persistence of alertness and inhibitory deficits from late childhood through early adolescence. Finally, in a 13–14 year follow-up of a well-characterized, large sample (composed of over 90% males), Fischer, Barkley, Smallish, and Fletcher (2005) found that at least some neuropsychological deficits persisted into young adulthood for the childhood-defined ADHD sample, although the most salient problems at follow-up were shown for those individuals whose ADHD-related symptomatology had persisted across the longitudinal interval from childhood through adulthood. For continuous performance test variables of omission and commission errors, the ADHD sample’s scores improved from childhood through adulthood, but so did the comparison group’s scores, meaning that deficits remained in place. Overall, it is still relatively unknown whether neuropsychological deficits in general or executive dysfunctions in particular attenuate over time for youth with ADHD or remain salient, especially in girls.
Fourth, almost no prospective follow-up has occurred of youth with the Inattentive type of ADHD—whether on cognitive/executive measures or behavioral and impairment-related outcomes in general (Mannuzza & Klein, 1999). Given claims that this variant of ADHD represents a qualitatively distinct condition (Milich, Balentine, & Lynam, 2001), such follow-up is a priority, particularly with respect to objective outcomes, which are likely to be less biased than reports from adult informants.
Our research team collected extensive neuropsychological information on a large, well-characterized, and diverse sample of preadolescent girls with ADHD (N = 140) plus a matched comparison sample (N = 88), aged 6–12 years (see Hinshaw et al., 2002). In childhood, (a) the ADHD group showed clear neuropsychological deficits relative to the comparison group, with effect sizes ranging from medium to large; (b) these deficits largely withstood stringent statistical control of demographic variables, comorbid disorders (disruptive behavior disorders, internalizing disorders, reading disorder), and full-scale IQ; (c) the largest ADHD-comparison differences were revealed on tests tapping EF, including the Rey-Osterrieth Complex Figure (see Sami, Carte, Hinshaw, & Zupan, 2003); and (d) ADHD-Combined vs. ADHD-Inattentive type differences were small in effect size and rarely significant. As expected, individual classification of ADHD diagnostic status from the neuropsychological tests was imperfect, with a preponderance of false positive predictions (for related data, see Doyle et al., 2000).
Regarding our prospective, 5-year follow-up of this sample, we ask the following: (1) Do preadolescent girls with ADHD continue to show neuropsychological and executive dysfunction in adolescence, relative to a matched comparison group? (2) Are such deficits robust to statistical control of age, socioeconomic status, comorbidities, and general intelligence yielded from baseline measures during childhood? (3) Do the girls who meet diagnostic criteria for ADHD at follow-up show particularly strong EF deficits in adolescence? Our main hypotheses are that neuropsychological (attentional, rapid naming, executive) deficits will be maintained (but attenuated) during adolescence; that these will be robust to statistical control of demographics, comorbidities, and IQ (see Mahone et al., 2002, regarding IQ-neuropsychological performance linkages); and that girls with adolescent ADHD symptomatology will show the largest degree of neuropsychological/executive dysfunction at follow-up (Fischer et al., 2005). We also predict that ADHD-Combined and ADHD Inattentive differences will continue to be small at follow-up. Although parallel forms of baseline measures were utilized, the only measure repeated intact was a test of continuous performance.
As described in Hinshaw, Owens, Sami, and Fargeon (2006), during the school years of 2001–2, 2002–3, and 2003–4 we conducted prospective follow-up investigations of the girls initially investigated in Hinshaw (2002) and Hinshaw et al. (2002) These subjects had participated in research summer programs in 1997, 1998, and 1999, respectively. During pre-camp assessments and the summer programs, we had conducted nearly three hours of neuropsychological testing performed in three separate sessions, to prevent fatigue and to assess test-retest reliability for certain measures.
For the follow-up evaluations, we selected key measures for a 30-minute neuropsychological battery. On this assessment day, any girl taking stimulant medication was asked to participate medication-free. Data from caregivers indicated that a small percentage (approximately 10%) of the follow-up ADHD sample may have actually been medicated on the assessment day, but because a number of these took short acting stimulants in the morning and testing was routinely performed during afternoons, any medication benefit may have attenuated. To the extent that stimulants yield improvements in EF, the present results may reflect an underestimate of dysfunction. Testing was conducted by highly trained graduate students in clinical psychology or bachelor’s level research assistants, who had undergone extensive training. These assessors were kept blind to the participant’s diagnostic status; they comprised, almost exclusively, individuals who had not served as staff for the summer programs. Overall, whereas Hinshaw et al. (2006) present data on several domains of psychiatric symptoms, on academic and social impairment, and on service utilization, the current data pertain to the persistence of neuropsychological deficits.
Hinshaw (2002) provides a complete description of the multi-gated recruitment, screening, and diagnostic procedures utilized to ascertain the sample of 140 girls with ADHD and 88 age- and ethnicity-matched comparison girls who participated in the baseline neuropsychological assessments and research summer programs. Girls with ADHD were recruited through pediatricians, mental health centers, schools, and direct advertisement, and comparison girls through pediatricians, community centers, and direct advertisement. Preliminary rating scale criteria were intentionally set with liberal, sex-specific thresholds, in order to prevent premature exclusion of potentially eligible girls, but final study entry depended on the participant’s having met full criteria for ADHD, through the parent-administered Diagnostic Interview Schedule for Children, 4th ed. (DISC-IV; Shaffer et al., 2000). Common comorbidities (oppositional defiant disorder (ODD), conduct disorder (CD), anxiety disorders, depression, learning disorders) were allowed. Comparison girls, who were matched at a group level with the ADHD participants and did not significantly differ from the ADHD sample with respect to age or ethnicity, could not meet diagnostic criteria for ADHD via either adult ratings or structured interview criteria. Exclusion criteria were mental retardation, evidence of psychosis or overt neurological disorder, lack of English spoken in the home, and medical problems prohibiting summer camp participation.
Regarding the types of ADHD sampled (Combined: ADHD-C; Inattentive: ADHD-I), Hinshaw (2002) describes diagnostic procedures, which were based largely on the DISC-IV but also included SNAP ratings (Swanson, 1992; this is a widely used adult informant rating scale of ADHD symptomatology) and staff judgments in borderline cases. Each of the 18 DSM-IV ADHD symptoms was considered present if endorsed on the DISC-IV or if the mother or teacher rated it as a 2 (“pretty much”) or 3 (“very much”) on the SNAP. Girls with at least 6 inattentive and 6 hyperactive/impulsive (HI) symptoms (with at least 4 in each domain based on the DISC-IV; see Hinshaw et al., 1997) were designated ADHD-C; girls with at least 6 inattentive (with at least 4 based on the DISC-IV) but fewer than 6 HI symptoms were designated ADHD-I; and girls with fewer than 6 inattentive and 6 HI symptoms were designated as not having ADHD. To preserve statistical power for the ADHD-C vs. ADHD-I contrast, we did not include girls with the Predominantly HI type.
At baseline, the girls spanned the ages of 6–12 years. The sample was ethnically diverse (53% White, 27% African-American, 11% Latina, 9% Asian-American). The clinic and summer camp procedures yielded multi-informant, multi-method data on both symptoms and a wide range of domains of functional impairment (Hinshaw, 2002).
The follow-up evaluations were performed on 209 of the 228 participants (92%), who ranged in age from 11.3–18.2 years (M = 14.2 years). Reasons for non-participation included (a) family lost to all tracking efforts (n = 4), (b) refusal to participate (n = 5), and (c) family contacted but scheduling of assessments not possible (n = 10). Comparisons of the retained sample versus those lost to attrition revealed that, for 29 of 31 demographic, diagnostic, and symptom variables from baseline, differences were not statistically significant. The only two variables for which significant differences emerged were single-parent status and teacher-reported internalizing symptomatology (for each, baseline rates were higher in the non-retailed sample). In short, the retained sample appears representative of the total sample (Hinshaw et al., 2006). Additionally, some assessments occurred via home visits or telephone interviews (n = 7), precluding full neuropsychological assessment; some measures were missing because of fatigue or refusal, and in other instances (i.e., Conners Continuous Performance Test), computer failures occurred. Hence, the sample size for our present battery ranges from N = 186–200. For secondary analyses that involve follow-up diagnostic status as the independent variable, we re-administered the DISC-IV at the follow-up evaluations (Hinshaw et al., 2006). See Table 1 for descriptive information about the 202 girls who constituted the overall sample for the follow-up neuropsychological battery.
The priority was placed on well-established and well-validated neuropsychological tests, most of which were parallel forms to those used at baseline (see Hinshaw et al., 2002) and which could be administered in approximately 30 minutes, given the need to sample multiple domains of functioning in our follow-up battery. The only two measures that were exact replications of the baseline battery were the Conners Continuous Performance Test and the Underlining Test. Furthermore, to avoid problems of multiple statistical tests, we selected a priori only one or two dependent measures from each test, constituting those with the most established psychometric properties and clinical utility.
This is a widely used measure of auditory working memory, which requires participants to immediately recall digit sequences of increasing length either in their original presentation order (Digits Forward) or in their reverse presentation order (Digits Backward). Because of the conceptual importance of rehearsal and interference control for the latter, we analyze scores separately herein. Working memory is considered to be an important component or correlate of EF (see Scheres et al., 2004; Willcutt et al., 2005) and appears to involve both frontostriatal and cerebellar brain regions (Martinussen et al., 2005). Split-half reliabilities average .85 across the age span of the standardization sample (Wechsler, 1991). We analyzed standard scores (M = 10, s.d. = 3) so that mean levels and effect sizes would be clinically interpretable. Note that Digit Span is a supplemental test of the WISC-III, so that full-scale IQ scores (used as a covariate in our analyses) are independent of Digit Span scores.
We selected this measure as a parallel form of the Rey Osterrieth Complex Figure (ROCF), which at baseline had been our most sensitive neuropsychological/executive test for discriminating the ADHD from the comparison sample (Hinshaw, 2002; Sami et al., 2003). The ROCF is used to evaluate planning, perceptual organization, and graphomotor abilities (Lezak, 1983; Spreen & Strauss, 1998). The TCFT is the only major alternative to the ROCF in a test-retest situation (Helmes, 2000). We administered only the copy condition of the TCFT because, at baseline, the delayed recall condition yielded weak evidence for ADHD vs. comparison differences (Sami et al., 2003).
Our primary dependent measure is the error proportion score (EPS), calculated using the number of segments drawn incorrectly (errors) divided by the sum of all segments drawn (correct plus incorrect). The current scoring system was parallel to the one used at baseline, which had been developed from our modifications of the procedures of Bernstein and Waber (1996), as described in Sami et al. (2003). Drawings were scored by a group of three extensively trained, independent raters who were unaware of diagnostic status. For the EPS, the intraclass correlations between pairs of the three scorers were .77, .83, and .94, with a mean of .84 on a subsample of 60 drawings.
In this task of visual attention and set-shifting, a target letter is presented on a computer screen and defined as any letter except for X. Participants are instructed to press the computer spacebar key upon the presentation of each target letter but to inhibit response to all Xs (i.e., nontargets). This continuous performance test therefore taps the EF of response inhibition by requiring occasional inhibitory responses in the midst of frequent responses to targets. Trials are presented in six blocks, with the interstimulus interval ranging from 1 s, 2 s, and 4 s within each block; stimulus display time is 250 ms. Parallel to the baseline analyses, we chose omission errors (percentage of failures to respond to target stimuli out of the total number of targets presented) and commission errors (percentage of key presses for nontargets out of the total number of nontargets presented) as the primary dependent measures. Conners (1995) provides criterion-related validity data for these scores in terms of known-groups differentiation and sensitivity to treatment response.
Here, children rapidly name repeated items: automatically processed digits (digits), pictured stimuli (objects), and digit-letter pairs (subtest 1 of global-local; see Hinshaw et al., 2002). We administered the digits subtest as an orienting task; objects and digit-letter pairs were the two subtests used in our analyses. The objects subtest requires effortful semantic processing and retrieval; this type of task appears to require left-prefrontal brain regions (Gabrieli et al., 1996). For boys with ADHD, this was the subtest revealing the largest differentiation from comparison boys (Carte, Nigg, Hinshaw, & Treuting, 1996). The digit-letter subtest requires both set shifting and inhibition of a prepotent response in order for participants to switch between numbers and letters. Our score for each test was an accuracy rate, defined for objects as 50 (i.e., the total number of stimuli to be named) minus the sum of incorrect recognitions plus omissions, with that quantity divided by the time to complete the task. For digit-letter it was the total number of stimuli named minus the sum of incorrect recognitions plus omissions, with that quantity divided by 60 (i.e., the time in seconds to complete the task). Because the correlation between objects and digit-letter scores was substantial (r = .67), we summed these subtests as a single dependent measure.
This is a target cancellation task that measures rapid and accurate visual discrimination, requiring the underlining of target stimuli embedded among distractors at a target:distractor ratio of 1:5. The subtest requiring simple detection of the numeral 4 was used to orient subjects. We utilized three additional subtests: (a) sequenced compound gestalt shapes (gestalt figure; original subtest No.4), (b) the sequenced consonants “fsbm” (original subtest no. 9), and (c) the sequenced letters “spot” (original subtest no. 11). Whereas the “spot” subtest is related to sight word recognition, the gestalt figure and fsbm subtests require controlled processing (Posner, 1988) and are unrelated to reading skill (Rourke & Orr, 1977); thus, we selected these two subtests for our analyses. In prior research, the fsbm subtest optimally discriminated boys with ADHD from comparison boys (Carte et al., 1996; Nigg, Carte, Hinshaw, & Treuting, 1998). Anterior cingulate prefrontal regions are involved in such tasks (Cabeza & Nyberg, 1997). Because errors (rather than slow speed) characterize participants with ADHD (MacLeod & Prior, 1996), our dependent variable is the single score of correctly identified targets minus misidentified targets and nontargets. Because of only a modest correlation (r = .31) between the gestalt and fsbm subtests, we examine these as separate dependent measures.
Overall, on the basis of conceptual and empirical formulations (Tranel et al., 1994), our measures of EF include digits forward and digits backward from Digit Span, omission and commission errors from the CPT (which we conceptualize as indicators of inattention and impulsivity, respectively), the error proportion score from the TCFT, and number correct minus incorrect from the UL gestalt and fsbm subtests. Our language-related measure is the RAN accuracy score. We reverse scored the CPT and TCFT so that higher scores reflect better performance.
To ascertain whether neuropsychological and EF impairments in adolescence are related specifically to the girls’ original ADHD status rather than variables associated with ADHD, we repeated our core analyses with stringent statistical control. The first set of covariates includes demographic information (family income and maternal education; see Hinshaw, 2002, for details); participant age (given the 6–7 year age span across the sample); and additional disorders (which constitute comorbidities for the girls with ADHD), dummy coded as 1 vs. 0 for the presence vs. absence of ODD or CD, 1 vs. 0 for the presence vs. absence of anxiety or mood disorders, and 1 vs. 0 for the presence vs. absence of reading disorder (see Hinshaw, 2002, for details). In a second set of analyses we additionally covaried the participant’s full-scale IQ (FSIQ), measured at baseline with the WISC-III. Note that Barkley (1997) recommends conducting neuropsychological analyses within ADHD samples both with and without statistical control of IQ.
All statistical analyses were performed with SPSS for Windows, Version 12 (SPSS, Inc. 2003). Because cohort effects could complicate interpretation of findings, we performed an initial set of ANOVAs with the independent variable of initial year of participation: 1997, 1998, or 1999. For 6 of the 8 primary dependent measures, there were no effects of year of participation. On the RAN, girls from the 1999 program scored worse than those from 1997 or 1998; on the TCFT, the 1999 sample was worse than the 1997 sample. Because such effects are likely to be of minor importance, and because year of participation interacted significantly with ADHD diagnostic group for only one outcome measure (UL gestalt), we include all 3 cohorts together in our analyses.
Note that correlations among the dependent measures averaged .24 and ranged from .07 to .50, with the largest correlation occurring between Digits Forward and Digits Backward from WISC-III Digit Span. Our first primary analysis was a multivariate analysis of variance (MANOVA) across the 8 dependent measures, with baseline diagnostic status (ADHD-C, ADHD-I, comparison) as the three levels of the independent variable. A significant MANOVA (alpha = .05) afforded the interpretation of follow-up univariate patterns of subgroup differences, examined via analyses of variance (ANOVAs) for each measure along with Tukey post-hoc comparisons of each subgroup contrast. Effect sizes are emphasized, calculated as Cohen’s d, with the difference between subtype means as the numerator and the pooled standard deviation as the denominator (Cohen, 1988 designates d = .2 as a small effect, .5 as medium, and .8 or above as large). Next, we completed two multivariate analyses of covariance (MANCOVA), first partialling the demographic/comoribidty variables and then adding IQ as a covariate. A significant MANCOVA is interpreted as signifying a specific association between childhood ADHD status and outcomes of interest—that is, ADHD status yields significant prediction of adolescent neuropsychological impairment despite stringent statistical control of important covariates. When significant, we followed these with ANCOVAs and post-hoc pairwise contrasts. For the two measures in our battery that were identical at baseline and follow-up (CPT omissions and commissions), repeated measures analyses afforded examination of the tendencies toward improvement or decrement across time for all subgroups. Finally, primary analyses were repeated using diagnostic status at follow-up (ADHD-C, ADHD-I, and no ADHD) as the independent variable. Overall, we had good statistical power (over .7) to ascertain group differences of small to medium size between participants with ADHD and comparison girls, and adequate power to ascertain group differences of medium size between the ADHD-C and ADHD-I groups (Faul & Erfelder, 1992).
The overall MANOVA yielded a statistically significant finding for the effect of baseline diagnostic status on neuropsychological performance, F(16, 342) = 2.76, p < .001. Table 2 presents the results of the 8 univariate ANOVAs. All 8 of the outcome measures yielded significant results. For six dependent measures (Digits Forward, Digits Backward, CPT omissions, TCFT, UL gestalt, UL fsbm) both ADHD types scored worse than the comparison group but did not differ themselves (see contrasts in Table 2). For RAN, the comparison girls performed better than girls with ADHD-C, but the ADHD-I group did not differ from either group. For CPT commissions, no post hoc contrasts were significant. Table 3 (left column) reveals that, for these measures of attentional and executive functioning, effect sizes for the contrasts of (a) ADHD-C vs. comparison and (b) ADHD-I vs. comparison were medium in magnitude (d ranging from .33 to .66). Effect sizes for ADHD-C vs. ADHD-I contrasts (which were never significant, as just noted) were not even small in most instances.
When the first set of covariates (demographics, comorbidity) was included, the MANCOVA was just at the traditional level of statistical significance, F(16, 322) = 1.67, p = .05. Univariate ANCOVAs revealed that 4 dependent measures continued to show significance: Digits Forward, F(2, 195) = 3.74, p < .05; TCFT error proportion, F(2, 191) = 3.62, p < .05; UL gestalt, F(2, 195) = 6.15, p < .01; and UL fsbm, F(2,195) = 5.09, p < .01. Effect sizes for the adjusted means regarding ADHD-C vs. comparison and ADHD-I vs. comparison contrasts decreased slightly for Digits Forward (.35-.52), and TFCT (.45-.50) but were maintained for the UL subtests (.42-.72) (see Table 3). However, when the MANCOVA was reconducted with FSIQ in the set of covariates, the results were now nonsignificant, F(16, 318) = 1.09, p = .36. Thus, control of baseline IQ eliminated the effect of childhood ADHD on follow-up neuropsychological performance.
For the two automated measures that were identical to those that administered at baseline, CPT omissions and commissions, we conducted a repeated measures MANOVA, with the independent variables of baseline diagnostic status (ADHD-C, ADHD-I, and comparison) and time (baseline, follow-up). These revealed significant effects of time, diagnostic status, and their interaction. In separate repeated measures ANOVAs examining these effects on omissions and commissions, there was a main effect of time on omissions, F(1,177 = 12.88, p <.001), signifying that all participants improved, on average, from baseline to follow-up. As expected, the main effect of diagnostic status was significant as well, F(2,177 = 7.92, p <.01), indicating that the girls with ADHD performed worse than the comparison girls, reflecting the primary analyses. The diagnostic status X time interaction was not significant F(2,177 = 2.49, p <.10), indicating the pattern of change across time was similar across the diagnostic groups.
For commissions, however, not only was the main effect of time significant, F(1,177 = 41.70, p <.001), again signifying that all groups improved from baseline to follow-up, but the diagnostic status by time interaction was also significant, F(2,177 = 5.30, p <.01). (The diagnostic status effect was not significant, F[2,177] = 2.32, p = .102). A plot of this interaction revealed that whereas the girls with ADHD-I were relatively stable in their performance across time, the ADHD-C and comparison groups both showed notable improvement (see Figure 1).
As noted in the data analytic plan, we reconducted our primary analyses with the independent variable of follow-up diagnostic status. Here, based on diagnoses that were determined in adolescence, we found that 40 girls met criteria for ADHD-C, 47 for ADHD-I, and 115 did not meet criteria for ADHD. In other words, as described in detail in Hinshaw et al. (2006), over half of the girls with ADHD-C at baseline no longer met diagnostic criteria for this subtype at the follow-up evaluations, compared to approximately one-third of the girls with ADHD-I at baseline. In other words, as the girls aged, a significant percentage failed to meet official criteria for ADHD, although clear impairments remained in all functional domains investigated (Hinshaw et al., 2006).
With this follow-up designation as the new independent variable, the MANOVA was significant, F(16, 342) = 3.15, p < .001. Seven of the 8 follow-up ANOVAs were significant: Digits Forward, Digits Backward, CPT omissions, TFCT, UL gestalt, UL fsbm, and RAN (see Table 4). For four of these (Digits Forward, RAN, UL gestalt, UL fsbm), contrasts revealed that each ADHD type was significantly more impaired than the comparison girls. For CPT omissions the ADHD-I type was intermediate between the ADHD-C and comparison groups but not significantly different from either. For TCFT the comparison and ADHD-I groups did not differ, but the ADHD-C girls performed worse than both other groups. For Digits Backward no contrasts were significant. Effect sizes associated with significant contrasts ranged from medium to large.
With inclusion of covariates, both the demographic plus comorbidity set, F(16, 322) = 2.70, p < .001, and the set that also included FSIQ, F(16, 318) = 2.36, p < .01), were also significant, with the four outcome measures (Digits Forward, TCFT, RAN, and UL gestalt) maintaining significance with inclusion of all covariates. As shown in Table 5, the comparison vs. ADHD-C contrasts remained medium in effect size. A few of the comparison vs. ADHD-I contrasts shrank to small effect sizes with inclusion of covariates whereas others actually increased (e.g., UL gestalt).
Our chief aim was to determine whether girls with carefully ascertained ADHD in childhood would continue to display neuropsychological deficits (and EF deficits in particular) 5 years later. Using well-established measures of working memory, planning, set maintenance, and set shifting—considered as EF in the literature—as well as indicators of attentional processing, impulse control, and rapid naming, we found that the girls diagnosed with ADHD during childhood continued to display significant, medium-effect deficits, in contrast to our matched comparison sample, in our 5-year prospective assessments. ADHD-C vs. ADHD-I differences did not emerge for any measure, however, with small effect sizes for these contrasts. Many EF differences between girls with ADHD and the comparison girls survived stringent statistical control of age, socioeconomic/demographic indicators, and comorbid diagnostic status, measured at baseline.
For CPT omissions and commissions, objective measures that were repeated identically at baseline and follow-up, main effects were significant, indicating better performance for all girls, on average, during adolescence. For commissions, the comparison girls and those with ADHD-C showed relatively greater improvement than the girls with ADHD-I. Regarding the overall analyses, controlling IQ scores from childhood eliminated the ADHD-comparison differences; but when we performed a secondary analysis examining the cognitive patterns of those girls who met diagnostic criteria for ADHD during the follow-up assessments, strong EF differences emerged between this subset and those who did not meet criteria, and many differences survived full statistical control, even of IQ. Overall, girls with ADHD continue to display noteworthy neuropsychological deficits in adolescence, 5 years after initial ascertainment, with adolescent ADHD status more specifically related to neuropsychological dysfunction than childhood ADHD status.
The largest prospective investigation including neuropsychological/executive measures is that of Fischer et al. (2005). With a sample of over 90% males, followed for an average of 13 years from childhood through early adulthood, they found evidence for continuing neuropsychological deficits, chiefly restricted to those who maintained their ADHD status across the longitudinal interval. In all, our findings document the continuing neuropsychological deficits of girls with ADHD across a 5-year span, which were independent of comorbidities and demographic factors, even though attentional performance showed improvements for all participants across time. Our effects were particularly robust for the girls meeting criteria for ADHD in adolescence. Whether such deficits will persist into adulthood must await subsequent prospective follow-up.
A perennial question related to investigations of youth with ADHD is whether IQ scores should be covaried when contrasting such samples with clinical or non-diagnosed comparison groups. We found that control of baseline IQ eliminated the EF, attentional, and rapid naming differences found at follow-up when we contrasted our childhood-diagnosed groups. Yet the adolescent-diagnosed group comparisons survived such control. The issue is contentious. On the one hand, it would be important to understand neuropsychological and executive differences between girls with ADHD and comparison girls that occur even when controlling for the 10-point-or-more IQ superiority of the comparison girls at baseline (see Hinshaw et al., 2002, for rationale and presentation of baseline data, which revealed that during childhood, all neuropsychological deficits survived control of IQ). However, it may well be that controlling for intelligence scores in such designs is tantamount to overcontrol (see, for example, Miller & Chapman, 2001), given that IQ deficits are an inherent part of the ADHD construct. Although Fischer at al. (2005) opted not to covary IQ scores in their recent report for just this reason, we performed our analyses both ways—that is, with and without covarying IQ—because of the potential importance of examining neuropsychological differences that are truly independent of overall cognitive functioning, (see, for example, Barkley, 1997).
Although control of baseline IQ (in addition to demographics and comorbidities) eliminated group differences at follow-up when diagnostic status was configured from baseline measures, such control did not eliminate EF and other neuropsychological deficits in the ADHD group when the independent variable was diagnostic status as ascertained at follow-up. Thus, parallel to the report of Fisher et al. (2005), we found that our most robust findings in terms of persisting neuropsychological deficits pertained to ADHD status contemporaneous with the follow-up testing. Longer-term follow-up will be crucial to explore this issue further.
Parallel to our childhood findings, we found little evidence for ADHD-C vs. ADHD-I differences in neuropsychological performance. In fact, in both the childhood-diagnosed and adolescent-diagnosed analyses, these “type” differences were never significant except for one instance for the adolescent-diagnosed girls, even though power calculations reveal that, given our follow-up sample sizes, we have statistical power of nearly .6 to detect effects of medium size (see Faul & Erfelder, 1992). Despite contentions that the inattentive type of ADHD is a separate, qualitatively distinct variant of this disorder, particularly with respect to cognitive performance (Milich et al., 2001), we found no such evidence, at least with respect to the variables in our battery. Whether other cognitive measures would yield such effects must await additional research.
Two comments on the ADHD-C vs. ADHD-I contrasts are in order. First, during the year prior to the follow-up assessment, 45% of the former group received medication treatment for ADHD, in contrast to 28% of the latter, a difference that just missed statistical significance (see Hinshaw et al., 2006, for details). Thus, it is conceivable that the tendency for more of the girls with the combined type of ADHD to receive medication treatment served to attenuate neuropsychological dysfunction, although we reiterate that nearly all participants performed the follow-up battery off of medication. Second, it is possible that the ADHD-I group includes a smaller subset of “truly” inattentive participants who exhibit sluggish cognitive tempo (McBurnett, Pfiffner, & Frick, 2001). Paralleling our article describing baseline neuropsychological performance (Hinshaw et al., 2002), we formed a subset of the ADHD-I participants at baseline who (a) displayed very few hyperactive-impulsive symptoms and (b) revealed high levels of parent- and teacher-rated sluggish cognitive tempo. However, just as in childhood, this “refined” inattentive subgroup did not reveal any different pattern of neuropsychological performance from that of the remainder of the ADHD-I group.
Executive dysfunction may be a continuing problem for youth with ADHD (or a subset of such youth) as they develop. Important in such research will be several issues. First, ascertaining EF deficits through ecologically valid, real-world measures over and above laboratory tasks is a priority (see Lawrence et al., 2004). Second, in causal models of ADHD, it will be important to understand that EF may well serve as one component of a multi-level process (Coghill et al., 2005; Sonuga-Barke, 2002). Third, it would be extremely helpful for different research laboratories to converge with respect to common measures of EF, so that replications of findings could be more clearly established. Finally, an important question arises as to the proportions of youth with ADHD who display dysfunctions in EF. We made a preliminary determination of this important variable by classifying as “executive dysfunctional” those participants who scored at or below the 10th percentile of the comparison group’s scores on at least two executive tests (see Nigg, Willcutt, Doyle, & Sonuga-Barke, 2005). We found that 43% of the baseline-defined ADHD-I group and 50% of the ADHD-C group showed such dysfunction, roughly consistent with the literature revealing that approximately half of youth with ADHD show executive dysfunction (Nigg et al., 2005). In addition, by follow-up, at which time we characterized smaller groups of girls with adolescent ADHD, 55% of the adolescent-diagnosed girls with ADHD-I and 75% with ADHD-C had executive dysfunction. Further analyses of categorically-defined executive dysfunctional subgroups will be the subject of additional papers from our research group.
Repeated measures analyses of the CPT scores first revealed that all participants improved across time. For omissions, the girls with ADHD remained significantly worse at follow-up. For commissions, however, the pattern was somewhat more complex: The main effect of time (revealing improvement for all participants) was qualified by a time × diagnostic status interaction, revealing greater improvement of the comparison girls and those with ADHD-C. Why the ADHD-I group showed more stability, across time, is unknown; longer-term follow-up is needed to understand developmental trajectories.
Limitations of this investigation include, first, the fact that most of our follow-up measures were not exact replicas of our childhood battery. Thus, except for CPT omissions and commissions, we could not examine precise change over time. Second, the follow-up battery was limited to 30 minutes of testing, for practical reasons related to the need for a multi-domain assessment protocol. It would be important to know whether girls with ADHD reveal problems with other measures of EF and for neuropsychological constructs beyond EF and attentional problems per se. Third, our retention rate of 92% for the overall follow-up (Hinshaw et al., 2006) was reduced by in-home assessments, equipment failure, and missed tests for selected additional participants. Still, the follow-up sample appears representative of the baseline participants. Finally, ours is largely a clinical sample of girls with ADHD, and it is unknown whether similar persistence of neuropsychological and executive dysfunction would emerge from a community sample.
Overall, our findings are consistent with those of Seidman et al. (2005), who found that girls with ADHD show EF deficits during adolescence (at a rate comparable to that of boys with ADHD), and with those of Fischer et al. (2005), who have provided the most convincing prospective evidence to date that cognitive, attentional, and neuropsychological problems among youth with ADHD do not abate, relative to a comparison group, even by early adulthood. Those youth with ADHD with persisting ADHD symptomatology over time show the strongest evidence for continuing neuropsychological and executive dysfunction, paralleling findings with other samples primarily constituting males with ADHD (Fischer et al., 2005). It may well be that those children with ADHD with the most persistent forms of underlying symptoms are those whose executive dysfunction is most striking by adolescence; further longitudinal work is a priority, with a wider range of measures of EF. Indeed, understanding how EF deficits relate to other vital aspects of the functioning of youth with ADHD across the lifespan is a continuing objective for high-quality research.
Work on this project was supported by National Institute of Mental Health Grant R01 MH45064. We thank the participating girls and their families for their continuing involvement in our longitudinal study. We also thank our assessors and coders, without whom the present neuropsychological data could not have been collected or prepared.
Stephen P. Hinshaw, University of California, Berkeley.
Estol T. Carte, Permanente Medical Group.
Catherine Fan, University of California, Berkeley.
Jonathan S. Jassy, University of California, Berkeley.
Elizabeth B. Owens, University of California, Berkeley.