|Home | About | Journals | Submit | Contact Us | Français|
Attention deficit hyperactivity disorder (ADHD) is associated with deficits in executive functioning (EF). ADHD in adults is also associated with impairments in major life activities, particularly occupational functioning. We investigated the extent to which EF deficits assessed by both tests and self-ratings contributed to the degree of impairment in 11 measures involving self-reported occupational problems, employer reported workplace adjustment, and clinician rated occupational adjustment. Three groups of adults were recruited as a function of their severity of ADHD: ADHD diagnosis (n = 146), clinical controls self-referring for ADHD but not diagnosed with it (n = 97), and community controls (n = 109). Groups were combined and regression analyses revealed that self-ratings of EF were significantly predictive of impairments in all 11 measures of occupational adjustment. Although several tests of EF also did so, they contributed substantially less than did the EF ratings, particularly when analyzed jointly with the ratings. We conclude that EF deficits contribute to the impairments in occupational functioning that occur in conjunction with adult ADHD. Ratings of EF in daily life contribute more to such impairments than do EF tests, perhaps because, as we hypothesize, each assesses a different level in the hierarchical organization of EF as a meta-construct.
A growing body of evidence suggests that attention deficit hyperactivity disorder (ADHD) consists of more than just its primary diagnostic symptoms of inattention, impulsiveness, and often hyperactivity (American Psychiatric Association, 2001). Theorists now argue that symptoms of deficient executive functioning (EF) may also be involved in the disorder (Barkley, 1997, 1997/2001; Castellanos, Sonuga-Barke, Milham, & Tannock, 2006). Meta-analyses of EF tests in ADHD also find significant deficits on most of the tests used to assess traditional EF constructs at the group level of analysis (comparisons of means), particularly in tasks involving response inhibition and working memory, especially in manipulative rather than merely storage and recall aspects of working memory (Boonstra, Oosterlaan, Sergeant, & Buitelaar, 2005; Frazier, Demareem, & Youngstrom, 2004; Hervey, Epstein, & Curry, 2004; Martinussen, Hayden, Hogg-Johnson, & Tannock, 2005). Greater deficits are found in tests of nonverbal than verbal working memory (Martinussen et al., 2005).
Research on EF deficits in ADHD routinely relies upon tests of EF as the primary source for determining the existence of EF deficits (Biederman et al., 2006; Boonstra et al., 2005; Hervey et al., 2004; Jonsdottir, Bouma, Sergeant, & Scherder, 2006; Wilcutt, Doyle, Nigg, Faraone, & Pennington, 2005). In the case of adult ADHD, such research often finds these deficits in group comparisons of mean scores, suggesting that ADHD is in fact a disorder of EF. But research examining these deficits at the individual level of analysis (such as percent impaired or abnormal on the test) reports that EF deficits exist in just a minority (30%–50%) of cases (Biederman et al., 2006; Nigg, Wilcutt, Doyle, & Sonuga-Barke, 2005). Moreover, such deficits on EF tests are usually only weakly related to severity of ADHD symptoms, if at all (Barkley, Murphy, & Fischer, 2008; Jonsdottir et al., 2006). On the basis of such evidence some researchers have concluded that ADHD involves only a minority of cases that may have EF deficits (Nigg et al., 2005; Wilcutt et al., 2005). Some cases or types of ADHD may be associated with non-EF neuropsychological deficits, such as problems with motivation or an energetic pool of arousal (Sergeant, 2005; Wilcutt et al., 2005). Others go further and assert that ADHD is not a disorder of EF at all (Boonstra et al., 2005; Jonsdottir et al., 2006; Marchetta, Hurks, Krabbendam, & Jolles, 2008). Either of these conclusions may be correct. However, both of these assertions hinge on an unquestioned premise—that EF tests are the gold standard or most valid indicators of the presence of EF deficits.
This presumption is problematic for various reasons, not the least of which is that the definition of EF is quite ambiguous with little or no consensus among researchers (Castellanos et al., 2006). Some assert that the essence of the construct of EF is the cross-temporal organization of behavior to achieve future goals (Fuster, 1997) or the maintenance of problem-solving sets toward those goals (Welsh & Pennington, 1988). If that is so, then it is not clear how EF tests are sampling this conceptual domain. Those tests involve an exceptionally limited time window over which testing occurs (5–30 min each) and their content is not directly indicative of cross-temporal organization or maintenance of problem-solving toward goals. A further problem is that most EF tasks are complex, involving multiple cognitive processes, only some of which are supposedly reflecting the EF intended (Anderson, 2002; Castellanos et al., 2006). Such tests are often found to be significantly influenced by the level of intelligence as well (Mahone et al., 2002) making their results difficult to interpret as reflecting pure measures of a particular EF construct.
Most troubling from a clinical perspective is that EF tests have low or no ecological validity when judged against ratings of EF in daily life or direct observations of EF performance in natural settings. Past studies found EF tests to have low-order correlations that are often nonsignificant with patient self-ratings and those of significant others of the dysexecutive symptoms they observe in patients with frontal lobe injuries (Burgess, Alderman, Evans, Emslie, & Wilson, 1998; Chaytor, Schmitter-Edgecombe, & Burr, 2006; Wood & Liossi, 2006). Research using parent and teacher ratings of EF in children with frontal lobe lesions, traumatic brain injuries (TBI), or other neurological or developmental disorders also find low or no significant relationships between EF ratings and EF tests (Anderson, Anderson, Northam, Jacobs, & Milkiewicz, 2002; Mangeot, Armstrong, Colvin, Yeates, & Taylor, 2002; Vriezen & Pigott, 2002). These studies typically find that the variance shared between any single EF test and EF ratings ranges from just 0% to 10%. Combinations of EF tests do not fair much better, sharing just 12%–20% of the variance with EF ratings. When IQ is covaried out of these results, even those significant relationships between EF tests and EF ratings may become nonsignificant (Mangeot et al., 2002) in keeping with studies suggesting that non-EF cognitive processes permeate EF tests. Likewise, when researchers directly observe the performance of frontal lobe injured patients in tasks in daily life, they find no relationship between impaired EF test performance and impairment in daily life activities presumed to involve EF (Mitchell & Miller, 2008). In contrast, studies of both autistic children and children with TBI have found moderate and significant relationships of EF ratings to measures of daily adaptive functioning (Gilotty, Kenworthy, Sirian, Black, & Wagner, 2002; Mangeot et al., 2002). The constructs evaluated by EF tests are apparently not those assessed by EF ratings or by direct evaluations of EF in daily life (Alderman, Burgess, Knight, & Henman, 2003; Shallice & Burgess, 1991).
Of all the domains of major life activities expected to be influenced by EF, adult occupational functioning would be expected to be one of those given its heavy emphasis on the need for the cross-temporal organization and maintenance of behavior and problem-solving toward goals across days, weeks, or even months. Effective performance at work requires the capacity for self-management relative to time (time management), self-organization planning and problem-solving, self-activation, and self-motivation to sustain the pursuit of larger delayed consequences over smaller more immediate ones (deferred gratification), among other activities, that are presumed to be facilitated by the EFs.
Increasing evidence indicates that adult ADHD is associated with various problems in occupational functioning. ADHD (hyperactive) children followed to adulthood are found to rank significantly lower than control groups in occupational status, to have significantly worse job performance ratings from employers, and to be more likely fired from employment and to be so more often (Barkley et al., 2008). Clinic-referred adults diagnosed with ADHD have been found to be more likely to be unemployed, to have been fired from employment, to have impulsively quit a job, to use more sick leave, to have applied for or be on disability pensions, to have changed jobs impulsively more often, and to have more chronic employment problems (De Quiros & Kinsbourne, 2001; Halmoy, Fasmer, Gillberg, & Haavik, 2009; Murphy & Barkley, 1996). Adults with ADHD also appear to have more dysfunctional career beliefs, more decision-making confusion, and greater work-related anxiety and external conflict regarding their careers (Painter, Prevatt, & Welles, 2008). Epidemiological studies (de Graaf et al., 2008) likewise found that adults with ADHD had significantly more days “out of role” or under-productive, had more than twice the risk of absence due to sickness, and more than twice the risk of workplace accidents than other working adults. They estimated the increased cost per ADHD worker to be $4,336 per year, not including the cost of the accidents. It is clear that ADHD in adults is associated with myriad problems in occupational history and current functioning. Yet, no research to date to our knowledge has examined the degree to which the EF deficits associated with ADHD in adults might contribute to their occupational problems. This paper attempted to do so using both EF tests and EF ratings in view of the evidence discussed above that each approach may be measuring quite different aspects of EF and so may contribute quite differently to those occupational difficulties.
Our EF tests were selected to assess the most commonly listed constructs involved in EF, as noted above, yet in a reasonable amount of time given the extensive battery of other measures being collected in the project (see Barkley et al., 2008). The battery included measures of response inhibition, maintenance of attention, resistance to distraction, set shifting and rule development, nonverbal working memory, and verbal working memory. This battery does not provide a comprehensive or exhaustive coverage of all constructs that have been attributed to EF, such as planning and problem-solving and emotional-motivational self-regulation. But it does provide some coverage of most of the EF constructs previously found to be deficient in reviews of EF tests in ADHD.
In summary, this paper examined the contribution of EF deficits to occupational functioning in adults who ranged in severity of ADHD symptoms, from those meeting full diagnostic criteria, to those referred for the disorder but falling short of such criteria, and to those in a general community control group. The second aim of this paper was to evaluate the relative merits of EF ratings and EF tests in predicting impairment in occupational functioning. In view of the problems with the ecological validity of EF tests noted above, we hypothesized that EF ratings would be more predictive of occupational problems than EF tests.
Three groups of participants were combined in this paper: (1) ADHD: 146 adults clinically diagnosed with ADHD; (2) clinical control group: 97 adults evaluated at the same clinic but not diagnosed with ADHD; and (3) community control group: 109 adult volunteers from the local community. These samples have been described in considerably greater detail elsewhere (Barkley et al., 2008). Both the ADHD and clinical control adults were obtained from consecutive referrals to an adult ADHD Clinic in a Department of Psychiatry at a major medical school. The community control adults were obtained from advertisements posted throughout the medical school lobbies and from periodic newspaper ads in the regional newspaper.
To be eligible, all subjects had to have an IQ of 80 or higher on the Shipley Institute of Living Scale. They also had to have no evidence of: Deafness, blindness, or other significant sensory impairment; significant and obvious brain damage or neurological injury, or epilepsy; significant language disorders that would interfere with comprehension of verbal instructions in the protocol; a chronic and serious medical condition, such as diabetes, thyroid disease, cancer, heart disease; a childhood history of mental retardation, autism, or psychosis. To be placed in the ADHD group, clinic-referred participants had to meet the DSM-IV criteria for ADHD, excepting the age of onset criterion, as judged by an experienced clinical psychologist using a structured interview for ADHD (see below). Participants in the clinical control group were those evaluated at this same clinic who did not receive a clinical diagnosis of ADHD.
No precise age of onset of symptoms producing impairment was required for placement within the ADHD group as one purpose of this grant funded project was to examine the value of specifying various age ranges of onset for the diagnosis of ADHD in adults. Also, the results of prior studies do not support the validity of the age of onset of 7 years currently included in the DSM-IV diagnostic criteria for ADHD, especially when applied to adults (Barkley & Biederman, 1997; Barkley et al., 2008). All had the onset of their symptoms prior to age 21 and the mean age of onset was 7 years (median = 6 years). Typically, we require corroboration of ADHD symptoms and impairment from someone else who knows the person well, such as parents, siblings, or spouses/partners, as part of making a clinical diagnosis of ADHD. We did not do so here though we did collect such information for most participants. This permitted us to specifically examine the degree of agreement between those sources and the participants concerning their ADHD symptoms and EF. Of the 146 adults assigned to the ADHD group, 30 were inattentive types (20%), 6 were residual (4%), and 110 were combined types (76%) by clinician diagnosis.
The clinical control group comprised patients referred to this same adult ADHD clinic but not clinically diagnosed as having ADHD. The primary diagnoses given by the clinician to members of this group were varied but comprised the following: 43% anxiety disorders, 15% drug use disorders, 12% mood disorders, 4% learning disorders, 4% partner relationship problems, 4% adjustment disorders, 1% personality disorders, 1% ODD, and 17% no diagnosis.
The community control group consisted of relatively normal adults drawn from the local region via advertisements. To be eligible for this group, they must have met the criteria noted earlier for all participants. In addition, they had to have a score on the Adult ADHD Rating Scale (see below) based on current symptoms (by self-report) “below” the 84th percentile (within +1 SD of mean) for their age (using norms reported in Barkley & Murphy, 2006). They also had to be free of any ongoing medication for treatment of a medical condition or psychiatric disorder that could be judged to interfere with the measures to be collected here.
As we previously reported (reference withheld for blind review), the selection criteria were successful in selecting groups of patients that differed significantly from each other in severity of their ADHD symptoms for the interview-based assessment of symptoms, for both self- and other rated current ADHD symptoms and impairments, and for both self- and other rated childhood symptoms of ADHD and impairments (see Rating Scales and ADHD Interview below for these instruments). In all instances, the ADHD group was significantly more severe than the clinical control group, which was significantly more severe than the community control group.
The age of our ADHD group (M = 32.4, SD = 10.9) was significantly younger than the other two groups, by an average of 4 years from the community control group (M = 36.4, SD = 12.0) and 5 years from the clinical control group (M = 37.8, SD = 13.2), F = 6.78; p = .001. The ADHD group had significantly less education (M = 14.2 years, SD = 2.2) than the two control groups (clinical control M = 16.3, SD = 2.8; community M = 15.4, SD = 2.7), F = 15.05, p < .001; the community group also had significantly less education than the clinical control group (p < .05). The lower education of the ADHD group is consistent with prior research on the impact of ADHD on the educational outcomes of adults diagnosed with ADHD and of children with ADHD followed to adulthood (Barkley et al., 2008, chapter 9). The groups did not differ in their IQ scores (Shipley Institute of Living Scale; ADHD: M = 106, SD = 8.8; clinical: M = 109, SD = 8.8; community: M = 108, SD = 7.9), F = 2.79, p = NS. But the clinical group (M = 54.1, SD = 31.3) had a significantly higher occupational index than the other two groups (ADHD: M = 38.2, SD = 26.8; community: M = 42.3, SD = 26.7), F = 64.93, p < .001, who did not differ from each other on the Hollingshead Index of Social Position (Hollingshead, 1975). The groups did not differ in the percent who were currently employed (ADHD = 73%, clinical control = 71%, and community control = 77%). The sex composition of the groups was: ADHD = 68% men and 32% women; clinical control = 56% men and 44% women; and community control = 47% men and 53% women. The groups differed significantly in this composition (χ2 = 11.60, p = .003) with the ADHD group having a significantly higher composition of men than the two control groups. This finding is in keeping with many studies of children and adults with ADHD that demonstrate a greater representation of the disorder in men than women (Barkley, 2006; Kessler et al., 2006). As for the ethnic composition of the groups, 94% of each group identified themselves as of European American (White) descent. Such group differences in certain demographic and other characteristics are less important to the present paper given that the groups are being combined in order to study the contribution of EF deficits to impairment.
Concerning current psychiatric medication treatment status, 17% of the ADHD group, 30% of the clinical control group, and none of the community control group were on medication. To evaluate the potential effect that medication status may have had on ADHD severity, we compared those ADHD cases that were on medication to those not on medication in the following measures reflecting severity of their disorder: The frequency of their ADHD symptoms from the interview, their age of onset, the number of domains of impairment from the interview, the number of childhood ADHD symptoms (interview), the total score for ADHD symptoms from self-ratings in adulthood and in childhood, self-rated impairment total scores on these same scales, the total score for ADHD symptoms from ratings provided by others for both current and childhood behavior, and the total impairment scores provided by others for both current and childhood functioning. None of these comparisons were significant (p-values = .13–.91; specific detailed analyses are available upon request from the first author). We conducted the same analyses for those patients in the clinical control group who were currently on medication and who were not. Again, no differences were significant (p-values = .11–.99; specific detailed analyses are available upon request from the first author). We therefore were reasonably confident that the small proportion of patients in these two groups currently taking medication would not bias our results by significantly reducing the severity of their ADHD-related symptoms (the independent variable of interest). If it were to do so, however, the bias would likely make the study a more conservative one by reducing differences between the two clinical groups and the community control group. We therefore combined the medicated and unmedicated patients in each clinical group and combined all of the groups together for all subsequent analyses done in this paper.
After contacting a project staff member, all participants were scheduled for their initial diagnostic interview with the second author and IQ screening test administered by a Master's level psychological assistant. That interview was a structured interview of ADHD diagnostic criteria, including symptoms, onset, and impairment that was designed for our research projects on adult ADHD. These steps were done to determine eligibility for participation in any of the three groups. All eligible participants then signed written statements of informed consent as approved by the medical school institutional review board and were scheduled for their subsequent evaluation in this project on a separate date following this initial screening. This latter evaluation comprised the completion of a structured interview concerning various impairments in multiple domains of major life activities including demographic information, educational and work history, current and prior psychiatric treatment, driving history, money management, drug use and abuse, dating and marital history, and antisocial activities. The participants also completed several behavior rating scales, academic achievement testing, and neuropsychological testing. For some patients, their schedules necessitated that this testing be continued into an additional day. Following the evaluation, all participants were paid $100 for their participation. Significantly others were paid $20 each for the forms we requested that they complete.
The vast majority of the results concerning the comparisons of these groups on the various measures collected in this project have been reported previously (Barkley et al., 2008). The present paper focuses on information not previously reported concerning relative utility of EF ratings compared with EF tests to predict impairment in the measures of occupational history and functioning.
All participants initially completed this scale as part of the initial determination of ADHD symptom (Barkley & Murphy, 2006). This rating scale contains the items from the DSM-IV criteria for ADHD with each item answered on a 4-point scale (0–3) using the response format of not at all, sometimes, often, and very often. We obtained these same ratings for current functioning from someone who knew the patient well, usually a parent (44%–67% of each group), spouse or cohabiting partner (30%–53%), or less often, a sibling (≤3% per group). The groups did not differ from each other in these percentages. An earlier DSM-III-R version of the current symptoms scale also correlated significantly with the same scale completed by a parent (r = .75) and completed by a spouse or intimate partner of the ADHD adult (r = .64; Murphy & Barkley, 1996). Agreement in the present project between participants and others who knew them well on this scale was .70 (p < .001) for the total ADHD symptoms score. This compares favorably with the results of another study on adults that found an inter-agreement correlation of .69 (Murphy & Schachar, 2000).
This is a paper-and-pencil interview consisting of the criteria from the DSM-IV for ADHD (Barkley & Murphy, 2006). This interview was employed by an experienced clinician during the initial interview with participants as part of the selection criteria used for identifying the groups as ADHD or not. Symptoms of ADHD were reviewed twice, once for current functioning (past 6 months) and a second time for childhood between with the requirement that the symptom only be endorsed if it occurs often or more frequently. The onset of symptoms was also obtained in this interview. Inter-judge reliability (agreement) on this structured interview for ADHD DSM-IV criteria was established in this project. This interview by this same expert clinician was audio taped. Approximately 11% (41) of these tapes were randomly sampled and received a blinded independent review by another expert to determine if the subjects' responses to this DSM-based interview met DSM criteria (as amended for onset, see above). Agreement between the two judges on whether or not the DSM-IV criteria for ADHD were met was 85.3% (kappa = .712, Approx. Tb = 4.76, p < .001).
This short intelligence test served as a measure of IQ (Shipley, 1946). It is comprised of a 40-item vocabulary test and 20 items assessing abstract thinking. The composite IQ score correlates well with other measures of intelligence (Zachary, 1988) and was employed here as a screening criterion for intellectual level as part of the study entry criteria.
For this project, we created an interview consisting of highly specific questions dealing with various domains of major life activities, including educational history, occupational history, antisocial activities, drug use, driving, money management, and dating and marital history. This interview was administered by a psychological technician holding a master's degree in psychology and trained in the evaluation of clinic-referred adults. The questions dealing with occupational history are the focus of this paper. Those questions dealt with: (1) the number of jobs held since leaving high school, (2) a rating of overall current work quality (5-point Likert scale), and (3) the number of jobs from which they had been fired, (4) on which they had trouble getting along with others, (5) on which they had trouble with their own behavior, (6) which they had quit due to boredom, (7) which they had quit due to hostility with their employer, and (8) which they had been formally disciplined due to substandard work. Because participants varied in the number of jobs they had held to date since leaving high school, the measures (3) through (8) above were converted to percentages of the total number of jobs held to date.
We also obtained ratings from the employers of our participants who were willing to grant such permission (Barkley & Murphy, 2006). Employers were kept blinded to the diagnosis of the participants. This employer scale contained questions about ADHD and oppositional deficit disorder symptoms and about any impairment in 10 domains of work activities, these being: Relations with coworkers, relations with supervisors, relations with clients or customers, completing assigned work, educational activities, punctuality, meeting deadlines, operating equipment, operating vehicles, managing daily responsibilities. Both the ratings of ADHD symptoms and the impairment ratings were answered using a Likert scale of 0–3 (rare to very often). The employer also provided an overall work performance rating using a 1–5 Lickert scale (1 = excellent to 5 = poor). Only the overall rating of impairment (the sum across all the impairment items) and overall work performance rating were used here.
We had such ratings on 39 of the ADHD group (27%), 25 of the clinical control group (26%), and 50 of the community control group (46%). With each group, we compared those on whom we had employer ratings to those without such ratings on age, sex, ethnic group, education, total current ADHD symptoms, age of onset of ADHD symptoms, the total number of domains they reported as being impaired, their total childhood ADHD symptoms retrospectively recalled, and the clinician Social and Occupational Functioning Assessment Scale (SOFAS) rating (see below). No differences were found in the ADHD or community control group on any of these measures (all ps >.05; detailed analyses are available upon request from the first author). Within the clinical control group, the only difference was that we had employer ratings on a larger proportion of women than participants without such ratings. All other comparisons were not significant. With this one exception, it appears that the subsets of each group on whom we obtained employer ratings are representative of their larger group in demographic factors and ADHD severity.
The SOFAS provides a clinician a means to rate functioning on a scale from 1 (grossly impaired) to 100 (superior or excellent functioning) based on the individual's social, occupational, and educational functioning (Patterson & Lee, 1995). Impairment is to be based on the totality of the patient's current functioning mental and not a result of lack of opportunity or other environmental limitations. Descriptors are provided at each 10-point markers on the scale to guide clinicians in making this rating. For instance, a score of 10 is indicated if the patient has “persistent inability to maintain minimal personal hygiene, or unable to function without harming self or others or without considerable external support (e.g., nursing care and supervision).” In contrast, a score of 70 would be given if there is “some difficulty in social, occupational, or school functioning but generally functioning well; has some meaningful interpersonal relations.”
The scale consists of 88 items with each item being answered on a 0–3 Likert scale (0 = rarely or not at all, 1 = sometimes, 2 = often, and 3 = very often). The scale has two versions for collecting self- and other-reports. Creation of the item pool and scale construction is explained in the paper by Barkley and Murphy (2009). An original pool of 91 items was created to capture the nature of EF deficits in the constructs of behavioral inhibition, nonverbal working memory and sense of time, verbal working memory and rule following, emotional, motivational, and arousal self-regulation, and planning and problem-solving. The DSM-IV symptoms of ADHD were specifically excluded from this scale to permit an evaluation of the relationships of the EF dimensions with ADHD symptom dimensions. The eventual set of 88 items was found to form five-dimensions of executive dysfunctions using factor analysis: Self-Management to Time (23 items), Self-Organization and Problem-Solving (21 items), Self-Discipline (inhibition—23 items), Self-Motivation (11 items), and Self-Activation and Concentration (10 items).
We previously reported significant differences among these three groups on all five scales of both versions (Barkley & Murphy, 2009). The ADHD group was rated as having more severe executive dysfunction mean scores on all five scales compared with both the clinical and community control groups, with the clinical group also being rated worse than the community group. About 89%–94% of the ADHD group fell in the clinically impaired range (+1.5 SD above the mean of the community group) across these five scales as did 84%–98% of the clinical control group and just 7%–11% of the community group. This indicates very little overlap in the distributions of these two clinic-referred groups with the community control group.
The relationship of the self- to other ratings was established based on all participants collapsed across groups. They were: Self-Management to Time = .79 (p < .001), Self-Organization = .66 (p < .001), Inhibition Problems = .74 (p < .001), Self-Motivation = .69 (p < .001), and Self-Activation/Concentration = .75 (p < .001). This illustrates a reasonably satisfactory level of agreement between participants and others who know them well. The correlations of severity of ADHD symptom ratings to deficits in EF scale (DEFS) dimensional scores were: Self-Management to Time = .91 and .71 (p < .001; inattention, hyperactive-impulsive, respectively), Self-Organization = .80 and .68 (p < .001), Inhibition Problems = .85 and .84 (p < .001), Self-Motivation = .83 and .71 (p < .001), and Self-Activation/Concentration = .87 and .80 (p < .001). These relationships are substantial implying sizeable shared variance between the ADHD and EF constructs if not colinearity (essentially assessing the same constructs). Four of the five scales were not significantly correlated with IQ in this study (rs ranging from .03 to .10) but IQ was correlated with a small but significant degree with the Self-Organization/Problem-Solving scale (r = .15, p = .007, n = 342).
This is a standardized computer-administered continuous performance test in which single letters are shown on a display screen at three different rates: one every second, one every 2 s, or one every 4 s (Conners, 1995). The task lasts 12 min. The variation in inter-stimulus interval allows the examination of this variable on the participant's performance. The task used a response format that is the reverse of most Continuous Performance Tests (CPTs). The participant presses a button in response to every signal shown but then must cease or inhibit their responding when the target signal appears. Norms are available for this CPT from the publisher (Multi-Health Systems). The dependent measures employed here were the total number of omissions (missed targets), total commissions (false hits), reaction time (RT), and RT variability. The scores for omissions and RT variability were chosen to assess sustained attention, whereas the scores for commissions and RT were chosen to assess response inhibition in view of prior factor analyses of these tests that reflect such factor loadings (Murphy, Barkley, & Bush, 2001). We previously reported the group differences on this test (reference withheld for blind review) where we found the ADHD group to be significantly worse than the community control group on all scores and worse than the clinical control group on Omission errors and RT variability.
This test measures the ability to inhibit competing responses in the presence of salient conflicting information (Stroop, 1935; Trenerry, Crosson, Deboe, & Leber, 1989). We employed it here as our measure of resistance to distraction, a form of inhibition often included in the concept of EF. The version and norms published by Trenerry and colleagues (1989) were used here. The task is comprised of three parts. In the first part, the participant reads a repeating list of color names (e.g., red, blue, green) printed in black ink. In the second part, the participant names the colors of a repeated series of Xs printed in an ink of those same colors. In the last or interference condition, the participant must say the color of ink in which a color word is printed. For some words, the color of ink in which it is printed is the same as that of the word, whereas for others, the color of ink differs from that specified by the word. This portion of the task is believed to reflect problems with the capacity to inhibit habitual or dominant responses (reading the word, in this case). Three scores were derived from this last portion of the test (Interference): The raw scores for the number of items completed and the number of incorrect responses, as well as the percentile score. We previously reported the group comparisons on this test (Barkley et al., 2008) where we found just the ADHD group to differ from the community group. Only the percentile score was used here.
This test comprises 128 cards each containing sets of geometric designs that vary according to color, shape, and number (Heaton, 1981). The subject is given four cards and then asked to sort the remaining deck using feedback from the examiner. Following 10 correct sorts on a given category (e.g., color), the examiner switches the category unannounced and the subject must now discover the new sorting rule from feedback given by the examiner. Although many scores can be derived, we used just the four scores for percent errors, percent perseverative errors, percent concepts, and categories achieved score. These served as our measures of set shifting and rule development. We previously reported the comparisons among these groups on this test (none were significant; reference withheld for blind review).
Originally developed by Regard, Strauss, and Knapp (1982) as a nonverbal version of more commonly used verbal working memory and fluency tasks, this test involves a sheet of paper with 40 five-dot matrices on it (Lee et al., 1997). Participants are required to produce as many different figures as possible by connecting the dots within each rectangle within a 3-min time limit. Not all dots have to be used and only straight lines between dots are permitted. No figures are to be repeated. If a violation occurs, participants are given a single warning on the first violation but the rules are not repeated after any further infractions. Scores are the number of unique designs created, the number of repeated designs (perseveration), the number of rule infractions, and the percentage of designs which are repeated designs (percent perseveration). Patients with frontal lobe dysfunction have a significantly higher percentage of perseverative errors than do neurological patients without frontal involvement and psychiatric patients (Lee et al., 1997). Using a modified version of this same task, Ruff, Allen, Farrow, Nieman, and Wylie (1994) also found the task to be sensitive to frontal lobe injuries and perhaps is more sensitive to right than left lobe involvement. Here, we used just the measure of number of unique designs generated in the task as our measure of nonverbal working memory. We have previously reported significant group differences on this test score among these three groups (Barkley et al., 2008).
This battery contains nine separate tests involving verbal learning, verbal memory, and nonverbal memory (Multi-Health Systems, Inc., 1995). A number of test scores are derived from these various tests (at least 3 per test) but most reflect learning and memory retention. We found significant group differences for most of the memory retention scores from this test and have reported the group differences for this entire test battery elsewhere (Barkley et al., 2008). Here, we used just the Digit Span (total forward and backward score) as our index of verbal working memory. Each digit string is presented twice with recall attempted after each presentation. In the first task, they must recall the digit sequence as presented. In the second task, they must recall the digit sequences read to them in backward order. Again each sequence is presented twice. In our earlier comparisons of these groups on this specific subtest, we found the ADHD group to perform worse than both the clinical and community groups who did not differ from each other.
Using the same samples as in the present study, we previously reported specific difficulties in occupational functioning associated with adults with ADHD in comparison to both clinical and community control groups (Barkley et al., 2008). More members of both the ADHD and clinical control groups had problems getting along with others at work (30%, 18%, and 7%, respectively) and had difficulties with their behavior or work performance on the job (53%, 50%, and 5%, respectively). Adults with ADHD reported having trouble with others, behavior problems at work, being fired or dismissed from a job, quitting a job out of boredom, and being disciplined by their supervisor on the job in a higher percentage of the jobs they had held than did participants in both the clinical and the community control groups. Those adults with ADHD also had quit more jobs over their own hostility in the workplace than did adults in the community group. Despite being blind to the diagnoses of our participants, employers rated the adults with ADHD as having significantly greater problems with inattention in the workplace than was the case for either control group. Compared with either control group of adults, the adults with ADHD were rated as being more impaired by their symptoms in performing assigned work, pursuing educational activities at work, being punctual, using good time management, and managing their daily responsibilities. As a consequence, the adults with ADHD were rated as having a poorer overall work performance level than were adults in either of the control groups.
Here, we examined the utility of the DEFS subscales in predicting these occupational difficulties using all participants. We used linear multiple regression analyses (SPSS version 17.0) with IQ forced into the equation at Step 1 to remove any potential relationship it may share with the DEFS and occupational measures. The DEFS were then allowed to enter stepwise at Step 2 if their contribution was significant (p < .05). Given the significant agreement noted above between the self-ratings and those provided by others on the DEFS, and to save space, we report here only the analyses using the self-ratings. The results for the self-ratings on the DEFS are displayed in Table 1. IQ was not significantly related to 10 of the 11 occupational measures but was significantly associated with the percentage of jobs participants had quit due to hostility with their employer.
It is clear from Table 1 that the Self-Organization scale was not related to any occupational measures. The Self-Activation/Concentration scale was associated with just the clinician SOFAS rating. In contrast, problems with Self-Management to Time, Inhibition, and Self-Motivation were predictive of five measures each, though not necessarily the same occupational measures. In fact, most occupational measures were predicted by just a single DEFS subscale. The strongest relationships were between the DEFS and the clinician rated SOFAS score, where three of the DEFS accounted for 63% of the variance in that clinician rating. For the remaining occupational measures, the amount of variance accounted for by the DEFS that contributed to any particular measure ranged from 5% to 22% in total.
Again linear regression was used to examine the contribution of the EF tests to these same occupational measures forcing IQ in at Step 1 and allowing the EF tests to enter at Step 2 in a stepwise fashion if their contribution was significant (p < .05). These results are shown in Table 2. Most of the EF tests made no significant contribution to these occupational measures. No EF test made a significant contribution to the percentage of jobs from which a participant had been fired. Most of the remaining occupational measures were predicted by just one EF test that being the Commission Errors on the CPT. Its contributions ranged from 1.6% (percent of jobs on which the participant was disciplined for substandard work) to 12.9% of variance (SOFAS rating). The 5-Point Test score for number of unique designs contributed significantly to three measures, these being, employer rated impairment (7%), employer rated work performance (7.5%), and the clinician SOFAS rating (3.7%). RT from the CPT made a small but significant contribution to two of the occupational measures: The number of jobs held since leaving high school (1.7% of variance) and the clinical SOFAS rating (2.8%). As with the DEFS, the greatest contribution made by any EF tests were to predicting the clinician SOFAS rating (21.2% of variance). For the remaining occupational measures, the EF tests contributed 1.6%–18.6% of the variance.
Given the small and often nonsignificant relationship between the DEFS and EF tests reported previously (Barkley & Murphy, 2009), it is likely that much of the contribution made by one method (rating scales) to these occupational measures is not redundant with that made by the other (tests). We therefore examined the extent to which the EF tests and ratings contributed uniquely to the prediction of these same occupational impairments when examined jointly. We did not use IQ in these analyses given that it did not contribute significantly to the vast majority of the occupational measures in the analyses conducted above. Once more, we used multiple linear regression in which we first entered all five DEFS jointly at Step 1 and then all EF tests jointly at Step 2. We then conducted these same analyses using the reverse order of entry of these two assessment batteries. We employed this battery-wise entry method in order to evaluate the total contribution made by one approach to EF against that made by the totality of the other approach in predicting occupational problems. The results appear in Table 3.
Entering the DEFS subscales first and then the EF tests, the findings indicated that the DEFS subscales made a significant contribution to 10 of the 11 occupational measures while the EF tests contributed significant and additional unique variance to just 1 of those 11 occupational measures. The DEFS contributed 8%–27% of the variance to these measures of workplace adjustment and 67% of the variance in the clinician SOFAS rating. Although not significant in their contributions to most impairment measures, the EF tests contributed 2%–27% additional variance but just 2% to the clinician SOFAS rating. The EF tests appear to provide little additional significant variance beyond that contained in the EF ratings for most of these occupational measures. Interestingly, both the EF ratings and the EF tests contributed substantially to employer rated impairment (40% of the variance) and work performance (46% of the variance) but neither contribution reached significance due to the small samples on which we had employer ratings. This argues for replicating these results with larger samples that may result in these results becoming significant.
When the order of entry was reversed, with EF tests being entered first followed by EF ratings, the EF tests now contributed to just three occupational measures (self-rated work quality, percent jobs quit due to problems with own behavior, clinician SOFAS rating); but this is two more than that was found above for the opposite order of entry. In contrast, the DEFS subscales contributed to nine occupational measures with just one measure now becoming nonsignificant. In sum, although it does appear that EF tests do contribute to a few of the occupational impairment measures, the EF ratings capture most of that contribution as well, contribute to the majority of such measures, and usually contribute more unique variance to them than do the EF tests.
This study set out to determine the contribution of EF to impairment in occupational functioning. A secondary aim was an examination of the relative utility of the two different methods of assessing EF (ratings vs. tests) in their capacity to predict such impairment. When examined individually and apart from the EF tests, self-ratings on the DEFS subscales were found to contribute significantly to all 11 occupational impairment measures, including self-rated work quality, the percentage of jobs on which these adults had experienced various behavioral and interpersonal problems or had been fired, employer ratings of overall work performance and impairment across a variety of work contexts, and clinician ratings of social and occupational adjustment on the SOFAS. Three DEFS were especially useful in these predictions, these being Self-Discipline (Inhibition), Self-Management to Time, and Self-Motivation with each contributing to five different occupational measures. The amount of variance in the occupational impairment measures accounted for by the DEFS ranged from just 5% to 22% except for the prediction of the clinician SOFAS rating (63%). These results indicate that EF as assessed by ratings of daily life activities makes some contribution to occupational impairments.
In comparison, when examined individually and apart from the EF ratings, just three EF test scores (two tests) made significant contributions to the occupational measures, the two most frequent being CPT Commission Errors and the 5-Point Test Unique Designs score. The first is typically interpreted as a measure of inhibition and contributed to various occupational impairments related to behavior, interactions with others, boredom, and hostility as well as to the number of jobs held since high school. It is understandable why problems with behavioral inhibition might contribute to such occupational difficulties. The CPT hit RT score is also viewed as a measure of response inhibition given its loading on the same factor as CPT Commission Errors in previous factor analyses (Murphy et al., 2001). It likewise contributed to the number of jobs held by our participants. These findings are also consistent with the findings above from the DEFS that problems with inhibition, or self-discipline, in daily life contribute to these same occupational difficulties.
The second EF test of any utility was a measure of nonverbal working memory and fluency—the 5-Point Test. It made separate contributions to several occupational measures, specifically from the employer rating scale: Employer rated workplace impairment and work performance quality. These results are consistent with current views that working memory may contribute to some of the inattention symptoms seen in ADHD (Barkley, 1997/2001) and thus to workplace problems related to inattention. Nonverbal working memory may be more impaired in ADHD than verbal working memory (Martinussen et al., 2005) and thus may contribute more to inattention in ADHD which may explain the utility here of a nonverbal working memory test to predict workplace problems possibly associated with inattention.
When the EF ratings and tests were examined jointly to evaluate their unique contributions to the occupational measures, the EF ratings contributed significantly to 10 of the 11 occupational measures when entered first and to 9 of them when entered second in the regression analyses. The greatest of these contributions was to the clinician SOFAS rating where they explained 45%–67% of the variance depending on the order of entry. The contribution of the EF tests when entered second after the EF ratings was not significant on 10 of the 11 measures. But it did contribute to a small degree (10% additional variance) to the percentage of jobs on which the individual had difficulties with their own behavior and work performance. Recall that this was the only occupational measure related to IQ and so it may be the involvement of IQ in the EF tests that led to this contribution. When the EF tests were entered first, they contributed to just two additional occupational measures, whereas the EF ratings continued to contribute unique variance to the majority of those measures. These results suggest that a few EF tests add some unique variance to the prediction of occupational adjustment beyond that contribution predicted by EF ratings, but the contribution is relatively small. EF ratings, in contrast, predict significant variance in the majority of occupational adjustment measures and often do so beyond that variance captured by EF tests. We hypothesized that EF tests would show substantially weaker relationships with our various measures of occupational impairment than would the EF ratings. This hypothesis was confirmed. If predicting occupational impairment is an important aspect of the validity of EF measures, then EF ratings were superior to EF tests in doing so. These findings also agree with the study by Mitchell and Miller (2008) that found that EF tests were only modestly related to both ratings and observations of daily functional activities.
These results provide further support for our earlier contention that EF tests should not be viewed as the only standard of evidence for establishing the presence of EF deficits, particularly in ADHD as prior research has stated or implied (Biederman et al., 2006, 2008; Boonstra et al., 2005; Hervey et al., 2004; Jonsdottir et al., 2006; Nigg et al., 2005; Wilcutt et al., 2005). As reported elsewhere (Barkley & Murphy, 2009), EF deficits are present in the vast majority of adults with ADHD (89%–98%) when ratings of EF in daily life activities are used. Thus, where weak or no EF test deficits are found in those having ADHD relative to control groups or are not significantly related to ADHD severity, this should not be taken to indicate that EF deficits are not part of or related to ADHD, as some have concluded (Boonstra et al., 2005; Jonsdottir et al., 2006; Marchetta et al., 2008; Wilcutt et al., 2005). Deficits may not be apparent in tests evaluating EF while being ubiquitous using ratings. And if predicting impairment in major life activities is evidence of validity, then EF ratings are far superior to EF tests in doing so. Thus, although neither method alone should serve as the gold standard for determining the presence of EF deficits, EF ratings may be taken to have greater validity at predicting impairment in daily life activities, and particularly occupational adjustment.
Our results, however, should not be taken to mean that EF tests may not be valuable for assessing certain features of EF. We assert this despite the numerous concerns we have raised about these tests as the best indices of the construct of EF and despite their limited predictive utility concerning impairments studied here. Although there appears to be uniformity of opinion that the construct of EF is broad and comprises multiple components (Castellanos et al., 2006), we believe that the nature of EF is also multileveled and hierarchical. Regardless of its definitional ambiguity, EF, like the prefrontal cortex that largely facilitates it, is most likely organized in a hierarchical system that allows smaller sequences of behavior to become clustered into more complex, nested sets of larger goal-directed actions that are sustained over longer intervals of time (Badre, 2008). And those complex behavioral sets can be further arranged into even larger nested meta-sets to accomplish even longer term and larger goals spanning days, weeks, months, or even years (Botvinick, 2008).
This hierarchical functional organization of the prefrontal lobes is highly consistent with the view that the EFs largely created by those lobes must likewise be hierarchically organized. The EFs should comprise increasingly higher and larger levels of behavioral organization with each comprised of longer and more varied and complex goal-directed actions. In our view, the nature of EF is likely to be similar to that of driving a motor vehicle that involves not only multiple cognitive processes at a basic level, but also several hierarchically organized levels of abilities (basic cognitive, instrumental, tactical, and strategic). Research on driving, like that on EF, has shown little or no relationship between clinic-based evaluations of basic cognitive abilities known to be necessary but not sufficient for driving (vision, RT, inhibition, motor coordination) or even simulator-based assessments of driving and measures of tactical and strategic driving levels involved in actually driving in real-world situations, such as may be measured by ratings or direct observations of actual driving behavior. And such basic cognitive clinic-based tests have little or no association with measures of adverse driving outcomes (citations, accidents; Barkley, Murphy, DuPaul, & Bush, 2002). These results can arise when measures of the lowest level of a complex and multilevel domain are necessary, yet not sufficient, to represent higher level functions or abilities utilized in daily life situations and done to meet strategic goals.
EF likewise can be conceptualized as not just involving multiple components at a basic instrumental neurocognitive level as is likely assessed by EF tests, but also involving higher levels of more complex behavioral organization at a tactical level (daily activities, immediate social goals). And those tactical actions may be further organized upward into even more complex actions and for larger goals that can be considered a strategic level of EF (longer term social, economic, occupational, and other goals spanning weeks, months, and even years). Above this level may be one of ultimate utility that encompasses the individual's long-term welfare and achievements that is likely reflected in measures of impairment that capture the long-term consequences of EF deficits at the lower levels.
At each new level of this hierarchy additional abilities and skills come into play that are not represented in lower levels and yet contribute to mastering and effectively performing executively at that and higher levels of EF. Therefore, one should not be surprised that basic measures of more proximal, short-term instrumental EF constructs (EF tests) have little relationship with the tactical or strategic level of EF as represented in ratings of EF ascertained across months. It should also not be puzzling that such basic EF tests make little contribution to the ultimate outcomes of EF as indexed indirectly by impairment in domains of major life activities that span years of EF utilization (occupational, driving, financial, social, marital, educational, etc.) and are three levels or more removed in this hierarchy. All this is to say that the purpose of choosing any method for the assessment of EF should be dictated by the purpose of the research or clinical undertaking in contrast to pitting measures of EF against each other as if all assessed the same level.
The limitations of this study should be considered in evaluating the foregoing results and interpretations. First, our study is limited by the method used to create the EF scale. It is possible that had other items of EF been generated besides the 91 we tested additional dimensions of EF deficits might have been unearthed. Yet, we believe our initial efforts were sufficiently comprehensive to provide a first broad pass at determining the possible nature of EF deficits as evaluated by rating scales. It was certainly more comprehensive, theory-based, and empirically constructed than the brief 20-item EF scale created in the study by Burgess and colleagues (1998). Earlier findings (Barkley & Murphy, 2009) also suggest that no matter what narrowband subscales of particular EF deficit studies may identify, such dimensions are highly inter-correlated and imply the possibility that there is a single over-arching meta-construct of EF shared across all these dimensions.
A second limitation was the relatively circumscribed battery of EF tests used here to assess the various constructs believed to characterize EF. Had other tests of these constructs been used or had additional constructs, such as planning and problem-solving, been assessed, some differences in these results might have been evident. However, other studies that have used the DEFS and examined its relationship with an entirely different battery of EF constructs did include such tests of planning and problem-solving as well as different measures of response inhibition and working memory. Those measures were collected at the adult follow-up of children with ADHD followed to adulthood and the study found very similar results to those discussed earlier—that being little or no relationship between EF ratings and those EF tests (Barkley & Fischer, 2010). Thus, we believe that the relatively poorer showing of EF tests to predicting impairment here is not entirely a consequence of our selection of these EF tests.
A third limitation may have occurred in the occupational impairment measures we derived from self-reports of work history that may be affected by recall biases and other influences that may make them less than accurate in reflecting the actual employment problems experienced by these participants. We tried to overcome this by also obtaining the reports of employers about current workplace problems and performance but could only get consent to do so for a subset of participants. This does not completely eliminate the problems with the historical nature of the self-reported information. We also did not measure other problems in the workplace, such as actual productivity, use of sick leave, and workplace accidents that have been found to be associated with ADHD in adults in earlier studies (de Graaf et al., 2008). Our employer rating scale can only be considered a relatively crude index of actual workplace problems that might have been revealed by more thorough or detailed workplace measures or direct observations of working. But such measures can be highly intrusive into the occupations of research participants and cannot be undertaken easily or lightly in conducting research with psychiatric patients. As an initial attempt to study occupational impairment, we felt that our employer rating scale offered minimal intrusiveness into the important domain of work for our participants while still allowing us a cursory glimpse of their potential workplace problems.
A further limitation may have occurred in the procedure of collapsing the ADHD and clinical control groups in with the community group to study the relationships among the EF measures and impairment measures. Such a heterogeneous grouping would not be expected to resemble the distributions of these measures in a general population sample of adults, and the greater range of scores and over-representation of extreme (clinical) cases may have inflated the size of the relationships obtained. This would be expected to do as much for the EF tests as for the EF ratings and so would not necessarily account for the differences we found between those measures in their predictive utility of impairments. Even so, we concede that these findings may not be readily extrapolated to the general population or even to other clinical disorders not represented in the clinical samples used here.
With these limitations in mind, this study found that EF deficits make a significant contribution to the occupational problems of adults who vary as a function of the severity of their ADHD symptoms. It therefore seems to be the deficits in EF associated with ADHD that are in part contributing to the workplace problems documented here and in past studies of adults with ADHD and in children with ADHD followed to adulthood. In this regard, ratings of EF in daily life activities may be more predictive of impairments in occupational history and current workplace functioning than are EF tests. These results and those of prior studies indicate that these two methods of assessing EF are likely sampling different aspects and even different hierarchically organized levels of EF. Consequently, neither should be taken alone as the sole index of the presence of EF deficits in various clinical populations, and particularly in ADHD.
This research was supported by a grant to the first author from the National Institute of Mental Health (MH54509) while he was at the University of Massachusetts Medical School. The preparation of this paper was also supported by a small grant to the first author from Shire Pharmaceuticals. The opinions expressed here, however, do not necessarily represent those of the funding institute or of Shire Pharmaceuticals.
Dr. Barkley is a consultant/speaker for Eli Lilly, Shire, Novartis, Janssen-Ortho, Janssen-Cilag, and Medice. He has also served as an expert witness for Lilly Canada. He receives product royalties from Guilford Publications, Jones and Bartlett, ContinuingEdCourses.net, and J & K Seminars. Dr. Murphy receives product royalties from Guilford Publications.
We are exceptionally grateful to Tracie Bush for her assistance with the evaluation of the research participants in the study and with their data entry. We also wish to thank Laura Montville for her administrative and data entry assistance and Thomas Babcock, DO, for his comments on an earlier draft of this manuscript.