This investigation examined the reliability and construct validity of self- and observer-ratings on the CAARS using a large sample of adults referred to a university-affiliated ADHD clinic for assessment of attention problems. Our goal was to provide information that would help clinicians integrate data from multiple informants in the assessment of adult ADHD. Measures of concordance between self- and observer-ratings on the CAARS were examined at the item-level and at the level of cluster scores. Measures of concordance were also calculated between self- and observer-ratings on the CAARS, and clinician-identified symptoms on the CAADID. Using the diagnosis determined by the clinician through a comprehensive evaluation as the “gold standard,” the sensitivity and specificity of CAARS self- and observer-rated cluster T-scores in detecting ADHD was evaluated. Finally, divergent validity was assessed by examining the rate at which item-level symptom counts and cluster T-scores in the clinical range for ADHD were also observed in other psychological disorders.
The majority of the participants in this sample (68.3%) were diagnosed with some form of ADHD; the large number of individuals with ADHD in our sample allowed us to examine concordance among raters to see how our findings compare with those previously reported. In addition, 27.1% of participants in our sample were found to have primary diagnoses other than ADHD on Axis I, and 4.6% had no Axis I diagnosis. Having a substantial percentage of our sample who presented for assessment of attention problems, but who were ultimately determined to have disorders other than ADHD, also allowed us to explore the discriminant validity of the CAARS within a sample of high-risk adults in an outpatient setting.
Across reporters, among the most frequently endorsed inattentive symptoms in this sample were “distractible” “difficulty sustaining attention,” and “doesn’t follow instructions or follow through”; this is consistent with previous findings of self- and observer- reports of adults with ADHD, as well as with symptoms commonly reported in control samples (Downey et al., 1997
; Murphy & Barkley, 1996
Also consistent with previous findings, concordance between self- and observer-ratings of individual items ranged from slight to fair, with κ values ranging from 0.11 to 0.37 (Downey et al., 1997
; Zucker et al., 2002
). Interestingly, those items with the lowest κ values were also those that were most frequently reported: “distractible” κ = .11; “difficulty sustaining attention” κ = .15; and “doesn’t follow instructions or follow through” κ = .20. This may suggest that these characteristics are sufficiently common as to be difficult for reporters to reliably assess across situations (Murphy & Barkley, 1996
). On the cluster indices, however, correlations between self- and observer-ratings rose to between r
= .39 and r
=.59; these concordance rates were similar to those previously reported by Kooij and colleagues (2008)
Among the different observers (friends, spouses, and parents), there was only one significant difference in agreement with self-ratings on symptom-specific items; this suggests that at least with respect to the core symptoms, various observers are likely to provide equally relevant data. Though some researchers have suggested that supervisors or others in a position to evaluate individuals in a structured setting may provide unique data (Belendiuk, Clarke, Chronis, & Raggi, 2007
), others have found that employers actually tended to report lower levels of ADHD symptomatology than either participants or other observers (Barkley et al., 2008
). Unfortunately there were too few supervisors or coaches represented in our sample of observers to address this issue.
Participants generally reported greater symptomatology than did observers; this was reflected in a consistently higher frequency of DSM-IV
symptom endorsement at the item-level, as well as in higher mean T
-scores on all CAARS clusters. On the other hand, the frequency ranking of symptoms was fairly consistent between self and observer. This consistency in frequency ranking was not preserved in clinician ratings of the DSM-IV
items, however: clinicians were more likely to conclude that avoidance of tasks requiring sustained mental effort was present than were either participants or observers, and they were less likely to conclude that being “on the go” or acting as if “driven by a motor” was present. This suggests that participants and observers may interpret the behaviors described on the rating scales similarly, and may be relatively close in the degree to which they believe these behaviors are common in the general population. However, for at least some of the items that are central to the diagnosis of ADHD, participants and observers appear to differ from clinicians in their conceptualization of the behavior or in their estimation of how “normal” it is. Given that one of the difficulties in diagnosing ADHD lies in determining how extreme the behavior is in relation to developmental and cultural norms, these findings suggest that it may be critical for clinicians to elicit concrete examples of behaviors and clear ratings of their frequency (i.e. daily, several times a day) in determining the degree to which a putative symptom is actually maladaptive. This conclusion is similar to Murphy and Barkley’s (1996)
observation that several of the Diagnostic and Statistical Manual of Mental Disorders
(4th ed.) (DSM-III-R
) criteria for ADHD were endorsed too frequently on both self- and observer-ratings scales to provide an adequate basis for discriminating between those with and without the disorder.
Clinically-elevated cluster T-scores demonstrated a relatively poor balance of sensitivity and specificity in predicting ADHD diagnosis. While self-rated DSM-IV Inattentive Symptoms T-scores were sensitive in detecting the presence of ADHD, they were not specific. On this index, even when both self- and observer-ratings were in the clinical range the specificity only improved to 0.54, while sensitivity fell to 0.63. Only T-scores in the clinical range on Conners’ Index allowed for sensitivity and specificity above 0.60. As such, clinically-elevated T-scores on the CAARS clusters are relatively limited in the information they contribute to differentiating ADHD from other psychiatric disorders that commonly manifest in adulthood.
This lack of sensitivity and specificity in detecting ADHD in individuals with attention problems was underscored by the high percentage of participants with mood or anxiety disorders who produced cluster T-scores in the clinical range. DSM-IV Inattentive Symptoms were particularly highly elevated across groups, with 94% of those with mood disorders and 87% of those with anxiety disorders scoring at or above T = 65. The Conners’ Hyperactivity/Restlessness Index was least represented across groups: only 24% of those with mood disorders and 13% of those with anxiety disorders scored in the clinical range. While including information from observer-ratings reduced the frequency with which clinically-elevated cluster scores were represented in these groups, 4 of the 8 indices were found to be at or above T = 65 in more than 25% of those with mood or anxiety disorders who did not have ADHD. Finally, an examination of mean cluster scores in those with ADHD, primary mood disorders, and primary anxiety disorders demonstrated that individuals with mood disorders are especially likely to be indistinguishable from those with ADHD on the CAARS.
Taken together, these findings suggest that while the CAARS is appropriate for screening for the presence of attention problems to determine whether or not a more thorough evaluation is necessary (Conners et al., 1999
), it may be misleading if given substantial weight in the determination of differential diagnosis. This conclusion is supported by studies in smaller samples of adults with depression, anxiety, or substance use disorders that have reported cluster scores comparable to those found in ADHD (Belendiuk et al., 2007
; Cleland, Magura, Foote, Rosenblum, & Kosanke, 2006
; Solanto et al., 2004
). Diagnosing ADHD in adults may require different clinical skills than diagnosing the disorder in children, both because the symptoms may manifest differently, and because attention problems are common to many disorders that peak in adolescence and adulthood.
On the other hand, it is possible that there is a subset of the items on the CAARS that would more effectively distinguish between ADHD and other adult psychiatric disorders. For example, in their analysis of the DSM-IV
ADHD criteria endorsed by adults in clinical interviews, Barkley and colleagues (2008)
utilized logistic regression to identify 4 of the current 18 DSM-IV
criteria that maximally discriminated participants with ADHD from both community and clinical controls; the optimized subset of symptoms correctly classified 86% of ADHD and 47% of clinical control cases (Barkley et al., 2008
). Given that in our sample the Conners’ ADHD Index was more effective at discriminating between ADHD and other Axis I disorders than were DSM-IV
cluster scores, it may be possible to use regression in a similar manner to identify a subset of these items that would be maximally sensitive and specific in distinguishing ADHD from other forms of adult psychopathology that impair attention. However, because individuals who are diagnosed with ADHD in adulthood are frequently clinically complex and often present with multiple comorbidities (Barkley et al., 2008
), the boundaries and factor structure of ADHD as it manifests in adulthood have yet to be defined. To begin to address this issue, we are currently in the process of conducting factor analyses using a portion of this data set to examine how ADHD in adulthood compares to the child form of the disorder.
Overall these findings point to the need for careful examination of self-reported symptoms of adult ADHD, and particularly of inattentive symptoms, in determining their relevance to a diagnosis. This is especially the case if information about childhood symptoms is unreliable because of lack of access to appropriate reporters, or because of patient difficulty in remembering details about childhood behaviors. Individual symptoms may not be highly concordant across reporters, and the attributions made about inattentive or impulsive behaviors are likely to vary among observers as well as across time and situation. Further, identification of symptoms as “present” or “absent” must be considered carefully in the context of etiology and functional impairment. In sum, the CAARS is an invaluable tool for identifying clinically significant problems with attention, but should be followed by a thorough clinical evaluation to determine differential diagnoses in adults seeking evaluation for ADHD.
Limitations and Future Directions
The findings reported here must be considered in light of several limitations. First, the comprehensive evaluations in our clinic include the CAARS-S and CAARS-O, and thus clinicians’ diagnoses were not determined completely independently of these measures. However, the CAARS manual indicates that this measure is appropriately used as a screening measure (Conners et al., 1999
), and the data from the CAARS was considered in the context of a wealth of other clinical and collateral information in determining diagnoses. We therefore believe that our conclusions remain relevant to clinicians using this measure as a tool in clinical practice.
Second, participants were drawn exclusively from referrals to a specialty ADHD clinic, and thus they may have been more likely to identify attention problems among their primary complaints than individuals recruited from a more general outpatient psychology or psychiatry setting. As such, our findings with respect to discriminant validity may be limited. On the other hand, almost a third of our sample was not
diagnosed with ADHD; among those who were, just under half met criteria for at least one comorbid Axis I psychiatric diagnosis other than nicotine dependence. Considering that mood and anxiety disorders are characterized by significant attention problems (Castaneda, Tuulio-Henriksson, Marttunen, Suvisaari, & Lonnqvist, 2008
), and that among individuals with ADHD comorbid conditions have been reported to be as high as 80% (Barkley et al., 2008
), our findings may maintain some relevance to a wider range of mental health settings.
Another issue with respect to the representativeness of our sample is related to our clinic billing practices: As with many specialty outpatient clinics, our services are necessarily limited to those with some resources (i.e. insurance), and it is unclear how our patients compare to those who are likely to seek evaluations in settings serving a broader population. Given that research has demonstrated that adults with ADHD often present with significant functional deficits that affect job stability and performance (Murphy & Barkley, 1996
), our sample may be largely comprised of individuals who have external resources or who have developed compensatory strategies that mitigate the impact of ADHD on their lives. As such, it will be important for future research to examine the psychometric properties of self- and observer-reports for individuals in settings with a broader clinical base (i.e. public health clinics), and for individuals whose lives may have been particularly affected by their symptoms (i.e. unemployment offices, prisons). In fact, despite the evidence that ADHD and substance abuse disorders are often highly comorbid (Biederman, Wilens, Mick, Faraone, & Spencer, 1998
; Cleland et al., 2006
; Levin, Evans, & Kleber, 1998
; Murphy & Barkley, 1996
), a relatively small percentage of our patients had current comorbid substance abuse disorders; this, again, supports the possibility that our sample may be comprised largely of a relatively mild subgroup of the larger population of adults with ADHD.
In sum, this study addresses an important gap in the literature by using a large clinical sample to examine the reliability and validity of the CAARS as a measure of ADHD in adults, and it provides unique and invaluable information to clinicians as to how this tool can best be used to determine the presence and diagnostic relevance of attention problems. Our findings suggest that the CAARS is a useful instrument when used as one source of information in an evaluation for ADHD in adults. Self- and observer-reports contribute unique information, and considering these two sources of information together may be useful in developing hypotheses regarding differential diagnoses. However, our findings also underscore the importance of using the CAARS only as a screening tool or as part of a more comprehensive evaluation; the guidelines for diagnosing ADHD in adults remain ambiguous, and clinical judgment will necessarily continue to play a significant role in determining how best to interpret the presence of attention problems in adulthood. As such, both self- and observer- ratings on the CAARS can be helpful in identifying problems and in forming hypotheses as to possible underlying pathology in adults being evaluated for attention problems, but the instrument does not, by itself, distinguish between ADHD and other adult psychiatric disorders.