In standardizing NP test scores, education level is frequently considered in order to correct for the effects that schooling has on cognitive ability. However, previous studies have found that grade attainment is not the best indicator of educational quality and may result in underestimating NP test performances among African Americans, thereby making them more prone to an erroneous diagnosis of NCI. As a result, reading level has been suggested to be a more accurate reflection of one's true educational quality, especially among African Americans.
Despite similar self-reported years of education among African Americans and Whites, reading grade-levels based on the WRAT-3 proved to be significantly lower for the former group. This is consistent with past studies (Manly et al., 1998
; Manly et al., 2002
), which demonstrated that African-Americans had attained a lower quality of education (operationalized as WRAT-3 reading grade-equivalent) than Whites matched for years of education.
Within each ethnic group, there were significant differences between NP test scores using the two correction methods. Among the African-Americans, significantly higher scores were obtained on visual attention and psychomotor tests (e.g., Trail Making Test and Grooved Pegboard) when WRAT-3 correction was used. Conversely, this group achieved higher scores primarily on measures of executive functioning and verbal attention (e.g., Letter/Number Sequencing, COWAT, and PASAT) when scores were corrected using years of education. Among the Whites, scores obtained via WRAT-3 correction were consistently higher than those based on grade attainment. However, examination of the differences in scores obtained from the two methods shows most of them to be quite small, on the order of about 1 T-score unit. Thus, while statistically significant, it is unclear that these differences were of clinical significance. The one exception was the PASAT, a measure of working memory. In addition, a significant interaction was seen between ethnicity and correction method across almost all tests, further indicating that NP test scores differed depending on both correction method and ethnicity.
These findings differ somewhat from those of Ryan et al. (2005)
, who found that the African Americans within their sample, similar to the Whites, had consistently higher scores across tests when the WRAT-3 grade level was used as a correction factor. One explanation may be the significant difference in the actual grade attainment between their White and African American cohorts (14.3 vs.
11.7, respectively). In our sample the two groups had equivalent grade attainment. Thus, there was a greater discrepancy between grade attainment and reading grade-equivalent among their African American cohort. In addition, Ryan et al. (2005)
examined impairment rates (defined as 1.5 SD
s below the mean for each measure of interest) only in those individuals who had a significant discrepancy between reported grade attainment and reading grade-equivalent. Therefore, the greater discrepancy between self-reported education and reading grade-level likely resulted in higher scores when the latter was used as a correction method. The disparate findings may also be the result of minor differences among the test batteries used. Ours included primarily measures requiring processing speed, but little in the way of language and reasoning skills, which some might argue are more highly correlated with education. However, previous findings from our group indicate that performance on even simple reaction time measures are predicted by education level (Levine et al., 2004
). The findings among our African American sample are perhaps more consistent with those of Johnstone et al. (1997)
, who found varying rates of NCI depending on whether reading level or years of education was used as a correction factor. Based on a sample of primarily White adults age 40 and under with a history of traumatic brain injury, the authors reported that reading-based scores (derived from the WRAT-R or WRAT-3) were associated with greater impairment on both parts of the Trail Making Test. In addition, this method resulted in a larger discrepancy in scores between cognitive and motor tasks. In contrast, they found that scores based on years of education were associated with greater rates of impairment on motor tasks and nonverbal IQ, but that the discrepancy between cognitive and motor performance was not as remarkable. Thus, the authors suggested that WRAT-based correction is associated with greater variability of impairment across abilities, and that this is perhaps more reflective of the greater sensitivity of this method. Note that Johnstone et al. (1997)
did not use the grade-equivalent from the WRAT, but rather calculated z
-scores derived from the reading subtest score. Z
-scores for each of their cognitive domains of interest were then subtracted from the reading z-scores in estimating rates of impairment. Thus, the difference in methodology between their study and ours makes comparison difficult.
Perhaps most striking is the finding that the two correction methods are associated with differential accuracy depending on ethnicity. Although Ryan et al. (2005)
found that using reading grade-level (via WRAT-3) as a proxy for years of education lowered rates of impairment (defined as a deviation from the sample mean) across a variety of NP tests, our study is the first to compare the diagnostic accuracy of the two methods using an external criterion (i.e. neurologist's diagnosis). For our entire sample, there was little difference in accuracy rates between the two correction methods, although WRAT-correction led to better specificity while scores based on years of education led to greater sensitivity. More compelling, however, are the findings that the two methods had differential diagnostic accuracy among the two ethnic groups. Consistent with our hypothesis, WRAT-corrected scores were found to increase specificity rates by over 20% above that of grade attainment-corrected scores (77.8% vs.
55.6%) among the African American cohort. Thus, these results support the notion that NP scores derived from self-reported years of education may lead to artificially inflated rates of impairment among this group. However, using WRAT-corrected scores may also have drawbacks, as the sensitivity associated with this method was significantly lower than that of the traditional technique (48.4% vs.
61.3%). Among the White cohort, overall accuracy was slightly better when using years of education as the correction factor (68.1% vs.
72%). Both sensitivity and specificity decreased by approximately 4% when WRAT-correction was used. Thus, the traditional method appears to be more accurate for Whites. These findings suggest that different correction methods may be appropriate for these two groups. The decision to employ reading grade-level as a correction factor for African Americans will rest upon the tradeoff between sensitivity and specificity.
There were a number of limitations to the current study, which should be considered. First, the WRAT-3 reading test scores were skewed such that most participant scores were in the upper part of the range, suggesting a ceiling effect for this test among our sample. As education level progresses to the high school years, the effect of educational quality is no longer as robust as when comparing participants among lower educational levels (Ostrosky-Solis et al., 1998
). This lack of variability can also adversely impact the statistical analysis. Thus, analyzing a sample that has more variability with regards to reading ability will be useful. Second, the study consisted predominantly of males, with females comprising 15.9% of our study population. However, it is worthy to note that men have higher rates of HIV, with women accounting for 22% of HIV infected individuals (CDC, 2003
). Therefore, this gender disparity generally reflects the demographics of HIV within the Los Angeles area, where the most common risk behavior for HIV remains male-to-male sexual contact. There was a significant difference in gender between racial groups. Although our analyses co-varied for gender to eliminate possible gender differences, a sample that has a more similar gender composite may be helpful in seeing the effects these normative methods have on diagnoses. Third, the results are based on the premise that reading ability is fundamentally similar among ethnic groups. However, an alternative explanation may be that the WRAT-3 is not appropriate for estimating reading level among African Americans. For example, the words used on the WRAT-3 may be less commonly used within schools that serve primarily African Americans, or within their homes and social settings. Thus, their poorer scores on the WRAT-3 may have been due to lack of familiarity rather than poor educational quality. This fundamental question will require further investigation. Another issue regarding the WRAT-3 is that there may have been greater variance in the abilities of African-Americans lumped in the “high school” reading level as compared to the Whites in that same category among the original WRAT sample. Unfortunately, the WRAT-3 has an inherent weakness in not assigning specific reading grade levels. The norms the WRAT-3 is based on aggregates all subjects with a reading level from ninth to twelfth grade as “high school” and all subjects with a reading level above the twelfth grade as “post high school.” Since the time of our study a fourth edition of the WRAT has been published, which has added grade-based norms, thus increasing the utility of the test in differentiating the grade levels within high school. Consequently, employing the WRAT-4 in future studies similar to ours will be of value. Finally, when determining impairment based on NP tests, we employed a cutoff score of 40, which is one standard deviation below the mean of the normative sample to which our cohort was compared. It is possible that this cutoff was not the most appropriate threshold for our sample. Adjusting the cutoff may have resulted in an improvement in overall accuracy rates for both methods examined. In addition, weighting certain tests over others may have increased our accuracy rates. However, while these psychometric issues are highly relevant to the current study, we believe that the current findings are just an initial step towards creating more fitting normative methods for African Americans, and minorities in general. Future studies will likely shed light on the additional psychometric issues that have arisen here.
These results have important implications on the HIV+ population. With the advent and use of highly active antiretroviral therapy (HAART), HIV-infected individuals are living longer and experiencing lower rates of opportunistic infections. However, we are seeing a rising prevalence of other HIV-associated conditions, including neurocognitive disorders (Fischer-Smith & Rappaport, 2005
). Moreover, HIV is affecting a growing number of African Americans and other minority populations. Taking this into consideration, it is necessary to have enhanced diagnostic tools with normative data that are more representative of the typical HIV + demographic. These preliminary findings suggest that specifying the most appropriate normative method for individuals from particular backgrounds may significantly reduce misdiagnosis.