The NAB List Learning test and its classification algorithm have been previously shown to accurately distinguish between individuals diagnosed as control, amnestic MCI, and AD by a clinical consensus diagnostic conference (Gavett et al., 2009
). The current study extends these findings to demonstrate a meaningful association between NAB List Learning algorithm classifications and future clinical course and outcome. The principal findings from the current study suggest that the NAB List Learning algorithm possesses clinically useful predictive validity. The results suggest that a NAB List Learning algorithm classification of AD was predictive of more rapid decline in language (animal fluency), attention/processing speed (WAIS-R Digit Symbol), and overall cognitive functioning (MMSE), as well as a significantly reduced time to reach a consensus diagnosis of AD, relative to those classified by the algorithm as controls. A NAB List Learning algorithm classification of MCI was predictive of more rapid decline in language (animal fluency) and episodic memory (CERAD WLR Trial 3), as well as a significantly shorter time to consensus diagnosis of AD, relative to controls. Fewer differences between groups classified as AD and MCI were apparent; the AD group declined more rapidly in overall cognitive functioning (MMSE) and visuospatial functioning (Hooper VOT) and had a higher proportion convert to AD compared with those with MCI, but there was not a significant difference in time to future AD diagnosis relative to the MCI group. Although robust and statistically significant, the hazards ratios from the Cox models for the AD and MCI groups are imprecise estimates with wide confidence intervals. However, even the lower limits of these ranges (HR = 7.5 and 5.6 for AD and MCI, respectively), suggest a clinically meaningful decrease in time to conversion relative to the control group. Surprisingly, no difference in the trajectory of cognitive decline was observed between the control group and the AD group on the CERAD, a measure of episodic memory. This may be due to floor and/or ceiling effects, with the AD group having little room to decline below the level of their attention span and the control group having little room to improve before reaching the maximum score for the test. In contrast, based on the results presented in , it is surprising that no significant difference in slope was observed on the Hooper VOT between the control and AD groups. The lack of significance can likely be attributed to (a) the linear trends observed in may be misleading because they are not corrected for age and education (although the tests of significance are corrected) or (b) the smaller sample size for the Hooper VOT (see ) reduced the power to detect a significant difference in trend.
In addition to establishing predictive validity, the results also appear to provide support for the face validity and convergent validity of the NAB List Learning test. The clinical manifestations of AD are strongly linked to neurofibrillary tangle (NFT) burden (Wilcock & Esiri, 1982
). In the early stages of the disease, NFTs aggregate in the medial temporal lobes; as the disease progresses, neuropathological changes expand considerably beyond the medial temporal lobes into the neocortex (Arriagada, Growdon, Hedley-Whyte, & Hyman, 1992
; Guillozet, Weintraub, Mash, & Mesulam, 2003
). As expected based on this pattern of neuropathology, individuals with clinically diagnosed AD often show rapid decline in overall cognitive functioning and in several cognitive domains in addition to episodic memory (Yaari & Corey-Bloom, 2007
). Individuals with MCI are often believed to be showing the earliest clinical manifestations of AD (Morris, 2006
), and are thought to undergo more subtle changes to episodic memory and language, especially verbal category fluency (Murphy, Rich, & Troyer, 2006
). The fact that the AD and MCI groups in the current study experienced cognitive changes consistent with these expectations suggests that the NAB List learning algorithm groupings are clinically meaningful.
One limitation of this study is the fact that participants with missing cognitive data were excluded from the longitudinal regression models of cognitive decline. Participants with missing data were unlikely to be excluded at random; in other words, individuals with more severe cognitive impairment were more likely to be excluded due to an inability to participate in cognitive testing. Had data from severely impaired participants been available, the current findings may have been bolstered by increasing both rate of decline and conversion to AD in the NAB List Learning algorithm-defined AD group. Given the geographic region in which the study was conducted, our samples may have higher socio-economic status and educational experience (M
> 15 years) than the general population. Our samples are representative of black and white individuals, but not of individuals from other racial and ethnic backgrounds. The outcome variable in the survival analyses was the clinical diagnosis of probable or possible AD by a consensus diagnosis team. Ideally, the gold standard would be neuropathologically confirmed AD, and as such, the accuracy of the current study is limited by the accuracy of clinical consensus diagnosis. Finally, because the results were obtained from the same longitudinal research registry used by Gavett et al. (2009)
, there is overlap between the current sample and the sample that was used to develop and cross-validate the ordinal regression algorithm. However, the algorithm was developed based on participants' most recent
annual visit and, for the current study, was retroactively applied to data from a previous visit. In addition, we redid our survival analysis using a sample that was independent from the original sample used to create the algorithm. Results revealed a significant, albeit less robust, overall model. We believe that these procedures mitigate the potential for tautological error and criterion contamination. Nevertheless, replication and extension of these findings in independent samples is necessary.
Future attempts should be made to evaluate the ability of the NAB List Learning test to predict neuropathologically confirmed AD. Because the AD and MCI groups in the current study did not differ as greatly as expected, longer follow-up periods, perhaps 10 years or more, may be necessary to determine whether these two groups, as defined by the NAB List Learning algorithm, age differently. Future research should also focus on examining whether a combination of NAB List Learning test results and an AD biomarker, such as cerebrospinal fluid tau/Aβ42 ratio, provides better diagnostic utility and predictive validity than either marker alone.
Although an algorithmic classification of MCI was predictive of future cognitive decline, these results were based on a subset of the entire MCI group due to missing data, and are therefore limited. This speaks to a more general point about the NAB List Learning algorithm as it pertains to a diagnosis of MCI. The algorithm appears to be considerably more accurate when participants are classified as either controls or AD. A classification of MCI should engender less confidence, both in terms of predicting future outcomes and for classification accuracy; for example, Gavett et al. (2009)
reported a sensitivity of .47 for correctly identifying MCI. This conclusion appears to be consistent with the fact that MCI is better described as a state of diagnostic ambiguity, rather than a specific disease entity (Royall, 2006
). Nevertheless, the results suggest that the NAB List Learning algorithm should not be relied upon as a primary tool for the diagnosis of MCI, and caution is warranted in interpreting an algorithmic MCI classification as a risk factor for future cognitive decline.
It was hypothesized that the three groups (control, MCI, and AD), derived solely on the basis of NAB List Learning performance, would undergo differential rates of cognitive decline and conversion to consensus diagnosis of AD. Because these hypotheses were supported for several outcome measures, the NAB List Learning test algorithm appears to be a valid predictor of changes in cognitive functioning and time to reach a consensus diagnosis of AD. Recently, compelling arguments have been made for eliminating the requirement for “dementia” in the clinical diagnosis of AD, instead focusing on making the diagnosis prior to full-blown dementia using a combination of biomarkers and cognitive indices (Dubois et al., 2007
). This approach to early diagnosis is especially important considering the many potential disease-modifying AD drugs currently in clinical trials (Salloway et al., 2008
). With the caveats about the algorithmic classification of MCI mentioned above, the NAB List Learning test can be diagnostically useful and predictive of multiple cognitive domain decline and time to conversion to AD. Therefore, it fits well within the Dubois et al. (2007)
framework for progress in AD research and clinical care.