Researchers interested in the assessment of cognitive impairment in the elderly population face substantial challenges. There is considerable interest in using the most sophisticated methods to investigate subtle neuropsychological impairments associated with the earliest stages of neurodegenerative disease [31
]. However, at the same time, the complexity of current research questions and the need to capture many different aspects of disease, from clinical signs [32
] to biomarkers [33
], have led to pressure to use shorter cognitive batteries to allow for the inclusion of additional types of measures. In practical terms, this has often meant the simple adaptation of many instruments that have long been used in dementia research, without evidence that they have the requisite psychometric properties or provide adequate sensitivity and/or specificity for conditions such as MCI or even more subtle forms of cognitive impairment.
Neuropsychological performance for any individual is determined by a complex array of variables including but not limited to cognitive reserve, premorbid strengths and weaknesses, engagement in testing, attentional capacity, sensory acuity, and fatigue, to name a few. Because any single cognitive measure can be affected by any one or a combination of these factors, neuropsychologists often examine an individual’s pattern of performance across several different indices, comparing that performance with normative data, rather than relying on a single test in isolation [34
]. Current trends, however, are to use limited numbers of measures and to use cut-offs for performance on the basis of the number of standard deviations that, for an individual, fall below a reference point set for patients of similar age, education, and cultural background [35
The most sophisticated clinical and epidemiological studies of cognitive impairment in the elderly population assess different aspects of neuropsychological function at baseline and ensure that their measurements have sufficient range to track changes in different cognitive domains longitudinally. The optimal cognitive battery may, at minimum, include tests of (1) learning and retentive memory, (2) executive function, (3) language, and (4) visuospatial skills [31
]. It is also desirable to have measures of both attention and processing speed because these are often impaired by a variety of brain disorders and, thus, serve as a mjore general markerof impairment.
Because memory dysfunction is a hallmark feature of AD and is present in many neurodegenerative disorders, this section will focus on selecting memory tests for studies of MCI and the trade-offs between efficiency and comprehensiveness of a cognitive battery.
Many neuropsychologists favor verbal list learning tasks for clinical work, longitudinal investigations of aging, clinical trials, and epidemiological studies. The advantage of this type of memory paradigm is that (1) there is more than one presentation of the to-be-remembered stimuli, thus minimizing the effect of poor attention on performance; (2) learning curves over repeated trials can be examined; (3) the processes of encoding, acquisition, and storage of to-beremembered information as well as specific types of memory can be distinguished; and (4) many measures also have parallel and alternate forms.
In choosing a list learning test, the number of to-be-remembered items should be sufficient to be challenging for high-functioning subjects (12 to 16 items is optimal). In populations where there may be cultural and/or language issues, alternative list learning tasks, such as, for example, using actual objects and making the test more challenging by requiring the subject to alternate between learning trials and distractor tasks may be beneficial [36
There are many other memory tasks that are also well justified in the cognitive assessment of older adults including, but not limited to, paired associates learning, immediate and delayed recall of simple and increasingly complex geometric designs, spatial object location, face–name association, and paragraph recall.
For neuropsychologists, the most widely used paragraph recall test is the Logical Memory subtest on the Wechsler Memory Scale [37
], which assesses immediate and delayed recall for two passages, each containing 25 elements. In one newer version of the test, the second passage is repeated to assess learning over more than one trial. Although Logical Memory takes no longer than 10 minutes to administer, the pressures to abbreviate the testing have been sufficiently powerful, such that a standard in the field today is to use a single paragraph to assess memory function. In fact, delayed recall for a single story passage is used in several significant national clinical research initiatives as the only memory measure in addition to a Clinical Dementia Rating [39
] clinical score to derive a diagnosis of amnestic MCI. However, a potential problem with this approach is that factors such as auditory difficulties or inattention can greatly affect performance on this cognitive test. Conversely, persons with seemingly intact story memory, especially those with above-average premorbid function, might still have underlying difficulties with other aspects of memory. The solution to this problem is to have tests that are not as prone to attentional effects, have repeated trials, and are more sensitive to early, mild impairment. It is likely that these issues will only become more acute as investigators attempt to identify MCI at even earlier stages.
In their proposed revision of research criteria for AD, Dubois et al [10
] contend that using paradigms that increase encoding specificity at acquisition and that assess failure to benefit from cueing at recall are superior to episodic memory tests using free recall alone. Buschke et al [40
], using the Selective Reminding procedure, first observed that probing with the same semantic cues used for learning and retrieval was superior to testing free recall alone and was also superior to paired associate learning and Logical Memory in distinguishing mild dementia from normal aging. More recently, Buschke developed the Memory Capacity Test, which requires an individual to learn 16 items from different semantic categories with cued recall and then to learn 16 new items using the same semantic categories with cued recall. Cued recall for the second list and 30-minute delay of the second list of the Memory Capacity Test was particularly useful in identifying amyloid-positive normal elderly individuals with high cognitive reserve who were at greater risk for AD because of their increased amyloid load [41
Loewenstein et al have focused on determining the extent to which vulnerability to semantic interference may identify cases of early AD [36
]. The modified three-trial Fuld Object Memory Evaluation paradigm was extended by having subjects recall a second list of items that are all semantically similar to the original to-be-remembered targets (i.e., ring versus bracelet, key versus lock) [43
]. Reduced recall for the second list as compared with the first list was thought to occur because of competition from the previously presented targets on the first list (proactive interference), whereas reduced recall for the first list after recall of the second list was thought to be related to retroactive interference. The Semantic Interference Test (SIT) [36
] evidenced high sensitivity and specificity in distinguishing normal elderly subjects from those with MCI and early dementia. Moreover, vulnerability to proactive interference was most associated with those subjects with MCI who progressed to dementia over a period of 2 to 3 years [44
]. Recently, the SIT has been validated for use in epidemiological investigations [45
Studies by Buschke et al and Loewenstein et al focus on semantic information processing deficits that may be specific for early AD and support the need to develop memory tests that optimize attention to the to-be-remembered stimuli and emphasize lack of encoding specificity.
For epidemiological studies, briefer memory tests may be required. The Florida Brief Memory Screen (FBMS) [46
], which takes approximately 3 to 4 minutes to administer, was recently developed to identify those in need of further evaluation. The primary objective is for the subject to attend to and register the to-be-remembered targets. Free recall is then assessed. The FBMS has been found to be highly reliable and correctly classify 100% of patients with AD, 82.6% of individuals with amnestic MCI, and 87.5% of normal elderly controls without cognitive impairment [46
]. Importantly, the FBMS scores were generally independent of age, education, and primary language (English versus Spanish). Although preliminary, these results suggest that a one-trial memory measure may have potential as a screening measure for amnestic MCI and dementia. For aforementioned reasons, the FBMS is not viewed as being sufficiently broad or having an adequate number of trials to be used as a stand-alone memory measure any more than the three words of the Mini-Mental State Examination [47
] or the five words of the Montreal Cognitive Assessment Screen [48
] would serve on their own to assess memory.
Because tests are developed to assess cognitive impairment in its earliest stages, the increased sensitivity may involve a trade-off of decreased specificity. Because false-positive and false-negative classification errors are related to underlying base rates of true impairment in a particular population, this trade-off needs to be evaluated with regard to the research questions being posed and the costs associated with misidentification.
Although there are many pressures to abbreviate memory measures, recent work with cued recall and semantic interference paradigms suggests that assessments may have to become even more detailed and comprehensive to capture subtle cognitive changes. A new era has commenced, where it is possible to obtain imaging and other biomarkers that point to cerebral compromise years before clinical symptoms are manifested [41
]. Instead of a focus on how to reduce existing tests, there needs to be continued development of measures that (1) can use the person as his/her own control in evaluating decrements in performance, (2) reveal vulnerabilities in the cognitive system that cannot be masked by high education and cognitive reserve, and (3) have acceptable levels of specificity. The dementia field deserves no less.