The aims of this paper were to identify and evaluate available screening instruments for cognitive impairment. Screens have been presented in tables according to different purposes, forming a quick reference resource to assist clinicians and researchers in making choices. We have also evaluated screens according to the three main purposes outlined, and have drawn attention to criticisms in the literature.
The first purpose for which we considered the screens was as brief assessment tools in the clinical setting, particularly in primary care. This is probably the most common way in which screens are used, and is the focus of policy consensus statements,6,7,8
which have highlighted the dearth of evidence in favour of routine screening. It is clear from the present review that few of the 39 tests identified have been validated in the types of unselected primary care or community based samples which would be representative of target populations for screening efforts. It is interesting that the screens which rate highest with regard to validation methods and statistics, as well as coverage of key cognitive abilities, are those which expand on the content of the MMSE and from which an MMSE score can be derived (3MS, CASI, SASSI). The ACE‐R also expands on the MMSE but has yet to be validated in non‐specialist settings. It is likely that these screens will prove easily acceptable to clinicians already familiar with the MMSE. There does not appear to be a direct relationship between number of key cognitive abilities covered and the validity statistics; however, the usefulness of having broader coverage lies more in the qualitative information it adds to the basic score. Despite an understandable drive towards ultra‐brief tests which can be used in a typically time constrained GP consultation, an administration time of more than 10 min appears to be an unavoidable cost of achieving sufficiently robust statistical performance while covering key domains.
The second purpose considered was large scale community screening programmes. Informant rated scales, or assessments of patients which can be carried out by telephone or post, formed the main focus of this section. However, some community screening initiatives (eg, memory awareness days in clinics or community centres) could be conducted face to face using the shorter of the instruments detailed in table 2. Of the informant scales, the IQCODE (in its original and abbreviated versions) is the most widely used, although it has variable performance across reported studies. The SMQ shows promise as a brief and accurate screen, meriting further study.
Coverage of various cognitive, psychiatric and functional abilities/domains was examined for all 39 screens. Tests varied in coverage from single domain tasks to wide ranging mini‐batteries. Clearly, if the clinician's aim is to elicit useful qualitative and quantitative information about the profile of a patient's presenting symptoms, then wider ranging screens will be of greater value. Secondary or tertiary care clinicians (working in psychiatric, neuropsychological or neurological settings, for example) are likely to be more concerned with differential diagnosis or with further investigation of mild or unusual presentations, situations in which clinical judgement will take precedence over composite scores and cut‐points; scales with broader coverage will therefore be sought in preference to brief assessment tools. It is possible to achieve a balance between these different uses, however, as with scales such as the 3MS, CASI, SASSI and ACE‐R.
An important point to note is that although a number of screens have been validated in particular subtypes of dementia, this does not mean that they are necessarily useful for differential diagnosis. Most sensitivity and specificity statistics for the various subtypes of dementia were calculated against normal controls, rather than other types of dementia. This means that a screen which is particularly good at picking up AD, for example, will not in fact be useful clinically unless it is also good at picking up non‐AD impairments. An effective screen is one which can firstly identify impairment of any aetiology, and secondly provide an indication as to the most likely aetiology in a particular case. For the former aim, it matters most that a screen has demonstrated good validity in samples of mixed aetiology to detect any type of impairment (ie, the “all dementia” column in tables 2 and 3); for the latter, it matters not that a screen can distinguish AD from normal controls (for example), but that it can distinguish AD from non‐AD aetiologies. The ACE‐R is notable for having been specifically validated with differential diagnosis in mind: the patient's individual profile across cognitive domains can be used to estimate the likelihood that their impairment is due to AD versus frontotemporal dementia, providing a valuable adjunct to their simple overall score. This further underscores the importance of covering a wide range of cognitive abilities when designing a screen (and, as mentioned in the introduction, fits better with the preferred working methods of most clinicians). Until other screens are also examined for effectiveness in distinguishing between different aetiologies, anything other than the “all dementia” calculations is clinically redundant.
If one considers the commonest application of screening (ie, brief direct assessment of patients, with the aim of firstly identifying any impairment and secondly providing an indication of the cause of that impairment), then the screens which are likely to be most useful are those which have good sensitivity and specificity for all dementia types in unselected populations, and which elicit information about key cognitive abilities, which can then be compared with neuropsychological profiles in different types of dementia. Table 2 shows that the most promising candidates are the 3MS, CASI, MMSE, SASSI, STMS and ACE‐R. The STMS is notably shorter than the others and so may appeal to the most time pressed clinicians. The 3MS and CASI are the only screens which have been validated in community samples and which cover all the key cognitive abilities, and so are good candidates for those with more time available (although note the shortcomings mentioned in the text accompanying table 2 above). The ACE‐R has not yet been validated in community samples, but its focus on differential diagnosis profiles may be particularly useful for clinicians in secondary/tertiary practice, to guide further investigations.
The specific criticisms described in the results section regarding some of the screens are indicative of common shortcomings in test validation research. Few screens have been validated in unselected samples, and those that have are frequently subject to differential gold standard procedures for patient and control groups. It is rare for all participants who screen negative in large community samples to undergo the same type of confirmatory assessment as those with positive screens. This leads to verification bias, whereby sensitivity calculations are overestimated and specificity underestimated.110
Applicability to real life situations is further compromised by restrictive sample recruitment criteria which often exclude those with a history of substance use, neurological and psychiatric disorder, head injury and other common comorbidity. In addition, as table 1 shows, many authors have not published reliability statistics for their screens. Adequate reliability (internal, test–retest and inter‐rater) is a prerequisite for robust validity, and should be evaluated and reported routinely. These factors should be borne in mind when evaluating all of the screens described here.
In our endeavour to present a comprehensive overview of as many screens as possible, it was not feasible to conduct a fully rigorous quality rating of each study from which we extracted the data presented here. We have, however, applied inclusion criteria as described in the methods section, and have noted critical points regarding certain screens and studies. This review is intended to serve as a resource and starting point from which interested readers can further investigate particular screens for their own requirements.
In consideration of the various purposes for which cognitive impairment screens can be used, it is almost certainly futile to attempt to develop screens that fit all needs. Out of 39 screens identified, we have emphasised a small subset that, in our opinion, have particular strengths, but ultimately there is no such thing as the perfect screen for all purposes. Clinicians should move away from the tendency to become over reliant on one screen (usually the MMSE), and take advantage of the continually evolving (and dauntingly extensive) range of more specialised tools for different situations. This task would be made easier if researchers were to focus on refining and adapting existing screens, with closer consideration of the theoretical basis of symptom profiles in different diagnoses, and specific examination of differential diagnosis within impaired samples. Regardless of policy positions on the merits or otherwise of routine cognitive screening, there is a wealth of potential benefit in the thoughtful application of existing screens in clinical practice.