|Home | About | Journals | Submit | Contact Us | Français|
A workgroup commissioned by the Alzheimer’s Association (AA) and the National Institute on Aging (NIA) recently published research criteria for preclinical Alzheimer’s disease (AD). We performed a preliminary assessment of these guidelines.
We employed Pittsburgh compound B positron emission tomography (PET) imaging as our biomarker of cerebral amyloidosis and 18fluorodeoxyglucose PET imaging and hippocampal volume as biomarkers of neurodegeneration. A group of 42 clinically diagnosed AD subjects was used to create imaging biomarker cut-points. A group of 450 cognitively normal (CN) subjects from a population based sample was used to develop cognitive cut-points and to assess population frequencies of the different preclinical AD stages using different cut-point criteria.
The new criteria subdivide the preclinical phase of AD into stages 1–3. To classify our CN subjects, two additional categories were needed. Stage 0 denotes subjects with normal AD biomarkers and no evidence of subtle cognitive impairment. Suspected Non-AD Pathophysiology (SNAP) denotes subjects with normal amyloid PET imaging, but abnormal neurodegeneration biomarker studies. At fixed cut-points corresponding to 90% sensitivity for diagnosing AD and the 10th percentile of CN cognitive scores, 43% of our sample was classified as stage 0; 16% stage 1; 12 % stage 2; 3% stage 3; and 23% SNAP.
This cross-sectional evaluation of the NIA-AA criteria for preclinical AD indicates that the 1–3 staging criteria coupled with stage 0 and SNAP categories classify 97% of CN subjects from a population-based sample, leaving just 3% unclassified. Future longitudinal validation of the criteria will be important.
Studies of biomarkers for Alzheimer’s disease (AD) have provided clear evidence for the existence of a preclinical phase of the disease. A panel commissioned by the National Institute on Aging and the Alzheimer’s Association (NIA-AA) recently proposed guidelines to define preclinical AD for research purposes . The NIA-AA criteria were based on a conceptual model of the pathophysiology of AD  in which the biomarkers of AD become abnormal in an ordered manner [2–6]. Amyloid biomarkers (PET amyloid imaging and CSF Aβ42) become abnormal first, as early as 20 years before significant clinical symptoms appear. Biomarkers of neuronal injury and degeneration (CSF tau, FDG-PET, and anatomic MRI) become abnormal later, closer to the time when individuals become symptomatic. The NIA-AA preclinical criteria are related to this hypothetical biomarker model in Figure 1. The NIA-AA criteria, therefore, identified stage 1 of preclinical AD as one of asymptomatic cerebral amyloidosis, and stage 2 as one in which evidence of synaptic dysfunction and neurodegeneration was also evident. The NIA-AA criteria further assert that subtle cognitive changes will be present in stage 3 and will precede the appearance of overt cognitive impairment.
Although the NIA-AA criteria are based on a conceptual model as well as observational data, they make specific assumptions about relationships among biomarkers and cognitive testing that have not been adequately validated. As a first step in the evaluation process, we examined the distribution of cognitively normal persons in a population-based study of aging in Olmsted County Minnesota who fell into the different stages of the NIA-AA criteria for preclinical AD. To do so, however, we first had to develop an operational approach to implement the criteria.
Two groups of subjects were employed in this study. First, 42 clinically diagnosed AD subjects who had undergone MRI, FDG-PET and PIB-PET were used to create imaging biomarker cut-points. These AD subjects were drawn from our Alzheimer’s Disease Research Center (ADRC) or incident cases in the Mayo Clinic Study of Aging (MCSA). Second, all available cognitively normal (CN) subjects from the MCSA who had undergone MRI, FDG-PET, PIB-PET, and complete neuropsychological testing (n = 450) were used to both develop cognitive cut-points and to assess population frequencies of the different preclinical AD stages using different cut-point criteria.
The MCSA is a population-based study of cognitive aging that was established in Olmsted County, MN starting in October 2004 . All MCSA subjects undergo a clinical and cognitive assessment every 15 months that includes 9 neuropsychological tests . The evaluations of all subjects were reviewed by a consensus panel consisting of physicians (neurologists and geriatricians), neuropsychologists, and study nurses. Subjects in the present study were diagnosed by the consensus panel as being cognitively normal, based on the clinical assessments including mental status examinations and informant interviews as well as the neuropsychological testing battery described below [7, 8].
The neuropsychological battery was constructed as previously described [7, 8]. Domain specific measures are formulated from the Wechsler Adult Intelligence Scale-Revised (WAIS-R), Wechsler Memory Scale-Revised (WMS-R), Auditory Verbal Learning Test (AVLT), Trail Making Test (TMT), category fluency test, and Boston Naming Test (BNT). Four cognitive domains are assessed: Executive (TMT: Part B, WAIS-R Digit Symbol); Language (BNT, category fluency); Memory (WMS-R Logical Memory-II (delayed recall), WMS-R Visual Reproduction-II (delayed recall), AVLT delayed recall); and Visuospatial (WAIS-R Picture Completion, WAIS-R Block Design). Individual test scores were first converted to z-scores using the mean and standard deviation from the MCSA 2004 enrollment visit for subjects that were CN (n=1624). The individual z-scores were averaged to create 4 domain scores which were then also converted to z-scores. A global cognitive summary score was formed from the average of the 4 domain z-scores and then converted to a z-score by subtracting the mean and dividing by the standard deviation. This global summary score was used to assess cognitive impairment in our subjects.
MRI was performed at 3T with a 3D-MPRAGE sequence  Images were corrected for distortion due to gradient non-linearity and for bias field [10, 11]. Our primary MRI measure was hippocampal volume measured with FreeSurfer software (version 4.5.0) . Each subject’s raw hippocampal volume was adjusted by his/her total intracranial volume  to form an adjusted hippocampal volume (HVa). We calculated HVa as the residual from a linear regression of hippocampal volume (y) versus total intracranial volume (x).
PET images  were acquired using a PET/CT scanner. The 11C PIB-PET scan consisting of four 5-minute dynamic frames was acquired from 40–60 minutes after injection [15, 16]. 18 Fluorodeoxyglucose (18F-FDG ) PET images were obtained 1 hour after the PIB scan. Subjects were injected with 18F-FDG and imaged after 30–38 minutes, for an 8-minute image acquisition consisting of four 2-minute dynamic frames.
Quantitative image analysis for both PIB and FDG was done using our in-house fully automated image processing pipeline . A global cortical PIB-PET retention ratio was formed by calculating the median uptake over voxels in the prefrontal, orbitofrontal, parietal, temporal, anterior cingulate, and posterior cingulate/precuneus regions of interest (ROIs) for each subject and dividing this by the median uptake over voxels in the cerebellar gray matter ROI of the atlas . FDG-PET scans were analyzed in a similar manner. We used angular gyrus, posterior cingulate, and inferior temporal cortical ROIs, as described in Landau et al , normalized to pons uptake.
While all biomarkers and cognitive tests are continuous measures, the NIA-AA criteria for preclinical AD require that every biomarker and cognitive test is designated normal or abnormal . This requires that cut-points be created in these continuous distributions. The ideal method for selecting biomarker cut-points would be to use autopsy diagnoses as the reference standard of truth [20–22]. Because we do not yet have an adequately large autopsy sample with antemortem 3T MRI, PIB PET and FDG PET, we created cut-points such that a majority of clinically defined AD dementia patients would be deemed abnormal. While we did not have CSF available in our subjects, we had amyloid (PIB-PET) and neurodegenerative (FDG-PET and MRI) biomarkers in all subjects, and were therefore able to stage all subjects in accordance with the NIA-AA criteria (1). We had two sources of data in the neurodegenerative biomarker category (FDG PET and MRI) and we considered a subject positive for evidence of neurodegeneration if either or both fell below the cut-point.
Cut-points were based on estimated percentiles. For example, where higher biomarker values are worse, the cut-point corresponding to 90% sensitivity was the estimated 10th percentile of the AD distribution. Where lower biomarker values are worse, the cut-point was the estimated 90th percentile of the AD distribution. In this example, approximately 90% of ADs are considered abnormal. Biomarker cut-points corresponding to other levels of sensitivity were determined similarly.
Because the cognitive impairment seen in AD dementia or even MCI could not be used as an external reference for evidence of subtle cognitive decline in CN subjects, we used percentiles of the global summary score from the 450 CN subjects in this study. These percentiles can also be thought of as corresponding to a specificity level in that the 10th percentile of the distribution corresponds to a specificity level of 90%. Note that this approach, unlike the approach used for biomarkers, guarantees a certain number of subjects will fall below the subtle “cognitive impairment" cut point.
Demographic features of the 42 AD subjects used to develop biomarker cut-points and the 450 MCSA CN subjects used to assess the preclinical staging criteria are shown in Table 1. Table 1 also shows the demographic characteristics of all 2399 CN subjects in the MCSA. The 450 CN subjects who underwent the full battery of imaging studies were largely representative of the larger MCSA CN sample but had slightly higher cognitive scores.
Figure 2 illustrates individual values for PIB, FDG, and HVa for the AD and CN subjects used in this study. In each panel arrows indicate the biomarker cut-points for five levels of diagnostic sensitivity for AD: 80%, 85%, 90%, 95% and 99%. Table 2 contains the numeric cut-points corresponding to these different diagnostic sensitivities for each of the three imaging biomarkers and the proportion of CNs that was abnormal at each cut point. For example, using a cut-point of 90% sensitivity for AD, the proportions of CN subjects that fall into the abnormal range are: 32% for PIB; 28% for FDG; and 19% for HVa.
The NIA-AA preclinical AD staging criteria require that the presence or absence of subtle cognitive impairment of insufficient severity to qualify for a diagnosis of MCI be established in each subject. We evaluated cut-points at the 5th, 10th and 15th percentile of the distribution of both the global summary score (Figure 3a) and the memory domain composite score (Figure 3b) in the 450 CN subjects.
Figure 3 shows distributions of subjects that fall into preclinical stages 1–3 in our CN subjects using various combinations of cut-points for biomarkers and cognitive impairment. First, we note that the distribution of subjects among categories were very similar whether the global (Figure 3a) or memory summary scores (Figure 3b) were used. We elected to use the global summary thereafter as our primary measure of subtle cognitive impairment. Next, it is evident that two other categories are necessary to classify all CN subjects in our sample. One we call “stage 0”, those cognitively normal subjects with no biomarker evidence of AD pathophysiology and no evidence of subtle cognitive impairment. The other category, which we refer to as “Suspected Non-AD Pathophysiology (SNAP)”, consists of subjects with normal amyloid PET but one or both neuronal injury biomarkers abnormal. Additionally, there are a small number of subjects that do no fit into any group who we labeled “unclassified.”
Two overall trends are evident in Figure 3a and b. As the biomarker cut-point becomes more lenient (i.e. moves from 80% to 95% sensitivity for AD) subjects move out of stage 0 and into stages 1–3 and SNAP. This is true for all 3 cognitive cut-points evaluated. In contrast, relaxing the cognitive cut-points from 5% to 15% has a much less pronounced effect.
Table 3 illustrates in more detail the breakdown of subject classification at a fixed biomarker cut-point of 90% for all biomarkers, and a fixed global cognitive domain score cut-point of 10%. While any fixed cut-points could be chosen for illustration, with these cut-points, 90% of our AD sample is labeled abnormal on biomarkers and 90% of our CN sample is labeled normal on cognition. A biomarker sensitivity of 90% gives a PIB-PET cut-point of 1.5 (cortical-to-cerebellar ratio) which is commonly used to denote an abnormal scan . With these cut-points, 97% of our CN sample were classified: 43% as stage 0, 16% as stage 1, 12% as stage 2, 3% as stage 3, and 23% as SNAP, and 3% were unclassified. Subjects could arrive at stages 2, 3 and SNAP by various combinations of test results. Subjects classified as SNAP had negative PIB PET plus positive FDG PET, HVa or both, with or without subtle cognitive impairment. Unclassified subjects were those with abnormal cognition alone or abnormal cognition plus positive PIB PET but no neurodegeneration. Demographic information for these groups is in Table 4. Figure 4 shows the distribution of subjects in stages 0–3 and SNAP according to the amyloid biomarker, neurodegeneration biomarker and cognitive status with the same cut points.
Using these 90% biomarker and 10% cognitive cut-points, the proportion of subjects who were APOE ε4 carriers was: stage 0, 24%; stage 1, 40%; stage 2, 40%; stage 3, 58%; SNAP , 13%, (highly different across groups, P < 0.001).
Based on our operational approach to the NIA-AA criteria, at 90% biomarker and 10% cognitive cut-points, 31% of our CN subjects met the NIA-AA criteria for preclinical AD (stages 1–3), 43% were in stage 0 and 23% fell into the SNAP category. Only 3% of subjects could not be classified by our approach. The concept of preclinical AD originated with a literature documenting the presence of AD pathology in approximately a third of elderly CN subjects who came to autopsy [24–28]. Many studies have documented the presence of AD pathophysiological processes in living cognitively normal elderly subjects using PIB-PET imaging [17, 23, 29–39], 18FDG-PET imaging [33, 34, 40, 41], CSF assays [20, 42–48], and structural MR [49–56]. We hypothesize that stage 1–3 subjects have entered the AD pathway and, if they live long enough, will progress to incident MCI and then AD dementia. Subjects in stage 0, as defined, neither have subtle cognitive impairment nor abnormal AD biomarkers now. It is possible that some stage 0 subjects could move to stage 1 or beyond in the future.
Approximately a quarter of our CN subjects, 23%, were designated as SNAP. We believe that SNAP does not represent a stage of pre clinical AD, but rather a distinct biologically-based category where amyloid biomarkers are normal but neuronal injury biomarkers are abnormal. We suspect, but can not prove at this time, that such subjects represent the pre clinical stage of non-AD pathophysiological processes. While most cases of dementia in elderly subjects are found at autopsy to have multiple pathologies that include AD, up to one-third are primarily attributable to pathologies other than AD, primarily cerebrovascular disease and synucleinopathy [57–62]. It is therefore expected that preclinical forms of the non-AD pathologies must exist in elderly CN subjects recruited from a population-based sample. Subjects with predominantly cerebrovascular disease or synuclein pathologies but little or no AD pathology should present with a biomarker profile of normal amyloid PET and abnormalities on MRI and FDG [63, 64]. The low proportion of SNAP APOE ε4 carriers, just 13%, provides strong support for the assertion that many of SNAP subjects are in a different mechanistic pathway from those in stages 1–3 where the proportion of APOE ε4 carriers is 3–4x tiems greater.
In this preliminary analysis, we used the same sensitivity level for all biomarker cut points. In the future, alternative approaches that reflect the underlying biology and longitudinal evolution may be preferred. For example, sensitivity for AD of 90% corresponds to a PIB-PET cut-point of 1.5. While this cut point is widely used to diagnose AD, a lower amyloid PET cut-point may be better suited to answer the question of amyloid positivity in CN subjects [49, 52, 65].
Alternative methods for selecting regions of interest for the neurodegeneration imaging biomarkers may be superior to those we have used here. We chose hippocampal volume as the structural MRI measure for this preliminary analysis because it is the most studied and validated MRI measure at this point in time. However, more complex multi-ROI measures including isocortex might be superior [21, 49, 51, 66]. Alternative ROI combinations might also be useful in FDG-PET .
The NIA-AA preclinical criteria specify that in stage 3 subtle cognitive “decline” will be present but do not offer specific guidance on how this should be operationalized. “Decline” is described both in longitudinal and cross-sectional terms – i.e. “evidence of subtle change from baseline level of cognition” or “poor performance on more challenging cognitive tests” . While longitudinal measures of decline have the obvious advantage of controlling for inter-subject variation in baseline performance, implementation of a simple cross-sectional measure of low cognitive performance such as the lowest 10th percentile is more straightforward. Prior longitudinal cognitive test data might not be available in many settings, e.g., preventative therapeutic trials. Somewhat surprisingly, the distribution of subjects among the various categories was similar if the memory domain score vs the composite domain score was used. One possible explanation is that although the pre clinical criteria emphasize memory, a body of literature exists that points to declines in domains other than memory as the initial cognitive signal of impending AD [68, 69]. A cut point based solely on a memory battery would miss these subjects whereas the composite might capture them.
Our approach to cross-sectional cut-point definition for subtle cognitive impairment was different from the approach for cut-point definition for AD biomarkers because the subtle cognitive impairment seen in the preclinical stage of the disease is notably different from that seen in demented subjects or subjects with MCI. In contrast, values for biomarkers, especially amyloid biomarkers, can overlap considerably in CN and demented subjects. By using the cohort under study to define a cognitive cut point, we lacked an independent means of defining abnormality. This meant that a fixed proportion (5%, 10%, 15%) of subjects were designated as cognitively impaired. This has obvious potential for erroneously classifying a few subjects at the margin as abnormal who might not have been with a slightly different definition of abnormal. We suspect that this effect underlies many of our “unclassified” subjects. The problem of using cognitive impairment as both a criterion for preclinical AD and as an outcome in longitudinal observational studies and therapeutic trials will need further exploration. Thus, operationalization of criteria for subtle cognitive impairment is complex, and the definition of the cognitive threshold we have provisionally chosen can likely be improved upon .
As a population-based study , our cognitively normal subjects differ from those recruited into studies such as the Alzheimer’s Disease Neuroimaging Initiative or other clinic-based samples. Age, education and co-morbidities greatly influence the likelihood and rate of progressing to dementia, and therefore evaluating new diagnostic criteria in samples that approximately reflect these variables as they exist in the general population is essential for generalizability and external validity of results [71, 72]. The high prevalence of SNAP in this preliminary exercise underscores the importance of performing studies in subjects where there are as few implicit inclusion and exclusion criteria as possible. Results might be different from samples drawn from memory clinics where recruitment biases might reduce the number of non-AD etiologies.
While the new NIA-AA preclinical criteria broke new ground conceptually, many operational issues were not addressed. These include standardization of biomarker measures, defining biomarker cut-points, how to address discrepancies within biomarker class (e.g., abnormal FDG but normal hippocampal volume), the definition of subtle cognitive impairment, and how to address the non-AD pathophysiological processes that are present in elderly populations. Some limitations of our study include the fact that our subjects had only MRI and PET imaging biomarkers available, not CSF. The number of cognitive testing sessions, and hence practice effects, varied among subjects in our cohort. Other important options remain to be evaluated including alternative biomarker and cognitive cut points and alternative imaging measures. However with this operational approach to implementation, the NIA-AA preclinical AD guidelines function adequately in a population based sample of elderly subjects and, therefore, should be useful in planning future observational and therapeutic studies.
The Alexander Family Alzheimer’s Disease Research Professorship of the Mayo Foundation, USA, and the Robert H. and Clarice Smith Alzheimer’s Disease Research Program of the Mayo Foundation, USA. This study was supported by the NIH/National Institute on Aging (R01 AG11378, U01 AG006786, P50 AG16574, C06 RR018898; C.R.J.).