|Home | About | Journals | Submit | Contact Us | Français|
Since the introduction of amyloid imaging nearly ten years ago, this technique has gained widespread use and acceptance. More recently, published reports have begun to appear in which amyloid imaging is used to detect the effects of anti-amyloid therapies. This review will consider the issues involved in the use of amyloid imaging in the development and evaluation of drugs for the treatment of Alzheimer disease. Current evidence regarding the postmortem correlates of in vivo amyloid imaging data is considered. The application of amyloid imaging to screening subjects for trials and use as an outcome measure is discussed in light of longitudinal changes in the in vivo amyloid signal. While the bulk of this review is directed at symptomatic patients with dementia, consideration is given to the use of amyloid imaging in non-demented subjects as well. Similarities and differences of cerebral amyloid assessment by amyloid imaging and CSF measurements are delineated and an agenda for further research to improve the applicability of amyloid PET to clinical trials is proposed.
This review will focus on the application of amyloid imaging as a tool in the development and evaluation of drugs for the treatment of Alzheimer disease. To avoid confusion, we will use the term “pathophysiology of Alzheimer disease” when referring to the full spectrum of underlying biological abnormalities that begin before symptoms and extend into the clinically evident phases. The main focus will be on the use of amyloid imaging in symptomatic Alzheimer disease [which we will refer to as Alzheimer disease dementia (AD dementia)], but attention also will be given to prodromal and pre-clinical manifestations, including Mild Cognitive Impairment (MCI) - particularly in the context of predicting progression to clinically “Probable AD dementia” (McKhann et al., 1984). From the outset we must be careful to stress that amyloid imaging, as the name implies, is intended to detect brain pathophysiology, but not to make a clinical diagnosis. It can be an important adjunct to a clinical evaluation in making more accurate clinical diagnoses - much as postmortem pathology can ultimately confirm the presence of pathologically proven AD dementia if coupled to a typical clinical history. Used in isolation, amyloid imaging can not diagnose AD, MCI or detect normal or abnormal aging.
Amyloid imaging agents typically detect beta-sheet rich fibrillar deposits of amyloid β-protein (Aβ) in plaques and cerebrovascular amyloid (CAA). For example, tracer binding to plaques and CAA in postmortem brain tissue can be abolished by destruction of the beta-sheet fibrillar structure by formic acid treatment (Ikonomovic et al., 2008). Fibrillar Aβ is a major component of compact/cored plaques, whether or not they are neuritic. In addition, fibrillar Aβ can be found in varying degrees in plaques that have been loosely characterized as “diffuse”, and these diffuse plaques can be detected to some extent by amyloid imaging agents (Burack et al., 2010; Ikonomovic et al., 2008; Lockhart et al., 2007). However, diffuse plaques that are sometimes described as “amorphous” or “fleecy” such as those found in the cerebellum contain little betasheet structure and are not detectable by typical amyloid imaging agents (Ikonomovic et al., 2008). The fibrillar Aβ deposits detected by prototypical amyloid imaging agents may be unique to human brain, as even compact-appearing plaques in squirrel monkey and amyloid precursor protein (APP) transgenic mice do not produce significant binding (Klunk et al., 2005a; Rosen et al., 2009; Toyama et al., 2005).
With regard to the amyloid imaging agents, the focus of this review will be on the most widely evaluated positron emission tomography (PET) tracer, Pittsburgh Compound-B (PiB) (Klunk et al., 2004). At the time of this writing, there have been single, small published studies using each of the F-18-labelled tracers, [F-18]florbetaben (18F-BAY94-9172 or AV-1; (Rowe et al., 2008)), [F-18]florbetapir (AV-45; (Wong et al., 2010)) and [F-18]flutemetamol (3’F-PiB or GE-067; (Nelissen et al., 2009)) in AD dementia patients. Another F-18-labeled agent has been used in preclinical studies, but no human data has been published (Jureus et al., 2010). There is insufficient published evidence available to evaluate any of these F-18 tracers for their potential use in drug trials at present. While the findings discussed below for PiB PET may ultimately be found to extend to these F-18-labeled tracers as well, this can not be assumed until appropriate studies have been repeated with each individual tracer or until pharmacological equivalency to PiB has been established by direct comparison in the same subjects. This becomes especially true in studies aimed at detecting the first signs of in vivo amyloid deposition in cognitively normal subjects, when the lower signal-to-noise ratio of the F-18-labeled tracers may become important.
Another F-18-labeled tracer, [F-18]FDDNP has fundamentally different properties from all of these tracers and will not be discussed here (Small et al., 2006; Thompson et al., 2009; Tolboom et al., 2010; Tolboom et al., 2009a; Tolboom et al., 2009b; Tolboom et al., 2009d).
In this review, PiB imaging will be discussed in the context of how it might be used in therapeutic, clinical trials. For example, it will be assumed that early trials will be conducted in highly specialized referral centers, and although the population of AD dementia subjects in these centers is not necessarily representative of a general AD dementia population as might be captured in an epidemiological study, it is likely to be the target population for AD dementia drug trials in the near future. Essentially all of the PiB PET studies discussed below have been performed in such “specialized center” settings.
Although this review focuses on amyloid imaging in isolation, it is clear that this technique can not fulfill all of the biomarker needs of any clinical trial and will need to be considered as a part of a broader biomarker arsenal. Other promising biomarkers are discussed separately in this position statement. Furthermore, there is significant overlap between the utility of amyloid imaging and measurement of CSF Aβ42 as a screening tool (but not as a trial outcome measure), and so an attempt will be made to address the areas where these two biomarkers may be equivalent and areas where one technique may hold unique advantages.
This review will be organized around the most likely applications of amyloid imaging to clinical trials, i.e., use as a screening tool or as an outcome measure, and in trials of AD dementia or pre-dementia syndromes associated with the pathophysiology of Alzheimer disease. However, within this structure, discussion of the typical aspects of biomarker characterization/validation will be incorporated, including: 1) cross-sectional association between PiB retention and clinical diagnosis; 2) longitudinal change of PiB retention as a marker of AD dementia progression; 3) prediction of progression of prodromal and preclinical syndromes associated with the pathophysiology of Alzheimer disease; and 4) postmortem correlation/validation studies. Among these four, the most basic and important characterization/validation of most biomarkers is correlation with postmortem assessment of pathology, so this will be discussed first.
Amyloid imaging is foremost a method for the detection of Aβ pathology more than it is a surrogate of any other aspect of the pathophysiology of Alzheimer disease or the clinical manifestations of this disorder. However, from the outset, it is recognized that no in vivo measure of pathology is likely to be as sensitive as modern postmortem histological and biochemical measures of pathology detection. For comparison to in vivo amyloid imaging, it is important to choose the method of postmortem analysis that best reflects that in vivo target(s). In the case of in vivo amyloid imaging, the target is Aβ deposition - in all of its forms. This would include all forms of Aβ plaques (e.g., diffuse, cored, neuritic, etc.) as well as cerebral amyloid angiopathy (CAA). Of course, the targets do not include other common forms of pathology associated with AD dementia, in particular, tau pathology in the form of neurofibrillary tangles, dystrophic neurites and neuropil threads. Thus any postmortem grading system or quantitative measure that includes tau pathology should not be a component of a postmortem validation of amyloid imaging. This would include the Braak and Braak staging system of neurofibrillary tangles (Braak and Braak, 1991) and the NIA-Reagan criteria (1997), because the latter incorporates Braak tangle staging in the determination of the “likelihood” of AD dementia. The optimal postmortem correlate for in vivo amyloid imaging may be a specific measure of total Aβ pathology by the use of sensitive and specific anti-Aβ antibodies. These antibodies can be applied in quantitative biochemical (e.g., enzyme-linked immunosorbant assays or ELISA) or immunohistochemical (IHC) analyses of Aβ load. Other biochemical methods to quantify Aβ could apply as well, but these often require specialized equipment and expertise that may be less available.
The literature contains reports of 24 cases that had been studied with PiB PET prior to autopsy (n=14) or after biopsy (n=10). These studies, described in Table 1, meet our aforementioned methodological goals to varying degrees but they do provide important insights about the sensitivity and specificity of in vivo amyloid imaging for detecting the presence of postmortem Aβ pathology (Bacskai et al., 2007; Burack et al., 2010; Cairns et al., 2009a; Ikonomovic et al., 2008; Sojkova et al., 2011; Villemagne et al., 2009). First, in all of the 12 cases with positive PiB PET scans in vivo, postmortem analyses confirmed the presence of significant Aβ deposition (Table 1). Thus, the specificity of PiB PET in this small sample was 100%. This is not surprising, given the discussion above that the postmortem measures are expected to be more sensitive than PET measures. Also consistent with this relative sensitivity is the finding that 3/12 cases that were PiB-negative in vivo showed evidence of some Aβ pathology postmortem (Table 1; highlighted). The Sojkova et al. (Sojkova et al., 2011) Case-B is difficult to interpret: while there were moderate numbers of neuritic plaques by CERAD criteria (Mirra et al., 1991) in the parietal and temporal lobes, there were sparse numbers in the frontal lobe, and Aβ IHC in the precuneus showed no Aβ deposits (and there was no CAA). Thus, it is unclear if this case represents a mismatch between in vivo PiB PET and postmortem pathology. Two cases do appear to be clear mismatches. Cairns et al. (Cairns et al., 2009a) have reported a negative PiB scan in the presence of biochemically and immunohistologically detectable Aβ at levels expected to be detectable in vivo. In a biopsy study, Leinonen et al. reported “Case #6” with high numbers of plaques by IHC (although sparse neuritic plaques) but a negative PiB scan (Leinonen et al., 2008). These may represent an example of non-fibrillar amyloid deposition (i.e., a type of diffuse plaques) or other alterations in the tertiary structure of Aβ, as has been reported in transgenic mice (Klunk et al., 2005b; Toyama et al., 2005). Thus, as expected, the sensitivity of amyloid imaging for all forms of Aβ deposits will be somewhat less than 100%. However, it should be noted that none of the PiB-negative cases with postmortem Aβ deposits met criteria for definite AD dementia, so the sensitivity of PiB for Aβ deposition in pathologically proven definite AD dementia is likely to be closer to 100% than its sensitivity for any form of Aβ deposition. A similar finding has recently been published for the F-18 agent, florbetapir (Clark et al., 2011).
Because the numbers are relatively small, the seven primary, peer-reviewed studies discussed above (six using standard neuropathological criteria) in 14 autopsy and 10 biopsy cases currently provide only “sufficient evidence of an association between PiB PET and postmortem assessment of Aβ pathology.”
Two broad uses of most biomarkers in clinical trials are for entry screening and as an outcome measure. Screening into a clinical trial is typically based on cross-sectional (one-time) collection of biomarker data. Outcome measures require acquisition of pre- and post-treatment (i.e., longitudinal) data. The breadth of applicability of amyloid imaging in clinical trials will be different whether it is used as a screening tool or as an outcome measurement of drug efficacy. For example, the use of amyloid-positivity as an inclusion criterion may be applicable to almost all clinical trials directed at AD dementia, MCI that is “prodromal to AD dementia” as well as primary prevention trials directed at cognitively normal individuals in whom the process of cerebral β-amyloidosis has begun. The purpose of this screening is to increase the homogeneity of the clinical trial population by including only those with Aβ deposition. Thus amyloid imaging as a screen could be useful regardless of the mechanism of action of the putative therapeutic. In contrast, amyloid imaging as an outcome measure is likely to be most applicable to therapeutics designed to significantly affect fibrillar Aβ levels over time.
Although a diagnosis of clinically “Probable AD dementia” made using standard criteria (McKhann et al., 1984) in the setting of a specialized center such as the Alzheimer Disease Centers in the US has been confirmed by autopsy in over 95% of cases (Mayeux et al., 1998), this number can drop to near 70% in less specialized settings (Knopman et al., 2001). Inclusion of only amyloid-positive AD dementia subjects in clinical trials is likely to increase the homogeneity of the study population due to exclusion of non-Alzheimer dementias. This would Page 10 be most important in trials of anti-amyloid therapies, but is likely to be important for any AD dementia trial. Table 2 shows that across 14 specialty centers plus the Alzheimer’s Disease Neuroimaging Initiative (ADNI), and using the NINCDS-ADRDA criteria for the diagnosis of AD, 328 of 341 (96%) clinically diagnosed AD dementia patients were amyloid-positive. This is consistent with the clinical diagnostic accuracy of Alzheimer Disease Centers previously found by autopsy (Mayeux et al., 1998). This finding suggests that the use of amyloid imaging can extend a very high rate of diagnostic accuracy to even less specialized centers when used as a screening tool in conjunction with a clinical diagnosis of dementia.
Although this sensitivity for clinical AD dementia is encouraging, the specificity of amyloid imaging for the clinical diagnosis of AD dementia deserves some attention in the context of using a biomarker for screening entry into AD dementia clinical trials. Table 2 suggests that the specificity of amyloid imaging for the “diagnosis” of AD dementia is ~76% when including only AD dementia and control subjects and would be worse with MCI subjects included. However, it must be kept in mind that amyloid imaging will not be used in isolation to make a diagnosis of AD dementia, MCI or “normal aging”. Amyloid imaging will be used to assess the underlying patholophysiology of subjects who have already been clinically evaluated and given a preliminary diagnosis. In this sense, the 24% PiB-positive subjects in the cognitively normal group is not a “false positive”; rather, based on the autopsy studies discussed above, these are more likely to be true positives for the presence of Aβ deposition, and the same will apply to the MCI subjects discussed below.
It is evident from Table 2 that the absolute value (and dynamic range) of PiB retention is dependent on the particular center conducting the study. This partly relates to trivial issues like the use of different units to express the outcome (i.e., DVR, SUVR/tissue ratios, BPND, MCBP, etc.), but also relates to some true differences caused by scanner differences, use of single regions or global means, use of atrophy correction, the size and location of the reference region and other factors. Even so, the values do tend to fall into a relatively narrow range. In addition, the cutoff values used in these studies all seem to identify similar subjects as amyloid-positive and amyloid-negative. This simplifies the task of standardization across studies. Over these 15 peer-reviewed studies of 341 AD dementia and 651 cognitively normal subjects, the difference in PiB retention observed in AD dementia subjects and cognitively normal controls was highly significant (p<0.001), and the effect sizes were very large (3.2 ± 1.4).
The primary, peer-reviewed studies discussed above, using standard clinical diagnostic criteria for AD dementia in 992 cases (combined AD dementia and controls; χ2=469; p<0.0001) provide “sufficient evidence of a direct relationship between PiB PET signal and the clinical diagnosis of AD dementia.”
There are other, more “fine-grained” approaches that have been employed to provide even stronger support for the link between a biomarker and a clinical diagnosis. One of these is demonstration of statistical differences in group means between control vs. MCI and MCI vs. AD dementia. Although this approach is commonly used, application of it to amyloid imaging tends to obscure the major advantage of this particular biomarker: distinguishing amyloid-positive subtypes of controls, MCI and AD dementia subjects. It makes the least sense to compare mean values of PiB retention in a group of MCI subjects, when most MCI cohorts are composed of similar-sized subgroups of two very different populations: those with detectable amyloid (and presumably prodromal AD dementia; see below) and those without. The strength of amyloid imaging is to distinguish the amyloid-positive subtypes within these diagnostic groups, not to distinguish between the diagnostic groups. That having been recognized, Table 3 shows several studies in which amyloid imaging has shown statistically significant differences in control vs. MCI or MCI vs. AD dementia comparisons (Devanand et al., 2010; Forsberg et al., 2010; Forsberg et al., 2008; Jack et al., 2008; Kemppainen et al., 2007; Koivunen et al., 2008; Li et al., 2008; Lowe et al., 2009; Mormino et al., 2009; Okello et al., 2009b; Pike et al., 2007; Rowe et al., 2010; Tolboom et al., 2009d), although when the control group contains subjects with very high levels of amyloid, the control vs. MCI distinction may not be apparent (Forsberg et al., 2008; Jack et al., 2008; Lowe et al., 2009).
Although these 13 primary, peer-reviewed studies using standard diagnostic criteria and comprising 960 cases (combined controls, MCI and AD) show the ability to discriminate between control vs. MCI in 8/12 studies and MCI vs. AD dementia in 9/10 studies, the significant overlap at the individual subject level between the MCI group and both the control and the AD dementia groups suggests that there is only “sufficient evidence of an association between PiB PET and the distinction between control vs. MCI and MCI vs. AD dementia.”
Another fine-grained approach is the correlation of a continuous measure of the biomarker (e.g., PiB SUVR) with continuous measures of cognition or clinical function [e.g., episodic memory scores or CDR sum of boxes (SOB)]. This sort of correlation is not a strength of amyloid imaging, although several studies do report significant correlations of this type. Five studies have shown a significant correlation between degrees of PiB retention and levels of cognitive performance when combining controls, MCI and AD dementia subjects (Edison et al., 2007; Forsberg et al., 2010; Pike et al., 2007; Rentz et al., 2010; Tolboom et al., 2009b), but one has not (Jagust et al., 2009). Several studies (3 in controls; 1 in MCI and 3 in AD dementia) have found a significant correlation between PiB and cognition in single diagnostic groups (Darreh-Shori et al., 2010; Engler et al., 2006; Grimmer et al., 2009a; Mormino et al., 2009; Pike et al., 2007; Rentz et al., 2010) - although one of these studies suggests the correlation is mediated through hippocampal atrophy (Mormino et al., 2009). At least five studies failed to find any correlation between continuous PiB retention measures and continuous cognition scores in one or more of the diagnostic groups when the latter were considered in isolation (Aizenstein et al., 2008; Forsberg et al., 2010; Furst et al., 2010; Pike et al., 2007; Rowe et al., 2010). Very demanding memory tests may be required to demonstrate this continuous correlation in cognitively normal controls (Rentz et al., 2010). Most likely, the usually weak correlations between continuous measures of amyloid imaging and cognition is due to the fact that Aβ deposition is a very early event in the full spectrum of pathophysiological changes in this disorder and does not necessarily correlate quantitatively with late events like cognition and clinical function [for a review, see (Jack et al., 2010)]. When done side-by-side, other later stage biomarkers (e.g., CSF tau protein levels, hypometabolism and brain atrophy) tend to correlate better quantitatively with degree of cognitive impairment than does PiB (Engler et al., 2006; Jack et al., 2009; Jagust et al., 2009; Mormino et al., 2009; Storandt et al., 2009).
Given the fairly weak correlations and the contradictory finding, these studies can provide only “limited/suggestive evidence of an association between continuous PiB PET levels and continuous measures of cognitive/clinical performance .”
Longitudinal change in a biomarker is often considered a surrogate for biological progression of disease. In the case of AD, it is increasingly believed that the noise in biomarkers over time may be less than that in cognitive or functional measures of disease progression. In turn, it is believed that this decreased variability will facilitate detection of drug-induced changes in disease progression, i.e., a disease-modifying effect. Therefore, change in a biomarker over time (i.e., the natural history of that biomarker) is increasingly being considered as an outcome measure for clinical trials. Driven largely by the evaluation of a variety of analytical approaches to the measurement of MRI volumetry and cerebral metabolism measured by FDG PET, a metric has evolved that is the sample size required for a drug-induced 25% reduction in the rate of change in a biomarker over some specified period of time. This metric has become a staple of biomarker comparisons in the ADNI (Cummings, 2010; Weiner et al., 2010).
However, this 25% rate-reduction metric may not be well-suited for amyloid imaging for two reasons. First, PiB retention increases slowly over time. Aβ deposition is believed to begin 10–15 years prior to the diagnosis of AD dementia and, as will be discussed below, continues to progress slowly during the clinical course of AD dementia. Obviously, an amyloid-free individual destined to develop typical AD dementia must have an increase in PiB retention in order to progress from the PiB retention typical of controls (~1.1 SUVR units) to that typically found in AD dementia (>2.0 SUVR units). However, this accumulation often occurs over 10–20 years, suggesting a rate of increase of 0.05–0.10 SUVR units per year. Given that the test-retest variability is on this same order, detecting a 25% reduction in this rate will be difficult. Second, and more importantly, this slow rate of change becomes moot when considering the fact that achieving a 25% reduction in the rate of Aβ accumulation over time may not be clinically meaningful unless the drug is begun when there is still little or no Aβ deposition.. Any clinically relevant anti-amyloid drug will likely need to actually decrease the Aβ fibrillar load (i.e., PiB retention) from baseline (although other forms such as soluble oligomers may be targeted by some Aβ-lowering treatments). This means a >100% reduction in the rate of increase of PiB signal over time. This is not an unreasonable goal, and those contemplating the use of amyloid imaging as an outcome biomarker in trials have the unique advantage of being able to refer to published data showing that a significant reduction in fibrillar Aβ load could be detected over 78 weeks with only 20 mild-moderate AD dementia patients in the treatment arm and 8 in the placebo arm (Rinne et al., 2010). Using the passive immunotherapy, bapineuzumab, in a Phase 2 trial, Rinne et al. reported a decrease of 0.9 PiB SUVR units in the bapineuzumab-treated patients over 78 weeks that was significant both when compared to the patients’ own baseline PiB retention or to the increase of 0.15 SUVR units observed in the placebo group over the same 1.5 year period of time. This outcome is thus equivalent to a 160% decrease in the rate and represents a 25% reduction in the absolute amyloid load (not rate) of the treated group compared to the placebo group. As will be seen below, the increase observed in the bapineuzumab placebo group is typical of that observed in AD dementia natural history studies. It is the characterization of this natural history as a foundation for drug trials that makes longitudinal studies of amyloid imaging measures important.
This single study, performed using standard methodology and showing detection of a drug-induced change in the absolute level of amyloid load constitutes “limited/suggestive evidence of an association between PiB PET and drug-induced changes in disease biology.”
Given the newness of amyloid imaging as a biomarker, there is still relatively little longitudinal data in the published literature. The few small studies that have been published suggest that, in individuals who are amyloid-positive at baseline, a relatively slow increase in amyloid deposition occurs across the full spectrum of the illness from preclinical stages to symptomatic AD dementia. Individuals who are amyloid-negative at baseline tend to show little change over time - although some become amyloid-positive and then progress. Amyloid-positive individuals rarely, if ever, revert to amyloid-negative status. While this is apparent at the individual level, group increases are not always observed at the AD dementia stage. The first longitudinal followup study of the 16 AD dementia subjects included in the original report of PiB PET imaging (Klunk et al., 2004), showed no significant group change over 2 years of follow-up in any brain area examined (Engler et al., 2006). Similar findings were reported for group-level determinations in 14 AD dementia patients studied at the Turku PET Center over 2 years (Scheinin et al., 2009). However, closer inspection of these data showed that a majority of the AD dementia patients in the Engler et al. study tended to show a combination of increased PiB retention and decreased cerebral metabolism (Klunk et al., 2006). Similarly, while only the Page 17 medial frontal cortex showed a statistically significant (4.3%) group increase in the Turku study (Scheinin et al., 2009), 10 of the 14 subjects tended to show an increase in PiB retention. Similar results can be seen in the ADNI natural history data, where group changes were not significant but 3 of the 12 AD dementia patients showed a significant increase in PiB retention over 1 year (Jagust et al., 2010). In a group of 21 cognitively normal, 32 amnestic MCI and 8 AD dementia subjects, Jack et al. (2009) found that the annual rate of change in global PIB retention ratio was significantly greater than zero over all subjects (p<0.001), and individually among cognitively normal (p = 0.002), and amnestic MCI subjects (p = 0.008), with a trend in Alzheimer’s disease (p = 0.11). Overall, the rate of change was small (0.03–0.06 SUVR units per year) across these 3 groups but tended to be higher in the subjects who were amyloid (PiB) positive at baseline (Jack et al., 2009). This rate matches the expected rate of change mentioned above. Grimmer et al. followed 24 AD dementia patients and found a significant increase in PiB retention of ~0.14 SUVR units over 24 months (8.7 ± 14.3%; annual rate=3.92%) (Grimmer et al., 2010). Grimmer et al. found that the increase was dependent on apolipoprotein-E (ApoE) ε4 gene dose, with the 5 homozygotes showing a 0.31 ± 0.27 SUVR increase over the 2 years. These latter two studies suggest that the rate of amyloid accumulation is relatively constant over the course of the disease - at least through the phase of mild AD dementia. This result suggests that the course of PiBdetectable Aβ deposition over time is best described by either a linear model or a sigmoid with a very gradual incline that does not level-off until the late stages of AD dementia.
Two studies have examined the relationship between the rate of change in PiB retention and the rate of cognitive decline on CDR and MMSE and differ on their results, so that insufficient evidence exists to comment on this specific relationship (Grimmer et al., 2010; Jack et al., 2009).
Given some contradictory evidence about the ability to detect change in PiB retention over time, the reports discussed above constitute only “limited/suggestive evidence of an association between PiB PET signal and disease progression.”
There is growing consensus that it will be necessary to study drugs earlier than the stage of clinical dementia in order to find robust treatment effects for the pathophysiology of Alzheimer disease. The first difficulty in conducting these early-intervention trials is that clinical diagnosis becomes less and less accurate as we move into these prodromal stages and clinical evaluation as currently conducted becomes useless in presymptomatic phases. The second difficulty is that it may take prohibitively long periods of time to determine a drug-effect on clinical measures alone when subjects are enrolled at very early stages. Thus, biomarkers are likely to play their most valuable role in clinical trials of “not-yet-demented” subjects. In the context of such trials, biomarkers can play three important roles. The first two roles are for screening and as an outcome measure, as discussed above for trials in AD dementia. The third role is staging non-demented subjects to select those that are likely to show a significant clinical change over a relatively short period of time (e.g., conversion from a diagnosis of MCI to AD dementia). A question that must be asked about a biomarker in this context is: can the biomarker accurately identify subjects who are destined to progress to the next clinical stage? Therefore, the next two sections will review cross-sectional studies that have measured PiB retention at baseline in MCI or cognitively normal subjects (i.e., relevant for screening purposes) and longitudinal studies that have assessed whether baseline PiB retention can predict future cognitive course (relevant for predicting impending clinical progression). The issues of using amyloid imaging as an outcome measure in pre-dementia subjects are essentially identical to those discussed above for AD dementia and will not be discussed again here.
Many studies have reported the results of PiB PET imaging in MCI subjects (Bourgeat et al., 2010; Butters et al., 2008; Cohen et al., 2009; Devanand et al., 2010; Forsberg et al., 2010; Forsberg et al., 2008; Fripp et al., 2008; Jack et al., 2008; Jack et al., 2009; Jagust et al., 2010; Jagust et al., 2009; Kemppainen et al., 2007; Koivunen et al., 2008; Li et al., 2008; Lopresti et al., 2005; Lowe et al., 2009; Okello et al., 2009a; Okello et al., 2009b; Pike et al., 2007; Price et al., 2005; Raji et al., 2008; Rowe et al., 2010; Rowe et al., 2007; Tolboom et al., 2009b; Tolboom et al., 2009d; Wolk et al., 2009; Zhou et al., 2007). However, as some of these studies represent progressive accumulations of subjects and variations in the analysis method, Table 4 describes only the most recent relevant paper from each group, in order to avoid counting single subjects more than once in this analysis.
One thing that is quickly apparent when working with amyloid imaging is the bimodal nature of the scans. Visually, they tend to be clearly positive or clearly negative, suggesting that people fall into one or the other of two distinct populations (amyloid-positive and amyloid-negative) (Ng et al., 2007; Rabinovici et al., 2007; Suotunen et al., 2010; Tolboom et al., 2010). The use of visual reads as a screening tool could be very useful for clinical trials, as this is relatively easy to standardize. This bimodal character also is apparent in quantitative data. Figure 1A shows a histogram of global cortical PiB retention across more than 300 subjects of all diagnoses (control, MCI, AD, other dementias, etc.) studied at a single site (Pittsburgh) using identical methods of acquisition and analysis across all subjects. The bimodal distribution of global cortical PiB retention is readily apparent and is described well by a relatively tight, amyloid-negative population with a mean SUVR of ~1.4 ± 0.15 and a much more broadly distributed, amyloid-positive population with a mean SUVR of 2.5 ± 0.40. Note that the absolute value of these numbers will vary depending on the analysis method, the use of atrophy correction, the size and location of the reference region and other factors. They are given here simply for comparison to each other, but the exact value should not be taken as universally applicable.
The standard PiB-positive cutoff is drawn with a dashed, vertical line, but it is clear that there will be some overlap of the two populations wherever a cutoff is drawn. Figure 1C is a similar representation of 90 cognitively normal control subjects and Figure 1D shows 41 mild-moderate AD dementia patients. Approximately 80% of the controls reside within the amyloid-negative population and only 1 of 90 has reached even the mean of the amyloid-positive population. In contrast >95% of the AD dementia patients reside within the amyloid-positive population and 80% are above the mean of that population. Figure 1B shows a histogram of 50 MCI subjects. The MCI subjects do not represent a third, separate, intermediate population (as they would on cognitive measures) but they are comprised of subjects that tend to fit into one or the other of the two populations that comprise our total cohort. This is a graphical representation of the fact that somewhere near half (this proportion varies with the age and ApoE ε4 frequency of a particular population) of MCI subjects are AD dementia-like as regards PiB signal and the rest are controllike. Of the 272 MCI subjects from the nine studies included in the analysis in Table 4, 59% are PiB-positive. If broken down into amnestic and non-amnestic subtypes, 63% of 242 amnestic MCI subjects were amyloid-positive and 27% of the 30 non-amnestic subjects were amyloid-positive. These data show that PiB imaging is very well-suited to dichotomize MCI patients based on the underlying pathophysiology. This could be extremely useful for screening into a clinical trial. However, amyloid imaging in isolation from clinical and cognitive data is poorly suited for the identification of a subject who is likely to receive a clinical diagnosis of MCI, since a roughly equal portion of MCI cases are PiB-positive and PiB-negative. But this is not how this biomarker would be used in a clinical trial. The reasons trial designers would want to know the amyloid-status of an MCI subject would be if knowledge of that status could help them: 1) decrease the heterogeneity of their trial population (applies to most trials); 2) identify a cohort that is likely to respond to a drug with a certain mechanism (applies mainly to anti-amyloid trials); and 3) assemble a cohort that is likely to convert to an endpoint of AD dementia in a relatively short period of time (applies to most secondary prevention trials). The first two points relate to the discussion above. The third point (prediction of progression) will be discussed in the following section.
The primary, peer-reviewed studies discussed above, using standard diagnostic criteria for MCI and AD dementia in 272 cases provide “sufficient evidence for the lack of a quantitative association between PiB PET and the clinical diagnosis of MCI.” Nevertheless, PiB PET could be useful for screening MCI subjects into amyloid-positive vs. amyloid-negative subtypes for inclusion in clinical trials.
Very similar issues apply to screening presymptomatic subjects into clinical trials. Table 2 shows that 24% of 651 cognitively normal control subjects studied with PiB PET are amyloid-positive. Thus, when comparing only AD dementia and control subjects, a PiB-negative scan was 76% sensitive and 96% specific for identifying controls subjects. Figure 1C shows that most of these are likely to be on the low end of the amyloid-positive spectrum (<2.2 SUVR) and can be distinguished from the vast majority of AD dementia patients (>2.2 SUVR). This suggests that a cutoff higher than that typically used to detect any amyloid deposition (i.e., the “PiB-negative” cutoff) could better differentiate between cognitively normal controls and clinically diagnosed AD dementia patients. Using the typical “PiB-negative” cutoff of SUVR=1.6 for the Pittsburgh data gives very similar sensitivity (80%) and specificity (98%) for a PiB-negative scan to identify controls. However, if we use an SUVR of 2.2 as the “PiB-AD” cutoff for the Pittsburgh data, a PiB-negative scan becomes 96% sensitive and 90% specific for identifying controls. Thus, it may be best to use a “PiB-negative” cutoff when the goal is to identify subjects with any evidence of Aβ deposition and a “PiB-AD” cutoff when the goal is to identify subjects who are AD-like. The point to be made here is simply that, although PiB PET can identify fibrillar Aβ deposition in ~25% of cognitively normal controls, this deposition is typically low and very distinguishable from that seen in AD dementia.
It also is important to recognize that the percentage of amyloid-positive subjects in a cognitively normal population is highly dependent on both age and the presence of an ApoE ε4 allele. In a study of 241 cognitively normal subjects, Morris et al. showed that the frequency of individuals with elevated PiB retention rose in an age-dependent manner from 0% at ages 45–49 years to 19% at 60–69 years to 30.3% at 80–88 years (Morris et al., 2010). Rowe et al. studied 177 cognitively normal controls and also found an age effect such that elevated PiB retention was seen in 18% at age 60–69 years and 65% over age 80 (Rowe et al., 2010). Morris et al. also showed that there was a gene dose effect for the ApoE ε4 genotype, with greater PiB retention with increased numbers of ε4 alleles such that 8.2% of age 60–69 ε4 non-carriers were PiB-positive while 75% of age 80–89 ε4 carriers were PiB positive (Morris et al., 2010). In addition to the 177 cognitively normal controls, Rowe et al. also studied 57 MCI and 53 AD dementia subjects and reported that ε4 carriers had higher PiB retention in the control and MCI groups, but not in the AD dementia group (Rowe et al., 2010).
The primary, peer-reviewed studies shown in Table 2, using standard diagnostic criteria for normal cognition and AD, in 992 cases (651 controls and 341 AD; χ2=469; p<0.0001) provide “sufficient evidence of a direct relationship between PiB PET and the clinical diagnosis of cognitively normal.” However, this diagnosis remains a neuropsychological/clinical one and PiB PET is best suited for screening cognitively normal subjects into amyloid-positive and amyloid-negative subtypes for inclusion in clinical trials.
In addition to the clinical categories of control, MCI and AD, one also must consider the issue of differential diagnosis among different dementias when considering the usefulness of a biomarker in clinical trial design. Amyloid imaging will not distinguish mixed dementias when only one component of the pathology is Aβ deposition. That is, amyloid imaging is likely to be good at ruling-in Aβ pathology, but can’t rule-out non-Aβ pathology. Thus, a majority of cases of Dementia with Lewy Bodies (DLB) will show AD dementia-like levels and patterns of Aβ deposition (Edison et al., 2008b; Gomperts et al., 2008; Maetzler et al., 2009; Rowe et al., 2007) and the clinical symptoms or additional imaging with dopamine transporter tracers may help distinguish these cases if necessary (McKeith et al., 2007). However, it is very unlikely that all DLB pathology can be excluded from any AD dementia trial since it is so common even in the absence of DLB symptoms (McKeith et al., 1999). Parkinson’s disease and Parkinson’s dementia are typically well-distinguished from AD dementia with PiB PET (Edison et al., 2008b; Gomperts et al., 2008; Johansson et al., 2008; Maetzler et al., 2008), although mixed cases can occur (Gomperts et al., 2008; Maetzler et al., 2008). Pure cases of CAA may be distinguishable from AD dementia by an occipital-predominant pattern of PiB retention (Greenberg et al., 2008). Semantic dementias rarely show Aβ deposition, so these could be identified as amyloid-negative (Drzezga et al., 2008; Rabinovici et al., 2008). Many cases of logopenic aphasia and posterior cortical atrophy may be atypical presentations of AD dementia (Kambe et al., 2010; Migliaccio et al., 2009; Rabinovici et al., 2008; Tenovuo et al., 2008), so it may not be advisable to exclude such cases from trials of AD dementia therapeutics. Pure vascular dementia should be distinguishable by amyloid imaging, but AD dementia pathology is commonly mixed with vascular pathology. Clinical history and MRI findings should be able to exclude these cases if necessary (Mok et al., 2010). Also, it must be kept in mind, that just as approximately one-quarter of normal elderly show low levels of fibrillar Aβ deposition without cognitive impairment, a similar proportion of elderly subjects with dementia from other causes could show these same low amounts of amyloid, even if the amyloid is not contributing to the clinical dementia. The use of a higher “PiB-AD” cutoff as discussed above may help screen-out amyloid-positive subjects with dementia due to causes other than AD dementia who might have incidental/low amyloid deposition.
Only a portion of patients with MCI progress to clinical AD dementia over 5–10 years (Petersen et al., 1999; Ritchie et al., 2001; Visser et al., 2006) and a recent meta-analysis concluded that most people with MCI will not progress to dementia even after 10 years of follow-up (Mitchell and Shiri-Feshki, 2009). In a longitudinal study of 134 MCI cases followed for 4 or more years, Hansson et al. (Hansson et al., 2006) reported that 43% developed clinical AD, 42% remained cognitively stable (but could, of course, develop AD dementia in the future) and 15% developed other dementias (mostly vascular). Two community-based studies have shown over one-third of patients diagnosed with MCI at baseline may eventually return to normal cognition (Ganguli et al., 2004; Larrieu et al., 2002). Obviously, it would be of great value to be able to predict which MCI subjects were destined to progress to a clinical diagnosis of AD dementia. The five studies listed at the bottom of Table 4 describe longitudinal follow-up of 155 MCI subjects (141 amnestic) who were followed between 1 and 3 years after their baseline PiB PET scan (Forsberg et al., 2008; Jagust et al., 2010; Koivunen et al., 2008; Okello et al., 2009b; Wolk et al., 2009). Of these 155 MCI subjects, 57 (37%) progressed to a clinical diagnosis of AD dementia over 1 to 3 years. The distribution of converters was far from random across these amyloid-positive and amyloid-negative groups. Of the 57 converters, 53 came from the 101 amyloid-positive subjects (representing a 53% conversion rate) and only 4 came from the 54 amyloid-negative subjects (7% conversion rate) (χ2 = 30.7; p<0.0001). It remains to be seen whether these latter 4 amyloid-negative converters were mis-diagnosed with AD dementia or represent false-negatives for PiB PET. Conversion rates from amyloid-positive subjects in the amnestic and non-amnestic categories could not be determined from the data published, but non-amnestic, amyloid-positive MCI subjects did show conversion to AD dementia in at least two studies (Wolk et al., 2009).
Other studies also have reported conversions from MCI to AD dementia. Jack et al. reported 9 amnestic MCI subjects studied at Mayo Clinic who had PiB PET scans (Jack et al., 2009). Three of these subjects converted to AD dementia within one year and one “reverted” to normal cognition, but it was not reported if these subjects were PiB-positive or PiB-negative in that report. Interestingly, Wolk et al. report 3 reversions to normal cognition and all 3 were PiB-negative.
The primary, peer-reviewed studies discussed above (using standard diagnostic criteria for MCI and AD) included 155 subjects and the data were so overwhelmingly significant that they constitute “sufficient evidence for a direct relationship between PiB PET and the likelihood of conversion from a clinical diagnosis of MCI to a clinical diagnosis of AD dementia over 3 years.”
There is very little similar prospective data on clinical conversions from cognitively normal controls to either MCI or AD dementia. This is not surprising given that the presymptomatic lag phase between initiation of Aβ deposition and emergence of clinical symptoms may be 10–15 years and most subjects have been followed for no more than 5 years. Several studies have looked retrospectively at data gathered in cohorts of subjects who were cognitively normal at baseline and then followed to the time of a PiB PET scan. Villemagne et al. reported a retrospective study of the cognitive course of 34 subjects who started with normal cognition in 1996 and were followed with 7–9 yearly visits prior to agreeing to a PiB PET scan (Villemagne et al., 2008). Ten of these 34 were classified as cognitive “decliners” by raters blinded to the PiB status. Three of these 10 were further diagnosed with MCI and one additional subject with AD; the other 6 decliners remained in the cognitively normal range. Seven of these 10 decliners (including all 3 MCI cases and the AD dementia case) were PiB-positive (70%) compared to 4/24 (17%) stable subjects. Although clinical conversion was not addressed, two other studies have similarly shown that the rate of cognitive decline in subjects who were cognitively normal at baseline is related to PiB retention. Storandt et al. followed the cognitive status of 135 individuals from 1985 to the time of PiB PET in 2004 (Storandt et al., 2009). They found that PiB retention was unrelated to the current cognitive performance in 2004, but was related to decline in working and visuospatial memory over the previous 19 years in the 29 subjects who were amyloid-positive (but not in the amyloid-negative group, as expected). Resnick et al. studied 57 participants for an average of 10.8 years who received a PiB PET scan near the end of that period (Resnick et al., 2010). They found greater declines over time in mental status and verbal learning and memory, but not visual memory, that were significantly associated with higher PiB retention. One of the subjects in this study progressed from normal cognition to MCI who was PiB-positive (Resnick et al., 2010). Similarly, Reiman et al. reported a single case of an ApoE ε4 homozygote who converted from normal cognition to MCI 7 months before a PiB PET scan and was found to be PiB-positive (Reiman et al., 2009).
Only one study has reported prospective, longitudinal cognitive outcomes in normal subjects imaged with PiB (Morris et al., 2009). In that study, 159 participants, with CDR=0 at the time of their baseline PiB PET scan, were followed for 0.8–5.5 years. Nine of these subjects converted to a diagnosis of “dementia of the Alzheimer’s type (DAT) at the CDR 0.5 stage.” PiB retention was a stronger predictor of time to DAT [hazard ratio (HR) 4.82 (1.22–19.01); p=0.02] than age [HR 1.14 (1.02–1.28; p=0.03]. Education, ApoE ε4 allele status and gender were not significant predictors. Unfortunately, the individual PiB-status of the 9 DAT-converters was not reported in this study. Jack et al. reported that 1 of 10 control subjects from a prospective Mayo study converted to MCI over 1 year, but did not report the PiB-status (Jack et al., 2009).
Although there is one study with a relatively large number of subjects (n=159), there were only 9 conversions to DAT CDR 0.5 in this study. The other studies are retrospective or report on conversions of single subjects that were not the focus of the main study. Therefore, these data constitute only “limited/suggestive evidence of an association between PiB PET positivity and the likelihood of conversion from normal cognition to a clinical diagnosis of MCI.”
As mentioned in the introduction, there is overlap in the utility of CSF and PET measures of amyloid deposition for clinical trials. Both are primarily measures of brain Aβ pathology. This overlap is mainly in the area of selecting amyloid-positive subjects for trials, i.e., screening. Most available data seem to indicate that CSF Aβ42 decreases relatively quickly to its final level very early in the course of the pathophysiological spectrum of AD dementia - probably presymptomatically (Blennow and Hampel, 2003; Fagan et al., 2009; Fagan et al., 2007; Hansson et al., 2006). That is, the change in CSF appears to be almost a step-function, and longitudinal studies have not shown a progressive decrease in CSF Aβ42 over time (Buchhave et al., 2009). This is not surprising given the fact that typical concentrations of Aβ found in insoluble deposits in AD dementia cortex are ~5,000 µg/L (~1 µM) (Klunk et al., 2005b; Naslund et al., 2000), while typical soluble Aβ concentrations in the cortex are on the order of 50 µg/L (Klunk et al., 2005b) and CSF Aβ concentrations are ~0.5 µg/L (Fagan et al., 2006) - or 0.01% of insoluble cortical Aβ. Thus, it is not surprising that relatively little cortical Aβ would need to deposit before a new equilibrium would be established with CSF. This conclusion has two implications in clinical trials: 1) for screening purposes, CSF Aβ42 may drop before fibrillar Aβ is detectable by PET; and 2) as an outcome measure, CSF Aβ42 is not likely to normalize until the vast majority of cortical Aβ deposits are removed. This implies that CSF Aβ42 and PiB PET would be roughly equivalent as screening tools for AD dementia and MCI trials and that CSF Aβ42 may have advantages in detecting amyloid-positive controls - although this has yet to be demonstrated. The more dynamic nature of amyloid signal by PET and the fact that PiB retention correlates directly with fibrillar Aβ load (Ikonomovic et al., 2008) makes this a more suitable outcome measure. Indeed, the ability of PiB PET to show an amyloid-lowering effect of passive immunotherapy in humans has now been reported (Rinne et al., 2010). While we often reduce imaging data to a single number (e.g., mean cortical PiB retention), we must remember that a major advantage of any imaging technique is the wealth of regional information that is provides. Whereas amyloid PET can quantify amyloid load throughout the brain, it is not clear what pool of brain Aβ42 is represented by changes in CSF Aβ42. One study has suggested that CSF Aβ42 is most tightly correlated with PiB retention in brain regions immediately adjacent to CSF spaces (Grimmer et al., 2009b). The rich regional information in an amyloid PET scan also allows differentiation not only by quantitation but also regional specificity. This is especially important because it allows visual reads of amyloid PET scans to be highly accurate in distinguishing normal from abnormal scans (Ng et al., 2007; Rabinovici et al., 2007; Suotunen et al., 2010; Tolboom et al., 2010). Visual reads are relatively easy to standardize because the technical variables in quantifying the amyloid PET signal are not a factor. Of course, visual reads would apply almost exclusively to use in screening and don’t lend themselves to detection of small changes. Therefore, CSF Aβ42 and PiB PET may be equivalent screening measures for entry into clinical trials in AD dementia and MCI. Differences in the costs, practicalities and risks of the two procedures for the application at hand would determine which is better suited to a particular trial. CSF Aβ42 could have an advantage in identifying more amyloid-positive controls than PiB PET. Amyloid PET has the advantage of the easily standardizable visual read, but the greatest advantage of amyloid imaging for clinical trials is as a quantitative outcome measure for drugs expected to decrease fibrillar Aβ load. If amyloid PET is to be used as an outcome biomarker, it is necessary to obtain a pre-treatment scan for comparison, so it seems logical to use this as the screening tool as well if amyloid PET will be used as the outcome measure. However, it is sometimes inappropriate to use the same measure for screening as is used for an outcome measure. In these cases, it may be appropriate to use CSF Aβ42 as the screening tool and amyloid PET as the outcome measure.
As for any biomarker, standardization of its application across many centers around the world, and across varying degrees of expertise, is critical for utility in clinical trials. As stated above, visual reads for a screening into amyloid-positive and amyloid-negative subtypes is relatively easy to standardize (Ng et al., 2007; Rabinovici et al., 2007; Suotunen et al., 2010; Tolboom et al., 2010). However, improved software to analyze and display amyloid PET scans in a standardized manner could aid in widespread applicability. With respect to quantitative assessment of amyloid PET signal, simplified dynamic methods of analysis such as Logan DVR (Lopresti et al., 2005) and SRTM/SRTM2 (Tolboom et al., 2009c; Zhou et al., 2007) may produce the most accurate and reproducible results, but practical considerations have frequently led to the application of tissue ratios and SUVR of short/late scans (Lopresti et al., 2005; McNamee et al., 2009), and these have proven to be good substitutes for the dynamic methods. Standardization of these quantitative ratio methods depend first on proper choice of the reference region (e.g., cerebellum or pons). This decision should be made carefully at the beginning of the trial, and it is important to not only choose the reference region carefully, but also carefully choose the exact method for delineating the region. This involves choices of normalization and automation. Decisions regarding atrophy correction or tissue segmentation (which often produces an equivalent outcome) must be made for each trial. The electronic nature of all PET data allows one to easily send the data to a central processing site, so that any variability in the application of the “analysis pipeline” can me minimized.
Another way to minimize variability is to use the same amyloid PET tracer throughout the trial. While the current discussion has centered on [C-11]PiB because of the lack of sufficient literature on the F-18 tracers, it is clear that widespread applicability of amyloid PET to the 90% of PET scanners that do not have access to a cyclotron could be enhanced by the use of an F-18 amyloid tracer. These F-18 tracers appear to come with somewhat inferior signal-to-noise qualities, and this could add to variability, so the desirability of application to many PET sites must be weighed against the trade-offs. Clearly, we need to see many more published studies with these new tracers, including direct comparisons to PiB PET in the same subjects, before we can fully judge the capabilities of such promising F-18-labelled amyloid imaging agents.
Supported by The National Institutes of Health grants: P50 AG005133, R37 AG025516, P01 AG025204.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Disclosure Statement: GE Healthcare holds a license agreement with the University of Pittsburgh based on the technology described in this manuscript. Dr. Klunk is a co-inventor of PiB and, as such, has a financial interest in this license agreement. GE Healthcare provided no grant support for this study and had no role in the design or interpretation of results or preparation of this manuscript.