In this study, we used FDG PET images from ADNI to demonstrate twelve-month CMRgl declines in probable AD and MCI. Using SPM5 to analyze FDG PET data from this unusually large multi-center study, mild probable AD patients and amnestic MCI patients each had twelvemonth regional-to-whole brain CMRgl declines bilaterally in posterior cingulate, medial and lateral parietal, medial and lateral temporal, and occipital regions. We also used these data to introduce the use of an empirically pre-defined sROI, in this case composed of the set of voxels consistently associated with CMRgl decline in an independent training data set, to evaluate AD-slowing treatment effects with improved statistical power and freed from the multiple regional comparison concerns. Using SPM5 and the sROI empirically predefined in each patient group's training data set to compute the statistical power of FDG PET to detect AD-slowing treatment effects in the respective patient group's test data set, we estimated the need for about 92 probable AD patients or about 226 MCI patients per group, or 66/217 probable AD/MCI patients when excluding data acquired from HRRT and Biograph HiRez scanners, to detect a 25% AD-slowing treatment effect with 80% power, and two-tailed α
=0.05 in a twelve-month parallel group, placebo controlled multi-center RCT, a fraction of the estimated numbers needed using different clinical endpoints. While this report describes the use of different significance thresholds to empirically pre-define the sROI in the training set, we have also used bootstrap resampling, along with the percentage of bootstrap analyses with decline at different significance thresholds to define the sROI with quite similar results (Reiman et al., 2008
). The comparability between the computationally labor intensive bootstrap with resampling method and the significance threshold method will be described in a separate methodological report.
Based on the publicly available analysis of test set data using different imaging modalities and data analysis techniques on a subset of the individuals included in this analysis (http://www.adni-info.org/index.php?option=com_content&task=view&id=89&Itemid=44
), the estimated number of probable AD or MCI patients needed in twelve-month RCTs was smallest in magnitude using FDG PET and our sROI method across the candidate FDG-PET markers. As documented on the ADNI website (http://www.adni-info.org/index/php?option=com_content&task=view&id=89&Itemid==44
), the estimated statistical power to detect an AD-slowing treatment effect using FDG PET was significantly better using measurements in our sROI than the estimates using other methods, including ROIs defined using a meta-analysis of the maximal CMRgl reductions in previous studies of AD. In comparisons with MRI measures, the estimates using FDG-PET and our sROI method were comparable to those obtained from some FreeSurfer volumetric ROIs, boundary shift integral techniques, and tensor-based morphometry measures (Ho et al., in press
; Hua et al., 2009
After defining sROIs to characterize the set of voxels associated with the twelve-month CMRgl decline and CMRgl sparing using the training set data from each patient group, twelvemonth decline-to-spared CMRgl ratios were found to be significantly associated with categorical measurements of clinical disease severity (i.e., AD>MCI>NC), correlated with twelve-month declines in some but not all of their clinical ratings (e.g., CDR-SB but not ADAS-Cog), and associated with about one-tenth the number of patients needed to detect AD-slowing twelvemonth treatment effects in twelve-month multi-center RCTs in comparison to these clinical endpoints. Each of these findings supports the value of FDG PET and this image analysis technique in AD and MCI RCTs, and we have reason to believe that they may have even greater value in evaluating pre-symptomatic AD-slowing treatments in cognitively normal subjects at increased risk of AD (Reiman and colleagues, unpublished data) due in part to the unusually large samples and treatment durations needed to do so using clinical endpoints. Our proposed sROI strategy has several advantages for assessing AD-slowing treatment effects in RCTs. 1) Statistical Power:
Since the sROI consists of the voxels most consistently associated with CMRgl decline in the relevant patient group during the relevant between-scan interval it is likely to be associated with greater statistical power than ROIs defined using anatomical or other landmarks, which may not capture the brain regions associated with this decline due to their size or location. 2) Freedom from the problem of multiple regional comparisons:
Since the sROI provides a single endpoint for the imaging modality of interest, and since it is empirically predefined using an independent data set prior to the performance of an RCT, this strategy would not require any statistical correction for multiple regional comparisons and, we would argue, is likely to be accepted by regulatory agencies in future pivotal trials. By comparison, the use of multiple ROIs would require statistical correction for multiple comparisons, and the use of a voxel/brain atlas coordinate associated with maximal CMRgl declines in an independent data set would require a trade-off between the size of the search region (which would be needed to be sufficiently large to capture a treatment effect) and the number of regional comparisons in that search region. 3) Face validity
. As our study shows—and just as one would predict—the sROI associated with twelve-month CMRgl declines in both the AD and MCI patient groups corresponds well to the brain regions implicated in both cross-sectional and longitudinal FDG PET studies of AD (Alexander et al., 2002
; de Leon et al., 1983
; Foster et al., 1983
; Haxby et al., 1990
; Jagust et al., 1988
; Langbaum et al., 2009
; Minoshima et al., 1994
; Mosconi et al., 2005
; Mosconi et al., 2009
), as well as the pattern of synaptic loss (the best predictor of clinical decline) in neuropathological studies (Selkoe, 2002
). Thus, sROI CMRgl decline meets one of the regulatory agency requirements for a surrogate endpoint (i.e., the measurement should reflect a process in the disease pathway and is likely to be relevant to a patient's clinical course). 4) Customizability
. Since the trajectory of longitudinal changes in regional brain imaging measurements may be non-linear (e.g., relatively early CMRgl declines in posterior cingulate and precuneus, which may level off in the later clinical stages of AD and relatively late CMRgl declines in frontal cortex, which may be most strongly correlated with clinical disease severity in the more severe stages of dementia), the sROI can be customized to the patient group and treatment duration proposed in the RCT, using data from a comparable subject group (e.g., in the study of more severely affected AD patients) and treatment duration as the training set needed to empirically predefine the sROI. ADNI provides the opportunity to customize the sROI and power estimates to different patient groups (e.g., more severely affected AD patients or MCI patients with significant fibrillar Aβ burden) and treatment durations. While our preliminary findings suggest that roughly the same number of mild probable AD or MCI patients may be needed for 18- and 24-month clinical trials whether the sROI is defined using the baseline and 12-, 18-, or 24-month follow-up scans, respectively, some of the patients in this study have not yet completed their 24-month follow-up scans. To be clear, we would recommend empirically pre-defining the sROI from the most appropriate longitudinal data set (e.g., from ADNI, another longitudinal study, or placebo group data from a prior clinical trial) prior to the design or analysis of data from the clinical trial of interest. We do not
recommend using any of the data from the clinical trial of interest to define the sROI itself. 5) Reproducibility
. The estimated number of patients needed to evaluate AD-slowing treatment effects using the test data set was only modestly (and not significantly) greater than the estimate using the training data set. It will be important to extend our findings to other patient groups, including the placebo groups assessed using FDG PET in ongoing RCTs. 6) Generalizability to other imaging modalities and voxel-based data analysis techniques
. As noted in the next paragraph, the sROI strategy described here has the potential to be used using other imaging modalities and voxel-based data-analysis techniques.
Researchers from Dr. Paul Thompson's laboratory used the strategy reported here to characterize twelve-month brain shrinkage in an empirically predefined sROI using tensor based morphometry (TBM) (Hua et al., 2009
). The sROI was empirically pre-defined in ADNI's training data set to include a set of voxels in temporal cortex associated with the most significant brain shrinkage in the training data set from ADNI. When the sROI was applied to the independent test set to estimate, they estimated that as few as 48 probable AD patients or 88 MCI patients would be needed to detect a 25% AD-slowing effect with 80% power, a two-sided test, and α
=0.05), a fraction of the number needed to using clinical endpoints. In a follow-up analysis also using the sROI strategy (Ho et al., in press
), sample size estimates were found to be comparable for 1.5 Tesla and 3 Tesla MRI, in a study of 110 patients scanned longitudinally at both field strengths. Our proposed sROI strategy may be applicable to a wide range of imaging modalities, voxel-based image-analysis techniques, and clinical studies, offering a single imaging endpoint and better statistical power than other regional measurements.
While we found associations between twelve-month declines and sROI/spared ROI CMRgl and certain clinical ratings, the correlations were relatively modest. One possible explanation for the relatively modest correlations may be the relatively small magnitude and large variance in the assessed measurements of clinical decline during this time frame. Indeed, we have previously shown 24-month CMRgl declines in cognitively normal late middle-aged APOE ε4 carriers in the absence of any significant clinical or neuropsychological declines. One might ask what implications this dissociation may have for the future use of these measurements as “reasonably likely surrogate endpoints” for the accelerated regulatory agency approval of treatments based solely on biomarkers alone (Reiman and Langbaum, 2009
)? In order for biomarkers endpoints to serve as reasonably likely surrogate endpoints in clinical trials, regulatory agencies may ultimately require evidence from other clinical trials that an AD-slowing treatment's effects on one or more biomarkers predict a clinical benefit. We have proposed strategies to demonstrate the relationship between a treatment's short-term biomarker effects and its longer-term cognitive and clinical effects to help provide this kind of evidence, even in assessment of presymptomatic AD treatments (Reiman and Langbaum, 2009
); (Reiman et al., 2010
Still, the present study has several limitations, which need to be addressed in future studies. First, our findings were derived from relatively small training and testing data sets, so the sROIs associated with CMRgl decline and sparing in each subject group should be confirmed in independent data sets, including the placebo data sets acquired in ongoing RCTs, extended to other pre-symptomatic and clinical stages of AD, customized to the selection criteria and treatment duration being considered in a RCT, and using clinical or population-based samples that optimally address the scientific question at hand. Second, since our training data set analyses were confined to SPM5, selected smoothing settings, and selected sROI significance thresholds, findings from this study could be extended to other voxel-based image-analysis techniques (perhaps using different registration methods) for a more exhaustive comparison of image analysis settings. However, as we will report in a separate article, the sROI can be pre-defined using either significance thresholds or a more time-consuming bootstrap with replacement strategy, with roughly comparable results. Since the significance thresholding strategy was comparable and easier for other groups to replicate, only findings using this approach are reported here. Third, it remains to be seen whether pharmaceutical companies and regulatory agencies would accept the idea of customizing the sROI to the selection criteria and treatment duration to be used in an RCT, even though the sROI would again be empirically predefined prior to the trial. Finally, power estimates remain to be computed in the sub-group of probable AD or MCI patients defined using ε4 carrier status, evidence of amyloid-β pathology, or other biomarker measurements that might be used to enrich a clinical trial for those individuals most likely to demonstrate AD-related decline.
This study also has limitations that apply to the suitability of any imaging modality and any image-analysis technique as a surrogate endpoint in future RCTs. We do not yet know the extent to which different AD-slowing treatments budge different brain imaging or other biomarker measurements, the extent to which an AD-slowing treatment's effects on one or more of these biomarkers predicts a clinical benefit at different clinical and pre-symptomatic stages of AD, the extent to which the treatment might have a confounding biomarker measurement (e.g., an effect on brain activity, synaptic activity, or brain swelling) independent of an AD-slowing effect. For these reasons, it will be important to embed the range of imaging and non-imaging biomarkers in RCTs of putative AD-slowing treatments, not just to help evaluate the treatment at hand, but to provide the data needed to find one or more biomarkers that meet regulatory agency criteria as surrogate endpoints for the accelerated approval of treatments in the earliest symptomatic and pre-symptomatic stages of the disease.