Few studies have explored the ability to perform AD prevention trials in populations enrolled with no cognitive complaint. Individuals with no demonstrable cognitive abnormality who meet criteria for AD biomarkers may be defined as having “preclinical AD” [61
] or as being “asymptomatic at risk for AD” [19
]. Trials in persons who meet these criteria are being planned [2
]. Similarly, a variety of working groups have proposed inclusion of “prodromal AD,” MCI patients who meet AD biomarker criteria, in trials [2
] and trials implementing these guidelines are now underway (www.clinicaltrials.gov
How to best design successful predementia clinical trials is controversial and requires guidance from research studies. This project sought to identify optimal clinical outcome measures and biological enrichment strategies for use in AD trials enrolling asymptomatic or mildly symptomatic participants. We chose to examine continuous outcome measures, rather than “conversion” outcomes, because such measures are likely to provide greater sensitivity and therefore reduced sample sizes for predementia trials. We found that for most outcome measures, 12- and 24-month trials of cognitively normal participants are not realistic. Decline in outcome measure scores for this population at 36-months, when present, were small and trials to demonstrate a reduction in that decline required very large study populations. A relative exception to this was for trials using the RAVLT total score, which required 1404 participants per study arm. Even so, the CN population did not demonstrate a mean decline from baseline at 12 or 24 months for this outcome measure (data not shown) and longer-term follow-up and confirmation in independent samples of the decline in the RAVLT in CN participants is warranted.
When the CN population was enriched for persons meeting AD biomarker criteria at baseline, decline was observed for outcome measures that did not demonstrate detectable decline for the entire CN population, and the decline observed on the remaining outcome measures was increased in degree. The numeric reductions in the outputs of sample size calculations frequently exceeded 50% in some scenarios (). Thus, biomarker enrichment increases the efficiency of performing AD clinical trials in asymptomatic patients. The ideal means of enrichment, however, are not yet clear. Determining the specificities and sensitivities of methods to predict future cognitive decline for each biomarker criterion remains an important area of study. In the current exercise, for example, each of CSF Aβ, FDG PET hypometabolism, and hippocampal volume successfully reduced the necessary sample sizes of trials using the CDR-sb as an outcome. Alternatively, hippocampal volume and FDG-PET failed as enrichment strategies for trials using the RAVLT total, while enrichment for CSF Aβ still produced sample size requirements lower than that of the entire CN population. Determining what enrichment strategy is best for what outcome measure may depend on the specifics of the study population.
Not surprisingly, the overall MCI population produced more consistent decline on trial outcome measures, resulting in consistent calculation of trial sample sizes that were reduced, relative to the estimates based on the CN population. In the MCI population as a whole, the CDR-sb required substantially fewer participants than did all other clinical measures, in line with the observations of others [4
]. Enrichment of the MCI population by any strategy reduced the needed sample sizes for all outcome measures, most likely resulting from the refinement of the total population to those who manifested prodromal AD. For the majority of enrichment strategies, the CDR-sb continued to require the fewest participants. Also consistent was the lower required sample sizes for the ADAS12, relative to the ADAS11 [4
]. The single exception to this, and to the substantially lower requirements for the CDR-sb than every other outcome measure by every other enrichment strategy, was in the setting of enrichment for FDG PET hypometabolism. When the MCI population was enriched for FDG PET hypometabolism, the ADAS11 required fewer participants than did the ADAS12 and the MMSE required fewer participants than did the CDR-sb ().
This study has limitations. It is derived entirely from a single data set, which has been used in a large number of studies with similar objectives [38
] see also [8
]. Within ADNI, subjects are well-educated, primarily Caucasian, and hold favorable attitudes toward research that may in part result from a high prevalence of a family history of AD. Further work modeling predementia trials based on alternate data sources is necessary. The conduct of ADNI-like studies on other continents may present such an opportunity.
We performed no formal comparisons of sample size outputs. Confidence intervals of the estimates are provided but, as has been seen in other studies [26
], are wide. We also did not incorporate slope models into our study, focusing instead on change-from-baseline calculations. This decision was based on the on-going debate regarding the appropriate means of incorporating slope analyses into sample size estimates [18
] and the fact that mean change from baseline is the general practice in AD registration trials. Our methods are also in contrast to the often-used practice of choosing a minimal clinically significant difference (for example, 2 points on the ADAS-cog) and powering a trial to detect such a difference at a given time point. Importantly, considering most of the trial scenarios in our results, the 25% drug effect would not achieve a clinically significant difference (for example, 25% of the mean decline for the ADAS11 at 24-months among the overall MCI population was 0.6 points). It is also true that the 25% drug effect would vary among the examined scenarios, as the overall rate of cognitive decline will vary among the different enriched populations. Thus, we do not propose that sample size decisions for predementia trials be based solely on the results of the current study. Rather, we believe that these results may be useful in considering predementia trial design choices, including primary and secondary outcome measures, enrichment strategies, and trial length.
Our analyses compared single biomarker modalities as enrichment strategies. Others have performed more integrated methods of enrichment as predictors of cognitive decline in the nondemented ADNI population. For example, McEvoy and colleagues showed that enriching for a composite measure of atrophy based on multiple brain regions resulted in a greater reduction in necessary MCI trial sample sizes (for trials using the CDR-SB or the ADAS11) than did genetic enrichment for ApoE genotype [41
]. Similarly, Vemuri and colleagues showed that a composite index of structural abnormalities on MRI better predicted clinical progression on the CDR-SB than did clinical or CSF measures in MCI patients [64
]. Using clinically available assays, Heister and colleagues demonstrated that combined use of hippocampal volume measures and psychometric testing better predicted conversion from MCI to dementia than did either volumetrics or cognitive testing alone or in combination with CSF measures [25
Finally, our results consider only the number of participants that must complete an AD prevention trial. We did not examine the important variables of screen failures, participant recruitment, or patient attrition, all of which have significant impact on trial efficiency, perhaps especially in the setting of asymptomatic AD trials. As is seen in , each enrichment strategy is associated with a high screen failure rate, ranging from 80% for CSF tTau to 51% for the ratio of pTau/Aβ in the CN population. Thus, in the scenarios that we examined, the number of total CN or MCI participants that would need to be recruited to undergo biomarker testing is frequently quite high. For example, a trial using the RAVLT as a single primary outcome enrolling CN participants who meet CSF Aβ criteria would need to enroll 5,450 participants to achieve the necessary sample sizes per study arm (not factoring in study attrition). Were the same study to use the CDR-sb as a single primary outcome, 8,550 participants would need to be screened. Though CSF criteria were frequently met by the CN ADNI population and have been shown to predict disease progression [16
], asymptomatic participants often cite lumbar puncture as a barrier to trial participation and rated it as the diagnostic modality that they were least likely to be willing to endure in the setting of an AD prevention trial (unpublished results), suggesting that achieving recruitment goals in predementia trials may meet challenges. It is also unclear how participants will interpret the information of being eligible (or not) for preclinical AD trials. Thus, no matter what enrichment strategy might be chosen for use in preclinical trials, educational campaigns to facilitate recruitment and protect the welfare of participants will be critical to ensure the successful and ethical conduct of these trials.
In conclusion, our data fall short of suggesting a specific biomarker enrichment strategy as optimal for the design of preclinical AD trials. The CDR-sb was the seemingly ideal outcome measure when considering trials of MCI populations, whether they are enriched for AD biomarkers or not. The ideal outcome measures for trials of asymptomatic participants remain open to debate, though these results suggest that the RAVLT total score and CDR-sb may preferable to the ADAS11, ADAS12, MMSE, or RAVLT delayed recall. Replication of these results should be pursued in independent datasets and a variety of ongoing studies, including the next phase of ADNI, will contribute to the overall understanding of AD biomarkers and their utility in the setting of AD clinical trials.