ADNI was designed to provide information for future clinical trials and it is ideal for evaluating the benefits of the usage of CSF biomarkers [1
]. The assessment of expert-proposed targeted trials designs for AD and the performance of Aβ1-42
diagnostic or predictive biomarkers under experimental, clinical trials conditions have not been done previously. The results in this study provide an empirical estimation of the distribution and accuracy of clinical outcomes and potential biases for future AD trials that would use Aβ1-42
biomarkers or a prodromal AD diagnosis as entry criteria [15
]. The low Aβ1-42
and the high t-tau/Aβ1-42
criteria when added to an aMCI diagnosis did not meaningfully affect the efficiency of the trials as compared with the aMCI diagnosis alone.
In the more plausible trials scenarios of small effect sizes of 0.35 or less, 40% dropouts over 2 years, and 200 to 400 patients per group, the gain in power was typically 4% or less with either clinical outcome. This small gain must be weighed against the additional efforts of obtaining CSF, analyzing it, and excluding a proportion of aMCI patients. At least 26% of the ADNI aMCI patients who had lumbar punctures would not fulfill the biomarker criteria, increasing cost and time for recruitment by about one-third in exchange for very little or no gain in statistical power.
The considerable heterogeneity among biomarker-positive participants is a likely explanation for our results. Despite greater clinical worsening of about 0.8 ADAS-cog and 0.4 CDR-sb points over 2 years in the biomarker-positive groups as compared with the overall aMCI group without regard to biomarkers, the standard deviations of the outcomes were larger, decreasing the power to detect treatment differences, that is, the within-group effect sizes were about the same. The use of these biomarker criteria for a targeted clinical trial may select from the extremes of the distribution, where increased within-group variability may offset any increase in mean difference between groups.
A notable difference between the ADAS-cog and CDR-sb outcomes was that the within-group effect sizes (i.e., mean change/SD of the change) were generally larger in the case of the CDR-sb. However, this did not translate to more efficient trials using the CDR-sb in preference to the ADAS-cog outcome in terms of treatment effects, power, and required sample sizes.
Longitudinal studies [21
], including ADNI [1
], demonstrate that CSF Aβ1-42
and t-tau concentrations predict clinical progression in MCI patients; and subgroups that progress to dementia at differential rates defined by CSF biomarkers can be identified [1
]. Although a consistent finding [26
], it has also been consistently observed that either the memory impairment or the CSF abnormalities provided approximately equal predicted clinical declines without differential sensitivity [24
]. Similarly, in our analyses the biomarker-positive groups showed only fractional differences on mean baseline and changed scores as compared with the aMCI group selected without consideration of CSF biomarkers. Therefore, it appears that positive CSF biomarkers, when obtained in a clinical research environment after an aMCI diagnosis is made—and therefore perhaps in clinical practice as well—may mainly identify more advanced aMCI or prodromal AD; and if so, then cognitive severity appears to be the more pragmatic predictor of decline [24
]. Further evidence for this is that the 148 aMCI patients with low concentration values of CSF Aβ1-42
scored significantly worse on screening ADAS-cog, CDR-sb, logical memory-delayed, and functional activities than the 51 patients with high CSF Aβ1-42
(data not shown).
These results have substantial implications for clinical trials planning and interpretation. Assumptions that low CSF Aβ1-42
or high t-tau/Aβ1-42
are more relevant selection criteria for clinical trials are based on views that they aid diagnosis and index greater brain Aβ load and neurodegeneration [27
]. However, it is not known whether such biomarker-positive patients would be more likely to respond to an experimental drug or whether a therapeutic effect will be detected more readily. The opposite could be true and targeted design trials that select only patients with Aβ1-42
biomarkers may inadvertently select those who are less likely to benefit because they are too advanced. In fact, the use of CSF Aβ1-42
biomarkers after a clinical aMCI diagnosis is made may not achieve the desired goal of identifying prodromal AD patients early enough in their illness course for a disease-modifying drug to show an effect.
Moreover, the efficiency of a targeted clinical trial design where the premise is that the low CSF Aβ1-42
patients will both deteriorate more and be particularly responsive to treatment, depends on the effectiveness of the drug in both the biomarker-positive and -negative groups, the proportion of biomarker-positive patients in the sample, and the accuracy of the assay [28
]. When a small proportion of available patients are biomarker-positive and the drug has little benefit for biomarker-negative patients, then in such cases choosing only biomarker-positive patients would indeed require fewer patients than a standard clinical trial design [28
]. In this study, 70% to 74% of aMCI patients were biomarker-positive, potentially limiting the usefulness of CSF Aβ1-42
for screening, and there was no meaningful effect on statistical power. For more efficient trials based on preferentially selecting biomarker-positive patients, the treatment in question must be substantially more effective in that group as compared with the biomarker-negative group. It is important to identify and validate biomarkers for diagnosis and prediction of both disease progression and treatment response when designing targeted clinical trials [4
]; however, CSF Aβ1-42
biomarkers may be differentially informative at different stages [4
The results also demonstrate differences between modeling and simulations in estimating power for clinical trials. Typically, parameters for power calculations are obtained using summary statistics from reference groups and, assuming a range of effect sizes, corresponding sample sizes are calculated. This approach depends on the critical assumptions that the reference group adequately represents the characteristics of the planned trial sample and the summary statistics capture the heterogeneity among the trial participants. However, heterogeneity in the pattern of outcomes may be unrecognized using summary statistics, particularly, when the model requires scores to change linearly over time, and could explain why we observed no significant increase in statistical power in biomarker-positive patients, whereas others calculated greater power for the same sample sizes by using summary data. Therefore, the heterogeneity resulting from the sampling process of the simulations better anticipates the heterogeneity that would be observed in a prospective trial.
One limitation to making inferences from these results is that, although ADNI was meant to inform clinical trials methods, it is not itself a randomized trial. Patients volunteered for a study without planned treatment intervention in which lumbar puncture was optional and ratings were not done under the double-blinded conditions of a randomized, controlled trial. Investigators could obtain knowledge related to APOE ε
4 genotype, clinical characteristics, test performance, course, severity, and medication use, which could have in turn influenced their diagnosis, clinical ratings, and performing lumbar punctures. Another potential limitation is that the substantial majority who underwent CSF examinations had low Aβ1-42
concentrations and high t-tau/Aβ1-42
ratios, and, although consistent with a European MCI sample [24
], may not represent samples from broader communities or nonacademic clinics. Finally, although the use of cholinesterase inhibitors is allowed in all long-term AD clinical trials [29
], the nearly half the patients using the drugs were slightly more impaired and declined more as compared with those not using them and this may have affected illness course [20
]. The random treatment allocations and thousands of simulations ensured that results were not biased in this respect; however, future simulations studies and clinical trials might consider the potential effects of marketed medications on both internal and external validity.
In summary, selecting aMCI or prodromal AD patients for a clinical trial on the basis of CSF Aβ1-42 biomarker criteria will most likely identify relatively more severe patients and not enhance the statistical power of the trials. In the absence of a strong scientific rationale, it may be more practical and clinically relevant to not have Aβ1-42 CSF biomarkers as a criterion for trials entry in this setting and to restrict their use as explanatory or stratification variables when there are reasons to do so.