|Home | About | Journals | Submit | Contact Us | Français|
We set out to determine factors that influence the rate of brain atrophy in 1-year longitudinal MRI data. With tensor-based morphometry (TBM), we mapped the 3D profile of progressive atrophy in 144 subjects with probable AD (age: 76.5±7.4 years), 338 with amnestic mild cognitive impairment (MCI; 76.0±7.2), and 202 healthy controls (77.0±5.1), scanned twice 1-year apart. Statistical maps revealed significant age and sex differences in atrophic rates. Brain atrophic rates were about 1– 1.5% faster in women than men. Atrophy was faster in younger than older subjects, most prominently in MCI, with a 1% increase in the rates of atrophy and 2% in ventricular expansion, for every 10-year decrease in age. TBM-derived atrophic rates correlated with reduced beta-amyloid and elevated tau levels (N=363) at baseline, baseline and progressive deterioration in clinical measures, and increasing numbers of risk alleles for the ApoE4 gene. TBM is a sensitive, high-throughput biomarker for tracking disease progression in large imaging studies; sub-analyses focusing on women or younger subjects gave improved sample size requirements for clinical trials.
Alzheimer’s disease (AD) is a neurodegenerative disorder characterized by pathological accumulation of misfolded beta-amyloid (Aβ) peptides in the neuropil, and hyperphosphorylated tau (p-tau) proteins in neurons (Selkoe, 2004, Skovronsky, et al., 2006). The macroscopic effects of neuronal atrophy, cell death and myelin impairment are detectable on high-resolution structural magnetic resonance imaging (MRI), offering an in vivo index of progressive brain deterioration. AD pathology accumulates up to two decades before overt cognitive decline, and minimally symptomatic subjects, with mild cognitive impairment (MCI) (Petersen, et al., 2001, Petersen, 2003), are a key target in clinical trials (Grundman, et al., 2004). Various imaging measures have been proposed as biomarkers of the disease, reflecting different aspects of AD pathology. Efforts are underway to assess their power for diagnosis, predicting future decline, and sensitivity to the effects of potential disease-modifying treatments (Shaw, et al., 2007, Frisoni, et al., 2009, Jagust, et al., 2009).
Longitudinal brain MRI can be used to track disease progression with high precision and statistical power (Leow, et al., 2006, Hua, et al., 2009). Brain MRI scans can be analyzed with automated or semi-automated methods to measure hippocampal atrophy (Jack, et al., 2004, Thompson, et al., 2004, Chetelat, et al., 2008, Morra, et al., 2009a, Morra, et al., 2009b, Schuff, et al., 2009), ventricular enlargement (Jack, et al., 2003, Thompson, et al., 2004, Carmichael, et al., 2006, Chou, et al., 2008, Nestor, et al., 2008, Chou, et al., 2009a, Chou, et al., 2009b), or whole brain atrophy (Fox, et al., 1999, Fox, et al., 2000, Smith, et al., 2002, Smith, et al., 2004, Sluimer, et al., 2008). The trajectory of brain atrophy on structural MRI largely mirrors the anatomical pattern and trajectory of neurofibrillary tangle deposition (Chetelat, et al., 2002, Thompson, et al., 2003, Vemuri, et al., 2008, Whitwell, et al., 2008, Vemuri, et al., 2009), correlates with clinical decline (Fox, et al., 1999, Thompson, et al., 2004, Hua, et al., 2008b, Evans, et al., 2009, Jack, et al., 2009, Leow, et al., 2009), and predicts future conversion from preclinical to symptomatic AD (Jack, et al., 1999, Apostolova, et al., 2006, Chetelat, et al., 2008, Hua, et al., 2008b, Misra, et al., 2009, Risacher, et al., 2009, Vemuri, et al., 2009), suggesting that MRI measures are useful outcome measures for early diagnosis (Chetelat and Baron, 2003) and clinical trials (Mueller, et al., 2005b, Mueller, et al., 2006, Shaw, et al., 2007, Halperin, et al., 2009, Frisoni, et al., 2010, Hill, 2010).
As AD progresses slowly, drug trials are usually under-powered to detect subtle therapeutic effects in a reasonable time interval, given the high cost of scanning large numbers of subjects. Several sample “enrichment” strategies have been proposed to selectively target subjects most likely to decline based on their genotypes (e.g., ApoE4 carriers, those with abnormal Aβ precursor protein genes, presenilin 1 and 2) (Saunders, et al., 1993, 1998), MRI markers of early AD (e.g., hippocampal or entorhinal atrophy) (Frisoni, et al., 1999, Du, et al., 2001, Jack, et al., 2004, Devanand, et al., 2007, Morra, et al., 2009b), or cerebrospinal fluid (CSF) biomarker profiles (e.g. Aβ, tau, p-tau) (Clark, et al., 2003, de Leon, et al., 2006, Hansson, et al., 2006, Ibach, et al., 2006), to reduce patient heterogeneity and improve statistical power in trials (Frank, et al., 2003, Thal, et al., 2006, Shaw, et al., 2007, Clark, et al., 2008). If factors influencing atrophic rates were better understood, they could be used, in principle, to stratify cohorts into subgroups of subjects most likely to decline. Sex and age differences in atrophic rates are still poorly understood: atrophic rates may be faster in young versus older MCI subjects (Jack, et al., 2008c), and greater atrophy is seen in early- versus late-onset AD (Frisoni, et al., 2007). Women may have higher risk of developing AD than men (Gao, et al., 1998) and, relative to men, women with AD may suffer from greater cognitive impairments (Henderson and Buckwalter, 1994, Fleisher, et al., 2005, Moreno-Martinez, et al., 2008, Bai, et al., 2009), greater functional disability (Dodge, et al., 2003), and more frontal metabolic impairment (Herholz, et al., 2002). Even so, MRI evidence of a “sexual dimorphism” in AD is still lacking. Most of the studies to date are underpowered, i.e. do not have a large enough sample size to detect a subtle sex effect on atrophic rates.
Here we assessed how brain atrophic rates depend on age and sex, in one of the largest MRI studies to date, in the hope that adjusting for these factors might enhance the power to track brain atrophy and factors that influence it. We related atrophic rates to other AD biomarkers, including Aβ, tau, and hyperphosphorylated tau (p-tau) levels in the CSF. We correlated atrophic rates with well known and candidate risk genes (ApoE and GRIN2b). We hypothesized that there would be age and sex differences in atrophy rates, in a diffuse pattern through the brain. We also attempted to rank the clinical variables in terms of their strength of association with rates of atrophy. We hypothesized that atrophic rates might correlate more strongly with cognitive scores, both at baseline and their rates of decline, than with changes in CSF biomarkers, which have poorer temporal reproducibility. We also explored some implications of these correlations for boosting power in clinical trials.
Baseline and 1-year follow-up brain MRI scans were downloaded from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) public database (http://www.loni.ucla.edu/ADNI/Data/) on or before June 1, 2009, and reflect the status of the database at that point; as data collection is ongoing, we focused on analyzing all available baseline and 1-year follow-up scans, together with the associated demographic information, ApoE genotypes, CSF biomarker measures (for Aβ, tau, p-tau), and clinical and cognitive data based on functional and behavioral assessments.. ADNI is a large five-year study launched in 2004 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies and non-profit organizations, as a $60 million public-private partnership. The primary goal of ADNI has been to test whether serial MRI, PET, other biological markers, and clinical and neuropsychological assessments acquired at multiple sites (as in a typical clinical trial), can replicate results from smaller single site studies measuring the progression of MCI and early AD. Determination of sensitive and specific markers of very early AD progression is intended to aid researchers and clinicians to monitor the effectiveness of new treatments, and lessen the time and cost of clinical trials. The Principal Investigator of this initiative is Michael W. Weiner, M.D., VA Medical Center and University of California, San Francisco.
We analyzed 1,368 brain MRI scans, from 144 probable AD patients (age at baseline: 76.5±7.4 years), 338 individuals with amnestic mild cognitive impairment (MCI; 76.0±7.2), and 202 healthy elderly controls (CTL; 77.0±5.1), each scanned twice one year apart. ADNI patients are scanned at other intervals, but here were focused on the one-year follow-up data, as such an interval is common in clinical trials, and we wanted to focus on an interval over which changes would be readily detectable. All AD patients met NINCDS/ADRDA criteria for probable AD (McKhann, et al., 1984). ADNI inclusion and exclusion criteria (Mueller, et al., 2005a, Mueller, et al., 2005b), are detailed online at http://www.alzheimers.org/clinicaltrials/fullrec.asp?PrimaryKey=208.
All subjects (N=684, consisting of 144 ADs, 338 MCIs, and 202 controls) completed thorough clinical and cognitive assessments at the time of baseline scan. During the one year follow-up, 660 (122 ADs, 336 MCIs, and 202 controls) of them completed an additional set of clinical and cognitive tests. Cognitive tests examined here included the Alzheimer's Disease Assessment Scale-cognitive subscale (ADAS-Cog), a 70-point scale designed to measure the severity of cognitive impairment; this is currently the most widely used cognitive measure in AD trials (Rosen, et al., 1984, Mohs, 1994). It consists of 11 tasks assessing learning and memory, language production and comprehension, constructional and ideational praxis, and orientation. The Mini-Mental State Examination (MMSE) provides a global measure of mental status, evaluating five cognitive domains: orientation, registration, attention and calculation, recall, and language (Folstein, et al., 1975, Cockrell and Folstein, 1988). The Rey Auditory Verbal Learning Test (AVLT) evaluates learning and memory functions by assessing the ability to recall a list of 15 words, both immediately after each of the five learning trials (AVLT-5), and after a 30-minute delay (AVLT-del) (Rey, 1964). The Logical Memory (LM) test is a modified version of the episodic memory assessment from the Wechsler Memory Scale-Revised (WMS-R) (Wechsler, 1987). Subjects were asked to recall a short story consisted of 25 pieces of information, both immediately after it was read to the subject (LM-im), and after a 30 minute delay (LM-del). Functional and behavioral assessments, analyzed here, included the sum-of-boxes Clinical Dementia Rating (CDR-SB), ranging from 0 to 18. The CDR-SB measures dementia severity by evaluating patients’ performance in six domains: memory, orientation, judgment and problem solving, community affairs, home and hobbies, and personal care (Hughes, et al., 1982, Berg, 1988, Morris, 1993). Finally, the Functional Assessment Questionnaire (FAR) summarizes the functional activities of daily living (Pfeffer, et al., 1982). Medical histories of cardiovascular, endocrine-metabolic, gastrointestinal disorders, alcohol abuse, drug abuse, and smoking were obtained at the screening visit from the participant and the study partner. Complete details of the ADNI assessments are found in the ADNI Procedures Manual (http://adni-info.org/images/stories/adniproceduresmanual12.pdf) and www.ADNI-info.org.
The study was conducted according to the Good Clinical Practice guidelines, the Declaration of Helsinki and U.S. 21 CFR Part 50-Protection of Human Subjects, and Part 56-Institutional Review Boards. Written informed consent was obtained from all participants.
CSF samples were obtained from a subset of the ADNI subjects through lumbar puncture, after an overnight fast. Samples collected at various sites were transferred, on dry ice, to the ADNI Biomarker Core Laboratory at the University of Pennsylvania Medical Center. Levels of Aβ 1–42 peptide, total tau, and tau phosphorylated at the threonine 181 (p-tau) were measured in 363 subjects at baseline (83 AD, 173 MCI, and 107 CTL), and in 251 subjects at 1-year follow-up (50 AD, 122 MCI, and 79 CTL).
ApoE and genome-wide genotyping were performed on DNA samples obtained from subjects’ blood. Genomic DNA samples were analyzed on the Human610-Quad BeadChip (Illumina, Inc. San Diego, CA) at the University of Pennsylvania (see www.adni-info.org for detailed information on blood sample collection, DNA preparation, and single nucleotide polymorphism (SNP) genotyping methods). We also assessed the effect of a common genetic variant in the GRIN2B gene, a subunit of the NMDA-type glutamate receptor, at SNP rs-10845840, which we previously found was associated with bilateral temporal lobe volume in a genome-wide study of the ADNI data (Stein, et al., 2010) using the Plink software (Purcell, et al., 2007). This SNP encodes a polymorphism in the glutamate receptor, and is over-represented in AD versus controls and is associated with cognitive decline (Stein, et al., 2010).
Scans were acquired on 1.5T MR scanners at 60 sites across the United States and Canada. Although different type of scanners (GE, Siemens, or Philips) and various software platforms were used, a standardized MRI protocol ensured cross-site comparability (Jack, et al., 2008a). A typical 1.5T MR protocol involved a 3D sagittal MP-RAGE scan with repetition time (TR): 2400 ms, minimum full TE, inversion time (TI): 1000 ms, flip angle: 8°, 24 cm field of view, and a 192×192×166 acquisition matrix in the x-, y-, and z- dimensions, yielding a voxel size of 1.25×1.25×1.2 mm3, later reconstructed to 1 mm isotropic voxels.
Image corrections were applied using a processing pipeline at the Mayo Clinic, consisting of: (1) correction of geometric distortion due to gradient non-linearity (Jovicich, et al., 2006), i.e. "gradwarp" (2) “B1-correction” to adjust for image intensity inhomogeneity due to B1 non-uniformity (Jack, et al., 2008a), (3) “N3” bias field correction for reducing residual intensity inhomogeneity (Sled, et al., 1998), and (4) geometrical scaling to remove scanner- and session-specific calibration errors using a phantom scan acquired for each subject (Gunter, et al., 2006). All original image files as well as all corrected images are available at http://www.loni.ucla.edu/ADNI/Data/.
First, each subject’s follow-up scan was linearly registered to their baseline scan, with a 9-parameter (9P) transformation driven by a mutual information (MI) cost function (Collins, et al., 1994), to adjust for linear differences in position and scale across time. 9P registration can correct for scanner voxel size variations in large longitudinal studies involving multiple sites, scanners and acquisition sequences (Clarkson, et al., 2009), consistently outperforming 6P registration in terms of statistical power (Paling, et al., 2004, Hua, et al., 2009). Second, to account for global differences in brain scale across subjects, the mutually aligned scan pairs were then linearly registered to the International Consortium for Brain Mapping template (ICBM-53) (Mazziotta, et al., 2001), applying the same 9P transformation to both mutually aligned scans. Globally aligned images were re-sampled in an isotropic space of 220 voxels along x-, y- and z-dimensions with a final voxel size of 1 mm3.
Individual maps of atrophic rates (also known as “Jacobian maps”) were derived from a TBM analysis of MRI scans acquired one year apart. These maps represent the rates of tissue shrinkage (or CSF space expansion) at each voxel location in the brain. A Jacobian map was created by nonlinearly warping the 1-year follow-up scan to match the baseline scan of the same individual, driven by a mutual information cost function, and a regularizing term called the symmetrized Kullback-Leibler (sKL-MI) distance (Yanovsky, et al., 2009). Registration parameters (sigma=6 and lambda=8) were chosen based on our earlier optimization study (Hua, et al., 2009). A color-coded map of the Jacobian determinants was computed from the gradient of the deformation field to illustrate regions of volume expansion (i.e., with det J (r) >1 ), or contraction (i.e., with det J (r) <1) (Freeborough and Fox, 1998, Toga, 1999, Thompson, et al., 2000, Chung, et al., 2001, Ashburner and Friston, 2003, Riddle, et al., 2004) over the 1-year interval, yielding a map that estimates tissue change rates. Jacobian maps were also spatially normalized across subjects by nonlinearly aligning all individual maps to a minimal deformation template (MDT), for regional comparisons and group statistical analysis. The MDT represented the average shape of 40 healthy elderly controls; the procedure to construct the MDT is detailed in (Hua, et al., 2008a, Hua, et al., 2008b). Average maps were computed by taking the mean at each voxel of the Jacobian maps across subjects.
We performed several statistical analyses to assess factors influencing or related to brain atrophic rates in Alzheimer’s disease and normal aging. First, general linear regressions were used to investigate the relations between TBM-derived brain atrophic rates and demographic variables, CSF biomarkers, clinical and neuropsychological measures, known risk genes, imminent conversion to AD, and other risk factors. These correlations were subsequently evaluated by cumulative distribution functions (CDF) to determine if they were significant after controlling for multiple comparisons using conventional criteria, inside the whole brain or within the temporal lobes. The CDFs were also used to rank the strengths of correlations within each category, to find out which factors are most strongly associated with the rates of structural brain atrophy. Second, the 3D map was reduced to a single numerical score, representing the overall atrophic rate for each individual within an ROI. Third, based on these numerical scores, a power analysis was used to estimate the patient recruitment size for a hypothetical clinical trial of a disease-modifying drug, using structural imaging or other biomarkers as surrogate measures of disease progression.
At each voxel within the brain, correlations were assessed, using the general linear model, between atrophy rates and (1) demographic variables: age, sex, and education; (2) baseline and 1-year changes in CSF biomarker levels: Aβ, tau, p-tau, and the ratio of tau to Aβ; (3) baseline and 1-year changes in clinical and behavioral measures: ADAS-cog, MMSE, AVLT, LM, CDR-SB, and FAQ; (4) medical histories of cardiovascular, endocrine-metabolic, and gastrointestinal disorders, as well as information on alcohol abuse, drug abuse, and smoking; (5) body mass index (BMI); (6) AD risk genes: ApoE4, and a newly discovered candidate risk gene, GRIN2b (Stein et al., 2010). Correlations were assessed within each diagnostic group independently, and in the combined group (of all AD, MCI, CTL subjects), where appropriate. Binary categorical (or indicator) variables were used to code sex (female sex as 0; male as 1), medical histories (no medical history as 0; present as 1), and conversion to AD (non-converters as 0; converters as 1). Risk genes were coded as 0, 1, and 2 for zero, one, and two risk alleles, respectively, to represent an additive model assuming an equal contribution of each risk allele to brain atrophy. All other covariates were represented as continuous variables. Multiple regressions allowed the fitting of a number of predictor variables simultaneously. We first examined age and sex effects (independent variables) on atrophic rates (dependent variables), and age and sex were fitted as covariates to adjust the rest of the correlations for these effects.
CDF plots of the regression p-values were used to determine the significance and compare the strengths of association (effect sizes) for the various factors that correlated with atrophic rates, inside a pre-defined region-of-interest (e.g., the temporal lobes or whole brain). CDF plots are commonly used by false discovery rate methods to assign overall significance values to statistical maps (Benjamini and Hochberg, 1995, Genovese, et al., 2002, Storey, 2002). A significant correlation is declared if the CDF intersects the y = 20x line (other than at the origin), i.e., critical P>0, as this shows that the volume of supra-threshold statistics is more than 20 times that expected under null-hypothesis (Hua et al., 2008a, Chou et al., 2009b, Hua, et al., 2009, Morra, et al., 2009b). The critical P-value refers to the point at where CDF plot intersects with the line y = 20x, and this represents the highest statistical threshold for which at most 5% false positives are expected in the map. This value is generally higher for stronger effect sizes in the maps, but is not defined if no effect is present (i.e. the false discovery rate in the map cannot be controlled). CDFs may also be used to compare effect sizes for different clinical correlations: CDF curves show increasing statistical correlations in rank order from bottom to top, in each graph.
A statistically-defined region of interest (stat-ROI), based on voxels with significant atrophic rates over time (p < 0.001) within a pre-defined anatomical ROI, was established in a non-overlapping training set of 20 AD patients (age at baseline: 74.8±6.3 years; 7 men and 13 women) scanned at baseline and 12-month. The anatomical ROIs included the whole brain gray matter and temporal lobes, two of the best search regions giving the highest statistical power in tracking AD progress (Hua, et al., 2010). This procedure is detailed in (Chen, et al., 2009, Hua, et al., 2009, Ho, et al., 2009, Hua, et al., 2010). A numerical summary of the atrophic rate in the whole brain gray matter, or temporal lobe, was computed by taking the arithmetic mean of Jacobian values within the corresponding stat-ROI (Hua, et al., 2009, Ho, et al., 2009, Hua, et al., 2010), giving a single rate-of-atrophy score for each individual.
A power analysis was defined by the ADNI Biostatistics Core to estimate the sample size required to detect, with 80% power, a 25% reduction in the mean annual change, as captured by imaging, clinical or CSF biomarker measures, using a two-sided test and standard significance level (α=0.05) for a hypothetical two-arm study (treatment versus placebo). The estimated minimum sample size for each arm was computed with the formula below. Briefly, β denotes the estimated annual change (average of the group) and σD refers to the standard deviation of the rate of atrophy across subjects.
Here zα is the value of the standard normal distribution for which P[Z < zα]= α (Rosner, 1990). The sample size required to achieve 80% power was computed, denoted by n80. The 95% confidence interval (c) for the n80 statistic was computed based on 10,000 bootstrapped resamplings, with a bias corrected and accelerated percentile method (Efron and Tibshirani, 1993, Davison and Hinkley, 1997).
The rates of atrophy (Jacobian values) at each location inside the brain were tested for correlations with age and sex in AD, MCI, and CTL groups independently, as well as in the combined group (ALL). The CDF plots (Figure 1a, 1b) show that age and sex correlate with atrophic rates, especially in the MCI group, and when all subjects were combined. There was no systematic age difference between the 3 diagnostic groups (mean age was 76.5, 76.0, and 77.0 for AD/MCI/CTL), so these effects are driven by differences in age within the diagnostic groups, not between them. Comparing CDF curves of the same color - for the whole brain versus temporal lobes – gives a clear impression of the power gained by restricting analyses to regions that are known to change the most. For example, the black curves show that age and sex effects are detected with greater effect sizes when focusing on the temporal lobes, as the CDF curves have a steeper gradient at the origin. They also cross the reference line y=20x at a higher point, which means that a higher threshold (critical P-value or C.P.) can be applied to the statistical maps while keeping the false discovery rate to 5% of the voxels shown.
The sign of the correlations with age—positive inside tissues and negative in the CSF—indicates faster brain degeneration in younger MCI subjects (Figure 1c), about 1% increase in atrophic rates and 2% increase in ventricular expansion rates for every 10-year decrease in age; AD patients showed a similar but lesser age effect. Healthy controls showed a small but significant age effect in the opposite direction: a few voxels in the CSF and at the boundary of GM/CSF showed positive correlations, i.e. younger age is associated with less ventricular expansion. Atrophic rates were faster in women than men by about 1–1.5%/year, signified by positive correlations between the atrophic rates and sex (female sex was coded arbitrarily as 0; male as 1; Figure 1d). As expected, the regression coefficient maps, using thresholds derived within the temporal lobes or across the entire brain, are generally consistent in their spatial distributions. However, a broader area reaches significance if restricting the search region to the temporal lobes, as the critical P-values are higher within the temporal lobes than those from the whole brain (results not shown).
When we added education and BMI into this regression model, they did not show significant correlations in any group so were not pursued further as confounds. To better illustrate the age and sex differences in atrophic rates, the MCI group was divided into six sub-groups (in age brackets: 60–<70, 70–<80, and 80–<90 years; further split by sex into female and male). Figure 2 shows the age and sex effects in a straightforward fashion, as group average maps. The rest of correlations tested in this paper were all statistically adjusted for these effects of age and sex.
As a related question, one might also wonder if age and sex differences were present in the baseline MRI measures. In fact, there were significant age and sex differences in baseline temporal lobe atrophy, within each group independently and in the combined group.
Temporal lobe atrophy rates were correlated with baseline clinical measures (Figure 3) and with their rates of decline (Figure 4). In AD and MCI, atrophic rates were most strongly correlated with the ADAS-cog, LM-im, and AVLT-5 scores at baseline (Figure 3a, 3b). Baseline LM-del, AVLT-del, FAQ, and MMSE also showed significant correlations in MCI (Figure 3b). Anatomical changes over time were also highly correlated with ongoing changes in LM-del, ADAS-cog, CDR-SB, in AD, and CDR-SB, FAQ, LM-im, ADAS-cog, LM-del, in MCI (Figure 4). The rank order - from highest to lowest effect sizes – is shown for these correlations, with baseline ADAS-cog showing the highest correlations with future atrophic rates. The highest curves show the covariates that are most strongly correlated with the measured atrophic rate.
Similar but weaker effect sizes (lower CDF curves and critical P-values) were obtained when expanding the search region to the entire brain, relative to restricting to the temporal lobes, comparing curves of the same color on each side of the plot (Figure 3, ,4).4). Using the whole brain ROI, atrophic rates were only significantly correlated with the ADAS-cog at baseline in AD, and baseline measures of ADAS-cog, AVLT-5, LM-del, LM-im and MMSE in MCI (Figure 3). Likewise, with the whole brain ROI, atrophic rates were only linked to LM-del decline over a year in AD, while the effect sizes were substantially reduced in MCI (Figure 4). These “butterfly plots” show that there is a clear boosting of power for detecting statistical effects on atrophy when focusing on the regions where greatest changes are expected (i.e., the temporal lobes).
Rates of brain atrophy were significantly correlated with CSF biomarker levels—Aβ, tau, p-tau, and tau/Aβ—at baseline in the combined group of all subjects (blue CDF curves in Figure 5). These correlations did not reach statistical significance within each diagnostic group independently, except that the level of CSF Aβ showed weak but significant correlations (critical P=0.004 in the temporal lobes and 0.001 in the whole brain) in MCI (cyan CDF curves in Figure 5).Also, there were no detectable correlations between rates of tissue atrophy and the rates of change in the CSF biomarkers within the individual groups, with the exception of tau/Aβ in the whole brain in AD (critical P=0.003). The ratio of tau to Aβ also showed some weak correlations with atrophic rates in the combined group (critical P=0.0004 in the temporal lobes and 0.001 in the whole brain). In the common sample, clinical correlations were compared with the results from CSF biomarkers. Baseline ADAS-cog and CDR-SB rates of decline were more strongly correlated with structural brain atrophy, as indicated by higher CDF curves and higher critical P values, with significant correlations also found in the separate diagnostic groups. Again, the effect sizes are substantially boosted by focusing on a temporal lobe region of interest, rather than including all the voxels in the brain; this is clearly evident as the curves on the right of each plot tend to rise more steeply at the original and intersect the FDR reference line (y=20x) at a higher intersection point, whose x-value denotes the highest P-value threshold that can be applied to the statistical maps while preserving the expected false discovery rate at the conventional level of 5%.
Carriers of the E4 allele of the ApoE (apolipoprotein E) gene, a commonly carried risk gene for late-onset AD (Saunders, et al., 1993, Roses and Saunders, 1994), showed faster atrophic rates in the temporal lobes overall. Associations were weak but significant within each diagnostic group individually only inside the temporal lobes, but strong when all groups were combined (Figure 6). The newly discovered risk allele (rs-10845840, which codes for GRIN2b, a glutamate receptor subunit; Stein et al., 2010) was associated with atrophic rates in the combined group, but more weakly than ApoE was (Figure 6; higher curves denote stronger effects). When ApoE4 was added to the statistical model that estimated the age and sex effects on the rates of atrophy, the sex effect turned out to be stronger (AD: critical P=0.001; MCI: 0.02; CTL: n.s.; ALL: 0.02) but the age effect was slightly attenuated (AD: n.s.; MCI: critical P=0.007; CTL: 0.0008; ALL: 0.01) inside the temporal lobes.
When expanding the search region to the whole brain, the presence of the ApoE4 risk allele was no longer associated with higher atrophic rates in individual diagnostic groups, but the effect remained significant in the combined group.
MCI subjects who converted to AD within a year (13% of the total MCI group) showed faster atrophic rates than nonconverters, as seen in the contrast map and the significance map (Figure 7). Converters, on average, displayed 2–3% faster atrophic rates than non-converters in the temporal lobes. A similar test in the whole brain did not reach statistical significance (critical P=n.s.).
We evaluated correlations between atrophic rates and histories of cardiovascular, endocrine-metabolic, gastrointestinal disorders, alcohol abuse, drug abuse, and smoking. A medical history of drug abuse was weakly associated with a faster rate of tissue atrophy (critical P=0.0001) in the AD group only, while the other factors had no detectable effect.
Given the age and sex effects in atrophic rates, we broke down the MCI groups into six age- and sex-divided sub-groups. The n80s (sample size estimates) and 95% confidence intervals are shown in Table 1. In this table, lower numbers are considered better as they imply that smaller sample sizes would be required to detect a 25% change in the rate of disease progress, measured by a specific AD biomarker, in response to a potentially disease-modifying drug. Younger men gave smaller n80s than older men, as expected from the age effects in MCI, where younger MCI subjects showed faster atrophy. For the sample size to be smaller, the atrophic rate may be higher and/or its standard deviation smaller. Women aged 60–70 or 70–80 had smaller n80s than men at similar ages. This is also consistent with the earlier finding that women had marginally faster atrophic rates in MCI (by ~0.5–1.5%/year locally). In other words, trials focusing on younger subjects, or with sub-analyses focusing on women versus men, would be better powered with these measures.
To compare structural MRI versus CSF biomarkers, we computed the n80s based on 1-year changes in CSF biomarker levels. Given their poorer reproducibility than MRI, the n80s were much larger than those from neuroimaging measures (Table 2). Although clearly not their intended use, tens of thousands to millions of subjects would need to be recruited to detect a potential drug effect using CSF biomarkers as surrogate markers measuring the rate of disease progression.
In one of the largest ADNI 1-year follow-up studies, we applied TBM to map the rates of atrophy throughout the brain. Atrophic rates were shown to be correlated with some demographic factors (age and sex), but not education or BMI (although BMI has been associated with baseline levels of atrophy in an independent sample normal subjects; (Raji, et al., 2009)). Atrophic rates were also associated with CSF biomarker levels (Aβ, tau, p-tau, tau/Aβ), cognitive performance, behavioral assessments, and risk genes (ApoE, GRIN2b).
In this study, greatest atrophy was primarily localized to the temporal lobes and several broadly distributed gray and white matter regions, and was evidenced by ventricular expansions (Figure 2); largely the same regions showed ongoing tissue loss in MCI and AD. This pattern of localization of atrophy agrees with many prior papers using voxel-based morphometry, tensor-based morphometry, and cortical thickness maps (Smith and Jobst, 1996, Baron, et al., 2001, Chetelat et al., 2002, Scahill et al., 2002, Smith, 2002, Karas et al., 2004, Whitwell et al., 2007, Frisoni et al., 2009, Pievani et al., 2009), based on cross-sectional data or smaller longitudinal studies.
This study was preceded by a smaller pilot study (20 AD, 40 MCI, 40 CTL) with a similar design, in which temporal lobe atrophy rates were correlated with clinical measures and biomarkers (Leow et al., 2009). The current study substantially extended the earlier study by expanding the search region to the whole brain, and by investigating age and sex effects as well as correlations with many newly added biomarkers and risk factors in a sample size almost seven times larger. We confirmed earlier findings that temporal lobe atrophy rates were faster in MCI converters than non-converters, and were correlated with baseline CSF biomarker levels (Aβ, tau, p-tau, tau/Aβ) in the combined group, with baseline LM-del in MCI, and with changes of CDR-SB and LM-im in MCI; however, rate of atrophy, in the current study, was not shown to correlate with baseline level of p-tau, change in MMSE, and change in AVLT-del in MCI. The discrepancy might be due to the sample composition (although sample selection was unbiased) but is more likely due to the sample size difference, which is 7 times larger here. Additionally, we identified significant age and sex differences in atrophic rates; temporal lobe atrophy rates correlated with Aβ in MCI, baseline ADAS-cog, LM-im, and AVLT-5 in AD, baseline ADAS-cog, AVLT-5, AVLT-del, LM-im, FAQ, and MMSE in MCI, changes in LM-del, ADAS-cog, and CDR-SB in AD, changes in FAQ, ADAS-cog, and LM-del in MCI. In the current study, we were also able to detect the associations between common variants in the ApoE and GRIN2b genes and brain atrophic rates; we also explored the implications of drug trial enrichment by performing sub-analyses based on this information.
The age effects on atrophic rates in our study are based on comparing atrophic rates in individuals, which is not to be confused with mapping disease acceleration or deceleration within individual subjects scanned more than twice (Sluimer, et al., 2009). A recent non-ADNI study of individuals with 3 or more serial MRI scans (46 amnestic MCI subjects who later converted to AD, 46 healthy controls, and 23 stable MCI subjects) found that the rates of atrophy do tend to accelerate as individuals progress from amnestic MCI to typical late-onset AD; and the rates of atrophy were greater in younger than older MCI subjects (Jack, et al., 2008c). Our study, in a much larger sample of 684 ADNI subjects (114 AD, 338 MCI, and 202 CTL), confirmed the trend for faster degeneration in younger amnestic MCI subjects versus older subjects. The most plausible explanation is that younger MCI subjects have a more biologically aggressive disease course than older subjects (Jack, et al., 2008c). There is substantial clinical and neuroimaging evidence that early-onset AD (onset before age 65 and typically in the 40’s and 50’s) generally represents a more aggressive form of disease than late-onset AD (onset after age 65) (Frisoni, et al., 2007). A second possibility is that younger MCI subjects may have a larger cognitive reserve than older subjects; under this theory, young people may have greater ability to compensate for the brain deficits so that symptoms may not be evident until brain atrophy has progressed to a greater degree, and is proceeding faster (see, e.g., Mortimer et al., (2005); but see also Christensen et al., (2007) for an opposing view). Finally, some very old subjects were assessed (80–90 years of age), so one has to keep in mind the possibility of a selection bias. Very old people in the study might tend to be more well (well enough to participate in a neuroimaging study requiring multiple follow-ups), and have lower atrophic rates; even though when those same people were younger (long before ADNI) they may have had even slower atrophy rates. In other words, early mortality may prevent people from enrolling in ADNI if they die earlier due to very fast atrophy, so the oldest subjects in ADNI, as a survivor effect, may have slower atrophic rates for this reason. This attrition effect could explain the paradoxical “adverse” effect of young age in a cross-sectional study (faster atrophy in younger people), even when people’s atrophic rates may speed up as the disease progresses (within an individual); this has been demonstrated in early-onset AD (Chan, et al., 2003, Ridha, et al., 2006) and late-onset AD (Jack, et al., 2008c). In a normal aging study, Scahill et al. (2003) found evidence that atrophic rates accelerated with increasing age; our study also showed a small age effect in the control group, with a similar direction of correlation.
We provided the first structural MRI evidence, to our knowledge, of sexual dimorphism in atrophic rates, although several studies have found worse cognitive and behavioral deficits in women versus men with AD. Most early MRI studies failed to detect a sex difference in atrophic rates, but were limited by small sample sizes and limited statistical power. Sex differences in brain structure are found naturally and well-studied (see, e.g., Brun el al. (2009) for a TBM study) but sex differences in the rates of brain change over time are less commonly reported, except in studies of childhood brain development where they occur around puberty (Giedd, et al., 1999). Why atrophic was faster in women is not clear. Numerous demographic studies provide evidence for a “male-female health-survival paradox.” According to this, older men are generally in better health and are less limited in their daily activities than women of the same age, but mortality rates are higher in men than women at all ages (Christensen, 2008). Genetic variation in the sex chromosomes may contribute to sex differences in the incidence of some comorbid disorders. Men may have earlier and higher incidence of hypertension and cardiovascular diseases (high mortality risk diseases) while women suffer more from migraine, arthritis, and musculoskeletal diseases (low mortality risk diseases) (Macintyre, et al., 1996); this may be related to the cohort effect discussed earlier. Sex hormones may also influence the expression of genes that affect lifespan and longevity (Tower, 2006, Tower and Arbeitman, 2009).
We identified significant age and sex differences in baseline measures of brain atrophy, within each group independently and in the combined group. These baseline effects may reflect a combination of (1) the cumulative influence of age and sex throughout life, and (2) naturally occurring sex differences in brain structure, as different structures tend to occupy different proportions of the total brain volume in men versus women (i.e., allometry; Brun, et al., 2009).
Lower education levels are also linked to a higher risk of developing AD and faster rate of progression when compared to more highly educated people (Scarmeas, et al., 2006, Ngandu, et al., 2007). Higher BMI, an index of obesity, is associated with greater brain atrophy in elderly normal subjects (Raji, et al., 2009). We therefore added education and BMI to the statistical models of age and sex, but the conclusions remained the same even after adjusting for these additional factors. BMI was associated with baseline atrophy but not with atrophic rates.
Different biomarkers provide complementary information at different stages of AD (Jack, et al., 2008b, Jagust, et al., 2009). In particular, structural MRI measures tend to correlate better with cognitive test scores than with CSF biomarker levels. This may be because (1) CSF biomarker changes tend to precede the gross anatomical changes on MRI, and (2) because CSF measures are primarily intended to help with diagnosis rather than resolve subtle changes over time within diagnostic categories. We note that CSF measures were not used to assist diagnosis in the ADNI study. However, at least part of the difference in statistical power is due to the different sample sizes of subjects who had available cognitive measures versus CSF biomarker measures. We tested a common set of subjects who had both cognitive and CSF measures (Figure 5). By reducing the full sample (N=684 at baseline and N=660 with 1-year follow-up) to the common set (N=363 at baseline and N=251 with 1-year follow-up), the clinical correlations all became weaker; however, their statistical effects remained higher than those of CSF biomarkers – for example, there were significant correlations even within the separate diagnostic groups, while only a couple of CSF biomarkers (baseline level of Aβ in MCI and rate of tau/Aβ decline in AD) survived statistical testing within the separate diagnostic groups.
ApoE4 is a well known AD risk gene (Corder, et al., 1993, Saunders, et al., 1993, Roses and Saunders, 1994, Roses, et al., 1995, Roses, 1996), and in our earlier cross-sectional study of 676 ADNI subjects, ApoE2 (the “protective” allele) was associated with reduced CSF volume (an index of lesser brain atrophy) and ApoE4 was associated with greater temporal lobe atrophy (Hua, et al., 2008b). In this longitudinal analysis, ApoE4 and GRIN2b were linked to faster rates of temporal lobe atrophy, in a dose-dependent fashion. GRIN2b is a newly identified risk SNP that predicts temporal lobe volumes in ADNI at baseline (Stein, et al., 2010), but its association with AD is not as strong as ApoE, so requires replication. As well as its use for measuring disease progression, structural MRI measures can also be used to identify genes that influence brain volumes in genome-wide association studies (GWAS) (Joyner, et al., 2009, Potkin, et al., 2009, Stein, et al., 2010).
We applied stratified analyses and ran separate regressions independently in each diagnostic group, to ensure that the observed statistical effects were not driven by diagnosis. Alternatively, the analysis could be carried out by pooling all subjects, by applying indicator variables to encode diagnostic groups and interaction terms to quantify inter-group differences on the main effects. However, this increases the computational burden, and each analysis already involves ~2,000,000 correlations. Because of the very large number of possible interactions, and the likelihood of not being able to fit them all stably, we did not test for interactions between diagnostic groups and predictor variables. We also did not attempt to quantify inter-group differences in the main effects, which requires a second order analysis and has still greater statistical power requirements. Instead, we treated the three diagnostic groups independently, merely to ensure that the observed statistical effects were not driven by diagnosis. Correlations were later assessed in the combined group (AD+MCI+CTL), after stratified analyses.
In all analyses, we first ran correlations in the separate clinical groups, and then we ran another correlation in the combined group, where appropriate. This is the most agnostic approach as it allows the correlations to differ, in principle, in the different diagnostic categories, avoiding the risk that the detected correlations may be shadowing diagnosis. We did not perform correlations with clinical measures in the combined group. As clinical measures are used to determine diagnosis, a correlation in the combined group will be significant by construction. The CSF biomarkers, however, were not used to define diagnosis so it is reasonable to correlate them with levels of atrophy across the combined group. Even so, correlations detected in the combined group may not even apply within some of the groups, either indicating a lack of association or, more likely, limited power to track subtle disease progression within the reduced samples of subjects in individual diagnostic groups. This effect likely explains the lack of correlations with CSF biomarkers within groups. The correlations between CSF biomarkers and MRI changes tend to break down as the disease progresses, as changes in CSF biomarker levels may primarily occur prior to the MRI changes. A similar pattern has been noted in studies of amyloid PET (Braskie et al., 2009), where cortical thinning may not correlate with amyloid deposition if the two processes occur or saturate at different times. In a recent study using serial imaging, the rate of neurodegeneration was shown to associate with clinical symptoms but dissociate from amyloid deposition measured by (11)C Pittsburgh compound B (PIB) positron emission tomography (Jack, et al., 2009).
We used categorical variables or indicator variables to encode binary predictors, such as sex, medical history, and conversion to AD, each of these variables only has two distinct classes, i.e. male vs. female, those with or without a medical history, and converters vs. non-converters. If a simple linear regression only includes a two-class categorical variable as the independent variable, the regression acts as a two-sample Student’s t test. An added benefit of using regressions over t tests is that regression allowed us to control for effects of several covariates simultaneously. For example, by fitting both age and sex in the regression model, the sex effect was controlled for when assessing any age effects on atrophic rates.
The results in the whole brain ROI are generally consistent with those derived from the temporal lobes, but are weaker in statistical power. This is expected as brain degeneration is not uniformly distributed across the brain, nor does it progress uniformly. The volume loss pattern from mild to moderate AD spreads over time from temporal and limbic cortices into frontal and occipital brain regions, largely sparing primary sensorimotor cortices (Braak and Braak, 1991; Thompson et al., 2003). One advantage of focusing on the temporal lobes is the improved statistical power by restricting the search region to the area most affected in MCI and early AD. In examining genes influencing brain atrophy (Figure 6) and comparing differences between groups of MCI-converters and non-converters (Figure 7), the statistical effects were only significant in the temporal lobes – which makes sense as these are the regions with greatest pathological burden in MCI. The inclusion of many voxels with much slower atrophic rates and with lower effect sizes tends to inflate the number of voxels assessed to the point where no FDR-controlling threshold can be found. Nevertheless, it is also important to examine the results across the entire brain to have a better understand factors influencing brain atrophy in normal aging and AD.
Our study is one of many that support the use of structural MRI for providing valid surrogate markers in clinical drug trials. MRI is also useful for detecting factors that affect structural changes in anatomical regions involved in AD. CSF biomarkers, despite their value for early diagnosis, might not be so effective for tracking disease progression over time or even for evaluating therapeutic interventions in MCI and AD. For example, their n80s – measures of sample size requirements to detect a fixed percent reduction in the rate of progression - are 1,000 to 10,000 times larger than those from structural MRI (Table 2).
TBM-derived maps of atrophic rates, coupled with voxel-based statistics, offer an easy-to-implement process to investigate factors that exert negative and positive influences on aging and AD. Full 3D maps are used in these correlations, as opposed to only one biomarker measure per individual. This type of map-based method may offer more information and spatial detail on the profile of effects, and may offer better statistical power if effect sizes are not constant across the brain.
Each AD biomarker, derived from structural MRI, clinical, or CSF measures, can be used independently to evaluate drug treatment effects, providing a surrogate outcome measure to track the rate of disease progress. As a result of using different biomarkers, the sample size estimates (n80) should be interpreted with care. For example, a 25% reduction in the atrophic rate (measured by MRI) may have a different functional significance for a patient than a 25% reduction in the rate of decline for clinical or cognitive test scores; similarly, it may also have a different biological significance than a 25% reduction in the rate of change in CSF biomarkers. For example, there may be important and relevant biological events that do not have an immediate imaging correlate. Future efforts will focus on combining multiple biomarkers that measure different aspects of disease progress to reduce the sample size even further.
This study has some limitations. The age and sex effects on atrophic rates, which were still significant here after controlling for education, BMI, and ApoE4, need to be replicated in future, independent studies. A more complete dataset from a large number of subjects with MRI, PIB-PET, [18F]fluorodeoxyglucose (FDG)-PET, diffusion tensor imaging (DTI), resting-state functional MRI, and arterial spin labeling is now being collected to explore the complementary value of each of these neuroimaging markers. Future longitudinal ADNI studies will make use of more than two serial scans, allowing acceleration hypotheses regarding age effects to be tested in the same subjects. More advanced statistical designs, such as random effects or mixed effects models, may then be used to estimate intra-subject variance and group effects with repeated measures (Fitzmaurice, et al., 2004, Frost, et al., 2004, Schuff, et al., 2009).
Acknowledgments and Author ContributionsData collection and sharing for this project was funded by the Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: Abbott, AstraZeneca AB, Bayer Schering Pharma AG, Bristol-Myers Squibb, Eisai Global Clinical Development, Elan Corporation, Genentech, GE Healthcare, GlaxoSmithKline, Innogenetics, Johnson and Johnson, Eli Lilly and Co., Medpace, Inc., Merck and Co., Inc., Novartis AG, Pfizer Inc, F. Hoffman-La Roche, Schering-Plough, Synarc, Inc., and Wyeth, as well as non-profit partners the Alzheimer's Association and Alzheimer's Drug Discovery Foundation, with participation from the U.S. Food and Drug Administration. Private sector contributions to ADNI are facilitated by the Foundation for the National Institutes of Health (www.fnih.org <http://www.fnih.org>). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of California, Los Angeles. This research was also supported by NIH grants P30 AG010129, K01 AG030514, and the Dana Foundation. Algorithm development and image analysis for this study was funded by grants to P.T. from the NIBIB (R01 EB007813, R01 EB008281, R01 EB008432), NICHD (R01 HD050735), and NIA (R01 AG020098), and National Institutes of Health through the NIH Roadmap for Medical Research, Grants U54-RR021813 (CCB) (to AWT and PT). Author contributions were as follows: XH, DH, SL, AT, and PT performed the image analyses; CJ and MW contributed substantially to the image and data acquisition, study design, quality control, calibration and pre-processing, databasing and image analysis. We thank Anders Dale for his contributions to the image pre-processing and the ADNI project.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
*Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (www.loni.ucla.edu/ADNI). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. ADNI investigators include (complete listing available at: http://www.loni.ucla.edu/ADNI/Collaboration/ADNI_Manuscript_Citations.pdf).