|Home | About | Journals | Submit | Contact Us | Français|
Alzheimer’s disease (AD) is a common neurodegenerative disorder of late life with a complex genetic basis. Although several genes are known to play a role in rare early-onset AD, only the APOE gene is known to have a high contribution to risk of the common late-onset form of the disease (LOAD, onset > 60 years). APOE genotypes vary in their AD risk as well as age-at-onset distributions, and it is likely that other loci will similarly affect AD age-at-onset. Here we present the first analysis of age-at-onset in the NIMH LOAD sample that allows for both a multilocus trait model and genetic heterogeneity among the contributing sites, while at the same time accommodating age censoring, effects of known genetic covariates, and full pedigree and marker information. The results provide evidence for genomic regions not previously implicated in this data set, including regions on chromosomes 7q, 15, and 19p. They also affirm evidence for loci on chromosomes 1q, 6p, 9q, 11, and, of course, the APOE locus on 19q, all of which have been reported previously in the same sample. The analyses failed to find evidence for linkage to chromosome 10 with inclusion of unaffected subjects and extended pedigrees. Several regions implicated in these analyses in the NIMH sample have been previously reported in genome scans of other AD samples. These results, therefore, provide independent confirmation of AD loci in family-based samples on chromosomes 1q, 7q, 19p, and suggest that further efforts towards identifying the underlying causal loci are warranted.
Alzheimer’s disease (AD, MIM 104300) is a fatal and devastating neurodegenerative disorder. Late onset AD (LOAD), defined by the onset of symptoms after age 60 years, is the most prevalent form of the disease, affecting 5–10% of individuals ≥ 65 years of age and 10–45% of individuals > 85 years of age [Canadian Study of Health and Aging Working Group 1994; Evans et al., 1989; Fratiglioni et al., 1999]. Because of the late onset, delay of onset, even in the absence of prevention or cure, would provide substantial benefit to the patients and to society. It is therefore important to identify risk factors that influence age-at-onset of AD, as wellas disease risk.
There is ample evidence for a genetic basis to both risk [Akesson 1969; Bergem and Lannfelt 1997; Breitner et al., 1986; Lautenschlager et al., 1996; Meyer and Breitner 1998; Mohs et al., 1987] and age-at-onset of AD [Daw et al., 1999; Daw et al., 2000]. The existence of a genetic basis to AD was confirmed by identification of causal mutations resulting in early-onset AD in the amyloid precursor protein (APP) [Goate et al., 1991], presenilin 1 (PSEN1) [Sherrington et al., 1995] and presenilin 2 (PSEN2) [Levy-Lahad et al., 1995] genes, and by identification of APOE as the only well-established susceptibility gene for LOAD [Corder et al., 1993]. APOE also affects age-at-onset, with onset age decreasing with number of ε4 alleles and increasing with number of ε2 alleles [Corder et al., 1994; Corder et al., 1993]. Only 10–20% of the genetic variance for LOAD risk [Bennett et al., 1995; Slooter et al., 1998] and variance in age-at-onset [Daw et al., 1999; Daw et al., 2000] can be attributed to APOE, suggesting that additional LOAD loci remain to be discovered.
Genome scans using a variety of samples and designs have searched for additional contributing loci. Several regions have been implicated across multiple pedigree-based genome scans based on different samples [Bertram et al., 2005; Blacker et al., 2003; Farrer et al., 2003; Hahs et al., 2006; Lee et al., 2006; Li et al., 2006; Pericak-Vance et al., 2000; Rademakers et al., 2005; Scott et al., 2003; Wijsman et al., 2004]. Genomewide association studies (GWAS) have also recently been used, starting with moderate [Coon et al., 2007] and more recently large samples [Harold et al., 2009; Lambert et al., 2009; Seshadri et al., 2010] of typically unrelated subjects, although related subjects have also been used [Wijsman et al., 2011]. These studies have implicated additional loci, several of which give consistent results across studies of subjects of European descent, although not always in non-European samples [Bertram et al., 2007]. However, for these loci contribution to AD risk is much smaller than has been estimated for APOE. Other than APOE, which is detectable in both family-based linkage analyses [Holmans et al., 2005; Myers et al., 2002; Pericak-Vance et al., 2000; Sillen et al., 2008] and population-based association analyses [Abraham et al., 2008; Beecham et al., 2009; Coon et al., 2007; Grupe et al., 2007; Harold et al., 2009; Lambert et al., 2009; Wijsman et al., 2011], there is also only modest overlap between results from pedigree-based vs. GWAS designs. However, a meta-analysis of linkage-based genome scans identified regions on chromosomes 1q, 7p and 8p with strong evidence of linkage across studies, and with additional regions showing suggestive results across studies, including 6p and 19p [Butler et al., 2009].
The complex genetic basis of LOAD may explain the lack of strong consensus in regions implicated across studies. In addition to genetic heterogeneity and multilocus contributions to the phenotype, there is likely to be heterogeneity in exposure and response to environmental risk factors. Even APOE is likely to contribute differently to AD risk across samples because allele frequencies vary considerably among populations; e.g., there is a gradient of the high-risk ε4 allele frequency ranging from a low of 5–9% in Mediterranean populations to over 20% in Scandinavian populations [Corbo and Scacchi 1999]. Some explicit allowance for heterogeneity has been introduced through use of stratification on earlier vs. later onset of AD [Blacker et al., 2003], through use of a 2-locus risk model [Curtis et al., 2001], and through use of an oligogenic model with age-at-onset [Daw et al., 1999; Wijsman et al., 2005]. However, despite this complexity, most previous genome scan analyses of LOAD have been carried out under models that have limited accommodation for genetic heterogeneity, using methods that either ignore, or do not explicitly acknowledge heterogeneity.
A small number of studies have focused on age-at-onset of LOAD rather than disease risk. The well-established LOAD locus, APOE, appears to affect risk of AD through genotype-dependent age-at-onset [Breitner et al., 1998; Corder et al., 1994; Jarvik et al., 1995]. Analysis of age-at-onset in pedigree-based samples is complicated by limited analytic options to accommodate a complex mode-of-inheritance, censoring of age information in unaffected subjects, and applicability to general pedigrees. Previous searches for loci affecting age-at-onset in LOAD samples have made compromises, including ignoring unaffected subjects [Holmans et al., 2005], ignoring their age censoring [Lee et al., 2008], or ignoring pedigree information when adjusting for the known affects of APOE [Dickson et al., 2008].
Here we present the first analysis of age-at-onset in the NIMH LOAD sample that allows for both a multilocus trait model and genetic heterogeneity among the contributing sites within the sample, while at the same time accommodating age censoring and effects of known genetic covariates. This large sample represents the combined data from three different recruitment sites that differ with respect to both genetic ancestries of the local populations and overall recruitment procedures. Both of these issues are likely to cause heterogeneity in the sample. Our analyses detect differing underlying multilocus models across sites, supported by subsequent genome scans, providing evidence of this heterogeneity. Our results both affirm some regions identified previously in this sample, as well as identify additional regions with evidence for AD age-at-onset loci. Some of the new regions that we identify also have been reported in other samples, and therefore serve as confirmation of relevant AD loci.
We used the publicly-available National Institute of Mental Health (NIMH) Genetic Initiative Alzheimer disease sample (https://www.nimhgenetics.org/available_data/alzheimers_disease/). We used the data as provided, and we restricted all analyses to European Americans as defined by the “race=white” indicator. Subject collection and evaluation followed a standard protocol at three clinical sites (Centers 50–52: C50, C51, C52) as described elsewhere [Blacker et al., 1997]. The three sites differed in their geographic locations and points of contact for systematic recruitment efforts: recruiting through a memory disorders clinic, a private nonprofit geriatric outpatient clinic and nursing home, and through private hospital outpatient clinics. The three centers also differed in terms of the fraction of subjects recruited through systematic efforts as compared to non-systematic efforts such as referral from a clinician or Alzheimer’s disease center, with 19%, 59%, and 69% of the sample representing subjects recruited by non-systematic procedures in C50, C51, and C52, respectively.
For the purpose of analysis, we defined affectation status as “affected” for individuals with definite, probable or possible AD based on NINCDS/ADRDA criteria [Blacker et al., 1997; McKhann et al., 1984], and as “unaffected” for individuals without dementia, or presumed to have no dementia. Among the affected subjects, autopsy documentation was available for 33%, 26%, and 35% in centers 50, 51 and 52, respectively. For other diagnosis groups, affectation status was considered to be unknown (missing). Age information used in the analysis was the age at which the first symptoms of AD were reported for affected individuals and the latest age at which an individual was observed for unaffected individuals, with all individuals with both age and diagnosis information used for analysis.
Genotyping is described elsewhere [Blacker et al., 2003; Blacker et al., 1997]. The markers consisted of 362 multiallelic markers spaced at ~10cM intervals. We restrict our presentation here to the genotypes as originally provided, since alternative options for compressing rare alleles into composite alleles for analysis did not affect the results (not shown). Marker map positions were obtained from the Rutgers map [Kong et al., 2004] and were converted to map positions based on the Haldane map function because of assumptions inherent in the underlying analysis methods.
We used age-at-onset as the phenotype of interest, modeled as a quantitative trait [Daw et al., 1999; Heath 1997]. Age-at-onset was treated as censored for unaffected individuals and as observed (non-censored) for affected individuals. If either affection status or age information was missing, age-at-onset was treated as missing. We performed all analyses both jointly for the data from all three clinical sites, as well as separately for the three sites in order to investigate possible center effects. In addition, there were two versions of joint analysis of the complete sample: (1) assuming every individual is drawn from the same genetic population with the same underlying model parameters (All1), and (2) assuming the three centers may differ in their genetic populations through differing allele frequencies of underlying trait loci (All3).
The analysis approach used here differs in two important ways from alternatives such as variance-component methods used for analysis of censored age-at-onset data [Dickson et al., 2008]. First, original ages, rather than residuals from a pre-analysis, are used, and the censoring of age data in unaffected subjects is explicitly taken into account in the model. This, in turn, allows for more efficient use of APOE genotype information as a covariate [Wijsman and Yu 2004] because unobserved APOE genotypes can be probabilistically inferred from the existing data. Second, by allowing different allele frequencies for underlying quantitative trait loci (QTL), this approach can account for both the familial relatedness in pedigrees as well as effects such as center effects due to either ascertainment or population differences.
Here, we describe briefly the Bayesian oligogenic MCMC analyses used for most analyses and implemented in the program Loki version 2.4.7 [Daw et al., 1999; Heath 1997]. Age-at-onset, Y, is modeled as a censored quantitative trait. Multilocus trait genotype-specific means are assumed to be normally distributed with a common within-genotype residual variance, after adjusting for covariates. The model is:
where μ is an overall baseline, X is the incidence matrix for covariate effects, β is the vector of covariate effects, Qi is the incidence matrix for the effect of diallelic QTL i, αi is the vector of additive and dominant effects for QTL i, and e is the normally distributed residual environmental effect. The model allows for multiple QTLs with the number of trait loci, k, considered as a random variable, not as a constant.
A Bayesian reversible-jump MCMC sampler [Green 1995] was used to estimate posterior distributions of model parameters conditional on the data and prior distributions on parameters. The parameters of interest are covariate effects, the number of QTLs contributing to the trait, the allele frequency and genotype effects for each QTL, and for linkage analysis, the location of each QTL. The reported allele frequency, pA, is the frequency of allele A, and homozygote (BB) and heterozygote (AB) effects are μBB -μAA and μAB -μAA, respectively, where μij is the mean genotypic value for genotype ij. For each QTL, the allele labels A and B were arbitrarily assigned such that the genotype BB effect was non-negative.
Oligogenic segregation analysis was performed to determine the complexity of the underlying mode of inheritance. We identified models for QTLs associated with age-at-onset for the complete sample (All1 and All3), and also for each of three centers (C50, C51, C52) separately. Segregation analysis describes the underlying QTL model(s) without use of markers or trait locus positions, with the complexity of the oligogenic trait model serving as a useful indicator of the genetic basis of the trait for further linkage analysis [Igo et al., 2006]. All segregation analyses were performed both without and with APOE as a covariate, and were based on 100,000 MCMC iterations, with every 10th iteration used for computation of posterior distributions. We used a uniform distribution between 0 and 1 for QTL allele frequencies. The prior distribution on the number of QTLs was assumed to be Poisson with mean 4. The trait locus effects were assumed to have a normally distributed prior with mean 0 and variance 512, which is about four times the phenotypic variance [Wijsman and Yu 2004].
To investigate whether certain QTL models are derived by a few individuals with the most extreme phenotypic values, we performed additional segregation analyses that excluded unaffected individuals with age < 46 or > 91 years. The values of 46 and 91 are based on minimum and maximum of ages of individuals in C52, which had a narrower range of age data in unaffected subjects than did the other two centers, and also had a less complex posterior distribution in the QTL model space. QTL models were compared first, without the young unaffected individuals (age < 46); second, without the old unaffected individuals (age > 91); and third, without both the young and old unaffected individuals for C50, C51 and All3.
We performed linkage analysis for All1, C50, C51, C52, and All3 as in the segregation analysis, carrying out analysis without and with including APOE both as a covariate and as a marker. A sex-averaged meiotic map was used with the assumption of a complete genome length of 3,000cM, with map positions based on the Haldane map function. All linkage analyses were based on 500,000 MCMC iterations, with every 10th iteration used for inference. We used a uniform distribution for the prior distribution on QTL locations. The prior distributions on other parameters were the same as the segregation analysis. Since we used a Bayesian approach, which does not result in conventional LOD scores or P-values, we report log (base 10) of the Bayes’ factor (BF) for every 2 cM bin, logBF, which is the ratio of posterior to prior odds of linkage. We use the maximum logBF, log(BFmax) in regions with high values as a test of linkage.
Calibration of logBF as supporting evidence of linkage is approximate. However, log10 BF>1.5 corresponds to strong evidence in favor of the alternative hypothesis of linkage [Kass and Raftery 1995]. In addition, in our experience, most cases with log BF>1.5 appear to be equivalent to p<0.001, calibrated either by reference to a different calibrated linkage statistic, or by carrying out computationally expensive simulations [Igo and Wijsman 2008].
We performed additional linkage analyses on chromosome 10 using affected sibling pairs (ASPs) with onset ages ≥ 65. Previous studies of part [Holmans et al., 2005; Myers et al., 2000] or all of this sample [Myers et al., 2002] reported strong evidence for linkage to Chromosome 10 based on allele sharing in ASPs, but we did not find such evidence using MCMC linkage analysis of age-at-onset in the complete sample, which also included additional relatives (see results). This analysis was, therefore, performed to determine whether the discrepancy was explained by differences in the sample included in the analysis or by difference in the analytic approach. To make our analysis comparable with the previous results, we used 304 ASPs in 303 pedigrees (1289 total individuals) and ignored unaffected siblings.
We carried out standard non-parametric linkage analysis with the Kong and Cox [Kong and Cox 1997] linear model using Merlin [Abecasis et al., 2002], and report multipoint maximum LOD scores. The Kong and Cox LOD score is maximized on a single parameter, δ, representing the degree of allele sharing among affected individuals. Under the null hypothesis, δ=0, and higher δ corresponds to the alternative of excess allele sharing. For comparison, MCMC linkage analysis was performed on the same sample (N=1289), and the BF was computed with and without adjustment for APOE effects.
Center-specific analyses identified evidence of linkage to chromosomes 1 and 7 that appeared to be consistent with linkage signals in a recent genome scan for modifiers of age-at-onset among early-onset AD families with a known PSEN2 mutation [Marchani et al., 2010]. To determine whether these signals were centered over the same regions, we performed a joint analysis including both the large early-onset (EO) family from this previous analysis with the greatest contribution to the published linkage signals, and either the individual NIMH centers supporting linkage to these regions, or all NIMH centers. In these analyses, we allowed allele frequencies to vary between each center and the early-onset family. We merged the marker maps from the two studies, and different markers with identical positions based on a Haldane map function were jittered to be 0.1cM apart to avoid analytical constraints. Analyses included APOE as a covariate, and were based upon 2,000,000 MCMC iterations after a 1,000 iteration burn-in, saving every 50th iteration for posterior parameter estimation.
There was evidence for heterogeneity among the samples from the three NIMH centers. The centers differed in terms of family size and phenotype distributions, as well as in the APOE allele frequencies (Table 1), which is well known to have a significant effect on age-at-onset [Corder et al., 1993]. Center 50 had both the largest pedigrees and the highest missing data rate. Although mean ages were similar among the centers for both affected and unaffected individuals, there were fewer extreme high and low values in Center 52 than in the other centers: Centers 50 and 51 included a few young individuals with age below 40, including one affected and eight unaffected individuals for Center 50, and six unaffected individuals for Center 51; Center 52 had no such individuals. Similarly, APOE frequencies varied considerably, with Center 51 representing the most extreme values among the three centers for both the ε3 and ε4 alleles.
We found three main QTL models affecting age-at-onset of AD, and defined them based on their homozygote (BB) and heterozygote (AB) effects (Figure 1, Table 2). Two of the three models were obtained when all three centers were analyzed, both jointly as one population (All1) (Figure 1A, B), and when all three centers were permitted center-specific QTL allele frequencies (All3) (Figure 1N, O). Model 1 is slightly over-dominant, in which the trait values for the AB genotype were lower than those for either homozygote. The mean genotype effects for the AB and BB genotypes of Model 1 in analysis All3 were −7.53 and 34.37 years without APOE adjustment and −6.68 and 33.22 years with APOE adjustment, respectively. In Model 2, the heterozygote effect was higher than those of either homozygote except in the analysis of C50 alone: the mean age-at-onset for the AB and BB genotypes in analysis All3 were 9.93 and 3.13 years greater than that of the AA genotype without APOE adjustment, and 7.89 and 2.48 years greater with APOE adjustment, respectively. Model 2 was the least well-defined of the three main models, and may represent multiple smaller models that improve overall model fit, thus leading to overall larger confidence intervals upon model extraction. Model 3 is close to a dominant model, with similar effects of the AB and BB genotypes: the mean effects of the AB and BB genotypes in analysis All3 were 22.43 and 22.65 years without APOE adjustment and 20.21 and 21.11 years with APOE adjustment, respectively.
Segregation analyses performed separately for each center gave different results, suggesting the existence of among-center heterogeneity. The three centers contribute to different QTL models, so the model spaces shifted slightly depending on which center was used in the analysis. In the analysis of C50 alone, all three models were identified. The models were similar for analysis both with and without APOE adjustment (Figure 1D, F), but with smaller parameter standard deviations with APOE adjustment. In the analysis of C51 alone, Models 2 and 3 had greater overlap (Figure 1H, I): with APOE adjustment, there was no model equivalent to Model 2 and again, parameter standard deviation were reduced relative to the analysis without APOE adjustment. For C52, the QTL model spaces were the most different from the other two centers (Figure 1K, L). Notably, there was no model equivalent to Model 1, the positions of Models 2 and 3 were also quite different from those of other centers, and there was less evidence for clarification of the QTL models with adjustment for APOE. These differences in the QTL models suggested that center effects should not be ignored in the linkage analysis.
The three QTL models were clearly separated in analysis of the full sample that allowed for different allele frequencies among the three centers (All3) (Figure 1N, O). In Model 1, allele frequencies of allele A for C50 and C51 were similar to each other (Table 2), but C52 had a much higher A allele frequency (rare minor allele frequency), which is consistent with the absence of a QTL model equivalent to Model 1 in the analysis of C52 only. C52 had a different allele frequency for Model 2 as well. For Model 3, allele frequencies were similar for all of three centers.
Overall, adjustment for APOE did not substantially alter QTL models, but did improve the precision of their estimates. The mean parameter values based on the posterior parameter distributions obtained for QTLs with APOE adjustment had lower standard deviations than without APOE adjustment. In addition, the number of QTLs in each model decreased under all analysis models that included an APOE adjustment (Figure 1, column 3).
Additional segregation analyses showed that Model 1 is driven by the oldest unaffected individuals (age > 91 years). This model disappeared or weakened when unaffected old individuals (N=6 in C50, N=9 in C51) were excluded (Supplementary Figure 1). For example, with APOE adjustment, removal of the oldest unaffected subjects lead to the disappearance of model 1 for C51, and a reduction of about 10 years for the homozygote effect for All3 and C50. The other QTL models (2 and 3) were essentially unaffected by exclusion of the outliers, and excluding young unaffected individuals similarly had little effect on QTL models (not shown). One possible reason for the absence of Model 1 for C52 in the segregation analysis may be that C52 includes no extremely old, unaffected individuals.
Our genomewide linkage scan found several regions with strong to moderate evidence of linkage with age-at-onset of LOAD. Figure 2 presents the results of all chromosomes across analysis conditions, including the all-centers analyses, All1 and All3. In the analysis of the complete data set, we detected two strong linkage signals on chromosome 6 and 19 (Figure 2A, E). We identified moderate signals on chromosome 1 (Figure 2B, D) and 7 (Figure 2C, D) both supported by at least two centers. We also identified two strong signals on chromosomes 9 and 15 each, in only one specific center (Figure 2C, D). As discussed further below, results without or with including APOE as a covariate and a marker generally differed, but in most cases, the differences were modest.
As expected, strong evidence of linkage was obtained near the APOE locus on 19q when APOE was not included as either a marker or covariate (Figure 3A). This was particularly striking when the complete sample was included in the analysis (logBFmax=1.62 and 1.50 for All1 and All3 respectively, at 71–73cM). The analysis of each of C51 and C52 gave moderate evidence for linkage to the location of APOE, but with the position of the maximum BF located at slightly different positions (logBFmax=1.05, 71cM for C51; logBFmax=0.85, 97cM for C52). However, the evidence for linkage to APOE was weak when the analysis was performed for C50 alone (logBFmax=−0.28, 71cM). Consistent with an interpretation of APOE as the cause of the strong evidence for linkage at ~70cM, evidence for linkage to 19q dropped considerably when APOE was included as a covariate and a marker (Figure 3B). For all the analyses except that of C50 alone, which had little evidence for linkage to this region, the logBFmax near APOE was decreased with APOE adjustment (logBFmax=0.58 and 0.78 for All1 and All3 respectively, at 71–73cM; logBFmax=0.61, 71cM for C51; logBFmax= 0.28, 97cM for C52).
We also detected moderate evidence for linkage to 19p. In the analysis without APOE adjustment, there was only weak evidence for linkage to 19p. This signal strengthened considerably in the analysis that included APOE both as a covariate and as a marker for analysis of All1 (logBFmax=0.99, 21cM for All1) with weaker but similar results for All3 and C52 (Figure 3B).
To investigate the effect of APOE adjustment on QTL models, we compared the QTL models near APOE for analysis with and without adjustment for APOE. Figure 3C and D shows the QTL models accepted in the MCMC analysis of All3 with locations in the APOE region from 55cM to 95cM on chromosome 19 in the analysis. Without adjustment for APOE, we found a QTL model with the mean trait effects of μAB -μAA = −8 and μBB -μAA = 30, which is equivalent to Model 1 in the segregation analysis. However this QTL model disappeared when APOE was included both as a covariate and as a marker, suggesting that Model 1 represents the effect of APOE. In addition, adjustment for APOE resulted in merging of two pairs of models that represent Model 2 and Model 3. This is what would be expected from the analysis model, which models genotype effects as deviation from a baseline overall mean: in the absence of an APOE adjustment, different APOE genotypes would be represented in the baseline as sampled in the MCMC process, leading to multiple similar models with an effect, as seen here.
Other than chromosome 19q, chromosome 6p exhibited the strongest genome wide evidence of linkage in the analysis of the complete data (Figure 4, logBFmax=1.63 and 1.23 for All1 and All3 respectively, at 71cM). Two of the three centers, C51 and C52, individually also provided individual support for linkage to this region both with and without adjustment for APOE, with the third center providing support for linkage in the absence of APOE adjustment. To the extent that the BF provides localization, results from the different centers are all consistent with a QTL at the same location, since the logBF maximizes at the same position in analysis of All1 and All3 as well as the data from individual centers. These results are similar to those obtained for chromosome 19q near APOE, which was also supported most strongly by C51 and C52. This signal on chromosome 6 decreased considerably when APOE was included as a covariate and a marker in the linkage analysis (Figure 4B vs. E).
There was moderate support for linkage to chromosome 7q across centers, with strong support in one family. Without APOE adjustment, one moderate signal was detected in the analysis of C51 (Figure 4C, logBFmax =1.19, 167cM). This signal became much stronger with APOE adjustment (logBFmax= 1.67, 167cM for C51) with supporting evidence at the same region in the analysis of C52 (Figure 4F). In the analysis of All1 and All3, we found linkage signals at ~167cM, but these were not as strong as in the analysis of C51. The logBF was essentially coincident in the analysis of all vs. only individuals centers, consistent with a single contributing locus in the separate samples. As for chromosome 19, when the analysis was performed for C50 only, the evidence for linkage was very weak on the entire chromosome without or with APOE adjustment.
There was moderately strong evidence for linkage to chromosome 1 (Figure 4A, D). This evidence was strongest for the individual center C52 (logBFmax=1.08, 169cM) without APOE adjustment (Figure 4A). There was supporting evidence for linkage from C50 (logBFmax=0.68, 187cM), but this signal dissipated when APOE was included as a covariate (Figure 4D). In analyses of C51, with or without APOE adjustment, there was no evidence of linkage to chromosome 1.
Regions with evidence for linkage on chromosome 9 and 15 were obtained in single centers only (Figure 2). For both chromosomes, results were insensitive to APOE adjustment, although in both cases APOE adjustment weakened signals slightly. Although there was weak evidence for linkage at ~133 cM on chromosome 9 in the analysis of All3, strong evidence for linkage was obtained for this region for C52, logBFmax= 1.69 and 1.44 without and with APOE adjustment, respectively (Supplementary Figure 4A). Similarly strong evidence for linkage at ~41cM on chromosome 15 was detected for C51 (logBFmax= 1.34 and 1.11 without and with APOE adjustment, respectively, Supplementary Figure 5A). Presumably, the weaker evidence of linkage to both these regions in the complete data (All3) is explained by the lack of support from the other two centers.
Similar to the model space observed on chromosome 19, the QTL models on chromosome 9, and to a lesser extent on chromosome 15, were affected by APOE adjustment. On chromosome 9 there were parallel distributions of models evident in the absence of adjustment for APOE (Supplementary Figure 4B). These models combined into a single model after adjustment for APOE (Supplementary Figure 4C). Similar, but weaker, evidence for QTL models was obtained for chromosome 15, and again parallel models merged when APOE was included as a covariate (Supplementary Figure 5B, C). The most probable explanation for this merging of models with APOE adjustment is again the structure of the model as genotypic effects relative to a baseline, with multiple possible APOE baseline genotypes.
Evidence for linkage of age-at-onset to chromosome 10 was very weak. MCMC and non-parametric linkage analysis of 303 ASPs suggests that evidence for linkage to chromosome 10 is specific to use of ASPs, not to use of the more extended families and unaffected subjects. In MCMC linkage analysis of the ASPs, we found very weak support for linkage to chromosome 10 with a maximum logBF=0.45 (BF=2.81) obtained at 109 cM, and centered over the same position as achieved a maximum LOD score. When all subjects were included, evidence for linkage dropped considerably: in the analysis of All3, the evidence for linkage to the region ~109 cM dropped to log BF=−0.52 (BF = 0.3). In the non-parametric linkage analysis of 303 ASPs, we found suggestive evidence of linkage (LOD =1.57, 107.88 cM), which is similar to the MCMC linkage analysis of 303 ASPs, and similar to the analysis presented originally on the complete sample [Blacker et al., 2003] including non-European Americans. This result suggests that absence of evidence for linkage of age-at-onset to Chromosome 10 is explained by use of all European Americans in our analysis, instead of only ASPs with onset ages ≥ 65.
Evidence of linkage of age-at-onset to chromosomes 1 and 7 strengthened with the inclusion of an early onset AD family with a known PSEN2 mutation (Table 3), with evidence of age-at-onset modifier loci on these chromosomes [Marchani et al., 2010]. The stronger linkage signal was detected at the same region of chromosome 7 for the analysis of C51 (logBFmax=1.43 and 1.91 without and with APOE adjustment, respectively at 167cM). On chromosome 1, the linkage signal in the analysis of C52 was also strengthened without (logBFmax=1.10, 170cM) and with (logBFmax = 1.21, 195cM) APOE adjustment, respectively, but the regions were slightly shifted compared to those identified in the analysis of NIMH data alone, especially when APOE was included.
Our results provide evidence for multiple loci affecting age-at-onset of AD in the complete European-American NIMH sample, with heterogeneity across data collection centers. For this data set, evidence obtained here for linkage to the regions identified on chromosomes 7q, 15, and 19p is new. Evidence for loci on chromosomes 1q, 6p, 9q, 11, and 19q has been reported previously with an AD risk model and/or an age-at-onset model [Blacker et al., 2003; Dickson et al., 2008; Holmans et al., 2005]. The approach to handling age-at-onset information used here for the first time allowed determination of the center(s) that contribute to individual linkage signals. This demonstrated that evidence for linkage of age-at-onset to chromosomes 6p, 7q, 11 and 19q was supported by more than one center, while evidence for linkage to chromosomes 1q, 9q, 15, and 19p was obtained primarily within individual centers. These results also demonstrate the advantage of analysis of this sample with a more sophisticated and complex model than has been previously used.
Differences between center-specific and global results support heterogeneity across centers. This is supported by differences in both sample composition and in results from the genome scans. In the multi-center NIMH sample, sample heterogeneity that contributes to genetic heterogeneity may be the consequence of variation in sample ascertainment as well as geographically correlated genetic ancestry differences across centers. Even though inclusion criteria were similar across the collection sites, de-facto ascertainment differences are suggested by differences in the resulting sample characteristics, including pedigree sizes and age ranges. This could inadvertently have an affect on the resulting genetic architecture of the samples. Evidence for differences in genetic background is suggested by differences in APOE allele frequencies among centers, as well as by differences in the QTL models obtained by oligogenic segregation analysis. Such genetic heterogeneity may be inevitable in any sample collected from geographically distinct regions. For example, two of the centers that contributed subjects to the NIMH sample used here also contributed subjects to the NIA LOAD sample, with a recent analysis identifying substantially different fractions of subjects with southern-European and Ashkenazi Jewish ancestral origins between those two centers [Wijsman et al., 2011]. While heterogeneity may be impossible to eliminate in large samples, knowledge of which data components contribute to evidence of linkage in any specific genomic region is useful for further studies leading to gene identification.
Other than the APOE region, the regions implicated in this sample differ from those identified by association-based approaches [Bertram et al., 2007]. This is not surprising, since factors that affect power to detect trait loci differ for linkage-based vs. population-based designs. For example, the strength of evidence for linkage near APOE in the individual NIMH centers is inversely related to the APOE ε4 allele frequency in the centers, as would be expected by the known deleterious effect of high allele frequencies on power to detect evidence of linkage [Risch and Merikangas 1996]. In contrast, association-based approaches do poorly in the presence of rare alleles – the situation that favors linkage-based designs. The linkage-based design is therefore more likely than the population-based design to lead to identification of loci with rare risk alleles that have large differences in genotype-specific risks. Identification of such loci is also likely to translate into easier subsequent mechanistic studies than would use of genotypes with the very small differences in risks currently being identified in population-based studies. Possible approaches for identifying loci in regions with evidence of linkage require dense genotyping in the pedigrees with evidence for linkage, as discussed elsewhere [Blangero 2009; Blangero et al., 2009; Wijsman et al., 2010]. The complementary strengths of different designs support the importance of tackling the genetic basis of AD with multiple approaches, since it is unlikely that any single approach will identify all of the loci underlying genetic variance in age-at-onset or disease risk.
The results obtained here provide replication and additional support for several previously implicated regions from other AD samples. The strong evidence for linkage on chromosome 7q reported here overlaps a previously reported linkage signal in the analysis of age-at-onset modifiers of early onset AD [Marchani et al., 2010], age-at-onset in a Caribbean Hispanic sample [Lee et al., 2008], and AD status in a Dutch sample [Rademakers et al., 2005]. A candidate gene in this region is NOS3 [Bertram et al., 2007]. Similarly, evidence for linkage of age-of-onset of AD to chromosome 1, previously also reported in the NIMH sample in analysis of AD status [Blacker et al., 2003; Kehoe et al., 1999] and AD age-at-onset [Dickson et al., 2008], also has been reported in the analysis of age-at-onset in a Caribbean Hispanic sample [Lee et al., 2008], as an age-at-onset modifier in early onset AD [Marchani et al., 2010], and in analysis of AD status in other independent samples [Hiltunen et al., 2001; Liu et al., 2007]. This region on chromosome 1 is notable for the location of nicastrin, which is a component of γ-secretase along with presenilin 1 and additional subunits [Kimberly et al., 2003], and is also one of the three regions with strongest support in a meta-analysis of genome scans for AD-risk susceptibility [Butler et al., 2009]. Regions identified on chromosomes 6 and 15 contain candidate genes VEGF and ADAM10, respectively [Bertram et al., 2007]. The region on chromosome 19p identified here for the first time was previously reported in two different samples in an analysis of dementia risk [Hahs et al., 2006] and of age-at-onset [Wijsman et al., 2004], with similar strengthening of evidence with adjustment for APOE as seen here. This region also was highlighted in meta-analysis [Butler et al., 2009], and contains several candidate genes, including LDLR [Bertram et al., 2007]. The region on chromosome 19q with strong linkage evidence is, of course, expected because it contains APOE, the strongest known LOAD genetic risk factor [Corder et al., 1993]. In the absence of adjustment for APOE, the variation among with respect to the position of the maximum BF for analysis of the data from individual centers may simply reflect the combination of a diallelic QTL model used for analysis of an effect governed by the triallelic APOE locus [Rosenthal and Wijsman 2010]. For the 19q region, this hypothesis is supported by the strong reduction in evidence for linkage to this region upon adjustment for APOE. However, for other regions with some variability in the location of the maximum BF, such as on chromosome 1, more data would be needed to begin to distinguish between hypotheses of multiallelic QTLs, multiple loci, or simple statistical variability in location estimates.
Identification of several regions that appear to replicate results in other, independent, samples is encouraging, as is the evidence that such regions can be detected with either age-at-onset or disease risk as an endpoint phenotype. The genomic regions in the NIMH sample that may be most amenable to further investigation are such regions that coincide with those in genome scans of other samples. These regions may contain relatively rare variants of interest, since this architecture best explains the replication across only a subset of samples and is more readily detected in a family-based than population-based design. Fortunately, rapidly developing technologies are increasingly making it possible to search directly for such variants, rather than having to depend on indirect association with tag-SNPs. These newer technologies, coupled with the existence of multiple samples that support particular regions, are likely to facilitate identification of the causal sites in future studies.
Supplementary Figure 1. Posterior distribution of models obtained from MCMC oligogenic segregation analysis with adjustment for APOE on the complete sample (top row), and for the sample after excluding from the sample the information on 15 unaffected subjects with age > 91 years from C50 and C51. Analyses are shown for All3 (panels A, D), C50 (panels B, E), and C51 (panels C, F).
Supplementary Figure 2. Color version of Figure 2, showing results of genome scan for age-at-onset of AD. For each panel, magenta line: log(BF) obtained without adjustment for APOE; cyan line: log(BF) obtained with adjustment for APOE as a major gene covariate. Panel A: All1; B: Center 50; C: Center 51; D: Center 52; and E: All3. Vertical dotted lines depict boundaries between chromosomes; horizontal dotted line is at logBF=1.
Supplementary Figure 3. Color version of Figure 4 showing MCMC multipoint oligogenic linkage analyses for chromosome 1 (panels A, D), 6 (panels B, E) and 7 (panels C, F), with upper row (panels A–C) carried out without adjustment for APOE, and lower row (panels D–F) carried out with adjustment for APOE. Analysis configurations are represented by magenta: All1; orange: Center 50; green: Center 51; cyan: Center 52; purple: All3.
Supplementary Figure 4. Chromosome 9, linkage analysis of All3 and C52 (panel A), and QTL models without (panel B) and with (panel C) adjustment for APOE.
Supplementary Figure 5. Chromosome 15, linkage analysis of All3 and C51 (Panel A), and QTL models without (panel B) and with (panel C) adjustment for APOE.
Financial support was provided by National Institute of Health (NIH) grants P50 AG005136 and T32 AG000258 from the National Institute of Aging (NIA), Veterans Affairs research funds, and an anonymous foundation. Data and biomaterials for the NIMH sample were collected in three projects that participated in the National Institute of Mental Health (NIMH) Alzheimer Disease Genetics Initiative, funded from 1991 to 1998 by U01 MH46281 (MS Albert, D Blacker), U01 MH46290 (S Bassett, GA Chase, MF Folstein) and U01 MH46373 (RCP Go, LE Harrell). Genotyping services were provided by the Center for Inherited Disease Research, which is fully funded through a federal contract from the NIH to The Johns Hopkins University, Contract Number N01 HG65403.