|Home | About | Journals | Submit | Contact Us | Français|
A recently published genome-wide association study (GWAS) of late-onset Alzheimer's disease (LOAD) revealed genome-wide significant association of variants in or near MS4A4A, CD2AP, EPHA1 and CD33. Meta-analyses of this and a previously published GWAS revealed significant association at ABCA7 and MS4A, independent evidence for association of CD2AP, CD33 and EPHA1 and an opposing yet significant association of a variant near ARID5B. In this study, we genotyped five variants (in or near CD2AP, EPHA1, ARID5B, and CD33) in a large (2,634 LOAD, 4,201 controls), independent dataset comprising six case-control series from the USA and Europe. We performed meta-analyses of the association of these variants with LOAD and tested for association using logistic regression adjusted by age-at-diagnosis, gender, and APOE ε4 dosage.
We found no significant evidence of series heterogeneity. Associations with LOAD were successfully replicated for EPHA1 (rs11767557; OR = 0.87, p = 5 × 10-4) and CD33 (rs3865444; OR = 0.92, p = 0.049), with odds ratios comparable to those previously reported. Although the two ARID5B variants (rs2588969 and rs494288) showed significant association with LOAD in meta-analysis of our dataset (p = 0.046 and 0.008, respectively), the associations did not survive adjustment for covariates (p = 0.30 and 0.11, respectively). We had insufficient evidence in our data to support the association of the CD2AP variant (rs9349407, p = 0.56).
Our data overwhelmingly support the association of EPHA1 and CD33 variants with LOAD risk: addition of our data to the results previously reported (total n > 42,000) increased the strength of evidence for these variants, providing impressive p-values of 2.1 × 10-15 (EPHA1) and 1.8 × 10-13 (CD33).
Following the identification of the APOE ε4 allele as a risk factor for late-onset Alzheimer's disease (LOAD) in 1993 , consistent replication of subsequently identified candidates was not achieved until 2009, when two genome-wide association studies (GWAS) [2,3] identified associations of variants in or near CLU, PICALM , and CR1 with LOAD, which were consistently replicated in multiple large, independent case-control studies [4-17]. Subsequently, a variant near BIN1 was reported  to achieve genome-wide significant association in a later GWAS published in 2010 that also replicated well in follow-up studies [14-19]. These results demonstrate the utility of the hypothesis-free GWAS approach for identifying loci that associate with LOAD and the necessity of pooling samples and data from multiple centers to obtain resources with sufficient statistical power (GWAS typically > 14,000, follow-up typically total > 28,000) to detect the modest ORs (e.g. 0.8/1.2) associated with these variants in GWAS and follow-up studies.
Two recently published companion studies by Hollingworth et al.  and Naj et al.  performed meta-analysis of two large GWAS datasets (n > 75,000). Besides APOE, CLU, PICALM, and CR1, the meta-analyses revealed association at ABCA7 (p = 5 × 10-21), MS4A6A (p = 1.2 × 10-16), MS4A4E (p = 1.1 × 10-10), EPHA1 (p = 6 × 10-10), CD2AP (p = 8.6 × 10-9) and CD33 (p = 1.6 × 10-9). In addition, the two datasets revealed opposing association (Naj et al. OR = 0.93, p = 0.001; Hollingworth et al. OR = 1.06, p = 0.03) of the variant near ARID5B (rs2588969) with LOAD, suggesting potential heterogeneity at this locus. In this study, we genotyped the variants identified at the CD2AP, EPHA1, and CD33 loci in our independent case-control dataset comprising six case-control series (n = 6,835). To assess the opposing associations at the ARID5B locus, we also genotyped the two ARID5B variants included in the Hollingworth et al. study. Genotypes from our follow-up case-control series (Mayo 2) for variants in ABCA7, MS4A6A and MA4A4E were included in Stage 3 of the Hollingworth et al. study, so we have not included these three variants in this study. We have performed meta-analyses of five variants (at CD2AP, EPHA1, ARID5B and CD33 loci) in our six case-control series, which showed no significant series heterogeneity. Furthermore, we have performed logistic regression analysis of our pooled series adjusting for covariates. Finally, we have used a Fisher's combined test to evaluate the significance of the association of these five variants in our data combined with the data in the Hollingworth et al. and Naj et al. studies.
We genotyped five variants (CD2AP; rs9349407, EPHA1; rs11767557, ARID5B; rs2588969 and rs4948288, CD33; rs3865444) in our independent follow-up case-control series (Mayo2) from three North American and three European Caucasian series. Detailed information about these samples is shown in Table Table11 and genotype counts are shown in Table Table2.2. Samples used in this study do not overlap with those included in the Naj et al. study and have not been included in any of the published LOAD GWAS. The Mayo2 dataset included in the Hollingworth et al. publication only included genotypes for ABCA7, MS4A6A and MA4A4E.
Meta-analyses of allelic association in the six Mayo2 series performed using a DerSimonian-Laird random effects model (Figure (Figure1)1) revealed a significant pooled OR for the EPHA1 variant (Figure (Figure1b;1b; OR = 0.88, p = 0.008) comparable to that previously published by Naj et al. (OR = 0.87) and by Hollingworth et al. (OR = 0.90). As shown in Figure Figure1c1c and and1d,1d, we also observed significant association for both ARID5B variants (rs2588969, OR = 1.08, p = 0.046; rs4948288, OR = 1.11, p = 0.008) with ORs comparable to those reported by Hollingworth et al. (OR = 1.06 and 1.07, respectively) and in the opposing direction to those reported by Naj et al. for rs2588969 (Stage 1+2 OR = 0.93, p = 7.7 × 10-4). As shown in Figure Figure1a1a and and1e,1e, we did not observe significant association for CD2AP (OR = 0.98, p = 0.76) or CD33 (OR = 0.96, p = 0.32) in our meta-analyses. Breslow-Day tests provided no significant evidence that the ORs for any of these variants were heterogeneous among our series (all p > 0.25), as shown in Figure Figure1.1. The variant with the most heterogeneity was CD2AP (rs9349407) where the estimated percentage of variation due to heterogeneity across studies (I2) was 25.1% (95% CI 0%-70%) suggesting the presence of some heterogeneity for that variant.
To adjust for important covariates, we included age-at-diagnosis/entry, sex and APOE ε 4 dosage in logistic regression analyses of all five variants in each of the six Mayo2 series; in our analysis of all Mayo2 series combined, series was included as an additional covariate. Table Table33 shows the results for the six Mayo2 series combined (Mayo follow-up) as well as for each of the six individual Mayo2 series. For the purpose of comparison, we have also included in Table Table33 the published GWAS results for the same variants. Adjustment for covariates revealed comparable ORs to those obtained in the meta-analyses, with improved p-values for the EPHA1 (OR = 0.87, p = 5 × 10-4), CD33 (OR = 0.92, p = 0.049) and CD2AP (OR = 0.97, p = 0.56) loci. However, the associations of the ARID5B variants were no longer significant following adjustment for covariates (rs2588969: OR = 1.05, p = 0.30, rs4948288: OR = 1.07, p = 0.11) suggesting that these associations may be dependent upon the series, age-at-diagnosis/entry, sex and/or APOE ε 4 dosage of the individual.
In order to estimate the overall association of these five variants in our data combined with the previously published associations, we used Fisher's method to combine the p-values for all associations (Table (Table3;3; Mayo2/ADGC/Hollingworth). We found that adding our data to those previously reported, increased the strength of evidence for all variants as LOAD risk modifiers (CD2AP: p = 6.5 × 10-11, EPHA1: p = 2.1 × 10-15, ARID5B rs2588969: p = 2.3 × 10-9, ARID5B rs4948288: p = 4.0 × 10-4, CD33: p = 1.8 × 10-13).
We report here successful replication of the association of two variants with LOAD in a large (n = 6,835), independent case-control study; rs11767557, which is located 3 kb upstream of EPHA1 (p = 5 × 10-4) and rs3865444, which is located 373 bp upstream of CD33 (p = 0.049). The ORs we observed in our meta-analyses (EPHA1 = 0.88, CD33 = 0.96) were comparable to those reported by both Naj et al. (EPHA1 = 0.87, CD33 = 0.89) and by Hollingworth et al. (EPHA1 = 0.90, CD33 = 0.89) such that the estimated p-values for association of these variants in all data (n > 42,000) were an impressive 2.1 × 10-15 for EPHA1 and 1.8 × 10-13 for CD33.
Although our meta-analyses showed successful replication of the association of the ARID5B variants rs2588969 (OR = 1.08, p = 0.046) and rs4948288 (OR = 1.11, p = 0.008) with a direction of association consistent with that reported by Hollingworth et al. (OR = 1.06 and 1.07, respectively), the associations did not survive adjustment for age-at-diagnosis/entry, sex and APOE ε 4 status (p = 0.30 and 0.11, respectively). This covariate-dependent association could explain the opposing association reported by Naj et al. in their discovery (OR = 0.88) and replication (OR = 1.05) datasets for rs2588969; the only ARID5B variant they tested. Therefore, while estimation of the p-values for association of the ARID5B variants in all datasets combined were highly significant (rs2588969; p = 2.3 × 10-9 and rs4948288; p = 4.0 × 10-4), interpretation of these associations should be treated with caution and should take into account the age-at-diagnosis/entry, sex and APOE ε 4 dosage of the populations. Finally, although the estimated p-value for association of rs9349407 (located in intron 1of CD2AP) in all datasets was 6.5 × 10-11, there was no evidence for association of this variant in our dataset alone (OR = 0.97, p = 0.56).
Our Mayo2 collection of case-control series studies provided a total of 2,634 LOAD and 4,201 controls. Combining across studies to perform global tests of significance for additive genotypic trend tests gave us 80% power to detect ORs ranging from 1.17 (or 0.855) for variants with a minor allele frequency (MAF) of 0.2 to 1.13 (or 0.883) for variants with a MAF of 0.45 in controls. The study provided approximately 60% power to detect the OR of 1.11 that we report for CD2AP (MAF = 0.27).
Case-control studies such as this are not designed to ascertain whether the variants with reported association with LOAD risk are the functional variant but they can identify a linkage disequilibrium (LD) block within which a truly functional variant may reside. Our results indicate that the EPHA1 and CD33 variants represent excellent candidates for targeted deep sequencing or high density genotyping in order to define the locus further, followed by subsequent functional studies of nearby genes to elucidate the mechanism behind these associations. With the exception of rs9349407, which lies within intron 1of CD2AP, all of these variants lie within intergenic regions but for ease of the reader, we have thus far only referred to the nearest gene for each variant. This by no means signifies that these variants (or the functional variants in LD with them) are assumed to affect the expression or function of the nearest gene but may affect other nearby genes. Until it is known which gene underlies these associations, all nearby genes should be included in follow-up functional investigation (all genes that reside within 100 kb of these variants are listed in Additional file 1, Table S1).
Taken along with our previous publications [5,18,20,21], we have now performed follow-up association studies of 25 of the top GWAS-identified candidate LOAD genes and successfully replicated the association of eleven variants (in or near ABCA7, BIN1, CD33, CLU, CR1, EPHA1, GAB2, LOC651924, MS4A6A/4E and PICALM), eight of which are currently ranked in the top ten (after APOE) on AlzGene. This recent success in replicating genetic association highlights the utility of multiple, large case-control follow-up studies to confirm the novel associations reported by large GWAS, thus confirming them as good candidate genes for functional follow-up studies.
Approval was obtained from the ethics committee or institutional review board of each institution responsible for the ascertainment and collection of samples. Written informed consent was obtained for all individuals that participated in this study.
The Mayo2 case-control series consisted of Caucasian subjects from the United States ascertained at the Mayo Clinic Jacksonville, Mayo Clinic Rochester, or through the Mayo Clinic Brain Bank. Additional Caucasian subjects from Europe were obtained from Norway , Poland , and from six research institutes in the United Kingdom that are part of the Alzheimer's Research UK (ARUK) Network. Although the ARUK samples used in this follow-up do not overlap with those employed in the original GWAS publication by Hollingworth et al., the same subject/sample ascertainment methodology was followed. The ARUK series included here are from Bristol, Leeds, Manchester, Nottingham, Oxford and Southampton. Since the Manchester cohort only consisted of LOAD cases, the Manchester cases were combined with subjects in the Nottingham series.
All genotyping was performed at the Mayo Clinic in Jacksonville using TaqMan® SNP Genotyping Assays in an ABI PRISM® 7900HT Sequence Detection System with 384-Well Block Module from Applied Biosystems, California, USA. The genotype data was analyzed using the SDS software version 2.2.3 (Applied Biosystems, California, USA).
Meta-analysis of allelic association and Breslow-Day tests were performed using StatsDirect v2.5.8 software. Meta-analyses were performed using the results from each individual case-control series. Summary ORs and 95% CI were calculated using the DerSimonian and Laird (1986) random-effects model . Breslow-Day tests were used to test for heterogeneity between populations. PLINK software  (http://pngu.mgh.harvard.edu/purcell/plink/) was used to perform logistic regression analysis under an additive model adjusting for age-at-diagnosis, sex and APOE ε 4 dose as covariates. In our analysis of all series combined, series was included as an additional covariate. Since genotype counts were not reported for series included in the Naj et al. or Hollingworth et al. studies, we employed a Fisher combined test to combine p-values across series. Power calculations, based on a Mantel-Haenszel chi-square test that pooled across six different study groups, were obtained to estimate the detectable odds ratios for an ordinal effect using a range of minor allele frequencies spanning those expected from the candidate variants.
ABCA7: ATP-binding cassette, sub-family A (ABC1), member 7; AD: Alzheimer's disease; ADGC: Alzheimer's disease Genetic Consortium; APOE: apolipoprotein E; ARID5B: AT rich interactive domain 5B (MRF1-like); ARUK: Alzheimer's Research United Kingdom; BIN1: bridging integrator 1; Bp: base pair; CD2AP: CD2-associated protein; CD33: CD33 molecule; CI: confidence interval; CLU: clusterin; CR1: complement component (3 b/4 b) receptor 1 (Knops blood group); EPHA1: EPH receptor A1; GAB2: GRB2-associated binding protein 2; GERAD: Genetic and Environmental Risk in Alzheimer's Disease Consortium; GWAS: genome-wide association study; kb: kilobases; LD: linkage disequibrium; LOAD: late-onset Alzheimer's disease; MAF: minor allele frequency; MS4A4A: membrane-spanning 4-domains, subfamily A, member 4; OR: odds ratio; PICALM: phosphatidylinositol binding clathrin assembly protein; SD: standard deviation.
The authors declare that they have no competing interests.
Study concept and design: MMC and SGY. Sample Collection and Diagnosis: ARUK, DWD, JOA, MB, NRG-R, RCP, SBS, and ZKW. Genotyping: MMC and TAH. DNA Sample Preparation: GDB, ML and ZFG. Analysis and interpretation of data: JEC, KM, MMC, OB, SGY and VSP. Drafting of the manuscript: MMC and OB. Critical revision of the manuscript for important intellectual content: KM, MMC, OB, SGY and VSP. Study supervision: KM, MMC and SGY. All authors have read and approve the final manuscript.
Table S1. Genes located within 100 kb of the five variants tested in this study. Chr, chromosome. Base pair positions (bp) are relative to the NCBI Human Genome build 36.1. The position of the variant relative to the gene is given as 5' (upstream from the gene's transcription start site) or 3' (downstream from the gene's last exon). Distance indicates the number of base pairs from the variant position to the gene's nearest exon.
We thank contributors, including the Alzheimer's disease centers who collected samples used in this study, as well as subjects and their families, whose help and participation made this work possible. We thank the members of the Alzheimer's Research UK (ARUK) consortium who contributed samples to the ARUK resource. This work was supported by grants from the US National Institutes of Health, NIA R01 AG18023 (NRG-R, SGY); Mayo Alzheimer's Disease Research Center, P50 AG16574 (RCP, DWD, NRG-R, SGY); Mayo Alzheimer's Disease Patient Registry, U01 AG06576 (RCP); and US National Institute on Aging, AG25711, AG17216, AG03949 (DWD). Samples from the National Cell Repository for Alzheimer's Disease (NCRAD), which receives government support under a cooperative agreement grant (U24AG21886) awarded by the National Institute on Aging (NIA), were used in this study. This project was also generously supported by the Robert and Clarice Smith Postdoctoral Fellowship (MMC); Robert and Clarice Smith and Abigail Van Buren Alzheimer's Disease Research Program (RCP, DWD, NRG-R, SGY) and by the Palumbo Professorship in Alzheimer's Disease Research (SGY). KM is funded by the Alzheimer's Research UK and the Big Lottery Fund. ZKW is partially supported by the NIH/NINDS 1RC2NS070276, NS057567, P50NS072187, Mayo Clinic Florida (MCF) Research Committee CR programs (MCF #90052018 and MCF #90052030), Dystonia Medical Research Foundation, and the gift from Carl Edward Bolch, Jr., and Susan Bass Bolch (MCF #90052031/PAU #90052). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
The Alzheimer's Research UK Consortium: Peter Passmore, David Craig, Janet Johnston, Bernadette McGuinness, Stephen Todd, Queen's University Belfast, UK; Reinhard Heun (now at Royal Derby Hospital), Heike Kölsch, University of Bonn, Germany; Patrick G. Kehoe, University of Bristol, UK; Nigel M. Hooper, Emma R.L.C. Vardy (now at University of Newcastle), University of Leeds, UK; David M. Mann, University of Manchester, UK; Kristelle Brown, Noor Kalsheker, Kevin Morgan, University of Nottingham, UK; A. David Smith, Gordon Wilcock, Donald Warden, University of Oxford (OPTIMA), UK, Clive Holmes, University of Southampton, UK.