|Home | About | Journals | Submit | Contact Us | Français|
Alzheimer’s disease is a complex and progressive neurodegenerative disease leading to loss of memory, cognitive impairment, and ultimately death. To date, six large-scale genome-wide association studies have been conducted to identify SNPs that influence disease predisposition. These studies have confirmed the well-known APOE ε4 risk allele, identified a novel variant that influences disease risk within the APOE ε4 population, found a SNP that modifies the age of disease onset, as well as reported the first sex-linked susceptibility variant. Here we report a genome-wide scan of Alzheimer’s disease in a set of 331 cases and 368 controls, extending analyses for the first time to include assessments of copy number variation. In line with previous reports, no new SNPs show genome-wide significance. We also screened for effects of copy number variation, and while nothing was significant, a duplication in CHRNA7 appears interesting enough to warrant further investigation.
Alzheimer’s disease (AD) is the most common form of dementia and is characterized by progressively worsening cognitive function. Although the etiology of AD is not fully understood, it is characterized neuropathologically by neuronal and synapse loss, gliosis, and the accumulation of intraneuronal neurofibrillary tangles and extraneuronal amyloid deposits.
Family history is a key indicator of disease risk as it is estimated that the heritability of this disorder approaches 80% . There are two recognized forms of AD, comprising autosomal dominant familial AD and sporadic AD . The familial form is rare, typically occurs before the age of 60, and accounts for < 1% of all AD cases . This disorder is largely attributed to rare mutations in the gene encoding amyloid-β protein precursor (AβPP)  and mutations in presenilin-1 (PSEN1)  and presenilin-2 (PSEN2) [5,6]. Sporadic AD, also termed late onset because the majority of cases occur after the age of 60, is the more common form. The most well-defined susceptibility gene for this form of AD is APOE. This gene, located on chromosome 19, is responsible for the production of a protein that transports cholesterol and other fats throughout the body. In the presence of the ε4 variant of this gene, there is increased risk for both forms of AD, with the number of copies present indicative of level of risk . Collectively these four genes account for 10–50% of the overall genetic risk for this disorder . This suggests that several genetic risk factors have yet to be fully characterized.
Since the identification of APOE in AD, close to 1,000 papers have been published reporting and refuting genetic associations, outside of the unequivocal APOE association, with AD . Recently, a meta-analysis suggested that there are no more than 12 reproducible associations with AD risk . Furthermore, six genome-wide association studies have been published to date [10,11]. The first genome-wide scan confirmed the APOE association with AD risk and reported that no other association approached that level of significance . Another genome-wide association study found that a SNP (and a corresponding haplotype) in the GAB2 gene was associated with AD, and that this risk was substantially increased in the presence of APOE ε4 allele(s) . Recently a scan was published that again confirmed the APOE ε4 association and identified three additional candidate SNPs that conferred AD risk, including SNPs located in GOLPH2, an intergenic SNP on chromosome 9, and an intergenic SNP located between ATP8B4 and SLC27A2 . The fourth scan reported a possible association at the 12q13 locus . The fifth scan reported a SNP and haplotype residing on the X chromosome in PCDH11X was shown to affect AD risk in the populations studied and the effects were most pronounced in homozygous females . Finally, the sixth scan reported a SNP located at 14q31 (rs11159647) which was found to modify age of onset in over 4000 patients .
A growing body of evidence suggests that structural variation, including copy number variants (CNV), across the genome is common and likely contributes to human disease [16,17]. In fact, a rare duplication of the AβPP gene has been linked to early onset AD. Other CNVs may also contribute to this disease state, however, this has yet to be investigated. Using new methods which use dense genome-wide SNP genotyping data to identify CNVs, we are now able to begin to assess how these genetic variants impact human disease [16,18].
In this work, we assessed genetic predisposition to late onset AD in subjects of European ancestry using the Illumina HumanHap 550 K genotyping platform to evaluate both SNP-level associations and CNVs across the genome. While based on a smaller sample size than previously conducted studies, these data add to the collection of genomic data available for the assessment of the genetic contribution to AD, and for the first time report results of a genome-wide CNV scan in AD.
Samples collected and stored by the Joseph and Kathleen Bryan Alzheimer’s Disease Research Center at Duke University were used in this study. A total of 331 cases were evaluated with a clinical diagnosis of dementia; > 80% of the patients had a clinical diagnosis of AD. Controls (n = 368) were comprised largely of unaffected spouses of cases. Sample demographics are included in Table 1. Full clinical data was available for 28% of the control subjects and 72% of cases. In these cases the dementia status was clinically determined and recorded at the time the subjects consented to study participation. We note that a portion of the samples used in this study likely overlap those reported in Beecham et al. .
An additional 531 neurologically normal control samples (age 20–68, average age 25) who were collected and anonymously databased as part of the Duke Genetics of Memory project were used to secondarily assess the frequency of CNVs in a population of subjects non-enriched for AD. This study was performed according to standards set forth by the Duke University Institutional Review Board.
Genome-wide genotyping was performed using Illumina Human Hap550K chips. DNA was extracted using standard protocols. Genotyping quality was assessed using previously published methods . Briefly, all SNPs that were called with a genotyping frequency of > 99% across subjects (1% rule) were included in the analysis. All subjects were also required to have a genotyping success rate of > 99% for all SNPs that passed the 1% rule.
Additional genotyping of rs2373115 (GAB2) was performed using a commercially available Taqman-based genotyping assay (standard protocol, Applied Biosystems).
All subjects that passed SNP QC procedures were entered into the CNV analysis. The CNV calls were generated using the PennCNV software (version 200 june 26) using the Log R ratio (LRR) and B allele frequency (BAF) . Standard PennCNV quality control checks were used to exclude samples for whom PennCNV calling would be considered unreliable, these included LRR standard deviation > 0.28, BAF median >0.55 or < 0.45, BAF drift > 0.002 or WF > 0.04 or < −0.04. Additionally, to ensure only high-confidence CNVs were included in the analysis, we excluded any CNV for which the difference of the log likelihood of the most likely copy number state and the less likely copy number state was less than 10 (generated using the -conf function in PennCNV), CNVs that were called based on data from fewer than 10 SNPs, and finally CNVs which overlap at least 50% with previously published regions prone for false positives, as described previously .
Finally, all CNVs that exceeded 1Mb and that were called based on < 200 SNPs were subjected to visual inspection to ensure that the CNVs did not span a centromere. Any centromeric CNV by this definition was removed from further analysis. Finally, an additional scan was made for larger than expected per chromosome CNV burdens which may indicate possible false positives in the CNV calling. Specifically, samples with > 1000 affected SNPs on a single chromosome were outliers (> 3 standard deviation away from the mean) in the cohort distributions. This occurred in a total of seven subjects, including 3 cases and 4 controls.
Genes included within CNVs were assessed using an internally designed annotation system which interfaces with Ensembl databases and provides a list of protein-coding genes, miRNAs, rRNAs, and snRNAs that are included or disrupted by the CNV.
Logistic regression of case-control status against all SNP data collected on the chip was performed using PLINK genome-wide association analysis toolkit . To control for the possibility of spurious associations resulting from population stratification, we used a modified EIGENSTRAT method [19,22]. A Bonferroni correction was used to correct for multiple testing. The threshold for a genome-wide significant finding was therefore defined as 9.1 × 10−8.
Association testing was performed to evaluate the impact of common CNVs on AD predisposition using a set of SNPs validated to be tagging CNVs kindly provided by Drs. McCarroll and Altshuler (personal communication). A total of 285 common CNVs were found to be tagged by a SNP . Of these, 199 were represented on the Illumina HumanHap 550K genotyping chip either directly or via a proxy with an r2 greater than 0.8. Each of these 199 SNPs were evaluated in the SNP scan (logistic regression with Eigenstrat population correct with a Bonferroni correction for the 199 SNPs tested).
Finally, we screened the following previously reported neuropsychiatric risk regions for AD associated CNVs, including 1q21.1 [24–26], 3q13.3 , 3q24 , 4q28.3 , 7q34–36.1 , 8p22 , 15q11.2 , 15q11.2–13.3 [30,31], 15q13.3 [24–26, 32], 16p11.2 , 16p13.11–12.3 [20,24,33], 22q11.2 [34,35], 22q13 , and Xp22.33 .
After correction for population stratification using the Eigenstrat method [19,22], there was no indication of any elevation of association due to stratification (Fig. 1) and only one SNP in the TOMM40 gene, located upstream of APOE ε4 gene, achieved genome-wide significance (Table 2). The effect of this SNP can be attributed in full to the previously reported APOE ε4 association with AD risk . Additional top associations are shown in Table 1, none of which have been previously reported to be associated with AD and none appear to be particularly suggestive of having any connection to AD. A full chromosome level view of p values generated in the genome-wide association study is shown in Fig. 2 and a comprehensive listing of p values for each SNP evaluated in this study are available at http://www.genome.duke.edu/labs/goldstein/data/.
We directly evaluated previously reported genome-wide significant associations in AD. The GAB2 SNP, rs2373115, reported by Reiman and colleagues , was not genotyped on the Illumina platform, however, this SNP was genotyped independently in our samples. We observed no association of rs2373115 alone or when evaluated for an interaction with number of copies of the APOE ε4 allele (p = 0.53, multiple logistic regression with an interaction term #APOE e4 alleles*rs2373115). Three additional candidate SNPs were reported , none of which were significant in this data set. We also found no association at 12p13 or 14q31 as reported by Beecham et al.  and Bertram et al. , respectively. Finally, the risk SNP in PCDH11X previously reported  was not associated in our data set in females alone or a combined model incorporating gender.
We also tested 21 genes implicated in candidate gene studies (Table 3). None were significant after correcting for the number of associations tested for all genes or within single genes.
Using a set of tagging SNPs for 199 common CNVs across the genome , we found no significant impact of CNVs on AD predisposition. Next, we screened for differences in CNV burden, average CNV size, average number of genes (both protein coding and nonprotein coding) or exons within or disrupted by CNVs, and enhanced presence of rare gene-disrupting CNV as defined previously for schizophrenia , and showed no overall differences. Unlike epilepsy and schizophrenia, we showed no clear excess of large deletions or duplications [20,24], although we note that the two heterozygous deletions greater than 1 Mb were present only in cases at 11q22.3 and 2q32.1. Effects of excess large CNV burden may be evident in studies involving a larger sample size. As several neuropsychiatric diseases have been shown to share common risk loci [20,24–26,32,38], we then screened all previously reported neuropsychiatric risk regions for rare CN-Vs enriched in AD cases (see Materials and Methods for a list of regions evaluated). Using this approach, we found a duplication within the schizophrenia and epilepsy associated risk region at 15q13.3 [24,26,32], affecting the CHRNA7 gene, with a total of six out of 276 cases (2%) and one out of 322 controls (0.3%) having this duplication. The implicated CNV duplicates CHRNA7 and an approximately 300 kb region upstream of the gene (Fig. 3), a much smaller region than implicated in schizophrenia and epilepsy associated deletions [24,26,32]. This gene is one of several neuronal nicotinic cholinergic receptors which contribute to a wide range of neuronal processes. Importantly, this receptor is acted on by a neurotransmitter, acetylcholine, which is regulated by a primary class of drugs (acetylcholinesterase inhibitors) used to treat AD. Several investigations have implicated genetic variants in CHRNA7, other genes in the cholinergic receptor family, and in genes encoding proteins involved in the associated biochemical pathway, in AD susceptibility [39–42], although many of these associations have failed to be consistently replicated (c.f., Alzgene http://www.alzforum.org/res/com/gen/alzgene ). This association did not achieve statistical significance in our study with a p value of 0.053 (uncorrected for multiple testing), however, given its reported role in intracellular amyloid accumulation  and genetic evidence for its involvement in AD [39–42], we have highlighted it in this work as a potential risk variant. The Database of Genomic Variants  reports this duplication to be present in HapMap participants at a frequency of 0.005, which is consistent with our cohort of neurologically-normal follow-up cohort (n = 531, see Methods).
As this screen for CNVs enriched in AD patients is small scale, additional work is needed to validate and confirm these associations. A full list of CNVs identified in this cohort, imposing the quality control checks defined in the methods section, are available at http://www.genome.duke.edu/labs/goldstein/data/.
There have now been six genome-wide scans involving close to 6000 AD cases, identifying several genome-wide significant discoveries that together account for only a small fraction of the heritability of sporadic AD. The failure to identify new common SNPs contributing to susceptibility suggests that for AD, as with many other common diseases, much of the heritability may be due to rare relatively high penetrant variants [20,25,26,31]. While additional work is needed to fully assess the impact of rare structural variants in AD, here in the first CNV scan in AD, we report a rare duplication affecting the CHRNA7 gene which may contribute to AD risk.
Funding for this work was provided by the Joseph and Kathleen Bryan Alzheimer’s Disease Research Center (NIA grant P30 AG028377) and the Institute for Genome Sciences and Policy at Duke University.
Authors’ disclosures available online (http://www.jalz.com/disclosures/view.php?id=94).