We performed the largest GWAS to date in PD. Unlike previous studies, we focused exclusively on cases having a positive family history of PD, which we hypothesize reflects an increased genetic contribution to disease risk. Using this approach, we detected consistent evidence of association to several chromosomal regions. Notably, we detected association to SNPs within or near two candidate genes previously associated with PD: SNCA and MAPT. Neither of these genes was identified in the two previous GWAS studies of PD.
was the first gene in which mutations were identified as causing PD (Polymeropoulos et al. 1997
). It is thought that aberrant aggregation of the α-synuclein protein results in cell damage and ultimately neuronal death. Subsequent analyses have showed that point mutations (Kruger et al. 1998
; Zarranz et al. 2004
) as well as gene duplications (Chartier-Harlin et al. 2004
; Ibanez et al. 2004
) and triplications (Singleton et al. 2003
) can result in PD; however, mutations in SNCA
are a quite rare cause of autosomal dominant PD. More recently, several studies reported that variation in the promotor region of SNCA
, specifically the dinucleotide repeat polymorphism known as Rep1, acts as a susceptibility factor for PD, increasing the risk for disease (Kruger et al. 1999
; Maraganore et al. 2006
). Association has also been reported at the 3′ end of the gene (Mueller et al. 2005
), and a 3′ SNP (rs356219) was identified to be associated with SNCA mRNA levels in substantia nigra and cerebellum (Fuchs et al. 2008
) The evidence of association we detected (p
< 1 × 10−4
) with several SNPs in SNCA
is within intron 4 and the 3′ region of the gene. The rs356229 SNP that we report with a minor allele increasing risk of PD (OR = 1.35) exhibits modest LD with rs356219 (HapMap CEPH D′ = 0.65, r2
= 0.39). The evidence that alpha-synuclein levels in the brain are influenced by genetic variability in the 3′ region of the gene (Fuchs et al. 2008
) and the LD between the reported SNPs in SNCA
provide a link between our GWAS results and SNCA
encodes microtubule-associated protein tau, which regulates microtubule dynamics and assembles microtubules into parallel arrays within axons. Aggregation of tau is a pathological hallmark of several neurodegenerative disorders collectively known as tauopathies, including Pick disease and Alzheimer disease, as well as several disorders with parkinsonian features such as progressive supranuclear palsy, corticobasal degeneration, and fronto-temporal dementia with parkinsonism. Linkage of PD to the MAPT
region was previously reported (Scott et al. 2001
) and several studies have indicated that a large haplotype block containing MAPT
is associated with a small but significant increase in risk for PD (Healy et al. 2004
; Tobin et al. 2008
; Zabetian et al. 2007
; Zhang et al. 2005
). The deleterious haplotype (H1) and the protective haplotype (H2) actually represent groups of subhaplotypes that arose from an inversion of 900 kb on chromosome 17 several million years ago (Stefansson et al. 2005
); however, associations with these subhaplotypes have not been replicated (Zabetian et al. 2007
). The SNPs that define the parent haplotypes of H1 and H2 are in complete linkage disequilibrium with each other (r2
= 1), indicating that the functional variation could be anywhere within this large 900 kb region and not necessarily within the MAPT
gene. Complex permutations of alternative splicing lead to many different isoforms of tau; so if the association with H1 is due to variation that were to upset this delicate balance of isoforms, it may help to explain the variety of different neurodegenerative phenotypes that exhibit tau pathology.
Within this MAPT region, which exhibits wide ranging LD, are several additional genes including C17orf69, CRHR1 and IMP5. A SNP between C17orf69 and CRHR1 provided the strongest evidence of association using the recessive model and had an even smaller p value when included as part of our meta-analysis. Evidence of association to this region was also strengthened when meta-analysis was performed using the additive model. Minor alleles of SNPs genotyped in this study that tag the H2 haplotype include rs12185268/G, rs12373139/A, rs1981997/A, and rs8070723/G, all of which were highlighted in the results of the additive meta-analysis (). Both SNPs in IMP5 identified in the meta-analysis () are missense polymorphisms. Given the complex LD structure within this chromosomal region, it is not yet clear whether it harbors multiple susceptibility genes (or alleles) within this region or, conversely, whether the evidence of association with multiple SNPs in different genes reflects a single susceptibility allele. We favor the former hypothesis, although further genotyping and analysis are clearly warranted to resolve this issue. Nonetheless, both the primary GWA analysis and meta-analysis support the existing hypothesis that the complex genomic region around MAPT is related to PD risk.
In order to evaluate replication of our top findings and to identify SNPs with modest p
values that may nonetheless be true associations, we performed a meta-analysis. The focus of the present study is a comparison of PD cases and controls, a design also employed by Fung et al. In contrast, a previous GWAS by Maraganore and colleagues (Maraganore et al. 2005
) initially employed a discordant sibling design. As noted by others (Defazio et al. 2006
), a discordant sibling design is less powerful than a case–control design since the unaffected sibling may have still inherited susceptibility alleles that as a result of incomplete penetrance are not expressed. Therefore, we thought it most appropriate to include in our meta-analysis only the study of Fung et al. which like our own study was an analysis of unrelated cases and controls. We considered combining the genotypic datasets from Fung et al. with our study and testing for association on the combined dataset; however, due to the potential variation introduced by genotyping in differing laboratories with unique control samples and protocols and the different ascertainment scheme of the cases (familial vs. sporadic), we elected to perform a conservative meta-analysis using the results of association tests performed in each study separately. The meta-analysis results have provided support for association to several novel genes and regions not previously reported in GWAS of PD.
To prioritize among these novel genes and regions, we carefully reviewed the evidence for association from nearby SNPs, any published literature about the function of the gene or its potential role in PD susceptibility, and the meta-analysis results. The evidence for a possible association with the LD block region containing GAK
(cyclin G associated kinase, a cell cycle regulator) and DGKQ
(diacylglycerol kinase, theta) increased following meta-analysis. GAK
is a particularly promising candidate because it is one of 137 genes shown to be differentially expressed in PD, with a 1.56-fold change in expression in the substantia nigra pars compacta of PD patients as compared to controls (Grunblatt et al. 2004
). No SNPs within the other 136 differentially expressed genes (or within 50 kb of these genes) highlighted in this expression study (Grunblatt et al. 2004
) were significantly associated with PD susceptibility in our sample (p
< 0.0001). Less is known about DGKQ
; however, it is thought to be involved in the phosphatidylinositol signaling system (KEGG pathway ID: hsa04070) and is expressed in the brain. The gene PIK3CD
, identified among top recessive model results, is involved in the same pathway as DGKQ
. There is another gene (TMEM175
) in between GAK
; however, while there were SNPs genotyped in this gene, none showed suggestive evidence of association with PD (). Nevertheless, it is possible that a disease risk modifying variant could be present in any of the genes in this region.
For the SNPs presented in , , we performed a secondary analysis in a broader set of individuals encompassing 902 cases (PROGENI, n
= 491; GenePD, n
= 411) and 881 controls (see Supplemental Methods III
and Supplemental Table 1
) including 40 cases and 14 controls of Hispanic or Asian descent and 19 cases from whole genome amplified samples. Results were largely similar to those obtained in the primary sample (see Supplemental Tables 2A, B
One limitation of our study is the difference in ascertainment that resulted in differences in the age and gender distribution between our case and control populations. Because the age at exam for the controls was on average 7 years younger than the average age of onset of the cases, it is possible that a small number of the controls might develop PD as they age. However, the lifetime risk of PD is only approximately 1%; therefore, if a few controls were to develop PD, this would have little effect on the power of the current study. As with any association study, the greatest concern is the possibility of population stratification within cases and controls. We have employed stringent criteria and did not detect evidence suggesting that any of the first 10 MDS components (a proxy for population stratfication) were significantly associated with disease status in the final sample. These results indicate that the sample is relatively homogenous and unlikely to be biased due to admixture.
The results obtained from this study do not meet genomewide significance based on a conservative Bonferroni correction for multiple testing (1.5 × 10−7). Although our sample size is more than twice the size of previous GWAS studies, we still have limited power to detect, at a genomewide significant level, the small to moderate effect sizes often seen in susceptibility alleles for complex diseases such as PD. It is likely that some of the true association results will not lie among the most significant association results. We, therefore, turned to other lines of evidence to discern which among our strongest association results are most likely to be true positive results. Notably, two of our strongest association results were in the regions that include SNCA and MAPT; both genes have been previously reported as associated with PD susceptibility and therefore independent replication has been demonstrated in the existing literature. Meta-analysis demonstrates consistency of the DGKQ/GAK region in two independent studies.
It is possible that genes related to familial PD may be different than sporadic PD and vice versa. Finding an appropriate sample to directly replicate our association results is hindered by the dearth of samples enriched for familial PD. Future directions include the recruitment and analysis of an independent sample of familial PD patients and collaborating with investigators that have already collected large samples of sporadic PD that can be used for replication. In addition we will perform analyses utilizing CNVs. The methodology for best calling CNVs is still evolving and we will apply new and existing algorithms to ensure we obtain consistent, robust results prior to dissemination of findings.
In summary, we have performed the largest GWAS to date in PD. We have limited our PD cases to only those with a family history of PD, thereby potentially increasing the contribution of genetic risk factors. Using this case–control design, we detected evidence of association to two chromosomal regions that encompassed previously reported genes: SNCA and MAPT. In addition, we found consistent evidence of association to DGKQ/GAK. Further analyses are warranted in these and additional chromosomal regions nominated in this study to evaluate the evidence of association in both familial and sporadic PD cohorts.