With the exception of MAPT and SNCA, we failed to replicate the most promising susceptibility loci from recent GWASs undertaken in European-derived populations. For MAPT and SNCA, the strength of association in our dataset was comparable to that observed in these and other studies ().
The
PARK16 region on chromosome 1q32 includes several candidate genes and was originally identified in a Japanese GWAS which included replication in two independent samples
7. The most significant markers in the combined three-stage GWAS were rs947211 (OR, 1.30; p = 1.5 × 10
−12) and rs823156 (OR, 1.37; p = 3.6 × 10
−9). Subsequently, Tan and colleagues found that four of five
PARK16 SNPs examined in 433 PD patients and 916 controls of Han Chinese ancestry were associated with PD, including rs947211 (OR, 0.71; p = 2 × 10
−3) and rs823156 (OR, 0.68; p = 5 × 10
−3)
11. The association of
PARK16 with PD in European-derived populations has been less consistent, and GWASs undertaken in individuals of European origin have yielded disparate findings
8–9. Simon-Sanchez and colleagues reported successful replication of
PARK16 in a two-stage GWAS of subjects from the U.S., Germany, and the U.K, with the strongest signal in the combined sample emanating from two low-frequency SNPs, rs823128 (OR, 0.66; p = 7.3 × 10
−8) and rs11240572 (OR, 0.67; p = 6.1 × 10
−7)
8. In contrast, Hamza and colleagues reported a failure to replicate
PARK16 in a single-stage, U.S.-based GWAS, with p values ranging from 0.03–0.15
9. A smaller GWAS of European-Americans with familial PD also failed to detect an association between
PARK16 and PD
6. Similarly, neither of the two
PARK16 SNPs we examined in this study reached significance. One possible explanation for these findings is that
PARK16 might harbor a population specific PD risk variant that occurs in Asians but is rare or absent in Europeans. For example, the
LRRK2 G2385R SNP conveys a risk for PD of approximately 2.5 fold in Asians, but it is essentially absent in other populations
4, 12. Another possibility is that a true but weaker association signal exists in populations of European origin such that most studies conducted to date have been under-powered. In support of this, even in studies where rs947211 and rs823156 failed to reach significance, the direction of the effect was the same, with ORs below 1.0 (). If the true effect size for these
PARK16 SNPs is similar to that observed by Simon-Sanchez and colleagues (ORs of 0.85–0.88), then our sample might well have lacked adequate power.
Hamza and colleagues reported an association reaching genome-wide significance for rs3129882 located in intron 1 of the
HLA-DRA gene (OR, 1.31; p = 2.9 × 10
−8), and designated this region
PARK18 9. They observed marginal associations for this SNP in two smaller GWASs (), which when combined yielded an OR of 1.18 (p = 1.1 × 10
−3), and cited this as evidence for replication. In contrast, we failed to detect an association of this SNP with PD in our sample (OR, 1.01; p = 0.88), and considered several possible explanations for this discordance. First, we do not believe that power was a major limitation, even if one takes into consideration the “winner’s curse” phenomenon in which the estimate of the genetic effect in the first positive report is biased upward
13. Our sample provided 99% power to detect an effect at the OR reported in the original study (1.31) and 84% power at the OR observed in the combined replication sample (1.18). Another possibility is that the observed association of
PARK18 with PD might represent a spurious finding resulting from population structure. Structure is of particular concern for the highly polymorphic HLA region in which many markers display substantial differences in allele frequency across European subpopulations
14. Hamza and colleagues observed significant structure in their case-control sample and a frequency gradient for the PD-associated
HLA-DRA allele such that individuals of Northern- and Southern-European origin had the lowest and highest allele frequencies, respectively. For example, the allele frequencies in controls of Scandinavian and Italian ancestry were 0.35 and 0.47, respectively. A popular principal component analysis-based method (EIGENSTRAT)
15 was used to address this issue, and the results remained highly significant after correction for structure. However, several authors have demonstrated that in some instances, even when EIGENSTRAT accurately detects population structure it still fails to properly correct its effect
16–17. Though we did not directly test for structure in our dataset, a recent population-based study of 800 control subjects ascertained from across Spain found little to no evidence of it
18. Thus, it is unlikely that significant structure existed in our sample, which was comprised predominantly of individuals from Northern Spain. A third potential explanation for the inconsistency in results is that the risk conveyed by
PARK18 might be dependent on interactions with unrecognized environmental factors that differ across populations. While this argument could be invoked for any of the loci studied here, it is particularly relevant for
HLA. Variation within the
HLA region is well known to influence susceptibility to a number of infectious diseases, and viral infections has long been postulated to be a risk factor for PD
19–20. Thus, if a specific
HLA allele increases susceptibility to a given infectious disease etiologically linked to PD, the association of that allele with PD might only be observed in populations where the infectious agent is common. A similar scenario in which Epstein-Barr virus and
HLA DRB1*1501 interact to increase risk for multiple sclerosis has recently been proposed
21.
An important limitation in our study was that we only examined a limited number of markers at each locus based on results from previous studies. Had we taken a more comprehensive approach, such as genotyping a full set of tagging SNPs, we might have detected significant associations at additional loci.
GWASs have provided a wealth of data and identified a number of susceptibility genes for complex diseases in recent years. However, rigorous and repeated replication in well-designed follow-up studies is still a prerequisite before promising candidate genes can be considered bona fide risk factors for disease. In PD, MAPT and SNCA have reached that status, a process that took several years. We believe the PARK16, 17, and 18 loci require further validation. Because the effect size for PARK16 might be smaller in European-derived populations, particular care must be taken to ensure that replication samples are sufficiently large to ensure adequate power. Because of concerns for the effect of population structure, future replication efforts for PARK18 might benefit from inclusion of more genetically homogenous case-control samples and the use of family-based association analysis.