Genome-wide association studies have identified several single nucleotide polymorphisms (SNPs) as reproducibly associated with risk of myocardial infarction (MI)1-3, a leading cause of death and disability. We tested both SNPs and copy number variants (CNVs) for association with early-onset MI in a large sample of 2,967 cases of early-onset MI and 3,075 matched controls. The design called for any variant with P < 0.001 to be tested for replication in up to 18,822 additional individuals. SNPs at eight loci reached genome-wide significance, two of which are newly identified: PHACTR1 (P = 6 × 10-10) and MRPS6/KCNE2 (P = 2 × 10-9). We tested 554 common CNVs (> 1% frequency) for association with MI; none met the pre-specified threshold for replication testing (P < 10-3), and the Q-Q plot did not deviate from the null distribution. We identified 8,065 rare CNVs but did not detect a greater CNV burden in cases as compared to controls, in genes as compared to the genome as a whole, or at any individual locus. Common SNPs at eight loci were reproducibly associated with risk of MI but a systematic well-powered test of common and rare CNVs failed to identify additional associations to risk of MI.
Myocardial infarction (MI) is heritable4 and among the leading causes of death and disability worldwide5. Whereas the majority of MIs occur in individuals >65 years old, 5-10% of new MIs occur in younger patients and these events are associated with substantially greater heritability5,6. Thus, early-onset MI is a promising phenotype for genetic mapping.
Genome-wide association studies (GWASs) of common SNPs have been reported for MI and coronary artery disease1-3,7, with each study finding common SNPs on chromosome 9p21.3 associated with MI or coronary artery disease. In addition to 9p21.3, these papers proposed at least eight other loci as harboring SNPs associated with coronary artery disease. Some of these loci await definitive replication, but even if all were valid they would explain a small fraction of the risk of MI.
Structural variants, another class of human DNA sequence variation, may account for some of the unexplained heritability in MI and other common diseases8,9. Common CNVs have been associated with Crohn’s disease10 and body mass index11 and rare CNVs have been related to risk for autism12 and schizophrenia13-16. To our knowledge, no integrated assessment of SNPs and CNVs in the same samples has been reported for MI or any other trait. Several technological developments make such systematic surveys now possible including hybrid oligonucleotide microarrays17 and analytical methods18 to simultaneously assess SNPs and CNVs genome-wide in each sample.
We designed a three-staged GWAS of early-onset MI with SNPs, common CNVs, and rare CNVs (Figure 1). Stage 1 consisted of the Myocardial Infarction Genetics Consortium (MIGen), a collection of 2,967 cases of early-onset MI (in men ≤50 years old or women ≤60 years old) and 3,075 age- and sex-matched controls free of MI from six international sites: Boston and Seattle in the United States as well as Sweden, Finland, Spain, and Italy (Table 1 and Supplementary Methods). The mean age at the time of MI was 41 years among males and 47 years among females.
Variants with P < 0.001 were advanced through two stages of replication (Figure 1, see Methods for power calculations). In total, 1,441 SNPs, including a SNP at each of eight loci recently proposed from GWA or candidate gene studies for coronary artery disease3,7,19, were taken forward into Stage 2, an in silico analysis of these SNPs in four recently completed GWA studies for MI. Stage 2 consisted of an effective symmetric sample size of 3,942 cases of MI and 3,942 controls (Supplementary Methods and Supplementary Table 1). Thirty-three SNPs were taken forward from Stage 2 into Stage 3, consisting of an additional 6 studies with an effective symmetric sample size of 5,469 cases of MI and 5,469 controls (Supplementary Methods and Supplementary Table 2). Stage 3 included 25 SNPs with the best combined statistical evidence in Stages 1 and 2 and 8 SNPs from previously reported loci (Methods).
After Stages 1, 2, and 3, we observed that SNPs at 8 loci were associated with MI at a pre-specified threshold for genome-wide significance of P < 5 × 10-8 (corresponding to P < 0.05 after adjusting for ~ 1 million independent tests20) (Table 2). Six of the eight previously-reported associations were confirmed (Table 2) with P ranging from 2 × 10-8 to 1 × 10-41. As the Stage 2 samples were used to implicate some of these previous findings, the data we present are not fully independent of prior reports. These six genetic association signals map to 9p21.3, CXCL12, CELSR2/PSRC1/SORT1, MIA3, LDLR and PCSK93,7. Three of the SNPs (those at the CELSR2/PSRC1/SORT1, LDLR, and PCSK9 loci) have been also previously shown to relate to plasma low-density lipoprotein cholesterol, a causal risk factor for MI7,21. The risk alleles at the eight loci ranged in frequency from 13% to 84%. Each copy of the risk allele conferred excess odds of MI ranging from 13% to 28%.
Three of the loci previously suggested by Samani et al.3 did not meet our pre-specified threshold of P < 5 × 10-8. Across Stages 1, 2, and 3, the statistical evidence was the following: rs17228212 in SMAD3 (odds ratio 1.03, 95% confidence interval 0.99 - 1.07, P = 0.15); rs2943634 on 2q36 (odds ratio 1.05, 95% confidence interval 1.01 - 1.10, P = 0.01); and rs6922269 in MTHFD1L (odds ratio 1.09, 95% confidence interval 1.05 - 1.14, P = 2 × 10-5).
Two novel associations were observed with genome-wide significance: (i) in an intron of phophastase and actin regulator 1 (PHACTR1) on chromosome 6 (rs12526453, odds ratio 1.13, P = 7 × 10-10) and (ii) in an intergenic region between mitochondrial ribosomal protein S6 (MRPS6), solute carrier family 5 (inositol transporters) member 3 (SLC5A3) and potassium voltage-gated channel, Isk-related family, member 2 (KCNE2) on chromosome 21 (rs9982601, odds ratio 1.19, P = 2 × 10-9). PHACTR1 is an inhibitor of protein phosphatase 1, an enzyme that dephosphorylates serine and threonine residues on a range of proteins22. MRPS6 encodes a subunit of the mitochondrial ribosomal protein 28S23. SLC5A3 is a gene embedded within MRPS6 and encodes a protein that transports sodium and myo-inositol in response to hypertonic stress24. KCNE2 encodes a subunit of a potassium channel and mutations in this gene cause inherited arrhythmias25. The mechanisms by which gene(s) at these two loci lead to MI remain to be defined.
At two additional new loci (in an intron of WDR12 and near SYT7), the statistical evidence for association across Stages 1, 2, and 3 was consistent (combined P for each at 4 × 10-7) but did not meet our pre-specified genome-wide threshold (Table 2). These loci require follow-up in additional samples.
Of the eight validated loci, non-coding SNPs at 9p21.3 have been the most widely replicated, confer the largest effect size and are supported by the strongest statistical evidence26. While it is possible that 9p21.3 SNPs act through as-yet unidentified coding variants, non-coding SNPs may affect function by altering level of gene expression. Thus, we explored whether the 9p21.3 SNP from our study might be related to mRNA level of nearby genes in three biologically-relevant human tissues - liver, subcutaneous fat, and visceral fat (Methods).
The MI-associated SNP at 9p21.3 (rs4977574) was strongly associated with mRNA level of cyclin-dependent kinase inhibitor 2B (CDKN2B), a gene located ~89 kilobases from the SNP. Compared with the mRNA level in a reference pool of individuals, carriers of the risk G allele at 9p21.3 had about the same level of expression of CDKN2B in subcutaneous fat tissue whereas carriers of the non-risk A allele had ~15% lower transcript level (P = 4 × 10-6 in 698 subcutaneous fat samples, Figure 2). The same SNP was also associated with CDKN2B transcript level in visceral fat tissue (P = 1 × 10-4) but not associated in human liver (P = 0.84). In each of the three tissues, this genotype was not associated with mRNA level of other neighboring transcripts on 9p21.3 including CDKN2A, MTAP, or ANRIL (P > 0.05 for each genotype-transcript association). CDKN2B, a downstream target of the transforming growth factor beta pathway, has been shown to decrease cell survival27. These results suggest the hypothesis that genetic variation at 9p21.3 leads to atherosclerosis through CDKN2B.
To evaluate the cumulative effect of these eight SNPs on risk for MI, we constructed an MI genotype score comprised of the 8 SNPs, modeling the number of risk alleles carried by each individual in the MIGen GWAS (Stage 1). In logistic regression models including age, gender, and principal components of ancestry, individuals in the top quintile of MI genotype score had a two-fold increased risk for MI compared with bottom quintile (odds ratio 2.05, 95% confidence interval 1.74 to 2.42; P = 4 × 10-25, Table 3). The MI genotype score confers risk of a magnitude comparable to other established risk factors such as plasma low-density lipoprotein cholesterol (odd ratio 1.62, 95% confidence interval 1.17 - 2.25 for top versus bottom quintile as previously reported28).
While the GWA approach has met with some success in MI, these variants, in sum, explain a small fraction of the variance; the current MI genotype score explains only 2.4% of the variance in risk for early-onset MI. Thus, we tested the hypothesis that systematic assessment of structural variants, common and rare, might identify additional loci contributing to MI.
We first used the CANARY algorithm18 to test 554 commonly segregating CNVs (> 1% frequency) for association with early-onset MI in 2,783 cases and 2,865 controls that passed sample quality control for CNV analysis (Methods). The estimated genomic control lambda for the entire set of CNVs was ~1.23; for 316 CNVs with allele frequency greater than 5%, lambda was ~1.05. We did not observe any CNV with evidence for association surpassing our pre-specified threshold for replication of P < 0.001. In fact, the strongest association (P = 0.002, Supplementary Table 3) did not pass the Bonferroni correction for 554 tests, let alone genome-wide significance for SNPs. A plot of the observed versus expected P value distribution did not show deviation from the null distribution (Figure 3).
To detect rare CNVs, we used Birdseye18 and restricted analysis to autosomal deletions and duplications that were both rare (< 1% frequency in our samples) and large (greater than 100kb). After stringent quality control filtering (Supplementary Methods), the analysis included 5,955 individuals and 8,065 CNVs (39% deletions). The mean number of rare CNVs per individual was 1.35 and the median was 1.
Using the same methods recently described in a successful study of schizophrenia14, we evaluated case/control differences in rare CNVs across three parameters: the overall burden of rare CNVs genome-wide, the number of genes overlapped by rare CNVs, and the total kilobase extent of rare CNVs. Controlling for sample collection site, there were no case/control differences in genome-wide rare CNV rate (P = 0.39), the number of genes intersected by rare CNVs (P = 0.74) or the total kilobase extent of rare CNVs (P = 0.77). Furthermore, there were no differences in rare CNV rate when restricting analysis to only gene-intersecting rare CNVs (P = 0.55), deletions (P = 0.57) or duplications (P = 0.34). Searching for specific loci with increased rates of rare CNVs in cases versus controls, only 4 regions showed uncorrected P values of P < 0.01; however, the lowest P value after correction for multiple testing was P = 0.96.
In conclusion, we screened common SNPs and CNVs both common and rare for association with early-onset MI in a large sample. Our study suggests four main conclusions. First, there are eight gene regions at which common SNPs are associated with MI with genome-wide significance and replication, two of which were newly implicated by this study. Second, at 9p21.3, we show that the SNP with the best statistical evidence for MI risk is also correlated with expression of a neighboring gene - CDKN2B - in human fat tissue. Third, whereas the effects of the individual SNPs are modest, the overall effect (in a comparison of extreme quintiles) of an eight SNP score (two-fold increased risk for MI) is comparable in predictive value to plasma LDL cholesterol28.
Fourth, and in contrast to the positive results for genetic mapping of MI via SNP analysis, we were unable to detect common or rare CNVs associated with risk for MI. The current analysis is directly comparable to a recent study of schizophrenia that found convincing evidence for rare CNVs associated with disease both at specific loci and for three specific genome-wide burden measures14: both studies are of similar sample size, used the same genotyping platform, and were analyzed by the same methods and by the same analyst. The different results indicate that the genetic architecture of MI may be different than schizophrenia (based on natural selection, genetic complexity or other factors), and that the remaining inherited risk for MI must be due to some combination of common SNPs for which we do not yet have sufficient power, CNVs not measured in our analysis, rare point mutations, and non-additive interactions. However, by systematically measuring all forms of genetic variation in appropriate samples, it should be possible to identify the architecture of each trait and increase information about the pathophysiology of disease.