We jointly analyzed data from two distinct but complementary genomewide association studies of coronary artery disease and myocardial infarction that performed ascertainment in similar ways and that involved the same genotyping platform. Sequential and combined analyses of the two data sets allowed us to identify several new genetic loci, which individually and in aggregate considerably affect the risk of coronary artery disease.
The association of chromosome 9p21.3 with coronary artery disease was the strongest found in the WTCCC study.
5 The finding that this locus was also most strongly associated with myocardial infarction in the German study provides compelling proof of its involvement in coronary artery disease. The evidence of association is strong, the risk variant is common, and each copy of the allele substantially increases the probability of the disease. These findings unequivocally demonstrate a major genetic risk variant at this locus.
Indeed, during revision of this manuscript, two other genomewide studies reported a strong association of the same 9p21.3 locus with coronary artery disease and myocardial infarction,
13,14 making this the most highly replicated locus for coronary artery disease identified to date. The region contains the coding sequences of genes for two cyclin-dependent kinase inhibitors,
CDKN2A (encoding the prototypic INK4 protein p16INK4a) and
CDKN2B (encoding p15INK4b), which play an important role in the regulation of the cell cycle and may be implicated, through their role in TGF-
β-induced growth inhibition, in the pathogenesis of atherosclerosis.
15-17 Although regulation of one or both of the
CDKN2 genes may explain the association with coronary artery disease, other explanations also need to be considered, including involvement of the methylthioadenosine phosphorylase (
MTAP) gene or of other expressed sequences located in the region. The same region has also recently been associated with increased susceptibility to type 2 diabetes,
18-20 raising the possibility of a shared, rather than a single, mechanism causing both coronary artery disease and diabetes.
The association of chromosome 6q25.1 with coronary artery disease maps to the
MTHFD1L gene, which encodes the mitochondrial isozyme of C1-tetrahydrofolate (THF) synthase.
21,22 The family of C1-THF synthases is used in a variety of cellular processes, particularly the synthesis of purine and methionine.
21 Therefore,
MTHFD1L activity may also contribute to plasma homocysteine levels,
21,23 raising the possibility of a link between
MTHFD1L variants and this risk factor for coronary artery disease.
24 A preliminary analysis of data from 1070 persons in the AtheroGene study
25 has not revealed an association between rs6922269 genotypes and plasma homocysteine levels (Tiret L, Blankenberg S: personal communication). Nevertheless, further studies in a wider range of subjects are needed to investigate this possibility.
Our findings demonstrate the main strength of a genomewide approach — namely, the possibility of identifying hitherto unsuspected loci that increase susceptibility to complex diseases. However, the mechanisms underlying the newly identified associations are often not immediately obvious. Indeed, the mechanisms for the association of signals on chromosomes 9p21.3, 6q25.1, and 2q36.3 with coronary artery disease all require elucidation. Similarly, none of the chromosomal loci identified in the combined analysis have previously been strongly linked to coronary artery disease. However, genes in several of the loci (
PSRC1 at 1p13.3,
MIA3 at 1q41, and
SMAD3 at 15q22.33) play a role in cell growth or inhibition.
26-29 These processes are fundamental for the formation and progression of atherosclerotic plaque and also for plaque instability.
30 Our results suggest that genetic regulation of these processes plays an important role in the development of coronary artery disease and myocardial infarction.
Some loci from the WTCCC study that we attempted to replicate did not show association in the German study. These negative data underscore the need to view genomewide associations with caution, despite their statistical strength, until they have been replicated in appropriate validation samples. In this context, caution should also be exercised with regard to the four loci identified in our combined analysis.
31Our primary objective was to identify loci with significant associations with coronary artery disease independently of any biologic assumptions. Nonetheless, the genotyping platform also offered an opportunity to examine genetic variants in genes with previously reported associations. Indeed, several showed evidence of an association in one of our studies. However, only SNPs in the lipoprotein lipase gene had evidence of an association in both studies. This finding is in agreement with those in most recent systematic studies that were largely unsuccessful in replicating initial findings in candidate genes.
32 However, many of the previously studied gene variants are poorly tagged on the GeneChip array, which clearly fails to cover the full extent of even common variation in these genes.
Whether our findings can be translated into better prevention or treatment for coronary artery disease will become clear only over time and with further research. Although the odds ratios for each locus are modest, as anticipated for a polygenic disorder, the estimates of population attributable fractions for the three validated loci are substantial, both individually and in aggregate. This observation offers the potential for improved overall coronary risk prediction. However, the case subjects in both studies had a strong family history of premature coronary artery disease, which might have enhanced the power to detect an association with coronary artery disease but also might have increased the estimated population attributable risks beyond that of sporadic cases, and further analysis of the loci in a wider range of subjects is necessary. Further studies are also needed to investigate the associations of the loci with other types of atherosclerotic disease, as well as with cardiovascular risk factors and markers. At a genetic level, studies should focus on fine mapping of the associated regions and thorough investigation of candidate genes. Our results provide a framework for all these additional studies.
Our analysis has several important limitations. Although the GeneChip array typed over 500,000 variants, a substantial percentage could not be evaluated, for reasons given in the
Supplementary Appendix. Furthermore, to reduce the effect of multiple testing, we used only the rather conservative Cochran–Armitage test for trend (an additive model) to screen the WTCCC data for significant associations. These limitations make it likely that some loci were missed and that further analysis of the data and subsequent validation will reveal other loci.
Nonetheless, by using a sequential strategy of initial replication and subsequent combination of information from the two genomewide association studies, we were able to describe several new genetic loci for coronary artery disease and myocardial infarction that have a considerable effect on the risk of these diseases and that merit indepth follow-up studies. Most important, the finding that a single locus was the strongest signal in two separate studies carries promise for clinically relevant progress in our understanding of the genetics of coronary artery disease. As the current activity in genomewide association studies of complex traits accelerates, our approach may also provide a paradigm for combining the results of such studies to maximize the amount of valuable information that can be extracted from these expensive and laborious experiments.