Coronary heart disease (CHD) is the single greatest cause of death worldwide (1
). Although CHD is highly heritable, the DNA sequence variations that confer cardiovascular risk remain largely unknown. To identify sequence variants associated with CHD, we undertook a genome-wide association study using 100,000 single-nucleotide polymorphisms (SNPs). To minimize false positive associations without unduly sacrificing statistical power, we designed the study to comprise three sequential case-control comparisons performed at a nominal significance threshold of P
< 0.025 (). For the initial genome-wide scan, cases and controls were Caucasian men and women from Ottawa, Canada who participated in the Ottawa Heart Study (OHS). Cases had severe, premature CHD with a documented onset before the age of 60 years and culminating in coronary artery revascularization (table S1
). To limit confounding by factors that strongly predispose to premature CHD, we excluded individuals with diabetes or plasma cholesterol levels consistent with monogenic hypercholesterolemia (>280 mg/dL). Controls were healthy Caucasian men (>65 years) and women (>70 years) from Ottawa who had no symptoms or history of CHD (table S1
Fig. 1 Study design for identification and validation of sequence variants associated with CHD. Assuming independence, the probability of any single SNP achieving a nominal significance level of 0.025 in all three studies with the associations being in the same (more ...)
Custom oligonucleotide arrays (3
) were used to assay 100,000 SNPs arranged at ~30-kb intervals throughout the genome in 322 cases and 312 controls (data set designated as OHS-1). Of these, 9636 SNPs deviated strongly from Hardy-Weinberg equilibrium (P
< 0.001) or did not meet quality-control criteria (3
), and 17,500 were subpolymorphic (minor allele frequency < 1%) in the sample. The remaining 72,864 SNPs were entered into the analysis, and 2586 were associated with CHD at a nominal significance threshold of 0.025 (table S2
). These 2586 SNPs were genotyped in an independent sample of 311 cases and 326 controls from Ottawa (OHS-2) using the same criteria as OHS-1 (table S1
). Of these, 50 were associated with CHD at a nominal significance threshold of 0.025, with the same direction of effect (table S2
To limit attrition of true positive associations due to inadequate statistical power, we performed the third case-control comparison in a much larger prospective study of CHD risk, the Atherosclerosis Risk in Communities (ARIC) study, which enrolled and followed 11,478 Caucasians (4
). Only 2 of the 50 SNPs identified in the Ottawa cohorts were significantly associated with incident CHD in the ARIC population (table S2
). These two SNPs, rs10757274 and rs2383206, were located within 20 kb of each other on chromosome 9p21 and were in strong linkage disequilibrium (r2
To validate the association between rs10757274 and rs2383206 and CHD, we assayed both SNPs in three additional independent cohorts: the Copenhagen City Heart Study (CCHS), a prospective study of ischemic heart disease in 10,578 Danish men and women (5
); the Dallas Heart Study (DHS), a population-based probability sample of Dallas County residents (6
); and a third sample of 647 cases and 847 controls from the Ottawa Heart Study population (OHS-3). In the CCHS, cases were participants who experienced an ischemic cardiovascular event during the 20-year follow-up period, whereas controls were those who did not develop CHD over the same time interval. In the DHS, cases and controls were defined by using electron-beam computer tomography to measure coronary artery calcium, an index of coronary atherosclerosis (7
). In OHS-3, cases were participants that had documented CHD before the age of 55 (men) or 65 (women) years, whereas controls were men aged > 65 years and women aged > 70 years who did not have symptoms of CHD (table S1
). In all three validation studies, both SNPs were significantly associated with CHD ().
Association between SNPs rs10757274 and rs2383206 and CHD. Values are numbers of individuals in each genotype group. P values were calculated by χ2 tests on allele counts. HW, Hardy-Weinberg equilibrium.
The magnitude of CHD risk associated with the risk allele was determined by Cox proportional-hazards modeling in the ARIC and CCHS cohorts. The hazard ratios associated with the risk alleles were comparable in the two populations and indicated a graded increase in risk from noncarriers to heterozygotes to homozygotes (). The two SNPs (rs10757274 and rs2383206) define an allele that was associated with an ~15 to 20% increase in risk in the 50% of individuals who were heterozygous for the allele and an ~30 to 40% increase in CHD in the 25% of Caucasians who were homozygous for the allele. The population attributable risk associated with the risk allele was 12.5 to 15% in the ARIC population and 10 to 13% in the CCHS cohort.
Table 2 Risk of CHD as a function of rs10757274 and rs2383206 in the ARIC study and the CCHS. ARIC expected event values are based on the log-rank test. ARIC incidence rates are measured in number of events per 10,000 person years of followup. Ranges in parentheses (more ...)
The mechanistic basis for the association between the risk allele defined by rs10757274 and rs2383206 and CHD is not known. The allele may increase the development of atherosclerotic plaques, promote thrombogenesis, or increase the tendency of atherosclerotic plaques to rupture. The finding that the risk allele was associated with coronary artery calcification in the DHS and with severe premature atherosclerosis in OHS-1 suggests that it promotes CHD by increasing the atherosclerotic burden. The risk allele was not associated with any of the major risk factors for atherosclerosis in ARIC or CCHS (table S3 and table S4
), and the association remained significant in models that considered multiple possible confounding covariates (including age, gender, plasma lipid levels, blood pressure, diabetes, and plasma C-reactive protein concentrations; table S3
). These analyses suggest that the effect of the chromosome 9 risk allele on CHD was not mediated by any of the established risk factors for cardiovascular disease.
To fine-map the locus associated with CHD, we assayed SNPs spaced at ~5-kb intervals across the region extending 175 kb upstream and downstream of rs10757274 and rs2383206 in 500 cases and 500 controls from OHS-2 and OHS-3. Eight additional SNPs at the locus spanning a 58-kb region (extending from 22,062,301 to 22,120,389) were significantly associated with CHD (). All eight were in strong linkage disequilibrium with each other and with rs10757274 and rs2383206. The region demarcated by these SNPs was flanked on both sides by ~50-kb regions in which none of the 30 SNPs tested were associated with CHD. Two of 65 SNPs in the 350-kb region surrounding the 58-kb risk locus were associated with CHD at the nominal significance threshold, but neither was in strong linkage disequilibrium with rs10757274 and rs2383206. These data indicate that the risk allele comprises a single haplotype that spans ~58 kb.
Fig. 2 Fine mapping of the genomic interval on chromosome 9 associated with CHD. (A) SNPs spaced ~5 kb apart in the interval extending 175 kb upstream and downstream of rs10757274 and rs2383206 were assayed in 500 cases and 500 controls from the OHS population (more ...)
Inspection of the UCSC Genome Browser (http://genome.ucsc.edu
) and BLAST searches against the National Center for Biotechnology Information (NCBI) (www.ncbi.nih.gov/blast
) nr (nonredundant) nucleotide sequence database revealed no annotated genes or microRNAs within the 58-kb interval. A number of spliced expressed sequence tags (ESTs) map within the interval, but none contained an open reading frame that extends more than a few amino acids. Resequencing of the 58-kb interval in two homozygotes for the risk allele and one homozygote for the reference allele revealed 66 polymorphisms (SNPs plus small insertions or deletions), of which 35 were specific to the risk allele (table S5
). Only one of these variants, a copy number variation in a run of nine consecutive CAT repeats (extending from nucleotide 22110787 to 22110814, NCBI build 36.1) mapped to a spliced transcript (BI765545) that appears to be part of a large noncoding RNA of unknown function (8
). Polymerase chain reaction (PCR) amplification of cDNAs confirmed expression of the transcript in placenta and transformed lymphocytes (fig. S1
). It is possible that variation in the expression or function of this transcript may be associated with risk of CHD.
Alternatively the risk allele may alter a regulatory element that affects the expression of a gene (or genes) located outside of the 58-kb interval. Cross-species sequence alignments revealed several conserved segments within the 58-kb interval that may contain such regulatory elements (fig. S2
). It is also possible that the risk allele extends beyond the 58-kb interval defined in this study and that the functional sequence variants that confer risk of CHD are located outside of the interval. Resequencing the coding regions of the two genes most proximal to the risk locus, CDKN2A
, revealed only a single variant (Ala158
in CDKN2A) that was present in 6 of the 96 individuals examined and is thus unlikely to explain the CHD risk associated with the locus. The localization of the risk locus to a region devoid of known genes implicates a previously unrecognized gene or regulatory element that can substantially affect CHD independently of established risk factors. Further studies will be required to elucidate the mechanism by which the locus modulates CHD risk.
Comparison of the Yoruba and Centre d’ Etude du Polymorphisme Humain (CEPH) data from the International HapMap Project (www.hapmap.org
) revealed notable ethnic differences in allele frequencies in the risk interval (table S6
). Of the 10 alleles that were significantly associated with CHD in whites, 3 were virtually absent from the Yoruba population, and 6 others were much less common. Both rs10757274 and rs2383206 were present at appreciable frequencies among African-Americans in ARIC and DHS, but neither SNP was associated with CHD in either population (table S7
). The apparent ethnic differences in association between these SNPs and CHD in ARIC may reflect differences in statistical power in ARIC but cannot explain the ethnic differences observed in DHS, where African-Americans are the largest group. Accordingly, it seems more likely that the functional sequence variants associated with the risk allele in whites are less common in African-Americans. This notion is consistent with our finding that the frequencies of several alleles associated with CHD risk factors differ widely among ethnic groups (9
). Comprehensive analysis of the locus in African-Africans may allow further refinement of the risk interval.
The results of this study illustrate both the perils and the promise of whole-genome association. The initial scan and the first replicate screen both generated substantially more SNPs that achieved the prespecified significance threshold than would be predicted by chance alone, as indicated by permutation testing (table S2
). Yet only two of these SNPs (comprising one allele) survived further replication, despite the use of a large sample (i.e., ARIC) with high statistical power. This finding highlights the necessity for adequate replication to protect against artifacts that may occur because of population stratification, multiple testing, or other factors to which whole-genome association studies are particularly susceptible. The consistent replication of the chromosome 9 risk allele in six independent study samples indicates that the approach can be productively applied to conditions as complex as CHD, which is known to be influenced by a variety of environmental and genetic factors (12
). Furthermore, analysis of 50 randomly selected regions of 500 kb each indicated that the 72,864 informative SNPs used in the initial scan provided 30 to 40% of the power that would be obtained by assaying all phase II Hapmap SNPs. Therefore, scans with denser SNP panels and larger sample sizes may reveal further loci associated with CHD risk.