|Home | About | Journals | Submit | Contact Us | Français|
Comparison of patients with coronary heart disease and controls in genome-wide association studies has revealed several single nucleotide polymorphisms (SNPs) associated with coronary heart disease. We aimed to establish the external validity of these findings and to obtain more precise risk estimates using a prospective cohort design.
We tested 13 recently discovered SNPs for association with coronary heart disease in a case-control design including participants differing from those in the discovery samples (3829 participants with prevalent coronary heart disease and 48 897 controls free of the disease) and a prospective cohort design including 30 725 participants free of cardiovascular disease from Finland and Sweden. We modelled the 13 SNPs as a multilocus genetic risk score and used Cox proportional hazards models to estimate the association of genetic risk score with incident coronary heart disease. For case-control analyses we analysed associations between individual SNPs and quintiles of genetic risk score using logistic regression.
In prospective cohort analyses, 1264 participants had a first coronary heart disease event during a median 10·7 years' follow-up (IQR 6·7–13·6). Genetic risk score was associated with a first coronary heart disease event. When compared with the bottom quintile of genetic risk score, participants in the top quintile were at 1·66-times increased risk of coronary heart disease in a model adjusting for traditional risk factors (95% CI 1·35–2·04, p value for linear trend=7·3×10−10). Adjustment for family history did not change these estimates. Genetic risk score did not improve C index over traditional risk factors and family history (p=0·19), nor did it have a significant effect on net reclassification improvement (2·2%, p=0·18); however, it did have a small effect on integrated discrimination index (0·004, p=0·0006). Results of the case-control analyses were similar to those of the prospective cohort analyses.
Using a genetic risk score based on 13 SNPs associated with coronary heart disease, we can identify the 20% of individuals of European ancestry who are at roughly 70% increased risk of a first coronary heart disease event. The potential clinical use of this panel of SNPs remains to be defined.
The Wellcome Trust; Academy of Finland Center of Excellence for Complex Disease Genetics; US National Institutes of Health; the Donovan Family Foundation.
Coronary heart disease is complex in origin, with contributions from lifestyle and genetic factors.1 Family history of premature coronary heart disease is an independent risk factor, suggesting that inherited DNA sequence variants contribute to risk of the disease. Using a case-control design, genome-wide association studies have identified single nucleotide polymorphisms (SNPs) at 13 genomic regions associated (p<5×10−8 with coronary heart disease, myocardial infarction, or both.2 In discovery studies, each copy of the risk allele at these loci was estimated to increase risk of myocardial infarction by 12–92%.
Discovery genome-wide association studies for myocardial infarction or coronary heart disease have ascertained cases on the basis of early age of disease onset or affected family members, and as such the reported effect estimates might not be representative of the general population. Although efficient for discovery, cross-sectional and case-control designs have the potential for several types of bias, whereas the prospective cohort study is regarded as the gold standard in epidemiological investigations.3 Therefore, we set out to answer two questions: first, are the reported genetic association findings externally generalisable in studies differing from the discovery studies; and second, can more precise risk estimates be obtained with a prospective cohort design?
We tested the 13 recently discovered SNPs for association with coronary heart disease in two designs: a case-control design including participants differing from those in the discovery samples (3829 participants with prevalent coronary heart disease and 48 897 controls free of the disease); and a prospective cohort design including 30 725 participants free of cardiovascular disease from Finland and Sweden. Coronary heart disease was defined as myocardial infarction, unstable angina pectoris, coronary revascularisation (coronary artery bypass graft or percutaneous transluminal coronary angioplasty), or death due to coronary heart disease. Cardiovascular disease included coronary heart disease and ischaemic stroke events. Detailed case definitions are described in the webappendix.
Participants from seven cohorts were included in our analyses (table 1). The FINRISK 1992, 1997, and 2002 cohorts consist of a representative random sample selected from inhabitants of different regions in Finland aged 25–74 years. The survey included a mailed questionnaire and a clinical examination at which a blood sample was drawn. The study protocol has been described previously.4 23 036 individuals participated in these cohorts and genotype data was available from 20 927 participants.
The Health 2000 study was based on a stratified two-stage cluster sampling from the National Population Register to represent the total Finnish population aged 30 years and older.5 The survey included an interview about medical history, health-related lifestyle habits, and a clinical examination at which a blood sample was drawn. 6200 people participated in the study. After exclusion of individuals older than 80 years and without sufficient genotype data, the final dataset consisted of 5796 participants. A detailed methodology report is available online.6
The Malmö Diet and Cancer (MDC) study was a community-based prospective epidemiological cohort of 28 449 people recruited for a baseline examination between 1991 and 1996.7 From this cohort, 6103 people were randomly selected to participate in the Cardiovascular Cohort (MDC-CC), which sought to investigate risk factors for cardiovascular disease. All participants underwent a medical history, physical examination, and laboratory assessment for cardiovascular risk factors, as described previously.8 Final data with genotypes were available for 5104 participants.
During follow-up of the FINRISK and HEALTH 2000 cohorts, data for admission to hospital and mortality were obtained from the Finnish National Hospital Discharge Register and the Finnish National Causes-of-Death Register. These registers have excellent validity and coverage.9,10 Follow-up ended on Dec 31, 2007. Follow-up of the MDC-CC is as previously described.11
The Malmö Preventive Project (MPP) is a cohort from southern Sweden that was set up in 1974. 33 346 individuals were screened during 1974–92. Information concerning lifestyle factors and medical history was obtained from a questionnaire. All participants underwent physical examination and biochemical analyses. Of individuals who participated in the baseline examinations, 17 284 were rescreened during 2002–06. The final data with genotypes included 14 884 individuals.
For the COROGENE cohort, initially, all consecutive Finnish patients undergoing coronary angiogram between June, 2006, and March, 2008 (n=5330), in the Helsinki University Central Hospital were included and a questionnaire, information about previous medical conditions and cardiovascular risk factors, hospital records for patients' history, laboratory measurements, electrocardiogram, echocardiogram, and medication were obtained. Of these patients, 2118 (53%) had acute coronary syndrome and were selected as COROGENE cases. The controls for COROGENE cases were selected from the Helsinki-Vantaa region participants of FINRISK 1997, 2002, and 2007 by risk set sampling.12 For each case, two controls matched by sex and birth year and free of acute coronary syndrome were sampled. In total, 2101 cases and 3914 controls (of which 1453 were unique) formed the final genotyped COROGENE case-control sample.
The FINRISK 1992, 1997, 2002 and Health 2000 study protocols were approved by the ethics committee of the National Institute for Health and Welfare, the MDC-CC and MPP study protocols by the ethics committee of Lund University, and the COROGENE study protocol by the ethics committee of Helsinki University Hospital, Internal Medicine. All participants provided written informed consent.
We selected SNPs from genome-wide association studies published before June, 2009 in which phenotypes studied were myocardial infarction or coronary heart disease, and association between a SNP and myocardial infarction or coronary heart disease exceeded a genome-wide association threshold (p<5×10−8). 13 SNPs from seven reports13–19 met these criteria, including 1q41 in MIA3, 1p32 near PCSK9, 1p13 near CELSR2–PSRC1–SORT1, 2q33 in WDR12, 6p24 in PHACTR1, 9p21 near CDKN2A–CDKN2B, 10q11 near CXCL12, 19p13 near LDLR, 21q22 near SLC5A3–MRPS6–KCNE2, 3q22 in MRAS, 6q26 in LPA, 12q24 near HNF1A, and 12q24 in SH2B3.
Samples were genotyped with the Sequenom platform (iPlex MassARRAY, San Diego, CA, USA) at the Institute for Molecular Medicine Finland FIMM (FINRISK 1992 and 2002), the Wellcome Trust Sanger Institute, UK (Health 2000), or the Broad Institute, USA (FINRISK 1997 and MDC-CC), and with Sequenom or Taqman (Applied Biosystems, Foster City, CA, USA) platforms at Lund University, Sweden (MDC-CC and MPP). COROGENE was genotyped with Illumina 610K chip (Illumina HumanHap 610-Quad SNP array, San Diego, CA, USA) at the Sanger Institute. Genotypes were manually curated with call rates above 97%.
We tested associations between SNPs and incident cardiovascular events using Cox proportional hazards models adjusted for traditional risk factors: sex, LDL and HDL cholesterol, current cigarette smoking, body-mass index, systolic and diastolic blood pressure, blood pressure treatment, and prevalent type 2 diabetes. Age was used as the baseline timescale in the Cox models. The proportional hazards assumption was met when tested with scaled Schoenfeld residuals.20
We constructed a multilocus genetic risk score for each individual by summing the number of risk alleles (0/1/2) for each of the 13 SNPs weighted by their estimated effect sizes in the discovery sample (table 2 shows SNP specific weights). Missing genotype values were imputed with the cohort-specific averages of risk allele frequencies. Estimates of association between the genetic risk score divided into quintiles and time to coronary heart disease, cardiovascular disease, and myocardial infarction were calculated with Cox proportional hazards models. For each cohort we calculated 95% CIs for hazard ratios (HRs) and tested the null hypothesis of no linear effect over the quintiles using 1 df Wald test.
For prevalent case-control analyses, we analysed individual SNP and quintiles of genetic risk score associations using a logistic regression model adjusted for age and sex. COROGENE data were analysed with conditional logistic regression. Each cohort was analysed separately, and the estimates weighted on the inverse of their standard errors were combined across cohorts with fixed effects meta-analysis.21
To evaluate the potential value of genetic risk score in risk prediction, we used two cohorts (FINRISK 1992 and 1997) and up to 10-year follow-up. First, we compared the receiver operating characteristic (ROC) curves22 of models with and without genetic risk score. The statistical significance of change in the area under the ROC curve (AUC) between models was tested with the correlated C-index approach.23 Second, we calculated net reclassification improvement (NRI) and clinical NRI24 using the Kaplan-Meier approach with bootstrap-based p values,25 and integrated discrimination improvement (IDI).26 The model calibration was tested with Hosmer-Lemeshow goodness-of-fit test.
Since each of the 13 reported SNPs has previously been associated with coronary heart disease or myocardial infarction at significance levels exceeding a stringent genome-wide threshold, in this report we regarded an association to be significant if a two-sided p value was less than 0·05 (for the same risk allele in the same direction as in the original report). The R statistical package (version 2.11.1) was used for all analyses.
The sponsors had no role in the conduct or interpretation of the study. The corresponding author had full access to all data in the study and had final responsibility for the decision to submit for publication.
52 726 participants in FINRISK 1992, 1997, and 2002, Health 2000, MDC-CC, MPP, and COROGENE were included in our analysis of prevalent cases versus controls. The total number of prevalent cases of coronary heart disease was 3829 (7%). 30 725 participants from five cohorts (FINRISK 1992, 1997, and 2002, Health 2000, and MDC-CC) were included in the prospective cohort analyses. Median follow-up was 10·7 years (IQR 6·7–13·6). 1264 (4%) incident cases of coronary heart disease occurred during follow-up. Table 1 shows background characteristics along with risk factor distributions for the cohorts.
In single SNP analyses for prevalent cases, 9p21 near CDKN2A–CDKN2B, 21q22 near SLC5A3–MRPS6–KCNE2, and 1q41 in MIA3 were associated with coronary heart disease, cardiovascular disease, and myocardial infarction, and 19p13 near LDLR was associated with prevalent coronary heart disease (webappendix p 2). In analysis of incident cases, 6q26 in LPA and 9p21 near CDKN2A–CDKN2B were associated with all three endpoints. Additionally, 6p24 in PHACTR1 was associated with incident coronary heart disease and cardiovascular disease, 12q24 in SH2B3 with cardiovascular disease and myocardial infarction, and 21q22 near SLC5A3–MRPS6–KCNE2 with coronary heart disease and myocardial infarction (table 2). Overall, seven of the 13 variants were associated in at least one analysis.
Genetic risk score was strongly associated with incident coronary heart disease, cardiovascular disease, and myocardial infarction when adjusted for age and sex (webappendix p 3) and traditional risk factors (table 3). Adjustment for traditional risk factors did not substantially change the estimates from the model adjusted for age and sex only. Participants in the top quintile of genetic risk score were estimated to have 1·66-times increased risk of coronary heart disease compared with those in the bottom quintile (95% CI 1·35–2·04, p value for linear trend across the quintiles=7·3×10−10), 1·50-times increased risk of cardiovascular disease (95% CI 1·29–1·75, p=1·9×10−10), and 1·46-times increased risk of myocardial infarction (95% CI 1·15–1·86, p=2·8×10−5).
Results were broadly similar when the genetic risk score was divided into tertiles. In models adjusted for traditional risk factors, participants in the top tertile of the genetic risk score were estimated to have a 1·56-times increased risk of coronary heart disease compared with those in the bottom tertile (95% CI 1·33–1·83, p value for linear trend across tertiles=2·8×10−8), 1·40-times increased risk of cardiovascular disease (95% CI 1·24–1·58, p=1·4×10−8), and 1·34-times increased risk of myocardial infarction (95% CI 1·12–1·62, p=0·0015).
The genetic risk score conferred risk comparable to other established risk factors such as plasma LDL cholesterol (HR 2·08, 95% CI 1·57–2·76, for top vs bottom quintile of LDL cholesterol in FINRISK studies), systolic blood pressure (HR 1·66, 95% CI 1·19–2·30, for top vs bottom quintile of systolic blood pressure in FINRISK studies), or plasma C-reactive protein (HR 1·79, 95% CI 1·15–2·80, for top vs bottom quintile in FINRISK studies). Although the group means were statistically different, the distribution of each quantitative risk factor between those who went on to develop coronary heart disease and those who did not was broadly overlapping (figure).
Table 4 shows results for prevalent events. The odds ratio for coronary heart disease between the highest and lowest quintile group was 1·63 (95% CI 1·24–2·15, p=4·8×10−5), for cardiovascular disease was 1·30 (95% CI 1·15–1·47, p=2·6×10−8), and for myocardial infarction was 1·56 (95% CI 1·38–1·76, p=1·2×10−15).
Additionally, we investigated whether adjustment for a history of early-onset myocardial infarction among first-degree relatives changed genetic risk score estimates in the FINRISK studies. Family history was significantly associated with incident events (HR for coronary heart disease 1·40, 95% CI 1·20–1·64; webappendix p 4), but the effect of the genetic risk score did not change after adjustment for family history (webappendix p 5).
We then investigated whether the genetic risk score association was dominated by rs4977574 at 9p21 near CDKN2B–CDKN2A, which is the strongest myocardial infarction locus reported to date. After adjustment for rs4977574, the HR between the highest and lowest genetic risk score quintile was 1·51 (95% CI 1·19–1·91) for coronary heart disease. Thus, other variants in the genetic risk score seem to have predictive power beyond the 9p21 locus (webappendix p 5).
The AUC estimates for coronary heart disease, cardiovascular disease, and myocardial infarction models with traditional risk factors and genetic risk score were 0·872, 0·853, and 0·881, respectively, and they were not significantly higher than the estimates from the models with only traditional risk factors (0·871, p=0·19; 0·853, p=0·48; and 0·880, p=0·35, for coronary heart disease, cardiovascular disease, and myocardial infarction, respectively).
Table 5 and table 6 show risk reclassification results for coronary heart disease. When participants were classified into four risk categories (0–5%, 5–10%, 10–20%, and >20%) on the basis of their 10-year predicted risk, 22 (13%) participants with coronary heart disease classified at 10–20% risk category in the model with traditional risk factors changed their risk category into the greater than 20% category when genetic risk score was included in the model. Similarly, 54 (13%) of the participants without incident coronary heart disease were reclassified from the greater than 20% to the 10–20% category. IDI was significant for coronary heart disease (IDI 0·004, p=0·0006), cardiovascular disease (IDI 0·004, p=0·0004), and for myocardial infarction (IDI 0·003, p=0·03). For coronary heart disease, overall NRI was not significant (NRI 2·2%, p=0·182), but there was a significant improvement in reclassification of participants at intermediate risk (clinical NRI 9·7%, p=3×10−6). The calibration of the models with (p=0·52) and without (p=0·47) genetic risk score was good.
Using case-control and prospective cohort samples independent from the discovery samples, we sought to validate recently discovered genetic risk factors for coronary heart disease and to estimate the magnitude of risk conferred by these genetic risk factors in the population setting. We found that a genetic risk score including 13 SNPs associated with coronary heart disease or myocardial infarction was associated with risk of prevalent and incident coronary heart disease (even after we accounted for traditional risk factors), and that the 20% of individuals of European ancestry who carry the most risk alleles have a roughly 1·7-times increased risk of coronary heart disease when compared with those in the lowest quintile.
These findings allow us to draw several conclusions. First, the results from case-control discovery samples do seem to generalise to independent samples, including those from prospective cohorts. Second, the magnitude of effect conferred by genetic risk score (roughly 1·7-times increased risk in our study) is attenuated when compared with the discovery reports (about 2·2-times in one report9). Third, even though family history of early-onset cardiovascular disease raised the risk of cardiovascular events by 25–40%, adjustment for family history had no effect on the risk estimates because of the genetic risk score. In view of measurement error for family history, and since genetic variants only account for a small proportion of familial risk, this finding might not be unexpected.
Finally, although strongly associated with risk of incident coronary heart disease, genetic risk score did not improve risk discrimination when assessed by the C index. This finding of a biomarker being associated with incident disease but yet not improving risk discrimination has been seen with several other biomarkers, including C-reactive protein and B-type natriuretic peptide, among others.27 Since some have argued that the C index might be an insensitive measure of risk discrimination, newer approaches have been developed, including IDI, NRI, and clinical NRI. The tested genetic risk score slightly improved risk prediction for coronary heart disease and myocardial infarction when assessed by IDI and clinical NRI. Overall, these results emphasise the challenge of risk prediction for complex traits on the basis of any single factor.28
Our combined results show much larger risk differences between the tails of genetic risk score than in a recently reported follow-up study of women enrolled in a clinical trial.29 The difference cannot be accounted for by the inclusion of men in our study, since our estimates and predictions are similar for both sexes. Also, nine of the genetic loci in our genetic risk score are the same as in the cardiovascular disease score reported by Paynter and colleagues,29 and the statistical models are comparable. We suggest three potential reasons for the differing results. Paynter and co-workers included a range of endpoints such as strokes (203 of the 777 outcomes) that have not been associated with the genetic variants studied. The inclusion of such outcomes might have diminished their effect estimates. Second, Paynter and colleagues had a small number of myocardial infarction events (199 events) and thus lower statistical power might account for the absence of association in that study. Finally, the women studied were a selected set of health professional participants who volunteered for the clinical trial and as such might not represent the full range of risk seen in the general population.
Although we present the largest effort to date to study the association between genetic risk score and risk of incident cardiovascular disease, our results should be interpreted in the context of several potential limitations. Our SNP panel could be incomplete. For example, we did not systematically evaluate SNPs related to cardiovascular risk markers, as Paynter and colleagues did.29 Whereas hundreds of blood biomarkers have been shown to be associated with cardiovascular disease in observational epidemiology research, few have been proven to causally relate to the disease; plasma LDL cholesterol and lipoprotein(a) are notable exceptions. Therefore, we did not consider SNPs that are only associated with cardiovascular risk markers. Our study was undertaken in individuals of Swedish and Finnish descent and hence the results might not be generalisable to others in Europe or to other ancestries. We divided the continuous genetic risk score into five groups and compared risk between the top and bottom quintiles. Since alternative categorisations are possible we also generated comparisons of the top and bottom third of the distribution, and these results were similar to those seen with quintiles.
In conclusion, a genetic risk score based on 13 SNPs from genome-wide association studies for myocardial infarction and coronary heart disease was associated with a first coronary heart disease event, with a relative risk estimate of 1·7 between the highest and lowest quintiles of genetic risk score. Genetic risk score improved risk reclassification in participants who were at intermediate risk on the basis of traditional risk factors. Whether this genetic risk score will have clinical usefulness remains to be defined in future studies.
We wish to dedicate this paper to the memory of Leena Peltonen, who passed away on March 11, 2010. This work was supported by the Wellcome Trust (WT089062/Z/09/Z, WT089061/Z/09/Z), US National Institutes of Health (R01 HL087676), and the Donovan Family Foundation. We thank David Altshuler for guidance and helpful suggestions regarding study design and analyses. LP was supported by the Center of Excellence for Complex Disease Genetics of the Academy of Finland (grants 213506, 129680), the Biocentrum Helsinki Foundation, and The Nordic Center of Excellence in Disease Genetics. VS is supported by the Academy of Finland (grant number 129494), the Finnish Foundation for Cardiovascular Research, and the Sigrid Juselius Foundation. JS was funded by Finnish Foundation for Cardiovascular Research, Special governmental subsidy for health sciences research (EVO). MP was supported by Finnish Foundation for Cardiovascular Research and Finnish Academy Salve Program, grant number 129322
SR, OM, VS, LP, and SK are all senior authors, and planned and managed the project. SR led the data analysis. ET, MO-M, and ASH prepared the data and did the data analyses. CG took part in genotyping. M-LL, JS, and MSN provided COROGENE data. MP and AJ provided clinical guidance and Finnish data. AS took part in data analysis for the Swedish cohorts. KS participated in study planning, did literature searches, and took part in genotyping. SR, ET, ASH, MO-M, VS, and SK wrote the report (with significant contributions from other authors).
SK has received consultancy fees from Merck and Daiichi Sankyo and grants from Alnylam and Pfizer. All other authors declare that they have no conflicts of interest.