|Home | About | Journals | Submit | Contact Us | Français|
Coronary Artery Disease (CAD) is a critical determinant of morbidity and mortality. Previous studies have identified several cardiovascular disease (CVD) risk factors, which may partly arise from a shared genetic basis with CAD, and thus be useful for discovery of CAD genes.
We aimed to improve discovery of CAD genes, and inform the etiologic relationship between CAD and several CVD risk factors using a shared polygenic signal-informed statistical framework.
Using genome-wide association studies (GWAS) summary statistics and shared polygenic pleiotropy-informed conditional and conjunctional false discovery rate (FDR) methodology, we systematically investigated genetic overlap between CAD and 8 traits related to CVD risk factors: low density lipoprotein (LDL) cholesterol, high density lipoprotein (HDL) cholesterol, triglycerides (TG), type 2 diabetes (T2D), C-reactive protein (CRP), body mass index (BMI), systolic blood pressure (SBP) and type 1 diabetes (T1D). We found significant enrichment of single nucleotide polymorphisms (SNPs) associated with CAD as a function of their association with LDL, HDL, TG, T2D, CRP, BMI, SBP and T1D. Applying the conditional FDR method to the enriched phenotypes, we identified 67 novel loci associated with CAD (overall conditional FDR < 0.01). Further, we identified 53 loci with significant effects in both CAD and at least one of LDL, HDL, TG, T2D, CRP, SBP and T1D.
The observed polygenic overlap between CAD and cardio-metabolic risk factors indicates an etiological relation that warrants further investigation. The new gene loci identified implicate novel genetic mechanisms related to CAD.
Coronary artery disease (CAD) is a leading cause of death worldwide. The development of CAD is influenced by both genetic and environmental factors, as evident by its high heritability (40–50%), shown in twin and family studies1. Genome-wide association studies (GWAS) in CAD have identified a total of 46 genetic variants reaching genome-wide significance for CAD2. However, the identified genetic variants explain only a small proportion of estimated heritability2, i.e. only a small amount of the familial clustering of CAD. This apparent paradox is widely seen across GWAS for complex traits and is termed “the missing heritability problem”3, 4. However, recent discoveries suggest that existing GWAS can capture more of the heritability due to common variants if proper statistical tools are used5–7.
Hypertension8, obesity9, abdominal fat10, diabetes11, dyslipidemia12–14, inflammation as reflected by high levels of C-reactive protein (CRP)15, are associated with CAD. Several studies have found overlapping pathophysiology16, but the underlying shared genetic factors and the extent of the polygenic overlap across these phenotypes are mainly unknown. We have developed an analytical framework for complex traits building on the polygenic overlap17 between two or more phenotypes6. This method has the potential to capture more of the polygenic effects in complex traits18, and has successfully been applied to psychiatric6, cardiovascular19, neurological diseases20 and cancer21. This ‘shared polygenic signal’ method could be particularly informative in CAD, a disease with known co-morbidities and overlapping pathophysiology with related cardiovascular and metabolic disorders2, 22–25.
We used this approach to leverage the power of multiple large genomic studies to describe the extent of the polygenic overlap and identify overlapping SNPs between CAD and 8 associated traits and cardiovascular disease (CVD) risk factors where recent GWAS results are available: low density lipoprotein (LDL) cholesterol26, high density lipoprotein (HDL) cholesterol26, triglycerides (TG)26, type 2 diabetes (T2D)27, CRP28, body mass index (BMI)29, systolic blood pressure (SBP)30, 31 and type 1 diabetes (T1D)32. By combining data from these different GWAS, we hypothesized that the shared polygenic signal approach can improve discovery of CAD genes, and inform the etiologic relationship between CAD and CVD risk factors.
We obtained summary statistics from large-scale genomic studies (p-values and risk allele when available) from public access websites or through collaboration with investigators. The summary statistics are based on the Metabochip33 for CAD2 (n=194,427 including 63,746 cases) and T2D27 (n=149,830), and standard GWAS for LDL26 (n=95,454), HDL26 (n=99,900), TG26 (n=96,568), BMI29 (n=123,865), SBP31 (n=203,056) and T1D32 (n=16,559), and CRP (n=66,185)28. Details on the inclusion criteria and phenotype characteristics of the different GWAS are described in the original publications.
There were some overlapping controls between CAD and T2D and also between CAD and T1D. In both instances this was mainly due to the inclusion of one or more sub-studies employing a shared control design (e.g. used by the Wellcome Trust Case Control Consortium and deCODE Genetics)34 (see Online Table I). There was also some sample overlap between CAD and LDL, HDL, TG, BMI and SBP (Online Table I). Note that even without raw data, an upper bound for the amount of sample overlap is obtainable from the original publications by comparing the sub-study definitions and samples sizes for CAD and each secondary trait (correlation of uncorrected test statistics due to sample overlap is given in Online Table I; see LeBlanc et al (in prep) for details).
The Women’s Genome Health Study (WGHS), initiated in 1992, is an ongoing prospective cohort including 23,294 initially healthy North American women of European ancestry with whole genome genotype data and follow-up formajor incident health events, including myocardial infarction (MI) and coronary heart disease (CHD; composed of MI, CHD death, and coronary revascularization) are recorded35. Over the approximately, 20 years of follow-up, there were 387 and 1007 cases respectively of incident MI and CHD among the 23,294 women.
The relevant institutional review boards or ethics committees approved the research protocol of the individual GWAS and all participants provided written informed consent.
We use Matlab (version R2013a) for all statistical analysis unless otherwise indicated. First, we looked for evidence of overlapping polygenic signal for CAD and each secondary trait. In the absence of an overlapping polygenic signal, the expectation is that the p-value distribution for CAD is independent from the p-values in the secondary trait. The dependency of the p-value distribution for CAD on each secondary trait can be visually explored using conditional quantile-quantile plots to evaluate genetic ‘enrichment’ in CAD as a function of a secondary phenotype. Quantile-quantile plots are a descriptive tool for visualizing the difference between an observed distribution and a theoretical distribution. With GWAS, quantiles of the observed (nominal) p-values, denoted by ‘p’, are plotted on the y-axis, with the quantiles of the theoretical null distribution (i.e. the uniform distribution), here denoted by ‘q’, on the x-axis. Conventionally, the -log10 transform is used to emphasize tail areas. If there is no deviation from the null distribution and thus no true genetic association present, a quantile-quantile plot falls on the 1:1 line. Leftward deflections of the observed distribution from the null line reflect increased tail probabilities in the distribution of the test statistics, and consequently an over-abundance of low p-values compared to that expected by chance, termed ‘enrichment’. Here, we constructed conditional quantile-quantile plots to investigate if enrichment in the primary phenotype (CAD) is related to significance in a given secondary phenotype, as visualized by a leftward deflection from the null line on the conditional quantile-quantile plot. A conditional quantile-quantile plot was separately constructed for CAD and each of the 8 secondary traits. To test for statistical significance associated with these conditional quantile-quantile plots, we used the Anderson-Darling test21. In brief, this is a statistical test of whether a given sample of data is drawn from a given probability distribution and allows us to determine if an observed leftward deflection is statistically significant (for additional details see 21). In this case, we used set of SNPs (GWAS p>0.1 in the secondary trait), i.e., SNPs that are signal depleted in the secondary trait, as the comparison set.
Second, once statistically significant enrichment was confirmed, we computed conditional False Discovery Rates (FDR), a statistical framework that leverages shared polygenic signal6, 18, to improve the discovery of SNPs for the primary trait of interest, CAD. The standard FDR is designed to control the expected proportion of incorrectly rejected null hypotheses, and is employed to correct for multiple comparisons. An extension of the standard FDR is the conditional FDR6, which in our application, is used to incorporate information from GWAS summary statistics of a second phenotype. The conditional FDR is defined as the probability of a SNP being null in the first phenotype given that the p-values in the first and second phenotype are as small as or smaller than the observed ones (see Supplemental Methods). Importantly, ranking SNPs according to conditional FDR re-orders SNPs compared to their raw CAD p-values, and this new ranking favors SNPs showing signal in both CAD and the given secondary trait. In contrast, the standard FDR does not re-rank the SNPs compared to their raw CAD p-values, but instead suggests a different significance cut-off compared to the Bonferroni correction.
In additional analysis, we computed the conjunctional FDR18 to detect loci showing strong evidence of association with both CAD and the given secondary trait. Low values in conditional FDR can be driven by association with both phenotypes or with the primary phenotype alone, whereas low values in conjunctional FDR are driven by association with both phenotypes.
The application and interpretation of FDR-based methodology is more challenging for post-GWAS specialized SNP panels such as the Metabochip33. The standard FDR is widely applied in GWAS where any given SNP is assumed to have the same prior probability of association as all other SNPs. The Metabochip (~200,000 SNPs) is designed to follow up SNPs of interest relating to metabolic and cardiovascular traits, including fine mapping around genome-wide significant SNPs. As such, the true positives (and the false positives) come in large dependent clumps. Large-scale dependence in the signal can lead to biased FDR36. To correct for this bias, we used an LD-pruned set of SNPs to estimate the conditional FDR distribution, which was then used for estimating the conditional FDR for the full SNP set (see the Supplemental Methods for details of this estimation procedure). To visualize the conditional and conjunctional FDR, we constructed Manhattan plots. Detailed information on conditional quantile-quantile plots, Manhattan plots, as well as conditional and conjunctional FDR can be found in earlier reports6, 18 and/or in the Supplement.
The conditional FDR assumes independent samples for CAD and each of the secondary traits. However, several of the participants were included in both a secondary trait GWAS and in the CAD study. Partially overlapping subjects between studies leads to dependencies between the test statistics for different traits for a given SNP under the null hypothesis37. We estimated the expected correlation of the cross-trait GWAS test statistics under the null hypothesis of no genetic associations using a similar method to the one described for GWAS meta-analysis37, 38 and corrected for the estimated correlation due to shared subjects using the Mahalanobis transformation (LeBlanc et al in prep). These corrected test statistics were used in all further analysis.
As an internal validation of stratified enrichment, we performed a stratified replication rate analysis using methods described previously,18 where the contributing studies of the CARDIoGRAMplusC4D Consortium were repeatedly divided into independent discovery and validation sets. The purpose of this analysis is to show that an observed pattern of stratified enrichment is not due to spurious effects. In brief, we randomly selected half of the studies (24) for the discovery set, and used the remaining studies for replication, and repeated this procedure 200 times. For each SNP in the replication set and the discovery set, we computed a meta-analysis test statistic (Liptak’s method). For the discovery set, we calculated the associated two-tailed p-values, whereas for the replication samples they were converted to one-tailed p-values in order to preserve the direction of effect in the discovery sample. We then created a vector of -log10(p-value) cutoffs and binned SNPs according to their p-values in the discovery set SNPs. For each bin, we kept track of their respective p-values in the replication set. We can then calculate the replication rate for each bin as defined by the proportion of SNPs in that bin which has a replication p value < 0.05. We checked for stratified replication rates by plotting the replication rate curves for four strata based on significance in each secondary trait, using the same strata definitions as for the conditional quantile-quantile plots.
For all novel CAD SNPs identified in the conditional FDR analysis, we checked for nominal replication (p<0.05) in the WGHS. Since the WGHS data is collected prospectively, we conducted age-adjusted Cox regression over approximately 20 yrs of follow-up ending in 2013 for both MI and CHD.
We tested whether the novel CAD SNPs discovered in the current study are associated with genotype-dependent gene expression in various tissue types. Such SNPs are known as eQTLs. To this end, we cross-referenced our novel findings from the conditional FDR analysis with three cis-eQTL databases: in whole blood39 (the most powerful eQTL database available), adipose tissue40 (relevant for metabolic disease) and lymphoblastoid cells (LCL)40. The whole blood eQTL data has been collected in a large collaborative effort n=5311 samples, the adipose and LCL eQTLs are from a sample size of approximately n=850. We considered a SNP to be an eQTL using an FDR q-value cutoff of 0.05. The FDR q-values were already available for whole blood, while for adipose tissue and LCL we downloaded the publically-available eQTL data and calculated q-values using the qvalue() package available from Bioconductor (version 2.14) in R (version 3.1.1).
To better understand the biological context of our results, we conducted an Ingenuity Pathway Analysis (IPA, QIAGEN Redwood City, www.qiagen.com/ingenuity) including all previously reported CAD genes and the nearest annotated gene for each novel SNP reported in our study. The available molecules and/or relationships in the IPA Knowledge Base for mammal (humans, mouse or rat) were considered. We set the confidence filter to relationships where the confidence is experimentally observed. We allowed a maximum size of 35 genes for generating networks and we allowed up to 25 networks in the overall analysis. IPA computes a score for each network according to the fit of that network to a set of focus gene and p-values are calculated using the right-tailed Fisher’s exact test.
We used a two-step analysis strategy. First we assessed overlapping polygenic enrichment for CAD and each of the other traits via conditional quantile-quantile plots, and applied the Anderson-Darling test to define which of the 8 secondary traits show significant polygenic overlap. This test requires the direction of the association and as this information was unavailable for SBP, we relied on a visual inspection of the conditional quantile-quantile plot for SBP. As illustrated in Online Table II, all testable traits showed significant enrichment after Bonferroni correction for 21 tests and SBP showed strong visual evidence for enrichment. Therefore all 8 secondary traits were retained for the second step of the analysis. Second we applied conditional and conjunctional FDR methods to identify new CAD risk loci and to identify overlapping loci between CAD and each of the 8 associated traits. Overall FDR thresholds of 0.01 and 0.05 were chosen for conditional and conjunctional FDR respectively. Conservatively adjusting for the 8 secondary traits being considered21 this translated to thresholds of 0.01/8 and 0.05/8 for conditional and conjunctional FDR.
Conditional quantile-quantile plots for CAD conditioned on nominal p-values of association with LDL, CRP, T1D and T2D showed significant enrichment across different levels of significance (Figure 1). Similar significant enrichment patterns were seen for HDL, TG, SBP and BMI (see Online Figure I). The increasing leftward shift with more strictly defined strata based on nominal p-values of associated phenotypes suggests a greater proportion of true associations for a given nominal CAD p-value. This is indicative of cross-trait polygenic enrichment. As illustrated in Figure 1, panel A: LDL, the proportion of SNPs in the −log10(pLDL) ≥ 3 category reaching a given significance level (e.g., −log10(pCAD) > 6) is much greater than for the all SNPs category, indicating a high level of enrichment (Figure 1).
Stratified replication rates were observed for all secondary traits with the exception of BMI (Online Figure II), indicating that the observed enrichment in the conditional quantile-quantile plots is also associated with increased replication rates. The observed pattern of stratified enrichment does not result from spurious effects, and replication rate is increased by conditioning on significance in each of the secondary traits, with the possible exception of BMI.
Conditional and conjunctional FDR were calculated for CAD paired with each of the 8 secondary phenotypes showing enrichment. The results of each analysis were filtered as follows. First, we filtered the lists of significant SNPs by their linkage disequilibrium patterns as observed in the 1000 Genomes41 dataset and report only the most significant result per annotated gene. We considered a SNP to be an independent finding if the linkage disequilibrium, defined using r2, was less than 0.2 with all other SNPs in the filtered list. Second, we further filtered the list of significant SNPs for novelty with respect to previously published CAD SNPs. We filtered out any previously reported genes and SNPs, including SNPs in linkage disequilibrium (r2>0.2) with those previously reported SNPs. Thus, the list of significant SNPs presented in Table 1 represent, to the best of our knowledge, independent novel SNPs for CAD. The corresponding conditional Manhattan plot is given in Figure 2.
Over all 8 secondary traits, we identify 101 SNPs associated with CAD, 67 of which have not previously associated with CAD (previously reported SNPs not shown). Many of these new loci are located in regions with borderline significant association with CAD in previous studies42 as is evident by the CAD association p-value column given in Table 1. Of interest, several of the identified loci are found across the conditional analysis from several risk factors. These loci are not found using standard methods applying a genome-wide Bonferroni correction.
We looked to the WGHS for independent validation of these 67 new CAD SNPs and 12 of these show nominal replication for at least one endpoint (CHD or MI); see Online Table III.
Of the 67 novel CAD loci, 32 show genotype-dependent gene expression in whole blood regulating the expression of 57 unique genes and 42 of these 67 SNPs would not have been detected using the standard (unconditioned) FDR. We found evidence for 16 and 18 loci having an eQTL effect in adipose tissue and LCL respectively (Table 2). For six of these loci we observed an eQTL effect on the same gene in both whole blood and adipose tissue. Interestingly, 18 loci show an effect on the gene expression of more than one gene.
To further evaluate genetic overlap, we used the conjunctional FDR to identify SNPs with significant effects in both CAD and its associated risk factors. The conjunctional Manhattan plot for CAD is shown in Online Figure II. We identified 53 loci achieving conjunctional FDR<0.05, after adjustment for using multiple risk factors and pruning the results in the same manner as for the conditional FDR (Online Table IV; corresponding z-scores in Online Table V).
Follow-up Ingenuity Pathways Analysis (IPA) identified highly significantly associated “Top Canonical Pathways” relevant to CAD (e.g. LXR (liver X receptor)/RXR (retinoid X receptor) as well as FXR (Farnesoid X Receptor)/RXR Activation and Atherosclerosis Signaling); (Online Table V). Additionally, in “Top Diseases and Bio Function” CAD relevant diseases and functions are on top (Cardiovascular Disease and Lipid Metabolism) in the subgroups “Diseases and Disorders” and “Molecular and Cellular Functions”.
Combining data from large-scale genomic studies from different phenotypes in a conditional FDR framework, we show polygenic overlap between CAD and several CVD risk factor phenotypes and identify 67 novel CAD susceptibility loci. Further, conjunctional FDR analysis identified 53 novel loci associated with both CAD and the CVD risk factors LDL, HDL, TG, T1D, T2D, CRP and SBP. Importantly, we validated the conditional FDR approach by showing that replication rates in independent CAD sub-studies increase as a function of p-value in each secondary trait, with the possible exception of BMI. Further, we see nominal replication for 12/67 SNPs in the WGHS. Overall, these results suggest that a proportion of the clinically and epidemiologically observed association between these phenotypes can be explained by overlapping genetic loci (pleiotropy) and not simply shared environmental risk factors. Further, the findings provide further evidence that CAD is a highly polygenic disease.
Our findings of polygenic overlap provide novel insights into the relationship between CAD and major CVD risk factors. We demonstrate an interesting genetic dissociation among these risk factors and CAD, with strong enrichment for lipids, inflammation and metabolic disorders. The combination of dyslipidemia (i.e., high TG and LDL cholesterol and low HDL cholesterol), T2D, and high blood pressure forms the metabolic syndrome12–14, 43, 44, and all of these factors (particularly LDL) showed strong genetic overlap with CAD. This is in agreement with a recent reports suggesting a common genetic basis for regulation of lipid and glucose homeostasis45, while previous studies did not show common genes for the different components of the metabolic syndrome46, but revealed strong lipid gene contribution. It is further supported by the pathway analysis that identified “Atherosclerosis Signaling” and “FXR/RXR Activation” among the three most relevant pathways. Genes activated by the FXR has been shown to influence vascular tension and regulate the unloading of cholesterol from foam cells47. Another important finding is the overlap between CAD and T2D. Based on conditional analysis of these two phenotypes, 21 novel loci were identified. This is in line with previous single gene studies suggesting a genetic link between T2D and CAD48.
The strong shared polygenic signal between LDL and CAD emphasizes the important role of LDL in CAD development, and support the notion that risk genes for atherosclerosis, such as LDL genes, are causal for CAD as recently suggested49. Finally, two of the phenotypes most strongly overlapping with CAD were CRP and T1D, two immune related phenotypes. CRP is regarded as a reliable marker of systemic inflammation and its role as a biomarker in CAD has been attributed to its ability to reflect up-stream inflammatory pathways. However, the finding in the present study suggests that the link between CRP and CAD may also reflect overlapping genetic loci. T1D is related to auto-immune mechanisms and its genetic overlap with CAD underlines the important role of the immune system in CAD, and could be due to a large number of overlapping genes between immune and lipid phenotypes50. In fact, the bidirectional interaction between inflammation and lipids is regarded as a phenotypic hallmark of atherosclerosis, and our findings suggest that this phenotype could reflect overlapping genes between these two interacting pathophysiological arms of atherogenesis. The pathway analysis revealed “LXR/RXR Activation”, as the top ranked canonical pathway. LXR/RXR are heterodimer nuclear receptors/transcription factors. LXR acts as a cholesterol sensor, and LXR pathway activation has been shown to stimulate lipogenesis and hypertriglyceridemia51. LXR/RXR can also modulate inflammatory responses to cholesterol exposure and could represent a regulator of the interaction between lipids and inflammation, being the most important pathway in the pathogenesis of CAD.
In the original CAD GWAS and follow-up Metabochip study, 46 loci were identified2. By combining the original CAD results with the CVD risk factor phenotypes GWAS, we identified 101 significant loci associated with CAD, of which 67 are novel, using the conditional FDR approach. Even though the original CAD study was quite large2, the increased power provided by additional GWAS of associated phenotypes together with the conditional FDR method more than doubled gene discovery. The novel SNPs discovered here contribute to explaining more of the missing heritability for CAD, but we cannot quantify how much more is explained since we are working at the summary-statistic level. These findings underline the cost-effectiveness of the current statistical methods and highlight several interesting genes in CAD pathology. IL1F10 (interleukin 1 family, member 10 (theta)) was identified in the pathway analysis of the CAD GWAS, it is known to bind IL1R and stimulate NF-kB pathway. VEGFA (vascular endothelial growth factor A) is well known in the CVD field, but to the best of our knowledge, this has never been shown in genetic studies. SLC18A1 (solute carrier family 18 (vesicular monoamine transporter), member 1) has been implicated in neuropsychiatric disorders, but not previously in CVD. SERPINH1 (serpin peptidase inhibitor, clade H (heat shock protein 47), member 1, (collagen binding protein 1)) is a heat shock protein, known to be involved in atherosclerosis. ILF3 (interleukin enhancer binding factor 3, 90kDa) is a matrix metallo proteinase, well studied in the CVD research field, and the findings of ILF10 and ILF3 underscore the role of the IL-1 cytokine family in CAD.
Although nearest-gene annotation can be informative, the vast majority of discovered SNPs are located outside coded DNA regions52. Therefore, annotating the identified genetic variants to the correct causal genes for the phenotype of interest often remains challenging52. One of the potential mechanisms whereby SNPs may affect phenotype variations is through altered gene expression. We successfully identified eQTL effects in whole blood, LCL and adipose tissue, suggesting these genes as potential causal candidates. Of interest, some of the genetic variants showed an effect on the gene expression of more than one gene. We speculate that the shared effect of the genetic variants on the phenotypes under study might be explained by the regulation of several different genes, but further studies would be necessary connect the genes with altered gene expression seen in Table 2 to the clinical phenotypes. Moreover, the majority of the genes regulated by the genetic variants were different from the nearest annotated gene. Given that the original whole blood has markedly different power and used different statistical eQTL definitions than the LCL and adipose tissue eQTL studies, a detailed cross-tissue comparison is not possible. Further studies are needed to determine the functional mechanisms involved in the novel CAD loci identified here.
There are certain limitations associated with the present results. Due to the overlap in some of the GWAS samples examined, we cannot completely exclude the contribution from environmental or behavioral factors. The shared participants between genomic studies could also affect the findings. However, we did adjust for overlapping subjects, and used strict FDR thresholds to account for the 8 secondary traits. Although clinical comorbidity and shared pathophysiology between these phenotypes poses a challenge for the interpretation of the basis of the shared polygenetic signals, their utility for increasing the power to detect new loci for CAD is not affected. The question remains if the identified shared genes are independent of other phenotypes (biological pleiotropy), or that the current findings are results of overlapping phenotypes (mediated by other phenotypes), as several of these risk factors can be co-occurring (mediated pleiotropy)53. However, it appears reasonable to interpret our findings as reflecting the existence of shared genetically determined pathophysiological processes across CAD and the associated phenotypes. In general, FDR methodology is a less conservative approach to multiple testing than Bonferroni correction. However, by using the conditional FDR, we are not simply relaxing the significance threshold, but are increasing power and incorporating useful information from a second trait into the analysis, allowing us to identify the SNPs more likely to replicate. We have not strictly replicated all of these findings in independent samples but we have shown that replication rates increase by conditioning on significance in the secondary traits, and have shown that 12 SNPs nominally replicate in the WGS. While the prospective design of the WGHS makes it suitable for validation of the candidate CAD associations, the numbers of incident events of MI and CAD were much smaller than in the discovery sample, which was composed of a preponderance of men compared with the all-female composition of the WGHS. However, in spite of much lower power and possibility of differences according to sex, the WGHS is the largest and most relevant independent dataset we were able to access and we found nominal association for novel CAD 12 loci.
In conclusion, we found substantial polygenic overlap between CAD and several related conditions, importantly LDL, T2D and CRP, providing more evidence for fundamental etiological relationship between these phenotypes that cannot be explained by lifestyle factors. The 67 novel CAD loci identified here provide new insight into genetic mechanisms of CAD and may form the basis for earlier diagnosis and new prevention and treatment strategies.
Clinical and epidemiological evidence suggests a relationship between CAD and cardio-metabolic traits. In the presence of a shared polygenic signal (i.e., a large number of shared risk variants each with a small effect), traits with overlapping pathophysiology with CAD can be used in combination with novel statistical methodology to improve discovery of variants associated with CAD. Using large-scale genetic data from CAD and genetic data from hypertension, obesity, abdominal fat, diabetes, dyslipidemia, and inflammation (C-reactive protein), we found a polygenic overlap between CAD and each of these related traits. We identified 67 novel CAD risk variants and 53 risk variants jointly associated with CAD and at least one other related trait. These results highlight the importance of shared polygenic risk factors between coronary artery disease and cardiovascular risk factors. Our findings provide important insights into molecular mechanisms underlying coronary artery disease and have potential implications for prevention and treatment strategies.
The authors would like to thank the DIAGRAM Consortium, the CARDIoGRAMplusC4D Consortium, the Global Lipids Genetics Consortium, The International Consortium for Blood Pressure GWAS, the GEFOS Consortium, the Giant Consortium, the Type 1 Diabetes Consortium and the CHARGE Consortium Inflammation Working Group for the summary statistics GWAS data.
SOURCES OF FUNDING
The present project was supported by the Research Council of Norway (213837, 223273); the South East Norway Health Authority (2013–123); the Kristian Gerhard Jebsen Foundation; in addition to the National Institutes of Health grants R01AG031224, R01EB000790, and RC2DA29475, and the European Union project tegrat@senegVC; a grant from the Leducq Foundation, CADgenomics; and BMBF, Forschungskonsortien zur Systemmedizin, e:AtheroSysMED; DFG, SFB 1123. The WGHS is supported by HL043851, HL080467, and HL099355 from the National Heart, Lung, and Blood Institute and CA047988 from the National Cancer Institute with collaborative scientific support and funding for genotyping provided by Amgen. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
This manuscript was sent to Elizabeth M. McNally, Consulting Editor, for review by expert referees, editorial decision, and final disposition.