|Home | About | Journals | Submit | Contact Us | Français|
eQTL analyses are important to improve the understanding of genetic association results. Here, we performed a genome-wide association and global gene expression study to identify functionally relevant variants affecting the risk of coronary artery disease (CAD).
In a genome-wide association analysis of 2,078 CAD cases and 2,953 controls, we identified 950 single nucleotide polymorphisms (SNPs) that were associated with CAD at P<10-3. Subsequent in silico and wet-lab replication stages and a final meta-analysis of 21,428 CAD cases and 38,361 controls revealed a novel association signal at chromosome 10q23.31 within the LIPA (Lysosomal Acid Lipase A) gene (P=3.7×10-8; OR 1.1; 95% CI: 1.07-1.14). The association of this locus with global gene expression was assessed by genome-wide expression analyses in the monocyte transcriptome of 1,494 individuals. The results showed a strong association of this locus with expression of the LIPA transcript (P=1.3×10-96). An assessment of LIPA SNPs and transcript with cardiovascular phenotypes revealed an association of LIPA transcript levels with impaired endothelial function (P=4.4×10-3).
The use of data on genetic variants and the addition of data on global monocytic gene expression led to the identification of the novel functional CAD susceptibility locus LIPA, located on chromosome 10q23.31. The respective eSNPs associated with CAD strongly affect LIPA gene expression level, which itself was related to endothelial dysfunction, a precursor of CAD.
Coronary artery disease (CAD) remains one of the major causes of death. Recent data indicate that classical risk factors and novel risk markers account for a large proportion of disease risk.1,2 Despite these considerable advances, it remains apparent that the underlying causes of CAD are multifactorial and involve a complex interplay between acquired and inherited risk factors. The advent of genome-wide association (GWA) studies led to the identification of several genetic loci that associate with the risk of CAD.3-7 The majority of these associations are located in genomic regions for which functional understanding is lacking.8 Consequently, there exists a substantial gap in our understanding about how these single nucleotide polymorphisms (SNPs) affect the pathophysiological mechanisms through which the loci contribute to disease. Variation in gene expression appears to be an important intermediate step underlying susceptibility of complex diseases.9-15 The abundance of a gene transcript can be directly modified by polymorphisms; thus, transcript abundance mediated by genetic variation either alone or in combination with environmental factors might be considered as a quantitative trait that can be mapped.15 When combined with GWA data, the analysis of the transcriptome can help to clarify and categorize effects of CAD-associated SNPs on gene expression (eSNPs).
In the present study, a genome-wide association case-control study in 5,031 individuals followed by two stages of replication and a final meta-analysis of 59,789 cases and controls was performed. This approach led to the identification of a novel CAD susceptibility locus on chromosome 10q23.31, LIPA. Additionally, eQTL analysis using a dataset of global monocytic gene expression revealed a strong effect of LIPA eSNPs on LIPA transcript levels and LIPA transcript levels in turn showed association to prevalent cardiovascular risk factors and phenotypes of subclinical disease.
A GWA study using the Genome-Wide Human SNP 6.0 Array (Affymetrix, Santa Clara, USA) was conducted to discover SNPs associated with CAD in the CADomics study (Coronary Artery Disease and Genomics), a case-control study of CAD (2,078 CAD cases and 2,953 controls). Replication of SNPs was performed in two steps. SNPs associated with CAD in the discovery stage at a threshold P-value of <10-3, entered the first replication stage (in silico replication in 9,487 cases and 30,171 controls of the following studies with European ancestry: CHARGE, GerMIFS I, GerMIFS II, MIGen, WTCCC-CAD, PennCATH, MedStar). Based on a threshold P-value of <10-4 in the pooled analysis of the discovery and the first replication stage, SNPs were selected for the second replication stage (wet lab replication in 9,863 cases and 5,237 controls of the following studies with European ancestry: ECTIM, AngioLueb, GoKard, LURIC, popgen, MORGAM). A final meta-analysis was performed in 21,428 cases and 38,361 controls. SNPs passing a conservative threshold of statistical significance at P<5×10-8 in the final meta-analysis were further evaluated for their association to global gene expression in 1,494 apparently-healthy, population-based samples from the Gutenberg Heart Express (GHSExpress) study for identifying SNPs (eSNPs) that affect gene expression (eQTL transcripts). Finally, we explored eSNPs and respective eQTL transcripts for their association to cardiovascular risk factors and phenotypes of subclinical disease. The study design is depicted in Figure 1.
CADomics is a case-control study including the hospital-based catheter-lab AtheroGene Registry16 and the population-based Gutenberg-Heart Study (GHS). For the present analysis, individuals with angiographically proven CAD (stenosis >50% in one major coronary artery), nearly 60% presenting with acute myocardial infarction, were included as cases, and individuals without a history of myocardial infarction and/or history of CAD were taken from the population-based cohort as controls. The GHSExpress study is a subsample of GHS participants – who served as controls in the CADomics study – from which RNA was directly extracted from monocytes isolated from fresh blood samples. Characteristics of the CADomics and the GHSExpress study samples are provided in Table 1 and Supplementary Table 1. Further detailed description of the studies is provided in the Supplemental Material. Descriptions of the studies used for replication stages are provided in the Supplemental Material and Supplementary Table 2.
For CADomics, genomic DNA was isolated from buffy-coats of EDTA plasma samples as described elsewhere.17 Genotyping was conducted on the Affymetrix Genome-Wide Human SNP 6.0 Array; quality control on sample and SNP level was performed according to standardized criteria.18 Genotyping was performed in individuals of European descent only. A detailed description of genotyping methods and quality control is provided in the Supplemental Material. In total, 5,031 samples and 608,247 SNPs were included in the analyses. Supplementary Table 3 provides information on genotyping platforms and methods used for all replication studies.
Isolation of total RNA and analysis of gene expression were performed as recently described.15 In brief, total RNA was isolated from monocytes of 1,606 participants of the GHSExpress Study and hybridized to Illumina HT-12 v3 BeadChips (Illumina Inc., San Diago, USA). Arrays were quantile-normalized and transformed using the arcsinh function. After quality control, 14,027 expressed RefSeq transcripts in 1,494 samples were used for eQTL analyses. Detailed description of the methods is given in the Supplemental Material.
eQTL transcripts and eSNPs were investigated for associations with prevalent cardiovascular risk factors (LDL- and HDL-cholesterol, triglycerides, diabetes mellitus, HbA1c, systolic and diastolic blood pressure) and phenotypes of subclinical disease (flow-mediated vasodilation and carotid macroangiopathy). Methods of risk factor measurements and descriptions of phenotype assessment are described in the Supplemental Material.
In the discovery GWA analysis, association of CAD with SNPs was tested using an additive genetic model in a logistic regression. In both replication steps (in silico and wet lab replication), fixed-effects meta-analysis using inverse-variance weighting was performed with the R package MetABEL.19
Associations between SNPs and transcripts were investigated using the median test20 with a significance level of P-value <10-8, corresponding to a P-value of <10-12 in an analysis of variance (ANOVA)20 for the samples that passed quality control for both genotype and expression data. SNPs located within 500 kb of either the 5′ or 3′ end of the associated gene were considered as cis acting SNPs; otherwise they were called to act in trans. Only associations of transcripts without SNPs in probe sequences are reported.21 Associations of eSNPs and eQTL transcripts with cardiovascular risk factors were analysed using logistic and linear regression for qualitative and quantitative traits, respectively. Triglycerides and HbA1c were log-transformed prior to analysis.
P-values were corrected for multiple testing using false discovery rate (FDR) 22 and a significance level of 0.05
All analyses were performed using R, version 2.10.1 (http://www.r-project.org).
The discovery GWAS revealed 950 SNPs that were associated with CAD at a level of P<10-3 in the 2,078 CAD cases and 2,953 population-based controls of the CADomics study. The strongest association was observed for the previously described region at 9p21.3 (lead SNP rs1333049: P=4.28×10-7, OR 1.22; 95% CI: 1.12-1.32). Detailed results of all associated SNPs are provided in Supplementary Table 4.
All 950 SNPs were selected for in silico replication in 7 independent case-control studies (9,487 cases and 30,171 controls). Only SNPs with P<10-4 in the pooled analysis of CADomics and the in silico replication studies were selected for wet lab replication (Supplementary Table 4). For loci with several CAD-associated SNPs, tagSNPs were selected for replication. A total of 20 SNPs was genotyped in 6 additional replication studies including 9,863 cases and 5,237 controls. Results of the discovery GWA study, both replication stages and the subsequent meta-analysis finally including 21,428 cases and 38,361 controls are presented in Table 2.
As expected, the chromosome 9p21.3 locus revealed the strongest association with CAD in the meta-analysis of all 14 studies included (lead SNP rs1333049: P=7.12×10-58, OR 1.27, 95% CI: 1.23-1.31, Supplementary Figure 1). A locus on chromosome 10q23.31, so far not known to be associated with CAD, also reached genome-wide significance in the meta-analysis (Figure 2A; rs1412444: P=3.71×10-8; OR 1.1; 95% CI: 1.07-1.14; rs2246833: P=4.35×10-8; OR 1.1; 95% CI: 1.06-1.14).
All SNPs that reached genome-wide significance (Table 2) were further tested for association to monocytic transcripts in cis (SNPs located within 500 kb of either the 5′ or 3′ end of the associated gene) and trans. SNPs rs1412444 and rs2246833, located on chromosome 10q23.31 in intronic regions of the LIPA (Lysosomal Acid Lipase A) gene, showed a strong association with expression of the LIPA transcript itself (P=1.3×10-96 and P=4.0×10-96, respectively; Figure 2B and Table 3). Both LIPA SNPs were in strong linkage disequilibrium (r2=0.985) and for both SNPs the CAD risk allele (T) was associated with higher LIPA expression. Figure 2C displays regional plots for the association of LIPA eSNPs and eQTL transcripts in relation to CAD. A “platform validation” was conducted in 119 monocytic samples using qRT-PCR analyses and the association of LIPA SNPs with LIPA transcripts was successfully replicated (rs1412444: P=3.87×10-8, rs2246833: P=1.52×10-8; see also Supplementary Figure 2).
The CAD-associated SNPs in the 9p21.3 region, rs1333049, rs7865618 and rs7044859, showed no association to global monocytic gene expression.
To explore potential mechanisms mediating the genetic risk, the relationship of LIPA mRNA transcript and the respective LIPA eSNPs rs1412444 and rs2246833 to cardiovascular risk factors (LDL- and HDL-cholesterol, triglycerides, diabetes mellitus, HbA1c, systolic and diastolic blood pressure) and subclinical atherosclerotic disease (endothelial function measured and carotid macroangiopathy) was investigated. Detailed results are provided in Table 4 (A: eQTL transcript, B: eSNPs). Elevated LIPA expression was significantly associated with lower HDL-cholesterol levels (P=2.5×10-3) and impaired endothelial function measured by flow-mediated vasodilation (P=4.04×10-3), whereas associations with higher levels of LDL-cholesterol and triglycerides did not reach statistical significance. In contrast, no significant association between LIPA eSNPs and any cardiovascular risk factor was observed.
In addition to SNPs identified in our analysis we performed an eQTL analysis for SNPs previously reported to be associated with CAD and/or myocardial infarction3-5,7,23, but not found in our analysis. Of 26 SNPs investigated (Supplementary Table 5), only 3 SNPs in two loci were associated with eQTL transcripts (Table 3). In our data, the locus on chromosome 1p13 (represented by SNPs rs599839 and rs629301) revealed a strong association with PSRC1 transcripts with the risk allele for both SNPs associated with decreased transcript levels of PSRC1. For the second locus, the risk allele of SNP rs6725887, located within the WDR12 gene on chromosome 2q33, was associated with decreased FAM117B transcript levels (located close to WDR12).
The association of these eSNPs and eQTL transcripts with cardiovascular risk factors and phenotypes of subclinical disease was further analysed (Table 4, A: eQTL transcript, B: eSNPs). Significant associations between increased PSRC1 transcript levels and lower LDL cholesterol levels (P=8.2×10-3), higher HDL cholesterol levels (P=3.0×10-3), lower systolic and diastolic blood pressure (P=9.9×10-5 and P=3.5×10-4, respectively) and an improved endothelial function (P=2.2×10-4) were observed. As previously reported3,4,24 the risk alleles of eSNPs rs599839 and rs629301 were robustly associated with increasing LDL cholesterol levels (P=3.96×10-4 and P=3.93×10-4). In addition, the risk alleles were associated with the extent of atherosclerotic plaques (P=1.44×10-3 and P=1.23×10-3). No significant association was found for FAM117B transcript levels and respective eSNPs with cardiovascular risk factors and phenotypes of subclinical disease.
A genome-wide association study for coronary artery disease was performed and identified loci were further evaluated to explore their potential functional relevance by (1) testing functionality of genetic variants in relation to gene expression, and (2) correlating expression levels with CAD risk factors and disease precursors like endothelial function and carotid atherosclerosis.
In addition to the previously known locus on chromosome 9p21, our study identified the LIPA gene on chromosome 10q23 as a novel CAD susceptibility locus (P=3.71×10-8 and P=4.35×10-8 for SNPs rs1412444 and rs2246833). In the subsequent eQTL analysis, LIPA genotypes displayed a strong association with LIPA transcripts (P=1.31×10-96 and P=3.97×10-96, respectively), with the CAD risk allele being associated with higher LIPA expression. Further, elevated LIPA expression itself was related to lower HDL cholesterol levels and impaired endothelial function, a precursor of CAD.
In humans, the LIPA gene encodes lysosomal acid lipase (LAL).25,26 LAL hydrolyzes cholesteryl esters and triglycerides delivered to the lysosome. If LAL is missing and/or not active, trigylcerides and cholesteryl esters accumulate in the cell, resulting in foam cell formation and as a consequence in atherosclerotic plaque.27 Mutations in the LIPA gene are the cause of the cholesteryl ester storage disease (CESD) and the Wolman's disease.28,29 Patients suffering from these diseases also suffer from premature cardiovascular disease.29 Residual LAL-activity determines the severity of clinical symptoms, with Wolman's patients having the lowest residual activity.30
Our data demonstrate that the LIPA CAD risk allele is associated with increased LIPA expression. Increased intrinsic LIPA expression might enhance intracellular release of fatty acids and cholesterol via the lysosomal route27 possibly explaining the association of the risk allele with impaired endothelial function, a precursor of atherosclerosis31. Furthermore, increased LIPA expression is expected to be associated with increased LAL-activity. Unesterified cholesterol is a hallmark of atherosclerotic lesions.32 In fact, cholesteryl ester hydrolysis has been shown to be a critical step in the enzymatic modification of LDL particles in the intima conferring the ability to activate complement to LDL and rendering them proatherogenic.33,34 Thus, the risk allele could increase the generation of enzymatically modified LDL and free cholesterol in the arterial intima, thereby promoting foam cell formation, complement activation, and an inflammatory process.
The significant association of the LIPA eSNPs rs1412444 and rs2246833 with CAD, their strong association with expression and the relation between transcript levels and subclinical disease in apparently healthy individuals strongly supports a causal role for the LIPA gene in atherosclerosis.
We also studied the relationship of previously published loci to gene expression, cardiovascular risk factors and phenotypes. The association of the risk alleles on the 1p13 locus with decreased PSRC1 transcript and increased LDL cholesterol levels had been reported previously.24 Further, our data showed significant association for 1p13 eSNPs and PSCR1 transcript levels with blood pressure and endothelial function, indicating that this genetic risk locus might act through these CAD risk factors. In human liver, the 1p13 locus affects transcript levels of CELSR2, PSRC1 and SORT1 with the strongest regulatory effect for SORT1.3,24 Further, in a recent study by Musunuru et al.35, liver-specific transcriptional regulation of the SORT1 gene by C/EBP transcription factors was shown and SORT1 has been nominated as the causal gene at the 1p13 locus for LDL cholesterol and MI. However, as previously reported by our group15, SORT1 was not cis-regulated in our dataset of global monocytic gene expression, suggesting a different mechanism of transcript regulation of the 1p13 locus in monocytes and does not exclude PSRC1 as an important contributor to lipid levels and coronary artery disease.
Some limitations merit consideration. Cases comprise individuals with severe coronary atherosclerosis documented by angiography and myocardial infarction. Gene expression studies were performed in monocytes. Hence, other cell types might yield different results. Finally, we did not test expression profiles in cases. However, as patients are on CAD treatment, medication would most likely severely modify expression patterns.
Overall, the use of genome-wide SNP data and the monocyte transcriptome (GHSExpress, http://genecanvas.ecgene.net/uploads/ForReview/15) led to the identification of a novel locus potentially relevant for the development of CAD. The respective eSNPs strongly affected LIPA gene expression, and the LIPA expression level itself was related to subclinical disease as assessed by vascular endothelial function. The consistency of our results between genetic variants, LIPA expression level and disease precursor identifies LIPA as an attractive research candidate for follow-up functional studies, also emphasized by the association between LAL deficiency and the rare CESD and Wolman's diseases.
We appreciate the contribution of participants of the Gutenberg Heart Study and the AtheroGene Registry. We gratefully acknowledge the excellent medical and technical assistance of all technicians, study nurses, and coworkers involved in the Gutenberg Heart Study. We acknowledge Andreas Weith, Detlev Mennerich and Werner Rust for help during technical performance of GWA and global gene expression experiments.
Funding Sources: The Gutenberg Heart Study is funded through the government of Rheinland-Pfalz (“Stiftung Rheinland Pfalz für Innovation”, contract number AZ 961-386261/733), the research programs “Wissen schafft Zukunft” and “Schwerpunkt Vaskuläre Prävention” of the Johannes Gutenberg-University of Mainz and its contract with Boehringer Ingelheim and PHILIPS Medical Systems including an unrestricted grant for the Gutenberg Heart Study. Specifically, the research reported in this article was supported by the National Genome Network “NGFNplus” (contract number project A3 01GS0833 and 01GS0831) by the Federal Ministry of Education and Research, Germany.
Journal Subject Codes:  Epidemiology;  Risk Factors;  Functional genomics;  Gene expression;  Genetics of cardiovascular disease;  Genomics
Conflict of Interest Disclosures: Muredach P. Reilly and Daniel J Rader were supported by GlaxoSmithKline through an Alternate Drug Discovery Initiative research alliance award.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.