|Home | About | Journals | Submit | Contact Us | Français|
Both the prevalence and incidence of heart failure (HF) are increasing, especially among African-Americans, but no large-scale, genome-wide association study (GWAS) of HF-related metabolites have been reported. We sought to identify novel genetic variants that are associated with metabolites previously reported to relate to HF incidence. GWASs of three metabolites identified previously as risk factors for incident HF (pyroglutamine, dihydroxy docosatrienoic acid and X-11787, being either hydroxy-leucine or hydroxy-isoleucine) were performed in 1260 African-Americans free of HF at the baseline examination of the Atherosclerosis Risk in Communities (ARIC) study. A significant association on chromosome 5q33 (rs10463316, MAF = 0.358, p-value = 1.92×10−10) was identified for pyroglutamine. One region on chromosome 2p13 contained a nonsynonymous substitution in N-acetyltransferase 8 (NAT8) was associated with X-11787 (rs13538, MAF = 0.481, p-value = 1.71×10−23). The smallest p-value for dihydroxy docosatrienoic acid was rs4006531 on chromosome 8q24 (MAF = 0.400, p-value = 6.98×10−7). None of the above SNPs were individually associated with incident HF, but a genetic risk score (GRS) created by summing the most significant risk alleles from each metabolite detected 11% greater risk of HF per allele. In summary, we identified three loci associated with previously reported HF-related metabolites. Further use of metabolomics technology will facilitate replication of these findings in independent samples.
Heart failure (HF) is a leading cause of hospitalization among the elderly, and its incidence and prevalence continue to rise[Haldeman, Croft, Giles, & Rashidee, 1999; Kannel, 2000; Rich, 1997; Roger et al., 2011]. The pathogenesis of HF includes genetic and environmental factors, and understanding the role of genetic variations and their interactions with the environment may improve our understanding of HF onset and progression. Genome-wide association studies (GWAS) have proven to be a powerful tool for identifying genes and genomic regions having common sequence variation affecting a trait of interest, such as HF[Hirschhorn & Daly, 2005].
Metabolomics is the high-throughput study of the small molecule end products of a variety of chemical processes in a biologic system[Nicholson, Lindon, & Holmes, 1999]. Metabolites are the ultimate downstream product of gene function and environmental exposures[van der Greef, Stroobant, & van der Heijden, 2004], and thus, may enhance our understanding of the role of both genes and the environment in the etiology of HF. A number of studies have reported differences in the metabolome between cases and controls for a number of common chronic diseases, including cardiovascular disease and HF[Dunn et al., 2007; Griffin, Atherton, Shockcor, & Atzori, 2011; Tuunanen, Ukkonen, & Knuuti, 2008; T. J. Wang et al., 2011; Z. Wang et al., 2011]. Recent studies combining genetics and metabolomics have provided novel functional insights related to several chronic diseases, including cardiovascular disease and type 2 diabetes[Suhre et al., 2011]. To date, no GWAS has been performed to interrogate the collective roles of genetics and the metabolome in the onset of incident HF. Zheng et al.[Zheng et al., 2013] identified three metabolites that were significantly associated with incident HF among African-Americans, including pyroglutamine, dihydroxy docosatrienoic acid, and the unnamed compound ( X-11787), which was revealed to be an isoform of either hydroxy-leucine or hydroxy-isoleucine. Identifying genetic factors that influence the levels of these novel metabolites may provide insights into their possible identity and function.
The ARIC study is a prospective cohort study designed to ascertain the etiology and predictors of cardiovascular disease (CVD), which enrolled 15,792 middle-aged adults from four U.S. communities (Forsyth County, NC; Jackson, MS; suburbs of Minneapolis, MN; and Washington County, MD) between 1987–1989. A detailed description of the ARIC study design and methods was published elsewhere[The ARIC investigators, 1989]. Metabolomic profiles were measured in African-Americans randomly selected from the Jackson, MS field center, who were included in this genome-wide association study of HF-related metabolites (N = 1,260). African Americans from both Jackson, MS and Forsyth County, NC free of prevalent HF were analyzed for the associations between incident HF and top ranking genome-wide significant variants (N = 2,225). Participants were excluded if they had prevalent HF, if they were first degree relatives of someone else in the study, if there were sample handling errors or discrepancies in race or sex between reported data and genotype data, or if they did not give consent for use of DNA information.
Metabolite profiling was completed in June 2010 using fasting serum samples which had been stored at −80° since collection at the baseline examination in 1986–7. In total, detection and quantification of 602 metabolites was completed by Metabolon Inc. (Durham, USA) using an untargeted, gas chromatography-mass spectrometry and liquid chromatography-mass spectrometry (GC-MS and LC-MS)-based quantification protocol, the details of which were described elsewhere[Zheng et al., 2013].
Incident HF was defined during follow-up by either (1) a first hospitalization which included an International Classification of Diseases, 9th revision, discharge code of 428 (428.0 to 428.9) in any position for those without a prior HF hospitalization, or (2) a death certificate with a 428 (HF) or International Classification of Diseases, 10th revision, code I50 (HF) in any position[Loehr, Rosamond, Chang, Folsom, & Chambless, 2008; White et al., 1996]. Individuals were followed up for HF events from enrollment (baseline) until death or December 31, 2008, and those who were lost to follow-up were censored at the date of last contact.
In the ARIC study, autosomal single-nucleotide polymorphisms (SNPs) were genotyped on the Affymetrix 6.0 chip and were imputed to ≈ 2.5 million SNPs based on a panel of cosmopolitan reference haplotypes from HapMap CEU and YRI. MACH v1.0 was used to do imputation and allele dosage information was summarized in the imputation results. SNPs were excluded if they had no chromosomal location, were monomorphic, had a call rate < 95%, or had a Hardy-Weinberg equilibrium p-value <10−6. For each SNP, the ratio of the observed versus expected variance of the dosage were served as the measure of imputation quality.
The three incident HF-related metabolites identified by Zheng et al.[Zheng et al., 2013] were treated as continuous variables, and covariates in the model were measured at the time when the serum samples were obtained in which metabolite levels were later measured. Linear regression analyses were applied to each metabolite respectively, adjusting for age, sex and the first 10 principal components derived from the principal components analysis to account for population stratification[Price et al., 2006]. Individuals with metabolite levels that were below the detectable limit of the assay were assigned the lowest detected value for that metabolite in all samples, and all metabolite values were natural log-transformed prior to analysis. SNP effects were estimated under an additive genetic model. SNPs with minor allele frequency (MAF) < 5% were excluded. Quantile-quantile (QQ) plots were generated for each analysis to illustrate the distribution of the observed and expected p-values for all eligible SNPs, and regional plots showing linkage disequilibrium (LD) and the location of nearby genes (if any) were generated for the top ranking SNPs for each metabolite. Genome-wide significance was defined as a p-value < 5×10−8, and a p-value < 1×10−5 was considered suggestive evidence for association. If more than one significant or suggestive SNP clustered at a locus, the SNP with the smallest p-value was reported as the sentinel marker. The analyses were performed by ProbABEL[Aulchenko, Struchalin, & van Duijn, 2010] and R (www.r-project.org).
To evaluate the cumulative effect of the identified genetic variants, a genetic risk score (GRS) was constructed by summing the number of risk raising alleles (0/1/2) for the top ranking genome-wide significant SNP of each metabolite. The proportional hazards assumption was examined and not rejected according to our assessment using the methods developed by Grambsch and Therneau[Grambsch & Therneau, 1994] and these analyses were performed using R (www.r-project.org).
A total of 1,260 African-Americans were involved in the genome-wide association analyses, and their baseline characteristics are shown in Table 1. In this study, pyroglutamine and dihydroxy docosatrienoic acid had at least one locus that reached genome-wide significance (p-value < 5×10−8). The Manhattan plots, QQ plots and detailed information about statistically significant and suggestive loci are provided in the Supplement Table1–3 and Figure 1–2.
For pyroglutamine, the most significant SNP that exceeded the genome-wide significant threshold was rs10463316. (MAF = 0.358, p-value = 1.92 × 10−10) (Figure 1A). rs10463316 is an intergenic SNP located on chromosome 5q33, 18.93 kb from the SLC36A2 gene (solute carrier family 36, member 2).
One region encompassing two genes on chromosome 2p13 was significantly associated with X-11787 levels. The top ranking SNP is intergenic (rs6546857, MAF = 0.477, p-value = 9.58 × 10−24) 0.91 kb from ALMS1 (Alstrom syndrome 1). The other gene in this same region is NAT8 (N-acetyltransferase 8). The sentinel SNP in NAT8 is a missense variant rs13538 (MAF = 0.481, p-value = 1.71 × 10−23), which causes a serine to phenylalanine substitution (F143S) within the acetyltransferase domain of N-acetyltransferase 8. These two SNPs (rs6546857 and rs13538) are in strong linkage disequilibrium (LD) with r2 ≥ 0.8 (Figure 1B). After conditioning on the missense variant, no other SNP in this region was showed a statistically significant relation to dihydroxy docosatrienoic acid levels (data not shown).
The SNP with smallest p-value for dihydroxy docosatrienoic acid was rs4006531 on chromosome 8q24 (MAF = 0.400, p-value = 6.98 × 10−7), 90.48 kb from a hypothetical gene LOC10013023. No SNP reached the genome-wide significant threshold for this unnamed metabolite.
The top ranking SNP for each metabolite was chosen to construct a GRS in 2,225 African-Americans free of HF at the baseline examination who had been monitored for the onset of HF for up to 22 years (396 HF events). Baseline characteristics of these individuals are shown in Table 1. Even though none of these SNPs was significantly related to incident HF individually, the association between the cumulative GRS and incident HF was statistically significant in a Cox proportional hazards model after adjusting for traditional risk factors. (HR = 1.11, 95% CI: 1.02–1.22, p-value = 0.019, Table 2). When a model is fit, that includes the traditional risk factors, the three metabolites and the GRS, then the GRS is no longer statistically significant (data not shown), reinforcing that the metabolites are likely a mediator of these genetic effects.
This study utilized an untargeted metabolomics approach combined with genome-wide association screening in a large, well-defined sample of African-Americans to identify two genetic loci that influence two HF-related metabolic traits. The gene most likely influencing pyroglutamine levels is SLC36A2, which is an electrogenic amino acid symporter for amino acids with small side chains, especially glutamine[Bode, 2001; Boll, Foltz, Rubio-Aliaga, Kottra, & Daniel, 2002]. Pyroglutamine is a cyclic derivative of glutamine and it is plausible, but not proven, that variants in SLC36A2 may affect pyroglutamine transport. A recent family-based study reported interaction between SLC36A2 and SLC6A20, a proline imino transporter, on the onset of iminoglycinuria[Broer et al., 2008]. Our results provide new insights into SLC36A2 function at a population level.
We previously explored the identity of metabolite X-11787 which has a likely chemical formula of C6H13NO3, consistent with the chemical structures of either hydroxy-leucine or hydroxy-isoleucine[Zheng et al., 2013]. Hydroxy-leucines are oxidized from leucine or hydroperoxyleucines and provide useful in vivo markers of protein oxidation [Fu & Dean, 1997]. Protein oxidation is implicated in aging and oxidative stress[Berlett & Stadtman, 1997], which is associated with a number of diseases, such as atherosclerosis, hypertension, diabetes, and chronic kidney disease[de Champlain et al., 2004; Maritim, Sanders, & Watkins, 2003; Singh & Jialal, 2006; Small, Coombes, Bennett, Johnson, & Gobe, 2012]. The genetic locus influencing X-11787 contains two genes on chromosome 2p13, ALMS1and NAT8. Mutations in ALMS1 are reported to cause Alström syndrome, a rare autosomal recessive disease sharing several features with the metabolic syndrome, namely obesity, hyperinsulinemia, and hypertriglyceridemia [Joy et al., 2007], NAT8 participates in the development and maintenance of normal kidney and liver function[Ozaki, Fujiwara, Nakamura, & Takahashi, 1998]. Recent GWAS report that this locus is associated with kidney function and chronic kidney disease[Chambers et al., 2010; Kottgen et al., 2010]. Given NAT8’s function and the fact that X-11787 has an independent association with incident HF after adjusting for BMI, glucose levels, lipid levels and kidney function, we hypothesize that NAT8 influences an underlying, but as yet unknown, process that affects both HF and kidney disease.
To date, only two studies have used a genome-wide association study approach to localize genes (or gene regions) associated with incident HF, one including both European Americans and African Americans and the other in European Americans only[Larson et al., 2007; Smith et al., 2010]. In this sample of African-Americans from the ARIC study, none of the top genetic loci associated with HF were associated with any of the three metabolites reported previously to be related to incident HF[Zheng et al., 2013]. Therefore, the loci identified in earlier GWAS most likely have effects outside of biologic pathways represented in the serum metabolome measured here. The general concept for the occurrence of common diseases is considered as a result of complex interactions between multiple genetic and environmental predisposing factors[Hirschhorn & Daly, 2005]. Several recently studies have shown common genetic variants have joint effects on cardiovascular diseases, such as CHD[Anderson et al., 2010; Morrison et al., 2007], however such studies on HF is rare. Our study underscores the potential multifactorial nature of HF.
This study, to our knowledge, is the first GWAS to reveal genetic risk variants for human metabolomic profile in African-Americans. In addition, it is the first study to estimate genetic effects on HF risk via disease related metabolites. Our study also has several limitations. First, there are no appropriate African-American sample sets with metabolomic profiles or enough incident HF events that can be used for replication. Second, although strong signals were detected at several genetic loci, the generalizability of these results has not been established. Through our internal quality control process, we limited our analyses to those metabolites with few missing values, and the remaining missing values were imputed with the lowest observed value. If the very low values were due to genetic variation, sensitivity of a metabolomic technology may impact the statistical power of these analyses.
From a panel of three HF-related metabolites, we identified two loci that reached genome-wide significance for two of the metabolites. These findings contribute to the knowledge-base of HF physiology and to our understanding of the human metabolic profile. Further use of metabolomics technology should enable replication of these findings.
We acknowledge the essential role of the Atherosclerosis Risk in Communities (ARIC) Study in developing and support for this article. The authors also thank the staff and participants of the ARIC study for their important contributions. The metabolomics research was sponsored by National Human Genome Research Institute (NHGRI). B.Y. and Y.Z. are supported in part by a training fellowship from Burroughs Wellcome Fund – The Houston Laboratory and Population Science Training Program in Gene-Environment Interaction (BWF Grant No. 1008200). J.A.N. is supported by a K01 from the National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases (5K01DK082729-04).
Sources of Funding
The Atherosclerosis Risk in Communities Study is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts (HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, and HHSN268201100012C), R01HL087641, R01HL59367 and R01HL086694; National Human Genome Research Institute grant U01HG004402; and National Institutes of Health contract HHSN268200625226C. Infrastructure was partly supported by Grant Number UL1RR025005, a component of the National Institutes of Health and NIH Roadmap for Medical Research.