Modification of DNA by methylation of cytosines at CpG dinucleotides is a widespread phenomenon that leads to changes in gene expression, thereby influencing and regulating many biological processes. Recent technical advances in the genome-wide determination of single-base DNA-methylation enabled epigenome-wide association studies (EWASs). Early EWASs established robust associations between age and gender with the degree of CpG methylation at specific sites. Other studies uncovered associations with cigarette smoking. However, so far these studies were mainly conducted in Caucasians, raising the question of whether these findings can also be extrapolated to other populations.
Here, we present an EWAS with age, gender, and smoking status in a family study of 123 individuals of Arab descent. We determined DNA methylation at over 450,000 CpG sites using the Illumina Infinium HumanMethylation450 BeadChip, applied state-of-the-art data processing protocols, including correction for blood cell type heterogeneity and hidden confounders, and eliminated probes containing SNPs at the targeted CpG site using 40× whole-genome sequencing data. Using this approach, we could replicate the leading published EWAS associations with age, gender and smoking, and recovered hallmarks of gender-specific epigenetic changes. Interestingly, we could even replicate the recently reported precise prediction of chronological age based on the methylation of only a few selected CpG sites.
Our study supports the view that when applied with state-of-the art protocols to account for all potential confounders, DNA methylation arrays represent powerful tools for EWAS with more complex phenotypes that can also be successfully applied to non-Caucasian populations.
Electronic supplementary material
The online version of this article (doi:10.1186/s13148-014-0040-6) contains supplementary material, which is available to authorized users.
DNA methylation; Age; Gender; Smoking; Association study; Epigenetics
Using a nontargeted metabolomics approach of 447 fasting plasma metabolites, we searched for novel molecular markers that arise before and after hyperglycemia in a large population-based cohort of 2,204 females (115 type 2 diabetic [T2D] case subjects, 192 individuals with impaired fasting glucose [IFG], and 1,897 control subjects) from TwinsUK. Forty-two metabolites from three major fuel sources (carbohydrates, lipids, and proteins) were found to significantly correlate with T2D after adjusting for multiple testing; of these, 22 were previously reported as associated with T2D or insulin resistance. Fourteen metabolites were found to be associated with IFG. Among the metabolites identified, the branched-chain keto-acid metabolite 3-methyl-2-oxovalerate was the strongest predictive biomarker for IFG after glucose (odds ratio [OR] 1.65 [95% CI 1.39–1.95], P = 8.46 × 10−9) and was moderately heritable (h2 = 0.20). The association was replicated in an independent population (n = 720, OR 1.68 [ 1.34–2.11], P = 6.52 × 10−6) and validated in 189 twins with urine metabolomics taken at the same time as plasma (OR 1.87 [1.27–2.75], P = 1 × 10−3). Results confirm an important role for catabolism of branched-chain amino acids in T2D and IFG. In conclusion, this T2D-IFG biomarker study has surveyed the broadest panel of nontargeted metabolites to date, revealing both novel and known associated metabolites and providing potential novel targets for clinical prediction and a deeper understanding of causal mechanisms.
Genome-wide association scans with high-throughput metabolic profiling provide unprecedented insights into how genetic variation influences metabolism and complex disease. Here we report the most comprehensive exploration of genetic loci influencing human metabolism to date, including 7,824 adult individuals from two European population studies. We report genome-wide significant associations at 145 metabolic loci and their biochemical connectivity regarding more than 400 metabolites in human blood. We extensively characterize the resulting in vivo blueprint of metabolism in human blood by integrating it with information regarding gene expression, heritability, overlap with known drug targets, previous association with complex disorders and inborn errors of metabolism. We further developed a database and web-based resources for data mining and results visualization. Our findings contribute to a greater understanding of the role of inherited variation in blood metabolic diversity, and identify potential new opportunities for pharmacologic development and disease understanding.
With diminishing costs of next generation sequencing (NGS), whole genome analysis becomes a standard tool for identifying genetic causes of inherited diseases. Commercial NGS service providers in general not only provide raw genomic reads, but further deliver SNP calls to their clients. However, the question for the user arises whether to use the SNP data as is, or process the raw sequencing data further through more sophisticated SNP calling pipelines with more advanced algorithms.
Here we report a detailed comparison of SNPs called using the popular GATK multiple-sample calling protocol to SNPs delivered as part of a 40x whole genome sequencing project by Illumina Inc of 171 human genomes of Arab descent (108 unrelated Qatari genomes, 19 trios, and 2 families with rare diseases) and compare them to variants provided by the Illumina CASAVA pipeline. GATK multi-sample calling identifies more variants than the CASAVA pipeline. The additional variants from GATK are robust for Mendelian consistencies but weak in terms of statistical parameters such as TsTv ratio. However, these additional variants do not make a difference in detecting the causative variants in the studied phenotype.
Both pipelines, GATK multi-sample calling and Illumina CASAVA single sample calling, have highly similar performance in SNP calling at the level of putatively causative variants.
Electronic supplementary material
The online version of this article (doi:10.1186/1756-0500-7-747) contains supplementary material, which is available to authorized users.
NGS; GATK; CASAVA; WGS pipeline; Mendelian inheritance; Qatari population; Multi-sample calling; Genotype calling; Variant; Trios; Illumina
With the help of epigenome-wide association studies (EWAS), increasing knowledge on the role of epigenetic mechanisms such as DNA methylation in disease processes is obtained. In addition, EWAS aid the understanding of behavioral and environmental effects on DNA methylation. In terms of statistical analysis, specific challenges arise from the characteristics of methylation data. First, methylation β-values represent proportions with skewed and heteroscedastic distributions. Thus, traditional modeling strategies assuming a normally distributed response might not be appropriate. Second, recent evidence suggests that not only mean differences but also variability in site-specific DNA methylation associates with diseases, including cancer. The purpose of this study was to compare different modeling strategies for methylation data in terms of model performance and performance of downstream hypothesis tests. Specifically, we used the generalized additive models for location, scale and shape (GAMLSS) framework to compare beta regression with Gaussian regression on raw, binary logit and arcsine square root transformed methylation data, with and without modeling a covariate effect on the scale parameter.
Using simulated and real data from a large population-based study and an independent sample of cancer patients and healthy controls, we show that beta regression does not outperform competing strategies in terms of model performance. In addition, Gaussian models for location and scale showed an improved performance as compared to models for location only. The best performance was observed for the Gaussian model on binary logit transformed β-values, referred to as M-values. Our results further suggest that models for location and scale are specifically sensitive towards violations of the distribution assumption and towards outliers in the methylation data. Therefore, a resampling procedure is proposed as a mode of inference and shown to diminish type I error rate in practically relevant settings. We apply the proposed method in an EWAS of BMI and age and reveal strong associations of age with methylation variability that are validated in an independent sample.
Models for location and scale are promising tools for EWAS that may help to understand the influence of environmental factors and disease-related phenotypes on methylation variability and its role during disease development.
DNA methylation; Beta regression; GAMLSS; Infinium HumanMethylation450k BeadChip; EWAS; Modeling variability; Resampling; Model performance; Model comparison; Models for location and scale
High-throughput screening techniques that analyze the metabolic endpoints of biological processes can identify the contributions of genetic predisposition and environmental factors to the development of common diseases. Studies applying controlled physiological challenges can reveal dysregulation in metabolic responses that may be predictive for or associated with these diseases. However, large-scale epidemiological studies with well controlled physiological challenge conditions, such as extended fasting periods and defined food intake, pose logistic challenges. Culturally and religiously motivated behavioral patterns of life style changes provide a natural setting that can be used to enroll a large number of study volunteers. Here we report a proof of principle study conducted within a Muslim community, showing that a metabolomics study during the Holy Month of Ramadan can provide a unique opportunity to explore the pre-prandial and postprandial response of human metabolism to nutritional challenges. Up to five blood samples were obtained from eleven healthy male volunteers, taken directly before and two hours after consumption of a controlled meal in the evening on days 7 and 26 of Ramadan, and after an over-night fast several weeks after Ramadan. The observed increases in glucose, insulin and lactate levels at the postprandial time point confirm the expected physiological response to food intake. Targeted metabolomics further revealed significant and physiologically plausible responses to food intake by an increase in bile acid and amino acid levels and a decrease in long-chain acyl-carnitine and polyamine levels. A decrease in the concentrations of a number of phospholipids between samples taken on days 7 and 26 of Ramadan shows that the long-term response to extended fasting may differ from the response to short-term fasting. The present study design is scalable to larger populations and may be extended to the study of the metabolic response in defined patient groups such as individuals with type 2 diabetes.
Metabolomics; Nutritional challenging; Ramadan fasting; Study design; Clinical research
Individualized Medicine aims at providing optimal treatment for an individual patient at a given time based on his specific genetic and molecular characteristics. This requires excellent clinical stratification of patients as well as the availability of genomic data and biomarkers as prerequisites for the development of novel diagnostic tools and therapeutic strategies. The University Medicine Greifswald, Germany, has launched the “Greifswald Approach to Individualized Medicine” (GANI_MED) project to address major challenges of Individualized Medicine. Herein, we describe the implementation of the scientific and clinical infrastructure that allows future translation of findings relevant to Individualized Medicine into clinical practice.
Clinical patient cohorts (N > 5,000) with an emphasis on metabolic and cardiovascular diseases are being established following a standardized protocol for the assessment of medical history, laboratory biomarkers, and the collection of various biosamples for bio-banking purposes. A multi-omics based biomarker assessment including genome-wide genotyping, transcriptome, metabolome, and proteome analyses complements the multi-level approach of GANI_MED. Comparisons with the general background population as characterized by our Study of Health in Pomerania (SHIP) are performed. A central data management structure has been implemented to capture and integrate all relevant clinical data for research purposes. Ethical research projects on informed consent procedures, reporting of incidental findings, and economic evaluations were launched in parallel.
Personalized Medicine; Individualized Medicine; Epidemiology
The mechanism of antihypertensive and lipid-lowering drugs on the human organism is still not fully understood. New insights on the drugs’ action can be provided by a metabolomics-driven approach, which offers a detailed view of the physiological state of an organism. Here, we report a metabolome-wide association study with 295 metabolites in human serum from 1,762 participants of the KORA F4 (Cooperative Health Research in the Region of Augsburg) study population. Our intent was to find variations of metabolite concentrations related to the intake of various drug classes and—based on the associations found—to generate new hypotheses about on-target as well as off-target effects of these drugs. In total, we found 41 significant associations for the drug classes investigated: For beta-blockers (11 associations), angiotensin-converting enzyme (ACE) inhibitors (four assoc.), diuretics (seven assoc.), statins (ten assoc.), and fibrates (nine assoc.) the top hits were pyroglutamine, phenylalanylphenylalanine, pseudouridine, 1-arachidonoylglycerophosphocholine, and 2-hydroxyisobutyrate, respectively. For beta-blockers we observed significant associations with metabolite concentrations that are indicative of drug side-effects, such as increased serotonin and decreased free fatty acid levels. Intake of ACE inhibitors and statins associated with metabolites that provide insight into the action of the drug itself on its target, such as an association of ACE inhibitors with des-Arg(9)-bradykinin and aspartylphenylalanine, a substrate and a product of the drug-inhibited ACE. The intake of statins which reduce blood cholesterol levels, resulted in changes in the concentration of metabolites of the biosynthesis as well as of the degradation of cholesterol. Fibrates showed the strongest association with 2-hydroxyisobutyrate which might be a breakdown product of fenofibrate and, thus, a possible marker for the degradation of this drug in the human organism. The analysis of diuretics showed a heterogeneous picture that is difficult to interpret. Taken together, our results provide a basis for a deeper functional understanding of the action and side-effects of antihypertensive and lipid-lowering drugs in the general population.
Electronic supplementary material
The online version of this article (doi:10.1007/s10654-014-9910-7) contains supplementary material, which is available to authorized users.
Beta-blockers; Angiotensin-converting enzyme inhibitors; Diuretics; Statins; Fibrates; Metabolomics
We aimed to assess whether whole blood expression quantitative trait loci (eQTLs) with effects in cis and trans are robust and can be used to identify regulatory pathways affecting disease susceptibility.
Materials and Methods
We performed whole-genome eQTL analyses in 890 participants of the KORA F4 study and in two independent replication samples (SHIP-TREND, N = 976 and EGCUT, N = 842) using linear regression models and Bonferroni correction.
In the KORA F4 study, 4,116 cis-eQTLs (defined as SNP-probe pairs where the SNP is located within a 500 kb window around the transcription unit) and 94 trans-eQTLs reached genome-wide significance and overall 91% (92% of cis-, 84% of trans-eQTLs) were confirmed in at least one of the two replication studies. Different study designs including distinct laboratory reagents (PAXgene™ vs. Tempus™ tubes) did not affect reproducibility (separate overall replication overlap: 78% and 82%). Immune response pathways were enriched in cis- and trans-eQTLs and significant cis-eQTLs were partly coexistent in other tissues (cross-tissue similarity 40–70%). Furthermore, four chromosomal regions displayed simultaneous impact on multiple gene expression levels in trans, and 746 eQTL-SNPs have been previously reported to have clinical relevance. We demonstrated cross-associations between eQTL-SNPs, gene expression levels in trans, and clinical phenotypes as well as a link between eQTLs and human metabolic traits via modification of gene regulation in cis.
Our data suggest that whole blood is a robust tissue for eQTL analysis and may be used both for biomarker studies and to enhance our understanding of molecular mechanisms underlying gene-disease associations.
The date palm is one of the oldest cultivated fruit trees. It is critical in many ways to cultures in arid lands by providing highly nutritious fruit while surviving extreme heat and environmental conditions. Despite its importance from antiquity, few genetic resources are available for improving the productivity and development of the dioecious date palm. To date there has been no genetic map and no sex chromosome has been identified.
Here we present the first genetic map for date palm and identify the putative date palm sex chromosome. We placed ~4000 markers on the map using nearly 1200 framework markers spanning a total of 1293 cM. We have integrated the genetic map, derived from the Khalas cultivar, with the draft genome and placed up to 19% of the draft genome sequence scaffolds onto linkage groups for the first time. This analysis revealed approximately ~1.9 cM/Mb on the map. Comparison of the date palm linkage groups revealed significant long-range synteny to oil palm. Analysis of the date palm sex-determination region suggests it is telomeric on linkage group 12 and recombination is not suppressed in the full chromosome.
Based on a modified gentoyping-by-sequencing approach we have overcome challenges due to lack of genetic resources and provide the first genetic map for date palm. Combined with the recent draft genome sequence of the same cultivar, this resource offers a critical new tool for date palm biotechnology, palm comparative genomics and a better understanding of sex chromosome development in the palms.
Sex chromosome; Genotyping by sequencing; Comparative genomics
Emerging technologies based on mass spectrometry or nuclear magnetic resonance enable the monitoring of hundreds of small metabolites from tissues or body fluids. Profiling of metabolites can help elucidate causal pathways linking established genetic variants to known disease risk factors such as blood lipid traits.
We applied statistical methodology to dissect causal relationships between single nucleotide polymorphisms, metabolite concentrations, and serum lipid traits, focusing on 95 genetic loci reproducibly associated with the four main serum lipids (total-, low-density lipoprotein-, and high-density lipoprotein- cholesterol and triglycerides). The dataset used included 2,973 individuals from two independent population-based cohorts with data for 151 small molecule metabolites and four main serum lipids. Three statistical approaches, namely conditional analysis, Mendelian randomization, and structural equation modeling, were compared to investigate causal relationship at sets of a single nucleotide polymorphism, a metabolite, and a lipid trait associated with one another.
A subset of three lipid-associated loci (FADS1, GCKR, and LPA) have a statistically significant association with at least one main lipid and one metabolite concentration in our data, defining a total of 38 cross-associated sets of a single nucleotide polymorphism, a metabolite and a lipid trait. Structural equation modeling provided sufficient discrimination to indicate that the association of a single nucleotide polymorphism with a lipid trait was mediated through a metabolite at 15 of the 38 sets, and involving variants at the FADS1 and GCKR loci.
These data provide a framework for evaluating the causal role of components of the metabolome (or other intermediate factors) in mediating the association between established genetic variants and diseases or traits.
The cross talk between the stroma and cancer cells plays a major role in phenotypic modulation. During peritoneal carcinomatosis ovarian cancer cells interact with mesenchymal stem cells (MSC) resulting in increased metastatic ability. Understanding the transcriptomic changes underlying the phenotypic modulation will allow identification of key genes to target. However in the context of personalized medicine we must consider inter and intra tumoral heterogeneity. In this study we used a pathway-based approach to illustrate the role of cell line background in transcriptomic modification during a cross talk with MSC.
We used two ovarian cancer cell lines as a surrogate for different ovarian cancer subtypes: OVCAR3 for an epithelial and SKOV3 for a mesenchymal subtype. We co-cultured them with MSCs. Genome wide gene expression was determined after cell sorting. Ingenuity pathway analysis was used to decipher the cell specific transcriptomic changes related to different pro-metastatic traits (Adherence, migration, invasion, proliferation and chemoresistance).
We demonstrate that co-culture of ovarian cancer cells in direct cellular contact with MSCs induces broad transcriptomic changes related to enhance metastatic ability. Genes related to cellular adhesion, invasion, migration, proliferation and chemoresistance were enriched under these experimental conditions. Network analysis of differentially expressed genes clearly shows a cell type specific pattern.
The contact with the mesenchymal niche increase metastatic initiation and expansion through cancer cells’ transcriptome modification dependent of the cellular subtype. Personalized medicine strategy might benefit from network analysis revealing the subtype specific nodes to target to disrupt acquired pro-metastatic profile.
Ovarian cancer; Mesenchymal stem cell; Transcriptome; Genomic modification; Metastasis
Changes in an individual’s human metabolic phenotype (metabotype) over time can be indicative of disorder-related modifications. Studies covering several months to a few years have shown that metabolic profiles are often specific for an individual. This “metabolic individuality” and detected changes may contribute to personalized approaches in human health care. However, it is not clear whether such individual metabotypes persist over longer time periods. Here we investigate the conservation of metabotypes characterized by 212 different metabolites of 818 participants from the Cooperative Health Research in the Region of Augsburg; Germany population, taken within a 7-year time interval. For replication, we used paired samples from 83 non-related individuals from the TwinsUK study. Results indicated that over 40 % of all study participants could be uniquely identified after 7 years based on their metabolic profiles alone. Moreover, 95 % of the study participants showed a high degree of metabotype conservation (>70 %) whereas the remaining 5 % displayed major changes in their metabolic profiles over time. These latter individuals were likely to have undergone important biochemical changes between the two time points. We further show that metabolite conservation was positively associated with heritability (rank correlation 0.74), although there were some notable exceptions. Our results suggest that monitoring changes in metabotypes over several years can trace changes in health status and may provide indications for disease onset. Moreover, our study findings provide a general reference for metabotype conservation over longer time periods that can be used in biomarker discovery studies.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-014-0629-y) contains supplementary material, which is available to authorized users.
Metabolomics; Longitudinal study; Heritability; Population study
Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) that associate with clinical phenotypes, but these SNPs usually explain just a small part of the heritability and have relatively modest effect sizes. In contrast, SNPs that associate with metabolite levels generally explain a higher percentage of the genetic variation and demonstrate larger effect sizes. Still, the discovery of SNPs associated with metabolite levels is challenging since testing all metabolites measured in typical metabolomics studies with all SNPs comes with a severe multiple testing penalty. We have developed an automated workflow approach that utilizes prior knowledge of biochemical pathways present in databases like KEGG and BioCyc to generate a smaller SNP set relevant to the metabolite. This paper explores the opportunities and challenges in the analysis of GWAS of metabolomic phenotypes and provides novel insights into the genetic basis of metabolic variation through the re-analysis of published GWAS datasets.
Re-analysis of the published GWAS dataset from Illig et al. (Nature Genetics, 2010) using a pathway-based workflow (http://www.myexperiment.org/packs/319.html), confirmed previously identified hits and identified a new locus of human metabolic individuality, associating Aldehyde dehydrogenase family1 L1 (ALDH1L1) with serine/glycine ratios in blood. Replication in an independent GWAS dataset of phospholipids (Demirkan et al., PLoS Genetics, 2012) identified two novel loci supported by additional literature evidence: GPAM (Glycerol-3 phosphate acyltransferase) and CBS (Cystathionine beta-synthase). In addition, the workflow approach provided novel insight into the affected pathways and relevance of some of these gene-metabolite pairs in disease development and progression.
We demonstrate the utility of automated exploitation of background knowledge present in pathway databases for the analysis of GWAS datasets of metabolomic phenotypes. We report novel loci and potential biochemical mechanisms that contribute to our understanding of the genetic basis of metabolic variation and its relationship to disease development and progression.
Genome-wide association; Metabolite; Genotype-phenotype prioritization; Bioinformatics; Pathway databases
Genome-wide association studies (GWAS) have identified many risk loci for complex diseases, but effect sizes are typically small and information on the underlying biological processes is often lacking. Associations with metabolic traits as functional intermediates can overcome these problems and potentially inform individualized therapy. Here we report a comprehensive analysis of genotype-dependent metabolic phenotypes using a GWAS with non-targeted metabolomics. We identified 37 genetic loci associated with blood metabolite concentrations, of which 25 exhibit effect sizes that are unusually high for GWAS and account for 10-60% of metabolite levels per allele copy. Our associations provide new functional insights for many disease-related associations that have been reported in previous studies, including cardiovascular and kidney disorders, type 2 diabetes, cancer, gout, venous thromboembolism, and Crohn’s disease. Taken together our study advances our knowledge of the genetic basis of metabolic individuality in humans and generates many new hypotheses for biomedical and pharmaceutical research.
Systems biology enables the identification of gene networks that modulate complex traits. Comprehensive metabolomic analyses provide innovative phenotypes that are intermediate between the initiator of genetic variability, the genome, and raw phenotypes that are influenced by a large number of environmental effects. The present study combines two concepts, systems biology and metabolic analyses, in an approach without prior functional hypothesis in order to dissect genes and molecular pathways that modulate differential growth at the onset of puberty in male cattle. Furthermore, this integrative strategy was applied to specifically explore distinctive gene interactions of non-SMC condensin I complex, subunit G (NCAPG) and myostatin (GDF8), known modulators of pre- and postnatal growth that are only partially understood for their molecular pathways affecting differential body weight.
Our study successfully established gene networks and interacting partners affecting growth at the onset of puberty in cattle. We demonstrated the biological relevance of the created networks by comparison to randomly created networks. Our data showed that GnRH (Gonadotropin-releasing hormone) signaling is associated with divergent growth at the onset of puberty and revealed two highly connected hubs, BTC and DGKH, within the network. Both genes are known to directly interact with the GnRH signaling pathway. Furthermore, a gene interaction network for NCAPG containing 14 densely connected genes revealed novel information concerning the functional role of NCAPG in divergent growth.
Merging both concepts, systems biology and metabolomic analyses, successfully yielded new insights into gene networks and interacting partners affecting growth at the onset of puberty in cattle. Genetic modulation in GnRH signaling was identified as key modifier of differential cattle growth at the onset of puberty. In addition, the benefit of our innovative concept without prior functional hypothesis was demonstrated by data suggesting that NCAPG might contribute to vascular smooth muscle contraction by indirect effects on the NO pathway via modulation of arginine metabolism. Our study shows for the first time in cattle that integration of genetic, physiological and metabolomics data in a systems biology approach will enable (or contribute to) an improved understanding of metabolic and gene networks and genotype-phenotype relationships.
Cattle; SEGFAM; Systems biology; Metabolomics; Genome-wide association study; Divergent growth; Puberty
Serum metabolite concentrations provide a direct readout of biological processes in the human body, and are associated with disorders such as cardiovascular and metabolic diseases. Here we present a genome-wide association study with 163 metabolic traits using 1809 participants from the KORA population, followed up in the TwinsUK cohort with 422 participants. In eight out of nine replicated loci (FADS1, ELOVL2, ACADS, ACADM, ACADL, SPTLC3, ETFDH, SLC16A9) the genetic variant is located in or near enzyme or solute carrier coding genes, where the associating metabolic traits match the proteins’ function. Many of these loci are located in rate limiting steps of important enzymatic reactions. Use of metabolite concentration ratios as proxies for enzymatic reaction rates reduces the variance and yields robust statistical associations with p-values between 3×10−24 and 6.5×10−179. These loci explained 5.6% to 36.3% of the observed variance. For several loci, associations with clinically relevant parameters have previously been reported.
Previously, we reported strong influences of genetic variants on metabolic phenotypes, some of them with clinical relevance. Here, we hypothesize that DNA methylation may have an important and potentially independent effect on human metabolism. To test this hypothesis, we conducted what is to the best of our knowledge the first epigenome-wide association study (EWAS) between DNA methylation and metabolic traits (metabotypes) in human blood. We assess 649 blood metabolic traits from 1814 participants of the Kooperative Gesundheitsforschung in der Region Augsburg (KORA) population study for association with methylation of 457 004 CpG sites, determined on the Infinium HumanMethylation450 BeadChip platform. Using the EWAS approach, we identified two types of methylome–metabotype associations. One type is driven by an underlying genetic effect; the other type is independent of genetic variation and potentially driven by common environmental and life-style-dependent factors. We report eight CpG loci at genome-wide significance that have a genetic variant as confounder (P = 3.9 × 10−20 to 2.0 × 10−108, r2 = 0.036 to 0.221). Seven loci display CpG site-specific associations to metabotypes, but do not exhibit any underlying genetic signals (P = 9.2 × 10−14 to 2.7 × 10−27, r2 = 0.008 to 0.107). We further identify several groups of CpG loci that associate with a same metabotype, such as 4-vinylphenol sulfate and 4-androsten-3-beta,17-beta-diol disulfate. In these cases, the association between CpG-methylation and metabotype is likely the result of a common external environmental factor, including smoking. Our study shows that analysis of EWAS with large numbers of metabolic traits in large population cohorts are, in principle, feasible. Taken together, our data suggest that DNA methylation plays an important role in regulating human metabolism.
Diabetes is generally diagnosed too late. Therefore, biomarkers indicating early stages of β-cell dysfunction and mass reduction would facilitate timely counteraction. Transgenic pigs expressing a dominant-negative glucose-dependent insulinotropic polypeptide receptor (GIPRdn) reveal progressive deterioration of glucose control and reduction of β-cell mass, providing a unique opportunity to study metabolic changes during the prediabetic period. Plasma samples from intravenous glucose tolerance tests of 2.5- and 5-month-old GIPRdn transgenic and control animals were analyzed for 163 metabolites by targeted mass spectrometry. Analysis of variance revealed that 26 of 163 parameters were influenced by the interaction Genotype × Age (P ≤ 0.0001) and thus are potential markers for progression within the prediabetic state. Among them, the concentrations of seven amino acids (Phe, Orn, Val, xLeu, His, Arg, and Tyr) were increased in 2.5-month-old but decreased in 5-month-old GIPRdn transgenic pigs versus controls. Furthermore, specific sphingomyelins, diacylglycerols, and ether phospholipids were decreased in plasma of 5-month-old GIPRdn transgenic pigs. Alterations in plasma metabolite concentrations were associated with liver transcriptome changes in relevant pathways. The concentrations of a number of plasma amino acids and lipids correlated significantly with β-cell mass of 5-month-old pigs. These metabolites represent candidate biomarkers of early phases of β-cell dysfunction and mass reduction.
Serum urate, the final breakdown product of purine metabolism, is causally involved in the pathogenesis of gout, and implicated in cardiovascular disease and type 2 diabetes. Serum urate levels highly differ between men and women; however the underlying biological processes in its regulation are still not completely understood and are assumed to result from a complex interplay between genetic, environmental and lifestyle factors. In order to describe the metabolic vicinity of serum urate, we analyzed 355 metabolites in 1,764 individuals of the population-based KORA F4 study and constructed a metabolite network around serum urate using Gaussian Graphical Modeling in a hypothesis-free approach. We subsequently investigated the effect of sex and urate lowering medication on all 38 metabolites assigned to the network. Within the resulting network three main clusters could be detected around urate, including the well-known pathway of purine metabolism, as well as several dipeptides, a group of essential amino acids, and a group of steroids. Of the 38 assigned metabolites, 25 showed strong differences between sexes. Association with uricostatic medication intake was not only confined to purine metabolism but seen for seven metabolites within the network. Our findings highlight pathways that are important in the regulation of serum urate and suggest that dipeptides, amino acids, and steroid hormones are playing a role in its regulation. The findings might have an impact on the development of specific targets in the treatment and prevention of hyperuricemia.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-013-0565-2) contains supplementary material, which is available to authorized users.
Gaussian Graphical Modeling; Metabolite network; Pathway reconstruction; Allopurinol; Uric acid; Purine metabolism
The aim was to characterise associations between circulating thyroid hormones—free thyroxine (FT4) and thyrotropin (TSH)—and the metabolite profiles in serum samples from participants of the German population-based KORA F4 study. Analyses were based on the metabolite profile of 1463 euthyroid subjects. In serum samples, obtained after overnight fasting (≥8), 151 different metabolites were quantified in a targeted approach including amino acids, acylcarnitines (ACs), and phosphatidylcholines (PCs). Associations between metabolites and thyroid hormone concentrations were analysed using adjusted linear regression models. To draw conclusions on thyroid hormone related pathways, intra-class metabolite ratios were additionally explored. We discovered 154 significant associations (Bonferroni p < 1.75 × 10−04) between FT4 and various metabolites and metabolite ratios belonging to AC and PC groups. Significant associations with TSH were lacking. High FT4 levels were associated with increased concentrations of many ACs and various sums of ACs of different chain length, and the ratio of C2 by C0. The inverse associations observed between FT4 and many serum PCs reflected the general decrease in PC concentrations. Similar results were found in subgroup analyses, e.g., in weight-stable subjects or in obese subjects. Further, results were independent of different parameters for liver or kidney function, or inflammation, which supports the notion of an independent FT4 effect. In fasting euthyroid adults, higher serum FT4 levels are associated with increased serum AC concentrations and an increased ratio of C2 by C0 which is indicative of an overall enhanced fatty acyl mitochondrial transport and β-oxidation of fatty acids.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-013-0563-4) contains supplementary material, which is available to authorized users.
Targeted metabolomics; Serum metabolites; Free thyroxine; Thyrotropin; Thyroid hormones; Epidemiology
Background Human ageing is a complex, multifactorial process and early developmental factors affect health outcomes in old age.
Methods Metabolomic profiling on fasting blood was carried out in 6055 individuals from the UK. Stepwise regression was performed to identify a panel of independent metabolites which could be used as a surrogate for age. We also investigated the association with birthweight overall and within identical discordant twins and with genome-wide methylation levels.
Results We identified a panel of 22 metabolites which combined are strongly correlated with age (R2 = 59%) and with age-related clinical traits independently of age. One particular metabolite, C-glycosyl tryptophan (C-glyTrp), correlated strongly with age (beta = 0.03, SE = 0.001, P = 7.0 × 10−157) and lung function (FEV1 beta = −0.04, SE = 0.008, P = 1.8 × 10−8 adjusted for age and confounders) and was replicated in an independent population (n = 887). C-glyTrp was also associated with bone mineral density (beta = −0.01, SE = 0.002, P = 1.9 × 10−6) and birthweight (beta = −0.06, SE = 0.01, P = 2.5 × 10−9). The difference in C-glyTrp levels explained 9.4% of the variance in the difference in birthweight between monozygotic twins. An epigenome-wide association study in 172 individuals identified three CpG-sites, associated with levels of C-glyTrp (P < 2 × 10−6). We replicated one CpG site in the promoter of the WDR85 gene in an independent sample of 350 individuals (beta = −0.20, SE = 0.04, P = 2.9 × 10−8). WDR85 is a regulator of translation elongation factor 2, essential for protein synthesis in eukaryotes.
Conclusions Our data illustrate how metabolomic profiling linked with epigenetic studies can identify some key molecular mechanisms potentially determined in early development that produce long-term physiological changes influencing human health and ageing.
Ageing; metabolomics; epigenetics; twin studies; developmental origins of health and disease; birthweight
Advanced glycation end products (AGEs) have been shown to be a predictor of cardiovascular risk in Caucasian subjects. In this study we examine whether the existing reference values are useable for non-Caucasian ethnicities. Furthermore, we assessed whether gender and smoking affect AGEs.
AGEs were determined by a non-invasive method of skin auto-fluorescence (AF). AF was measured in 200 Arabs, 99 South Asians, 35 Filipinos and 14 subjects of other/mixed ethnicity in the Qatar Metabolomics Study on Diabetes (QMDiab). Using multivariate linear regression analysis and adjusting for age and type 2 diabetes, we assessed whether ethnicity, gender and smoking were associated with AF.
The mean AF was 2.27 arbitrary units (AU) (SD: 0.63). Arabs and Filipinos had a significant higher AF than the South Asian population (0.25 arbitrary units (AU) (95% CI: 0.11‒0.39), p = 0.001 and 0.34 (95% CI: 0.13‒0.55), p = 0.001 respectively). Also, AF was significantly higher in females (0.41 AU (95% CI: 0.29‒0.53), p < 0.001). AF associated with smoking (0.21 AU (95% CI: 0.01‒0.41), p = 0.04) and increased with the number of pack-years smoked (p = 0.02).
This study suggests that the existing reference values should take ethnicity, gender and smoking into account. Larger studies in specific ethnicities are necessary to create ethnic- and gender-specific reference values.
skin auto-fluorescence; advanced glycation endproducts; type 2 diabetes; gender differences; ethnicity; smoking; epidemiology
Metabolomics helps to identify links between environmental exposures and intermediate biomarkers of disturbed pathways. We previously reported variations in phosphatidylcholines in male smokers compared with non-smokers in a cross-sectional pilot study with a small sample size, but knowledge of the reversibility of smoking effects on metabolite profiles is limited. Here, we extend our metabolomics study with a large prospective study including female smokers and quitters.
Using targeted metabolomics approach, we quantified 140 metabolite concentrations for 1,241 fasting serum samples in the population-based Cooperative Health Research in the Region of Augsburg (KORA) human cohort at two time points: baseline survey conducted between 1999 and 2001 and follow-up after seven years. Metabolite profiles were compared among groups of current smokers, former smokers and never smokers, and were further assessed for their reversibility after smoking cessation. Changes in metabolite concentrations from baseline to the follow-up were investigated in a longitudinal analysis comparing current smokers, never smokers and smoking quitters, who were current smokers at baseline but former smokers by the time of follow-up. In addition, we constructed protein-metabolite networks with smoking-related genes and metabolites.
We identified 21 smoking-related metabolites in the baseline investigation (18 in men and six in women, with three overlaps) enriched in amino acid and lipid pathways, which were significantly different between current smokers and never smokers. Moreover, 19 out of the 21 metabolites were found to be reversible in former smokers. In the follow-up study, 13 reversible metabolites in men were measured, of which 10 were confirmed to be reversible in male quitters. Protein-metabolite networks are proposed to explain the consistent reversibility of smoking effects on metabolites.
We showed that smoking-related changes in human serum metabolites are reversible after smoking cessation, consistent with the known cardiovascular risk reduction. The metabolites identified may serve as potential biomarkers to evaluate the status of smoking cessation and characterize smoking-related diseases.
metabolic network; metabolomics; molecular epidemiology; smoking; smoking cessation
Nuclear magnetic resonance spectroscopy (NMR) provides robust readouts of many metabolic parameters in one experiment. However, identification of clinically relevant markers in 1H NMR spectra is a major challenge. Association of NMR-derived quantities with genetic variants can uncover biologically relevant metabolic traits. Using NMR data of plasma samples from 1,757 individuals from the KORA study together with 655,658 genetic variants, we show that ratios between NMR intensities at two chemical shift positions can provide informative and robust biomarkers. We report seven loci of genetic association with NMR-derived traits (APOA1, CETP, CPS1, GCKR, FADS1, LIPC, PYROXD2) and characterize these traits biochemically using mass spectrometry. These ratios may now be used in clinical studies.