Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) that associate with clinical phenotypes, but these SNPs usually explain just a small part of the heritability and have relatively modest effect sizes. In contrast, SNPs that associate with metabolite levels generally explain a higher percentage of the genetic variation and demonstrate larger effect sizes. Still, the discovery of SNPs associated with metabolite levels is challenging since testing all metabolites measured in typical metabolomics studies with all SNPs comes with a severe multiple testing penalty. We have developed an automated workflow approach that utilizes prior knowledge of biochemical pathways present in databases like KEGG and BioCyc to generate a smaller SNP set relevant to the metabolite. This paper explores the opportunities and challenges in the analysis of GWAS of metabolomic phenotypes and provides novel insights into the genetic basis of metabolic variation through the re-analysis of published GWAS datasets.
Re-analysis of the published GWAS dataset from Illig et al. (Nature Genetics, 2010) using a pathway-based workflow (http://www.myexperiment.org/packs/319.html), confirmed previously identified hits and identified a new locus of human metabolic individuality, associating Aldehyde dehydrogenase family1 L1 (ALDH1L1) with serine/glycine ratios in blood. Replication in an independent GWAS dataset of phospholipids (Demirkan et al., PLoS Genetics, 2012) identified two novel loci supported by additional literature evidence: GPAM (Glycerol-3 phosphate acyltransferase) and CBS (Cystathionine beta-synthase). In addition, the workflow approach provided novel insight into the affected pathways and relevance of some of these gene-metabolite pairs in disease development and progression.
We demonstrate the utility of automated exploitation of background knowledge present in pathway databases for the analysis of GWAS datasets of metabolomic phenotypes. We report novel loci and potential biochemical mechanisms that contribute to our understanding of the genetic basis of metabolic variation and its relationship to disease development and progression.
Genome-wide association; Metabolite; Genotype-phenotype prioritization; Bioinformatics; Pathway databases
Genome-wide association studies (GWAS) have identified many risk loci for complex diseases, but effect sizes are typically small and information on the underlying biological processes is often lacking. Associations with metabolic traits as functional intermediates can overcome these problems and potentially inform individualized therapy. Here we report a comprehensive analysis of genotype-dependent metabolic phenotypes using a GWAS with non-targeted metabolomics. We identified 37 genetic loci associated with blood metabolite concentrations, of which 25 exhibit effect sizes that are unusually high for GWAS and account for 10-60% of metabolite levels per allele copy. Our associations provide new functional insights for many disease-related associations that have been reported in previous studies, including cardiovascular and kidney disorders, type 2 diabetes, cancer, gout, venous thromboembolism, and Crohn’s disease. Taken together our study advances our knowledge of the genetic basis of metabolic individuality in humans and generates many new hypotheses for biomedical and pharmaceutical research.
Systems biology enables the identification of gene networks that modulate complex traits. Comprehensive metabolomic analyses provide innovative phenotypes that are intermediate between the initiator of genetic variability, the genome, and raw phenotypes that are influenced by a large number of environmental effects. The present study combines two concepts, systems biology and metabolic analyses, in an approach without prior functional hypothesis in order to dissect genes and molecular pathways that modulate differential growth at the onset of puberty in male cattle. Furthermore, this integrative strategy was applied to specifically explore distinctive gene interactions of non-SMC condensin I complex, subunit G (NCAPG) and myostatin (GDF8), known modulators of pre- and postnatal growth that are only partially understood for their molecular pathways affecting differential body weight.
Our study successfully established gene networks and interacting partners affecting growth at the onset of puberty in cattle. We demonstrated the biological relevance of the created networks by comparison to randomly created networks. Our data showed that GnRH (Gonadotropin-releasing hormone) signaling is associated with divergent growth at the onset of puberty and revealed two highly connected hubs, BTC and DGKH, within the network. Both genes are known to directly interact with the GnRH signaling pathway. Furthermore, a gene interaction network for NCAPG containing 14 densely connected genes revealed novel information concerning the functional role of NCAPG in divergent growth.
Merging both concepts, systems biology and metabolomic analyses, successfully yielded new insights into gene networks and interacting partners affecting growth at the onset of puberty in cattle. Genetic modulation in GnRH signaling was identified as key modifier of differential cattle growth at the onset of puberty. In addition, the benefit of our innovative concept without prior functional hypothesis was demonstrated by data suggesting that NCAPG might contribute to vascular smooth muscle contraction by indirect effects on the NO pathway via modulation of arginine metabolism. Our study shows for the first time in cattle that integration of genetic, physiological and metabolomics data in a systems biology approach will enable (or contribute to) an improved understanding of metabolic and gene networks and genotype-phenotype relationships.
Cattle; SEGFAM; Systems biology; Metabolomics; Genome-wide association study; Divergent growth; Puberty
Serum metabolite concentrations provide a direct readout of biological processes in the human body, and are associated with disorders such as cardiovascular and metabolic diseases. Here we present a genome-wide association study with 163 metabolic traits using 1809 participants from the KORA population, followed up in the TwinsUK cohort with 422 participants. In eight out of nine replicated loci (FADS1, ELOVL2, ACADS, ACADM, ACADL, SPTLC3, ETFDH, SLC16A9) the genetic variant is located in or near enzyme or solute carrier coding genes, where the associating metabolic traits match the proteins’ function. Many of these loci are located in rate limiting steps of important enzymatic reactions. Use of metabolite concentration ratios as proxies for enzymatic reaction rates reduces the variance and yields robust statistical associations with p-values between 3×10−24 and 6.5×10−179. These loci explained 5.6% to 36.3% of the observed variance. For several loci, associations with clinically relevant parameters have previously been reported.
Previously, we reported strong influences of genetic variants on metabolic phenotypes, some of them with clinical relevance. Here, we hypothesize that DNA methylation may have an important and potentially independent effect on human metabolism. To test this hypothesis, we conducted what is to the best of our knowledge the first epigenome-wide association study (EWAS) between DNA methylation and metabolic traits (metabotypes) in human blood. We assess 649 blood metabolic traits from 1814 participants of the Kooperative Gesundheitsforschung in der Region Augsburg (KORA) population study for association with methylation of 457 004 CpG sites, determined on the Infinium HumanMethylation450 BeadChip platform. Using the EWAS approach, we identified two types of methylome–metabotype associations. One type is driven by an underlying genetic effect; the other type is independent of genetic variation and potentially driven by common environmental and life-style-dependent factors. We report eight CpG loci at genome-wide significance that have a genetic variant as confounder (P = 3.9 × 10−20 to 2.0 × 10−108, r2 = 0.036 to 0.221). Seven loci display CpG site-specific associations to metabotypes, but do not exhibit any underlying genetic signals (P = 9.2 × 10−14 to 2.7 × 10−27, r2 = 0.008 to 0.107). We further identify several groups of CpG loci that associate with a same metabotype, such as 4-vinylphenol sulfate and 4-androsten-3-beta,17-beta-diol disulfate. In these cases, the association between CpG-methylation and metabotype is likely the result of a common external environmental factor, including smoking. Our study shows that analysis of EWAS with large numbers of metabolic traits in large population cohorts are, in principle, feasible. Taken together, our data suggest that DNA methylation plays an important role in regulating human metabolism.
Diabetes is generally diagnosed too late. Therefore, biomarkers indicating early stages of β-cell dysfunction and mass reduction would facilitate timely counteraction. Transgenic pigs expressing a dominant-negative glucose-dependent insulinotropic polypeptide receptor (GIPRdn) reveal progressive deterioration of glucose control and reduction of β-cell mass, providing a unique opportunity to study metabolic changes during the prediabetic period. Plasma samples from intravenous glucose tolerance tests of 2.5- and 5-month-old GIPRdn transgenic and control animals were analyzed for 163 metabolites by targeted mass spectrometry. Analysis of variance revealed that 26 of 163 parameters were influenced by the interaction Genotype × Age (P ≤ 0.0001) and thus are potential markers for progression within the prediabetic state. Among them, the concentrations of seven amino acids (Phe, Orn, Val, xLeu, His, Arg, and Tyr) were increased in 2.5-month-old but decreased in 5-month-old GIPRdn transgenic pigs versus controls. Furthermore, specific sphingomyelins, diacylglycerols, and ether phospholipids were decreased in plasma of 5-month-old GIPRdn transgenic pigs. Alterations in plasma metabolite concentrations were associated with liver transcriptome changes in relevant pathways. The concentrations of a number of plasma amino acids and lipids correlated significantly with β-cell mass of 5-month-old pigs. These metabolites represent candidate biomarkers of early phases of β-cell dysfunction and mass reduction.
Serum urate, the final breakdown product of purine metabolism, is causally involved in the pathogenesis of gout, and implicated in cardiovascular disease and type 2 diabetes. Serum urate levels highly differ between men and women; however the underlying biological processes in its regulation are still not completely understood and are assumed to result from a complex interplay between genetic, environmental and lifestyle factors. In order to describe the metabolic vicinity of serum urate, we analyzed 355 metabolites in 1,764 individuals of the population-based KORA F4 study and constructed a metabolite network around serum urate using Gaussian Graphical Modeling in a hypothesis-free approach. We subsequently investigated the effect of sex and urate lowering medication on all 38 metabolites assigned to the network. Within the resulting network three main clusters could be detected around urate, including the well-known pathway of purine metabolism, as well as several dipeptides, a group of essential amino acids, and a group of steroids. Of the 38 assigned metabolites, 25 showed strong differences between sexes. Association with uricostatic medication intake was not only confined to purine metabolism but seen for seven metabolites within the network. Our findings highlight pathways that are important in the regulation of serum urate and suggest that dipeptides, amino acids, and steroid hormones are playing a role in its regulation. The findings might have an impact on the development of specific targets in the treatment and prevention of hyperuricemia.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-013-0565-2) contains supplementary material, which is available to authorized users.
Gaussian Graphical Modeling; Metabolite network; Pathway reconstruction; Allopurinol; Uric acid; Purine metabolism
Background Human ageing is a complex, multifactorial process and early developmental factors affect health outcomes in old age.
Methods Metabolomic profiling on fasting blood was carried out in 6055 individuals from the UK. Stepwise regression was performed to identify a panel of independent metabolites which could be used as a surrogate for age. We also investigated the association with birthweight overall and within identical discordant twins and with genome-wide methylation levels.
Results We identified a panel of 22 metabolites which combined are strongly correlated with age (R2 = 59%) and with age-related clinical traits independently of age. One particular metabolite, C-glycosyl tryptophan (C-glyTrp), correlated strongly with age (beta = 0.03, SE = 0.001, P = 7.0 × 10−157) and lung function (FEV1 beta = −0.04, SE = 0.008, P = 1.8 × 10−8 adjusted for age and confounders) and was replicated in an independent population (n = 887). C-glyTrp was also associated with bone mineral density (beta = −0.01, SE = 0.002, P = 1.9 × 10−6) and birthweight (beta = −0.06, SE = 0.01, P = 2.5 × 10−9). The difference in C-glyTrp levels explained 9.4% of the variance in the difference in birthweight between monozygotic twins. An epigenome-wide association study in 172 individuals identified three CpG-sites, associated with levels of C-glyTrp (P < 2 × 10−6). We replicated one CpG site in the promoter of the WDR85 gene in an independent sample of 350 individuals (beta = −0.20, SE = 0.04, P = 2.9 × 10−8). WDR85 is a regulator of translation elongation factor 2, essential for protein synthesis in eukaryotes.
Conclusions Our data illustrate how metabolomic profiling linked with epigenetic studies can identify some key molecular mechanisms potentially determined in early development that produce long-term physiological changes influencing human health and ageing.
Ageing; metabolomics; epigenetics; twin studies; developmental origins of health and disease; birthweight
Advanced glycation end products (AGEs) have been shown to be a predictor of cardiovascular risk in Caucasian subjects. In this study we examine whether the existing reference values are useable for non-Caucasian ethnicities. Furthermore, we assessed whether gender and smoking affect AGEs.
AGEs were determined by a non-invasive method of skin auto-fluorescence (AF). AF was measured in 200 Arabs, 99 South Asians, 35 Filipinos and 14 subjects of other/mixed ethnicity in the Qatar Metabolomics Study on Diabetes (QMDiab). Using multivariate linear regression analysis and adjusting for age and type 2 diabetes, we assessed whether ethnicity, gender and smoking were associated with AF.
The mean AF was 2.27 arbitrary units (AU) (SD: 0.63). Arabs and Filipinos had a significant higher AF than the South Asian population (0.25 arbitrary units (AU) (95% CI: 0.11‒0.39), p = 0.001 and 0.34 (95% CI: 0.13‒0.55), p = 0.001 respectively). Also, AF was significantly higher in females (0.41 AU (95% CI: 0.29‒0.53), p < 0.001). AF associated with smoking (0.21 AU (95% CI: 0.01‒0.41), p = 0.04) and increased with the number of pack-years smoked (p = 0.02).
This study suggests that the existing reference values should take ethnicity, gender and smoking into account. Larger studies in specific ethnicities are necessary to create ethnic- and gender-specific reference values.
skin auto-fluorescence; advanced glycation endproducts; type 2 diabetes; gender differences; ethnicity; smoking; epidemiology
Metabolomics helps to identify links between environmental exposures and intermediate biomarkers of disturbed pathways. We previously reported variations in phosphatidylcholines in male smokers compared with non-smokers in a cross-sectional pilot study with a small sample size, but knowledge of the reversibility of smoking effects on metabolite profiles is limited. Here, we extend our metabolomics study with a large prospective study including female smokers and quitters.
Using targeted metabolomics approach, we quantified 140 metabolite concentrations for 1,241 fasting serum samples in the population-based Cooperative Health Research in the Region of Augsburg (KORA) human cohort at two time points: baseline survey conducted between 1999 and 2001 and follow-up after seven years. Metabolite profiles were compared among groups of current smokers, former smokers and never smokers, and were further assessed for their reversibility after smoking cessation. Changes in metabolite concentrations from baseline to the follow-up were investigated in a longitudinal analysis comparing current smokers, never smokers and smoking quitters, who were current smokers at baseline but former smokers by the time of follow-up. In addition, we constructed protein-metabolite networks with smoking-related genes and metabolites.
We identified 21 smoking-related metabolites in the baseline investigation (18 in men and six in women, with three overlaps) enriched in amino acid and lipid pathways, which were significantly different between current smokers and never smokers. Moreover, 19 out of the 21 metabolites were found to be reversible in former smokers. In the follow-up study, 13 reversible metabolites in men were measured, of which 10 were confirmed to be reversible in male quitters. Protein-metabolite networks are proposed to explain the consistent reversibility of smoking effects on metabolites.
We showed that smoking-related changes in human serum metabolites are reversible after smoking cessation, consistent with the known cardiovascular risk reduction. The metabolites identified may serve as potential biomarkers to evaluate the status of smoking cessation and characterize smoking-related diseases.
metabolic network; metabolomics; molecular epidemiology; smoking; smoking cessation
Nuclear magnetic resonance spectroscopy (NMR) provides robust readouts of many metabolic parameters in one experiment. However, identification of clinically relevant markers in 1H NMR spectra is a major challenge. Association of NMR-derived quantities with genetic variants can uncover biologically relevant metabolic traits. Using NMR data of plasma samples from 1,757 individuals from the KORA study together with 655,658 genetic variants, we show that ratios between NMR intensities at two chemical shift positions can provide informative and robust biomarkers. We report seven loci of genetic association with NMR-derived traits (APOA1, CETP, CPS1, GCKR, FADS1, LIPC, PYROXD2) and characterize these traits biochemically using mass spectrometry. These ratios may now be used in clinical studies.
Recent genome-wide association studies (GWAS) with metabolomics data linked genetic variation in the human genome to differences in individual metabolite levels. A strong relevance of this metabolic individuality for biomedical and pharmaceutical research has been reported. However, a considerable amount of the molecules currently quantified by modern metabolomics techniques are chemically unidentified. The identification of these “unknown metabolites” is still a demanding and intricate task, limiting their usability as functional markers of metabolic processes. As a consequence, previous GWAS largely ignored unknown metabolites as metabolic traits for the analysis. Here we present a systems-level approach that combines genome-wide association analysis and Gaussian graphical modeling with metabolomics to predict the identity of the unknown metabolites. We apply our method to original data of 517 metabolic traits, of which 225 are unknowns, and genotyping information on 655,658 genetic variants, measured in 1,768 human blood samples. We report previously undescribed genotype–metabotype associations for six distinct gene loci (SLC22A2, COMT, CYP3A5, CYP2C18, GBA3, UGT3A1) and one locus not related to any known gene (rs12413935). Overlaying the inferred genetic associations, metabolic networks, and knowledge-based pathway information, we derive testable hypotheses on the biochemical identities of 106 unknown metabolites. As a proof of principle, we experimentally confirm nine concrete predictions. We demonstrate the benefit of our method for the functional interpretation of previous metabolomics biomarker studies on liver detoxification, hypertension, and insulin resistance. Our approach is generic in nature and can be directly transferred to metabolomics data from different experimental platforms.
Genome-wide association studies on metabolomics data have demonstrated that genetic variation in metabolic enzymes and transporters leads to concentration changes in the respective metabolite levels. The conventional goal of these studies is the detection of novel interactions between the genome and the metabolic system, providing valuable insights for both basic research as well as clinical applications. In this study, we borrow the metabolomics GWAS concept for a novel, entirely different purpose. Metabolite measurements frequently produce signals where a certain substance can be reliably detected in the sample, but it has not yet been elucidated which specific metabolite this signal actually represents. The concept is comparable to a fingerprint: each one is uniquely identifiable, but as long as it is not registered in a database one cannot tell to whom this fingerprint belongs. Obviously, this issue tremendously reduces the usability of a metabolomics analyses. The genetic associations of such an “unknown,” however, give us concrete evidence of the metabolic pathway this substance is most probably involved in. Moreover, we complement the approach with a specific measure of correlation between metabolites, providing further evidence of the metabolic processes of the unknown. For a number of cases, this even allows for a concrete identity prediction, which we then experimentally validate in the lab.
Nutrition plays an important role in human metabolism and health. Metabolomics is a promising tool for clinical, genetic and nutritional studies. A key question is to what extent metabolomic profiles reflect nutritional patterns in an epidemiological setting. We assessed the relationship between metabolomic profiles and nutritional intake in women from a large cross-sectional community study. Food frequency questionnaires (FFQs) were applied to 1,003 women from the TwinsUK cohort with targeted metabolomic analyses of serum samples using the Biocrates Absolute-IDQ™ Kit p150 (163 metabolites). We analyzed seven nutritional parameters: coffee intake, garlic intake and nutritional scores derived from the FFQs summarizing fruit and vegetable intake, alcohol intake, meat intake, hypo-caloric dieting and a “traditional English” diet. We studied the correlation between metabolite levels and dietary intake patterns in the larger population and identified for each trait between 14 and 20 independent monozygotic twins pairs discordant for nutritional intake and replicated results in this set. Results from both analyses were then meta-analyzed. For the metabolites associated with nutritional patterns, we calculated heritability using structural equation modelling. 42 metabolite nutrient intake associations were statistically significant in the discovery samples (Bonferroni P < 4 × 10−5) and 11 metabolite nutrient intake associations remained significant after validation. We found the strongest associations for fruit and vegetables intake and a glycerophospholipid (Phosphatidylcholine diacyl C38:6, P = 1.39 × 10−9) and a sphingolipid (Sphingomyeline C26:1, P = 6.95 × 10−13). We also found significant associations for coffee (confirming a previous association with C10 reported in an independent study), garlic intake and hypo-caloric dieting. Using the twin study design we find that two thirds the metabolites associated with nutritional patterns have a significant genetic contribution, and the remaining third are solely environmentally determined. Our data confirm the value of metabolomic studies for nutritional epidemiologic research.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-012-0469-6) contains supplementary material, which is available to authorized users.
Metabolomics; Twins; Dietary pattern; Nutrition habits; Food questionnaires
A targeted metabolomics approach was used to identify candidate biomarkers of pre-diabetes. The relevance of the identified metabolites is further corroborated with a protein-metabolite interaction network and gene expression data.
Three metabolites (glycine, lysophosphatidylcholine (LPC) (18:2) and acetylcarnitine C2) were found with significantly altered levels in pre-diabetic individuals compared with normal controls.Lower levels of glycine and LPC (18:2) were found to predict risks for pre-diabetes and type 2 diabetes (T2D).Seven T2D-related genes (PPARG, TCF7L2, HNF1A, GCK, IGF1, IRS1 and IDE) are functionally associated with the three identified metabolites.The unique combination of methodologies, including prospective population-based and nested case–control, as well as cross-sectional studies, was essential for the identification of the reported biomarkers.
Type 2 diabetes (T2D) can be prevented in pre-diabetic individuals with impaired glucose tolerance (IGT). Here, we have used a metabolomics approach to identify candidate biomarkers of pre-diabetes. We quantified 140 metabolites for 4297 fasting serum samples in the population-based Cooperative Health Research in the Region of Augsburg (KORA) cohort. Our study revealed significant metabolic variation in pre-diabetic individuals that are distinct from known diabetes risk indicators, such as glycosylated hemoglobin levels, fasting glucose and insulin. We identified three metabolites (glycine, lysophosphatidylcholine (LPC) (18:2) and acetylcarnitine) that had significantly altered levels in IGT individuals as compared to those with normal glucose tolerance, with P-values ranging from 2.4 × 10−4 to 2.1 × 10−13. Lower levels of glycine and LPC were found to be predictors not only for IGT but also for T2D, and were independently confirmed in the European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam cohort. Using metabolite–protein network analysis, we identified seven T2D-related genes that are associated with these three IGT-specific metabolites by multiple interactions with four enzymes. The expression levels of these enzymes correlate with changes in the metabolite concentrations linked to diabetes. Our results may help developing novel strategies to prevent T2D.
early diagnostic biomarkers; IGT; metabolomics; prediction; T2D
Systems Biology is a field in biological science that focuses on the combination of several or all “omics”-approaches in order to find out how genes, transcripts, proteins and metabolites act together in the network of life. Metabolomics as analog to genomics, transcriptomics and proteomics is more and more integrated into biological studies and often transcriptomic and metabolomic experiments are combined in one setup. At a first glance both data types seem to be completely different, but both produce information on biological entities, either transcripts or metabolites. Both types can be overlaid on metabolic pathways to obtain biological information on the studied system. For the joint analysis of both data types the MassTRIX webserver was updated. MassTRIX is freely available at www.masstrix.org.
To characterise the influence of the fat free mass on the metabolite profile in serum samples from participants of the population-based KORA (Cooperative Health Research in the Region of Augsburg) S4 study.
Subjects and Methods
Analyses were based on metabolite profile from 965 participants of the S4 and 890 weight-stable subjects of its seven-year follow-up study (KORA F4). 190 different serum metabolites were quantified in a targeted approach including amino acids, acylcarnitines, phosphatidylcholines (PCs), sphingomyelins and hexose. Associations between metabolite concentrations and the fat free mass index (FFMI) were analysed using adjusted linear regression models. To draw conclusions on enzymatic reactions, intra-metabolite class ratios were explored. Pairwise relationships among metabolites were investigated and illustrated by means of Gaussian graphical models (GGMs).
We found 339 significant associations between FFMI and various metabolites in KORA S4. Among the most prominent associations (p-values 4.75×10−16–8.95×10−06) with higher FFMI were increasing concentrations of the branched chained amino acids (BCAAs), ratios of BCAAs to glucogenic amino acids, and carnitine concentrations. For various PCs, a decrease in chain length or in saturation of the fatty acid moieties could be observed with increasing FFMI, as well as an overall shift from acyl-alkyl PCs to diacyl PCs. These findings were reproduced in KORA F4. The established GGMs supported the regression results and provided a comprehensive picture of the relationships between metabolites. In a sub-analysis, most of the discovered associations did not exist in obese subjects in contrast to non-obese subjects, possibly indicating derangements in skeletal muscle metabolism.
A set of serum metabolites strongly associated with FFMI was identified and a network explaining the relationships among metabolites was established. These results offer a novel and more complete picture of the FFMI effects on serum metabolites in a data-driven network.
Genome-wide association studies (GWAS) with metabolic traits and metabolome-wide association studies (MWAS) with traits of biomedical relevance are powerful tools to identify the contribution of genetic, environmental and lifestyle factors to the etiology of complex diseases. Hypothesis-free testing of ratios between all possible metabolite pairs in GWAS and MWAS has proven to be an innovative approach in the discovery of new biologically meaningful associations. The p-gain statistic was introduced as an ad-hoc measure to determine whether a ratio between two metabolite concentrations carries more information than the two corresponding metabolite concentrations alone. So far, only a rule of thumb was applied to determine the significance of the p-gain.
Here we explore the statistical properties of the p-gain through simulation of its density and by sampling of experimental data. We derive critical values of the p-gain for different levels of correlation between metabolite pairs and show that B/(2*α) is a conservative critical value for the p-gain, where α is the level of significance and B the number of tested metabolite pairs.
We show that the p-gain is a well defined measure that can be used to identify statistically significant metabolite ratios in association studies and provide a conservative significance cut-off for the p-gain for use in future association studies with metabolic traits.
p-gain; Metabolomics; MWAS; GWAS; Genome-wide association studies; Metabolome-wide association studies
Tumor microenvironement is an important actor of ovarian cancer progression but the relations between mesenchymal cells and ovarian cancer cells remain unclear. The objective of this study was to determine the ovarian cancer cells' biological modifications induced by mesenchymal cells. To address this issue, we used two different ovarian cancer cell lines (NIH:OVCAR3 and SKOV3) and co-cultured them with mesenchymal cells. Upon co-culture the different cell populations were sorted to study their transcriptome and biological properties. Transcriptomic analysis revealed three biological-function gene clusters were enriched upon contact with mesenchymal cells. These were related to the increase of metastatic abilities (adhesion, migration and invasion), proliferation and chemoresistance in vitro. Therefore, contact with the mesenchymal cell niche could increase metastatic initiation and expansion through modification of cancer cells. Taken together these findings suggest that pathways involved in hetero-cellular interaction may be targeted to disrupt the acquired pro-metastatic profile.
Dysregulation of fatty acid oxidation plays a pivotal role in the pathophysiology of obesity and insulin resistance. Medium- and short-chain-3-hydroxyacyl-coenzyme A (CoA) dehydrogenase (SCHAD) (gene name, hadh) catalyze the third reaction of the mitochondrial β-oxidation cascade, the oxidation of 3-hydroxyacyl-CoA to 3-ketoacyl-CoA, for medium- and short-chain fatty acids. We identified hadh as a putative obesity gene by comparison of two genome-wide scans, a quantitative trait locus analysis previously performed in the polygenic obese New Zealand obese mouse and an earlier described small interfering RNA-mediated mutagenesis in Caenorhabditis elegans. In the present study, we show that mice lacking SCHAD (hadh−/−) displayed a lower body weight and a reduced fat mass in comparison with hadh+/+ mice under high-fat diet conditions, presumably due to an impaired fuel efficiency, the loss of acylcarnitines via the urine, and increased body temperature. Food intake, total energy expenditure, and locomotor activity were not altered in knockout mice. Hadh−/− mice exhibited normal fat tolerance at 20 C. However, during cold exposure, knockout mice were unable to clear triglycerides from the plasma and to maintain their normal body temperature, indicating that SCHAD plays an important role in adaptive thermogenesis. Blood glucose concentrations in the fasted and postprandial state were significantly lower in hadh−/− mice, whereas insulin levels were elevated. Accordingly, insulin secretion in response to glucose and glucose plus palmitate was elevated in isolated islets of knockout mice. Therefore, our data indicate that SCHAD is involved in thermogenesis, in the maintenance of body weight, and in the regulation of nutrient-stimulated insulin secretion.
We have performed a metabolite quantitative trait locus (mQTL) study of the 1H nuclear magnetic resonance spectroscopy (1H NMR) metabolome in humans, building on recent targeted knowledge of genetic drivers of metabolic regulation. Urine and plasma samples were collected from two cohorts of individuals of European descent, with one cohort comprised of female twins donating samples longitudinally. Sample metabolite concentrations were quantified by 1H NMR and tested for association with genome-wide single-nucleotide polymorphisms (SNPs). Four metabolites' concentrations exhibited significant, replicable association with SNP variation (8.6×10−11
Physiological concentrations of metabolites—small molecules involved in biochemical processes in living systems—can be measured and used to diagnose and predict disease states. A common goal is to detect and clinically exploit statistical differences in metabolite concentrations between diseased and healthy individuals. As a basis for the design and interpretation of case-control studies, it is useful to have a characterization of metabolic diversity amongst healthy individuals, some of which stems from inter-individual genetic variation. When a single genetic locus has a sufficiently strong effect on metabolism, its genomic position can be determined by collecting metabolite concentration data and genome-wide genotype data on a set of individuals and searching for associations between the two data sets—a so-called metabolite quantitative trait locus (mQTL) study. By so tracing mQTLs, we can identify the genetic drivers of metabolism, characterize how the nature or quantity of the corresponding expressed protein(s) feeds forward to influence metabolite levels, and specify disease-predictive models that incorporate mutual dependence amongst genetics, environment, and metabolism.
Metabolomic profiling and the integration of whole-genome genetic association data has proven to be a powerful tool to comprehensively explore gene regulatory networks and to investigate the effects of genetic variation at the molecular level. Serum metabolite concentrations allow a direct readout of biological processes, and association of specific metabolomic signatures with complex diseases such as Alzheimer's disease and cardiovascular and metabolic disorders has been shown. There are well-known correlations between sex and the incidence, prevalence, age of onset, symptoms, and severity of a disease, as well as the reaction to drugs. However, most of the studies published so far did not consider the role of sexual dimorphism and did not analyse their data stratified by gender. This study investigated sex-specific differences of serum metabolite concentrations and their underlying genetic determination. For discovery and replication we used more than 3,300 independent individuals from KORA F3 and F4 with metabolite measurements of 131 metabolites, including amino acids, phosphatidylcholines, sphingomyelins, acylcarnitines, and C6-sugars. A linear regression approach revealed significant concentration differences between males and females for 102 out of 131 metabolites (p-values<3.8×10−4; Bonferroni-corrected threshold). Sex-specific genome-wide association studies (GWAS) showed genome-wide significant differences in beta-estimates for SNPs in the CPS1 locus (carbamoyl-phosphate synthase 1, significance level: p<3.8×10−10; Bonferroni-corrected threshold) for glycine. We showed that the metabolite profiles of males and females are significantly different and, furthermore, that specific genetic variants in metabolism-related genes depict sexual dimorphism. Our study provides new important insights into sex-specific differences of cell regulatory processes and underscores that studies should consider sex-specific effects in design and interpretation.
The combination of genomic and metabolic studies during the last years has provided astonishing results. However, most of the studies published so far did not consider the role of sexual dimorphism and did not analyse their data stratified by sex. The investigation of 131 serum metabolite concentrations of >3,300 population-based samples (KORA F3/F4) revealed significant differences in the metabolite profile of males and females. Furthermore, a genome-wide picture of sex-specific genetic variations in human metabolism (>2,000 subjects from KORA F3/F4 cohorts) was investigated. Sex-specific genome-wide association studies (GWAS) showed differences in the effect of genetic variations on metabolites in men and women. SNPs in the CPS1 (carbamoyl-phosphate synthase 1) locus showed genome-wide significant differences in beta-estimates of sex-specific association analysis (significance level: 3.8×10−10) for glycine. As global metabolomic techniques are more and more refined to identify more compounds in single biological samples, the predictive power of this new technology will greatly increase. This suggests that metabolites, which may be used as predictive biomarkers to indicate the presence or severity of a disease, have to be used selectively depending on sex.
Human plasma and serum are widely used matrices in clinical and biological studies. However, different collecting procedures and the coagulation cascade influence concentrations of both proteins and metabolites in these matrices. The effects on metabolite concentration profiles have not been fully characterized.
We analyzed the concentrations of 163 metabolites in plasma and serum samples collected simultaneously from 377 fasting individuals. To ensure data quality, 41 metabolites with low measurement stability were excluded from further analysis. In addition, plasma and corresponding serum samples from 83 individuals were re-measured in the same plates and mean correlation coefficients (r) of all metabolites between the duplicates were 0.83 and 0.80 in plasma and serum, respectively, indicating significantly better stability of plasma compared to serum (p = 0.01). Metabolite profiles from plasma and serum were clearly distinct with 104 metabolites showing significantly higher concentrations in serum. In particular, 9 metabolites showed relative concentration differences larger than 20%. Despite differences in absolute concentration between the two matrices, for most metabolites the overall correlation was high (mean r = 0.81±0.10), which reflects a proportional change in concentration. Furthermore, when two groups of individuals with different phenotypes were compared with each other using both matrices, more metabolites with significantly different concentrations could be identified in serum than in plasma. For example, when 51 type 2 diabetes (T2D) patients were compared with 326 non-T2D individuals, 15 more significantly different metabolites were found in serum, in addition to the 25 common to both matrices.
Our study shows that reproducibility was good in both plasma and serum, and better in plasma. Furthermore, as long as the same blood preparation procedure is used, either matrix should generate similar results in clinical and biological studies. The higher metabolite concentrations in serum, however, make it possible to provide more sensitive results in biomarker detection.
With the advent of high-throughput targeted metabolic profiling techniques, the question of how to interpret and analyze the resulting vast amount of data becomes more and more important. In this work we address the reconstruction of metabolic reactions from cross-sectional metabolomics data, that is without the requirement for time-resolved measurements or specific system perturbations. Previous studies in this area mainly focused on Pearson correlation coefficients, which however are generally incapable of distinguishing between direct and indirect metabolic interactions.
In our new approach we propose the application of a Gaussian graphical model (GGM), an undirected probabilistic graphical model estimating the conditional dependence between variables. GGMs are based on partial correlation coefficients, that is pairwise Pearson correlation coefficients conditioned against the correlation with all other metabolites. We first demonstrate the general validity of the method and its advantages over regular correlation networks with computer-simulated reaction systems. Then we estimate a GGM on data from a large human population cohort, covering 1020 fasting blood serum samples with 151 quantified metabolites. The GGM is much sparser than the correlation network, shows a modular structure with respect to metabolite classes, and is stable to the choice of samples in the data set. On the example of human fatty acid metabolism, we demonstrate for the first time that high partial correlation coefficients generally correspond to known metabolic reactions. This feature is evaluated both manually by investigating specific pairs of high-scoring metabolites, and then systematically on a literature-curated model of fatty acid synthesis and degradation. Our method detects many known reactions along with possibly novel pathway interactions, representing candidates for further experimental examination.
In summary, we demonstrate strong signatures of intracellular pathways in blood serum data, and provide a valuable tool for the unbiased reconstruction of metabolic reactions from large-scale metabolomics data sets.
The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38 000 000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de).
Metabolomics is the rapidly evolving field of the comprehensive measurement of ideally all endogenous metabolites in a biological fluid. However, no single analytic technique covers the entire spectrum of the human metabolome. Here we present results from a multiplatform study, in which we investigate what kind of results can presently be obtained in the field of diabetes research when combining metabolomics data collected on a complementary set of analytical platforms in the framework of an epidemiological study.
40 individuals with self-reported diabetes and 60 controls (male, over 54 years) were randomly selected from the participants of the population-based KORA (Cooperative Health Research in the Region of Augsburg) study, representing an extensively phenotyped sample of the general German population. Concentrations of over 420 unique small molecules were determined in overnight-fasting blood using three different techniques, covering nuclear magnetic resonance and tandem mass spectrometry. Known biomarkers of diabetes could be replicated by this multiple metabolomic platform approach, including sugar metabolites (1,5-anhydroglucoitol), ketone bodies (3-hydroxybutyrate), and branched chain amino acids. In some cases, diabetes-related medication can be detected (pioglitazone, salicylic acid).
Our study depicts the promising potential of metabolomics in diabetes research by identification of a series of known and also novel, deregulated metabolites that associate with diabetes. Key observations include perturbations of metabolic pathways linked to kidney dysfunction (3-indoxyl sulfate), lipid metabolism (glycerophospholipids, free fatty acids), and interaction with the gut microflora (bile acids). Our study suggests that metabolic markers hold the potential to detect diabetes-related complications already under sub-clinical conditions in the general population.
Results 1-25 (44)
This will clear all selections from your clipboard. Do you wish proceed?
Clipboard is full! Please remove an item and try again.