Using a nontargeted metabolomics approach of 447 fasting plasma metabolites, we searched for novel molecular markers that arise before and after hyperglycemia in a large population-based cohort of 2,204 females (115 type 2 diabetic [T2D] case subjects, 192 individuals with impaired fasting glucose [IFG], and 1,897 control subjects) from TwinsUK. Forty-two metabolites from three major fuel sources (carbohydrates, lipids, and proteins) were found to significantly correlate with T2D after adjusting for multiple testing; of these, 22 were previously reported as associated with T2D or insulin resistance. Fourteen metabolites were found to be associated with IFG. Among the metabolites identified, the branched-chain keto-acid metabolite 3-methyl-2-oxovalerate was the strongest predictive biomarker for IFG after glucose (odds ratio [OR] 1.65 [95% CI 1.39–1.95], P = 8.46 × 10−9) and was moderately heritable (h2 = 0.20). The association was replicated in an independent population (n = 720, OR 1.68 [ 1.34–2.11], P = 6.52 × 10−6) and validated in 189 twins with urine metabolomics taken at the same time as plasma (OR 1.87 [1.27–2.75], P = 1 × 10−3). Results confirm an important role for catabolism of branched-chain amino acids in T2D and IFG. In conclusion, this T2D-IFG biomarker study has surveyed the broadest panel of nontargeted metabolites to date, revealing both novel and known associated metabolites and providing potential novel targets for clinical prediction and a deeper understanding of causal mechanisms.
Genome-wide association scans with high-throughput metabolic profiling provide unprecedented insights into how genetic variation influences metabolism and complex disease. Here we report the most comprehensive exploration of genetic loci influencing human metabolism to date, including 7,824 adult individuals from two European population studies. We report genome-wide significant associations at 145 metabolic loci and their biochemical connectivity regarding more than 400 metabolites in human blood. We extensively characterize the resulting in vivo blueprint of metabolism in human blood by integrating it with information regarding gene expression, heritability, overlap with known drug targets, previous association with complex disorders and inborn errors of metabolism. We further developed a database and web-based resources for data mining and results visualization. Our findings contribute to a greater understanding of the role of inherited variation in blood metabolic diversity, and identify potential new opportunities for pharmacologic development and disease understanding.
High-throughput screening techniques that analyze the metabolic endpoints of biological processes can identify the contributions of genetic predisposition and environmental factors to the development of common diseases. Studies applying controlled physiological challenges can reveal dysregulation in metabolic responses that may be predictive for or associated with these diseases. However, large-scale epidemiological studies with well controlled physiological challenge conditions, such as extended fasting periods and defined food intake, pose logistic challenges. Culturally and religiously motivated behavioral patterns of life style changes provide a natural setting that can be used to enroll a large number of study volunteers. Here we report a proof of principle study conducted within a Muslim community, showing that a metabolomics study during the Holy Month of Ramadan can provide a unique opportunity to explore the pre-prandial and postprandial response of human metabolism to nutritional challenges. Up to five blood samples were obtained from eleven healthy male volunteers, taken directly before and two hours after consumption of a controlled meal in the evening on days 7 and 26 of Ramadan, and after an over-night fast several weeks after Ramadan. The observed increases in glucose, insulin and lactate levels at the postprandial time point confirm the expected physiological response to food intake. Targeted metabolomics further revealed significant and physiologically plausible responses to food intake by an increase in bile acid and amino acid levels and a decrease in long-chain acyl-carnitine and polyamine levels. A decrease in the concentrations of a number of phospholipids between samples taken on days 7 and 26 of Ramadan shows that the long-term response to extended fasting may differ from the response to short-term fasting. The present study design is scalable to larger populations and may be extended to the study of the metabolic response in defined patient groups such as individuals with type 2 diabetes.
Metabolomics; Nutritional challenging; Ramadan fasting; Study design; Clinical research
The mechanism of antihypertensive and lipid-lowering drugs on the human organism is still not fully understood. New insights on the drugs’ action can be provided by a metabolomics-driven approach, which offers a detailed view of the physiological state of an organism. Here, we report a metabolome-wide association study with 295 metabolites in human serum from 1,762 participants of the KORA F4 (Cooperative Health Research in the Region of Augsburg) study population. Our intent was to find variations of metabolite concentrations related to the intake of various drug classes and—based on the associations found—to generate new hypotheses about on-target as well as off-target effects of these drugs. In total, we found 41 significant associations for the drug classes investigated: For beta-blockers (11 associations), angiotensin-converting enzyme (ACE) inhibitors (four assoc.), diuretics (seven assoc.), statins (ten assoc.), and fibrates (nine assoc.) the top hits were pyroglutamine, phenylalanylphenylalanine, pseudouridine, 1-arachidonoylglycerophosphocholine, and 2-hydroxyisobutyrate, respectively. For beta-blockers we observed significant associations with metabolite concentrations that are indicative of drug side-effects, such as increased serotonin and decreased free fatty acid levels. Intake of ACE inhibitors and statins associated with metabolites that provide insight into the action of the drug itself on its target, such as an association of ACE inhibitors with des-Arg(9)-bradykinin and aspartylphenylalanine, a substrate and a product of the drug-inhibited ACE. The intake of statins which reduce blood cholesterol levels, resulted in changes in the concentration of metabolites of the biosynthesis as well as of the degradation of cholesterol. Fibrates showed the strongest association with 2-hydroxyisobutyrate which might be a breakdown product of fenofibrate and, thus, a possible marker for the degradation of this drug in the human organism. The analysis of diuretics showed a heterogeneous picture that is difficult to interpret. Taken together, our results provide a basis for a deeper functional understanding of the action and side-effects of antihypertensive and lipid-lowering drugs in the general population.
Electronic supplementary material
The online version of this article (doi:10.1007/s10654-014-9910-7) contains supplementary material, which is available to authorized users.
Beta-blockers; Angiotensin-converting enzyme inhibitors; Diuretics; Statins; Fibrates; Metabolomics
Metabolomic screening of fasting plasma from nondiabetic subjects identified α-hydroxybutyrate (α-HB) and linoleoyl-glycerophosphocholine (L-GPC) as joint markers of insulin resistance (IR) and glucose intolerance. To test the predictivity of α-HB and L-GPC for incident dysglycemia, α-HB and L-GPC measurements were obtained in two observational cohorts, comprising 1,261 nondiabetic participants from the Relationship between Insulin Sensitivity and Cardiovascular Disease (RISC) study and 2,580 from the Botnia Prospective Study, with 3-year and 9.5-year follow-up data, respectively. In both cohorts, α-HB was a positive correlate and L-GPC a negative correlate of insulin sensitivity, with α-HB reciprocally related to indices of β-cell function derived from the oral glucose tolerance test (OGTT). In follow-up, α-HB was a positive predictor (adjusted odds ratios 1.25 [95% CI 1.00–1.60] and 1.26 [1.07–1.48], respectively, for each standard deviation of predictor), and L-GPC was a negative predictor (0.64 [0.48–0.85] and 0.67 [0.54–0.84]) of dysglycemia (RISC) or type 2 diabetes (Botnia), independent of familial diabetes, sex, age, BMI, and fasting glucose. Corresponding areas under the receiver operating characteristic curve were 0.791 (RISC) and 0.783 (Botnia), similar in accuracy when substituting α-HB and L-GPC with 2-h OGTT glucose concentrations. When their activity was examined, α-HB inhibited and L-GPC stimulated glucose-induced insulin release in INS-1e cells. α-HB and L-GPC are independent predictors of worsening glucose tolerance, physiologically consistent with a joint signature of IR and β-cell dysfunction.
Changes in an individual’s human metabolic phenotype (metabotype) over time can be indicative of disorder-related modifications. Studies covering several months to a few years have shown that metabolic profiles are often specific for an individual. This “metabolic individuality” and detected changes may contribute to personalized approaches in human health care. However, it is not clear whether such individual metabotypes persist over longer time periods. Here we investigate the conservation of metabotypes characterized by 212 different metabolites of 818 participants from the Cooperative Health Research in the Region of Augsburg; Germany population, taken within a 7-year time interval. For replication, we used paired samples from 83 non-related individuals from the TwinsUK study. Results indicated that over 40 % of all study participants could be uniquely identified after 7 years based on their metabolic profiles alone. Moreover, 95 % of the study participants showed a high degree of metabotype conservation (>70 %) whereas the remaining 5 % displayed major changes in their metabolic profiles over time. These latter individuals were likely to have undergone important biochemical changes between the two time points. We further show that metabolite conservation was positively associated with heritability (rank correlation 0.74), although there were some notable exceptions. Our results suggest that monitoring changes in metabotypes over several years can trace changes in health status and may provide indications for disease onset. Moreover, our study findings provide a general reference for metabotype conservation over longer time periods that can be used in biomarker discovery studies.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-014-0629-y) contains supplementary material, which is available to authorized users.
Metabolomics; Longitudinal study; Heritability; Population study
Genome-wide association studies (GWAS) have identified many risk loci for complex diseases, but effect sizes are typically small and information on the underlying biological processes is often lacking. Associations with metabolic traits as functional intermediates can overcome these problems and potentially inform individualized therapy. Here we report a comprehensive analysis of genotype-dependent metabolic phenotypes using a GWAS with non-targeted metabolomics. We identified 37 genetic loci associated with blood metabolite concentrations, of which 25 exhibit effect sizes that are unusually high for GWAS and account for 10-60% of metabolite levels per allele copy. Our associations provide new functional insights for many disease-related associations that have been reported in previous studies, including cardiovascular and kidney disorders, type 2 diabetes, cancer, gout, venous thromboembolism, and Crohn’s disease. Taken together our study advances our knowledge of the genetic basis of metabolic individuality in humans and generates many new hypotheses for biomedical and pharmaceutical research.
Polymorphisms in the transcription factor 7-like 2 (TCF7L2) gene have been shown to display a powerful association with type 2 diabetes. The aim of the present study was to evaluate metabolic alterations in carriers of a common TCF7L2 risk variant.
Seventeen non-diabetic subjects carrying the T risk allele at the rs7903146 TCF7L2 locus and 24 subjects carrying no risk allele were submitted to intravenous glucose tolerance test and euglycemic-hyperinsulinemic clamp. Plasma samples were analysed for concentrations of 163 metabolites through targeted mass spectrometry.
TCF7L2 risk allele carriers had a reduced first-phase insulin response and normal insulin sensitivity. Under fasting conditions, carriers of TCF7L2 rs7903146 exhibited a non-significant increase of plasma sphingomyelins (SMs), phosphatidylcholines (PCs) and lysophosphatidylcholines (lysoPCs) species. A significant genotype effect was detected in response to challenge tests in 6 SMs (C16:0, C16:1, C18:0, C18:1, C24:0, C24:1), 5 hydroxy-SMs (C14:1, C16:1, C22:1, C22:2, C24:1), 4 lysoPCs (C14:0, C16:0, C16:1, C17:0), 3 diacyl-PCs (C28:1, C36:6, C40:4) and 4 long-chain acyl-alkyl-PCs (C40:2, C40:5, C44:5, C44:6).
Plasma metabolomic profiling identified alterations of phospholipid metabolism in response to challenge tests in subjects with TCF7L2 rs7903146 genotype. This may reflect a genotype-mediated link to early metabolic abnormalities prior to the development of disturbed glucose tolerance.
Bacterial infectious diseases are the result of multifactorial processes affected by the interplay between virulence factors and host targets. The host-Pseudomonas and Coxiella interaction database (HoPaCI-DB) is a publicly available manually curated integrative database (http://mips.helmholtz-muenchen.de/HoPaCI/) of host–pathogen interaction data from Pseudomonas aeruginosa and Coxiella burnetii. The resource provides structured information on 3585 experimentally validated interactions between molecules, bioprocesses and cellular structures extracted from the scientific literature. Systematic annotation and interactive graphical representation of disease networks make HoPaCI-DB a versatile knowledge base for biologists and network biology approaches.
Serum metabolite concentrations provide a direct readout of biological processes in the human body, and are associated with disorders such as cardiovascular and metabolic diseases. Here we present a genome-wide association study with 163 metabolic traits using 1809 participants from the KORA population, followed up in the TwinsUK cohort with 422 participants. In eight out of nine replicated loci (FADS1, ELOVL2, ACADS, ACADM, ACADL, SPTLC3, ETFDH, SLC16A9) the genetic variant is located in or near enzyme or solute carrier coding genes, where the associating metabolic traits match the proteins’ function. Many of these loci are located in rate limiting steps of important enzymatic reactions. Use of metabolite concentration ratios as proxies for enzymatic reaction rates reduces the variance and yields robust statistical associations with p-values between 3×10−24 and 6.5×10−179. These loci explained 5.6% to 36.3% of the observed variance. For several loci, associations with clinically relevant parameters have previously been reported.
Previously, we reported strong influences of genetic variants on metabolic phenotypes, some of them with clinical relevance. Here, we hypothesize that DNA methylation may have an important and potentially independent effect on human metabolism. To test this hypothesis, we conducted what is to the best of our knowledge the first epigenome-wide association study (EWAS) between DNA methylation and metabolic traits (metabotypes) in human blood. We assess 649 blood metabolic traits from 1814 participants of the Kooperative Gesundheitsforschung in der Region Augsburg (KORA) population study for association with methylation of 457 004 CpG sites, determined on the Infinium HumanMethylation450 BeadChip platform. Using the EWAS approach, we identified two types of methylome–metabotype associations. One type is driven by an underlying genetic effect; the other type is independent of genetic variation and potentially driven by common environmental and life-style-dependent factors. We report eight CpG loci at genome-wide significance that have a genetic variant as confounder (P = 3.9 × 10−20 to 2.0 × 10−108, r2 = 0.036 to 0.221). Seven loci display CpG site-specific associations to metabotypes, but do not exhibit any underlying genetic signals (P = 9.2 × 10−14 to 2.7 × 10−27, r2 = 0.008 to 0.107). We further identify several groups of CpG loci that associate with a same metabotype, such as 4-vinylphenol sulfate and 4-androsten-3-beta,17-beta-diol disulfate. In these cases, the association between CpG-methylation and metabotype is likely the result of a common external environmental factor, including smoking. Our study shows that analysis of EWAS with large numbers of metabolic traits in large population cohorts are, in principle, feasible. Taken together, our data suggest that DNA methylation plays an important role in regulating human metabolism.
HSC-Explorer (http://mips.helmholtz-muenchen.de/HSC/) is a publicly available, integrative database containing detailed information about the early steps of hematopoiesis. The resource aims at providing fast and easy access to relevant information, in particular to the complex network of interacting cell types and molecules, from the wealth of publications in the field through visualization interfaces. It provides structured information on more than 7000 experimentally validated interactions between molecules, bioprocesses and environmental factors. Information is manually derived by critical reading of the scientific literature from expert annotators. Hematopoiesis-relevant interactions are accompanied with context information such as model organisms and experimental methods for enabling assessment of reliability and relevance of experimental results. Usage of established vocabularies facilitates downstream bioinformatics applications and to convert the results into complex networks. Several predefined datasets (Selected topics) offer insights into stem cell behavior, the stem cell niche and signaling processes supporting hematopoietic stem cell maintenance. HSC-Explorer provides a versatile web-based resource for scientists entering the field of hematopoiesis enabling users to inspect the associated biological processes through interactive graphical presentation.
Serum urate, the final breakdown product of purine metabolism, is causally involved in the pathogenesis of gout, and implicated in cardiovascular disease and type 2 diabetes. Serum urate levels highly differ between men and women; however the underlying biological processes in its regulation are still not completely understood and are assumed to result from a complex interplay between genetic, environmental and lifestyle factors. In order to describe the metabolic vicinity of serum urate, we analyzed 355 metabolites in 1,764 individuals of the population-based KORA F4 study and constructed a metabolite network around serum urate using Gaussian Graphical Modeling in a hypothesis-free approach. We subsequently investigated the effect of sex and urate lowering medication on all 38 metabolites assigned to the network. Within the resulting network three main clusters could be detected around urate, including the well-known pathway of purine metabolism, as well as several dipeptides, a group of essential amino acids, and a group of steroids. Of the 38 assigned metabolites, 25 showed strong differences between sexes. Association with uricostatic medication intake was not only confined to purine metabolism but seen for seven metabolites within the network. Our findings highlight pathways that are important in the regulation of serum urate and suggest that dipeptides, amino acids, and steroid hormones are playing a role in its regulation. The findings might have an impact on the development of specific targets in the treatment and prevention of hyperuricemia.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-013-0565-2) contains supplementary material, which is available to authorized users.
Gaussian Graphical Modeling; Metabolite network; Pathway reconstruction; Allopurinol; Uric acid; Purine metabolism
The aim was to characterise associations between circulating thyroid hormones—free thyroxine (FT4) and thyrotropin (TSH)—and the metabolite profiles in serum samples from participants of the German population-based KORA F4 study. Analyses were based on the metabolite profile of 1463 euthyroid subjects. In serum samples, obtained after overnight fasting (≥8), 151 different metabolites were quantified in a targeted approach including amino acids, acylcarnitines (ACs), and phosphatidylcholines (PCs). Associations between metabolites and thyroid hormone concentrations were analysed using adjusted linear regression models. To draw conclusions on thyroid hormone related pathways, intra-class metabolite ratios were additionally explored. We discovered 154 significant associations (Bonferroni p < 1.75 × 10−04) between FT4 and various metabolites and metabolite ratios belonging to AC and PC groups. Significant associations with TSH were lacking. High FT4 levels were associated with increased concentrations of many ACs and various sums of ACs of different chain length, and the ratio of C2 by C0. The inverse associations observed between FT4 and many serum PCs reflected the general decrease in PC concentrations. Similar results were found in subgroup analyses, e.g., in weight-stable subjects or in obese subjects. Further, results were independent of different parameters for liver or kidney function, or inflammation, which supports the notion of an independent FT4 effect. In fasting euthyroid adults, higher serum FT4 levels are associated with increased serum AC concentrations and an increased ratio of C2 by C0 which is indicative of an overall enhanced fatty acyl mitochondrial transport and β-oxidation of fatty acids.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-013-0563-4) contains supplementary material, which is available to authorized users.
Targeted metabolomics; Serum metabolites; Free thyroxine; Thyrotropin; Thyroid hormones; Epidemiology
Nuclear magnetic resonance spectroscopy (NMR) provides robust readouts of many metabolic parameters in one experiment. However, identification of clinically relevant markers in 1H NMR spectra is a major challenge. Association of NMR-derived quantities with genetic variants can uncover biologically relevant metabolic traits. Using NMR data of plasma samples from 1,757 individuals from the KORA study together with 655,658 genetic variants, we show that ratios between NMR intensities at two chemical shift positions can provide informative and robust biomarkers. We report seven loci of genetic association with NMR-derived traits (APOA1, CETP, CPS1, GCKR, FADS1, LIPC, PYROXD2) and characterize these traits biochemically using mass spectrometry. These ratios may now be used in clinical studies.
Recent genome-wide association studies (GWAS) with metabolomics data linked genetic variation in the human genome to differences in individual metabolite levels. A strong relevance of this metabolic individuality for biomedical and pharmaceutical research has been reported. However, a considerable amount of the molecules currently quantified by modern metabolomics techniques are chemically unidentified. The identification of these “unknown metabolites” is still a demanding and intricate task, limiting their usability as functional markers of metabolic processes. As a consequence, previous GWAS largely ignored unknown metabolites as metabolic traits for the analysis. Here we present a systems-level approach that combines genome-wide association analysis and Gaussian graphical modeling with metabolomics to predict the identity of the unknown metabolites. We apply our method to original data of 517 metabolic traits, of which 225 are unknowns, and genotyping information on 655,658 genetic variants, measured in 1,768 human blood samples. We report previously undescribed genotype–metabotype associations for six distinct gene loci (SLC22A2, COMT, CYP3A5, CYP2C18, GBA3, UGT3A1) and one locus not related to any known gene (rs12413935). Overlaying the inferred genetic associations, metabolic networks, and knowledge-based pathway information, we derive testable hypotheses on the biochemical identities of 106 unknown metabolites. As a proof of principle, we experimentally confirm nine concrete predictions. We demonstrate the benefit of our method for the functional interpretation of previous metabolomics biomarker studies on liver detoxification, hypertension, and insulin resistance. Our approach is generic in nature and can be directly transferred to metabolomics data from different experimental platforms.
Genome-wide association studies on metabolomics data have demonstrated that genetic variation in metabolic enzymes and transporters leads to concentration changes in the respective metabolite levels. The conventional goal of these studies is the detection of novel interactions between the genome and the metabolic system, providing valuable insights for both basic research as well as clinical applications. In this study, we borrow the metabolomics GWAS concept for a novel, entirely different purpose. Metabolite measurements frequently produce signals where a certain substance can be reliably detected in the sample, but it has not yet been elucidated which specific metabolite this signal actually represents. The concept is comparable to a fingerprint: each one is uniquely identifiable, but as long as it is not registered in a database one cannot tell to whom this fingerprint belongs. Obviously, this issue tremendously reduces the usability of a metabolomics analyses. The genetic associations of such an “unknown,” however, give us concrete evidence of the metabolic pathway this substance is most probably involved in. Moreover, we complement the approach with a specific measure of correlation between metabolites, providing further evidence of the metabolic processes of the unknown. For a number of cases, this even allows for a concrete identity prediction, which we then experimentally validate in the lab.
The pathobiology of common diseases is influenced by heterogeneous factors interacting in complex networks. CIDeR http://mips.helmholtz-muenchen.de/cider/ is a publicly available, manually curated, integrative database of metabolic and neurological disorders. The resource provides structured information on 18,813 experimentally validated interactions between molecules, bioprocesses and environmental factors extracted from the scientific literature. Systematic annotation and interactive graphical representation of disease networks make CIDeR a versatile knowledge base for biologists, analysis of large-scale data and systems biology approaches.
To characterise the influence of the fat free mass on the metabolite profile in serum samples from participants of the population-based KORA (Cooperative Health Research in the Region of Augsburg) S4 study.
Subjects and Methods
Analyses were based on metabolite profile from 965 participants of the S4 and 890 weight-stable subjects of its seven-year follow-up study (KORA F4). 190 different serum metabolites were quantified in a targeted approach including amino acids, acylcarnitines, phosphatidylcholines (PCs), sphingomyelins and hexose. Associations between metabolite concentrations and the fat free mass index (FFMI) were analysed using adjusted linear regression models. To draw conclusions on enzymatic reactions, intra-metabolite class ratios were explored. Pairwise relationships among metabolites were investigated and illustrated by means of Gaussian graphical models (GGMs).
We found 339 significant associations between FFMI and various metabolites in KORA S4. Among the most prominent associations (p-values 4.75×10−16–8.95×10−06) with higher FFMI were increasing concentrations of the branched chained amino acids (BCAAs), ratios of BCAAs to glucogenic amino acids, and carnitine concentrations. For various PCs, a decrease in chain length or in saturation of the fatty acid moieties could be observed with increasing FFMI, as well as an overall shift from acyl-alkyl PCs to diacyl PCs. These findings were reproduced in KORA F4. The established GGMs supported the regression results and provided a comprehensive picture of the relationships between metabolites. In a sub-analysis, most of the discovered associations did not exist in obese subjects in contrast to non-obese subjects, possibly indicating derangements in skeletal muscle metabolism.
A set of serum metabolites strongly associated with FFMI was identified and a network explaining the relationships among metabolites was established. These results offer a novel and more complete picture of the FFMI effects on serum metabolites in a data-driven network.
Human plasma and serum are widely used matrices in clinical and biological studies. However, different collecting procedures and the coagulation cascade influence concentrations of both proteins and metabolites in these matrices. The effects on metabolite concentration profiles have not been fully characterized.
We analyzed the concentrations of 163 metabolites in plasma and serum samples collected simultaneously from 377 fasting individuals. To ensure data quality, 41 metabolites with low measurement stability were excluded from further analysis. In addition, plasma and corresponding serum samples from 83 individuals were re-measured in the same plates and mean correlation coefficients (r) of all metabolites between the duplicates were 0.83 and 0.80 in plasma and serum, respectively, indicating significantly better stability of plasma compared to serum (p = 0.01). Metabolite profiles from plasma and serum were clearly distinct with 104 metabolites showing significantly higher concentrations in serum. In particular, 9 metabolites showed relative concentration differences larger than 20%. Despite differences in absolute concentration between the two matrices, for most metabolites the overall correlation was high (mean r = 0.81±0.10), which reflects a proportional change in concentration. Furthermore, when two groups of individuals with different phenotypes were compared with each other using both matrices, more metabolites with significantly different concentrations could be identified in serum than in plasma. For example, when 51 type 2 diabetes (T2D) patients were compared with 326 non-T2D individuals, 15 more significantly different metabolites were found in serum, in addition to the 25 common to both matrices.
Our study shows that reproducibility was good in both plasma and serum, and better in plasma. Furthermore, as long as the same blood preparation procedure is used, either matrix should generate similar results in clinical and biological studies. The higher metabolite concentrations in serum, however, make it possible to provide more sensitive results in biomarker detection.
Metabolomics is an emerging field that is based on the quantitative measurement of as many small organic molecules occurring in a biological sample as possible. Due to recent technical advances, metabolomics can now be used widely as an analytical high-throughput technology in drug testing and epidemiological metabolome and genome wide association studies. Analogous to chip-based gene expression analyses, the enormous amount of data produced by modern kit-based metabolomics experiments poses new challenges regarding their biological interpretation in the context of various sample phenotypes. We developed metaP-server to facilitate data interpretation. metaP-server provides automated and standardized data analysis for quantitative metabolomics data, covering the following steps from data acquisition to biological interpretation: (i) data quality checks, (ii) estimation of reproducibility and batch effects, (iii) hypothesis tests for multiple categorical phenotypes, (iv) correlation tests for metric phenotypes, (v) optionally including all possible pairs of metabolite concentration ratios, (vi) principal component analysis (PCA), and (vii) mapping of metabolites onto colored KEGG pathway maps. Graphical output is clickable and cross-linked to sample and metabolite identifiers. Interactive coloring of PCA and bar plots by phenotype facilitates on-line data exploration. For users of commercial metabolomics kits, cross-references to the HMDB, LipidMaps, KEGG, PubChem, and CAS databases are provided. metaP-server is freely accessible at http://metabolomics.helmholtz-muenchen.de/metap2/.
A new machine learning-based method is presented here for the identification of metabolic pathways related to specific phenotypes in multiple microbial genomes.
Identifying the biochemical basis of microbial phenotypes is a main objective of comparative genomics. Here we present a novel method using multivariate machine learning techniques for comparing automatically derived metabolic reconstructions of sequenced genomes on a large scale. Applying our method to 266 genomes directly led to testable hypotheses such as the link between the potential of microorganisms to cause periodontal disease and their ability to degrade histidine, a link also supported by clinical studies.
The PEDANT genome database provides exhaustive annotation of nearly 3000 publicly available eukaryotic, eubacterial, archaeal and viral genomes with more than 4.5 million proteins by a broad set of bioinformatics algorithms. In particular, all completely sequenced genomes from the NCBI's Reference Sequence collection (RefSeq) are covered. The PEDANT processing pipeline has been sped up by an order of magnitude through the utilization of precalculated similarity information stored in the similarity matrix of proteins (SIMAP) database, making it possible to process newly sequenced genomes immediately as they become available. PEDANT is freely accessible to academic users at http://pedant.gsf.de. For programmatic access Web Services are available at http://pedant.gsf.de/webservices.jsp.
The PEDANT genome database (http://pedant.gsf.de) provides exhaustive automatic analysis of genomic sequences by a large variety of established bioinformatics tools through a comprehensive Web-based user interface. One hundred and seventy seven completely sequenced and unfinished genomes have been processed so far, including large eukaryotic genomes (mouse, human) published recently. In this contribution, we describe the current status of the PEDANT database and novel analytical features added to the PEDANT server in 2002. Those include: (i) integration with the BioRS™ data retrieval system which allows fast text queries, (ii) pre-computed sequence clusters in each complete genome, (iii) a comprehensive set of tools for genome comparison, including genome comparison tables and protein function prediction based on genomic context, and (iv) computation and visualization of protein–protein interaction (PPI) networks based on experimental data. The availability of functional and structural predictions for 650 000 genomic proteins in well organized form makes PEDANT a useful resource for both functional and structural genomics.