There is a general consensus that supports the need for standardized reporting of metadata or information describing large-scale metabolomics and other functional genomics data sets. Reporting of standard metadata provides a biological and empirical context for the data, facilitates experimental replication, and enables the re-interrogation and comparison of data by others. Accordingly, the Metabolomics Standards Initiative is building a general consensus concerning the minimum reporting standards for metabolomics experiments of which the Chemical Analysis Working Group (CAWG) is a member of this community effort. This article proposes the minimum reporting standards related to the chemical analysis aspects of metabolomics experiments including: sample preparation, experimental analysis, quality control, metabolite identification, and data pre-processing. These minimum standards currently focus mostly upon mass spectrometry and nuclear magnetic resonance spectroscopy due to the popularity of these techniques in metabolomics. However, additional input concerning other techniques is welcomed and can be provided via the CAWG on-line discussion forum at http://msi-workgroups.sourceforge.net/ or http://Msiemail@example.com. Further, community input related to this document can also be provided via this electronic forum.
Metabolomics; Metabolite profiling; Metabolite identification; Minimum reporting standards; Chemical analysis; Mass spectrometry; Nuclear magnetic resonance; Flux; Isotopomer analysis; GC-MS; LC-MS; CE-MS; NMR; Quality control; Method validation
Lipid secretions from algae pose a great opportunity for engineering biofueler feedstocks. The lipid exudates could be interesting from a process engineering perspective because lipids could be collected directly from the medium without harvesting and disrupting cells. We here report on the extracellular secretions of algal metabolites from the strain UTEX 2341 (Chlorella minutissima) into the culture medium. No detailed analysis of these lipid secretions has been performed to date. Using multiple mass spectrometric platforms, we observed around 1000 compounds and were able to annotate 50 lipids by means of liquid chromatography coupled to accurate mass quadrupole time-of-flight mass spectrometry (LC-QTOF), direct infusion with positive and negative electrospray ion trap mass spectrometry and gas chromatography coupled to mass spectrometry (GC–MS). These compounds were annotated by tandem mass spectral (MS/MS) database matching and retention time range filtering. We observed a series of triacylglycerols (TG), sulfoquinovosyldiacylglycerols (SQDG), phosphatidylinositols and phosphatidylglycerols, as well as betaine lipids diacylglyceryl-N,N,N-trimethylhomoserines (DGTS).
Biofuel; Algae; Lipids; LC–MS; GC–MS
We have shown that lithium treatment improves motor coordination in a spinocerebellar ataxia type 1 (SCA1) disease mouse model (Sca1154Q/+). To learn more about disease pathogenesis and molecular contributions to the neuroprotective effects of lithium, we investigated metabolomic profiles of cerebellar tissue and plasma from SCA1-model treated and untreated mice. Metabolomic analyses of wild-type and Sca1154Q/+ mice, with and without lithium treatment, were performed using gas chromatography time-of-flight mass spectrometry and BinBase mass spectral annotations. We detected 416 metabolites, of which 130 were identified. We observed specific metabolic perturbations in Sca1154Q/+ mice and major effects of lithium on metabolism, centrally and peripherally. Compared to wild-type, Sca1154Q/+ cerebella metabolic profile revealed changes in glucose, lipids, and metabolites of the tricarboxylic acid cycle and purines. Fewer metabolic differences were noted in Sca1154Q/+ mouse plasma versus wild-type. In both genotypes, the major lithium responses in cerebellum involved energy metabolism, purines, unsaturated free fatty acids, and aromatic and sulphur-containing amino acids. The largest metabolic difference with lithium was a 10-fold increase in ascorbate levels in wild-type cerebella (p<0.002), with lower threonate levels, a major ascorbate catabolite. In contrast, Sca1154Q/+ mice that received lithium showed no elevated cerebellar ascorbate levels. Our data emphasize that lithium regulates a variety of metabolic pathways, including purine, oxidative stress and energy production pathways. The purine metabolite level, reduced in the Sca1154Q/+ mice and restored upon lithium treatment, might relate to lithium neuroprotective properties.
Therapeutic response to selective serotonin (5-HT) reuptake inhibitors in Major Depressive Disorder (MDD) varies considerably among patients, and the onset of antidepressant therapeutic action is delayed until after 2 to 4 weeks of treatment. The objective of this study was to analyze changes within methoxyindole and kynurenine (KYN) branches of tryptophan pathway to determine whether differential regulation within these branches may contribute to mechanism of variation in response to treatment. Metabolomics approach was used to characterize early biochemical changes in tryptophan pathway and correlated biochemical changes with treatment outcome. Outpatients with MDD were randomly assigned to sertraline (n = 35) or placebo (n = 40) in a double-blind 4-week trial; response to treatment was measured using the 17-item Hamilton Rating Scale for Depression (HAMD17). Targeted electrochemistry based metabolomic platform (LCECA) was used to profile serum samples from MDD patients. The response rate was slightly higher for sertraline than for placebo (21/35 [60%] vs. 20/40 [50%], respectively, χ2(1) = 0.75, p = 0.39). Patients showing a good response to sertraline had higher pretreatment levels of 5-methoxytryptamine (5-MTPM), greater reduction in 5-MTPM levels after treatment, an increase in 5-Methoxytryptophol (5-MTPOL) and Melatonin (MEL) levels, and decreases in the (KYN)/MEL and 3-Hydroxykynurenine (3-OHKY)/MEL ratios post-treatment compared to pretreatment. These changes were not seen in the patients showing poor response to sertraline. In the placebo group, more favorable treatment outcome was associated with increases in 5-MTPOL and MEL levels and significant decreases in the KYN/MEL and 3-OHKY/MEL; changes in 5-MTPM levels were not associated with the 4-week response. These results suggest that recovery from a depressed state due to treatment with drug or with placebo could be associated with preferential utilization of serotonin for production of melatonin and 5-MTPOL.
Wilson’s disease (WD) is an inherited disorder of copper metabolism characterized by liver disease and/or neurologic and psychiatric pathology. The disease is a result of mutation in ATP7B, which encodes the ATP7B copper transporting ATPase. Loss of copper transport function by ATP7B results in copper accumulation primarily in the liver, but also in other organs including the brain. Studies in the Atp7b−/− mouse model of WD revealed specific transcript and metabolic changes that precede development of liver pathology, most notably downregulation of transcripts in the cholesterol biosynthetic pathway. In order to gain insight into the molecular mechanisms of transcriptomic and metabolic changes, we used a systems approach analysing the pre-symptomatic hepatic nuclear proteome and liver metabolites. We found that ligand-activated nuclear receptors FXR/NR1H4 and GR/NR3C1 and nuclear receptor interacting partners are less abundant in Atp7b−/− hepatocyte nuclei, while DNA repair machinery and the nucleus-localized glutathione peroxidase, SelH, are more abundant. Analysis of metabolites revealed an increase in polyol sugar alcohols, indicating a change in osmotic potential that precedes hepatocyte swelling observed later in disease. This work is the first application of quantitative Multidimensional Protein Identification Technology (MuDPIT) to a model of WD to investigate protein-level mechanisms of WD pathology. The systems approach using “shotgun” proteomics and metabolomics in the context of previous transcriptomic data reveals molecular-level mechanisms of WD development and facilitates targeted analysis of hepatocellular copper toxicity.
copper; Wilson’s Disease; ATP7B; liver; nuclear receptor; lipid metabolism; DNA repair; proteomics; metabolomics; transcriptomics
Breast cancer is the most common cancer in women worldwide, and the development of new technologies for better understanding of the molecular changes involved in breast cancer progression is essential. Metabolic changes precede overt phenotypic changes, because cellular regulation ultimately affects the use of small-molecule substrates for cell division, growth or environmental changes such as hypoxia. Differences in metabolism between normal cells and cancer cells have been identified. Because small alterations in enzyme concentrations or activities can cause large changes in overall metabolite levels, the metabolome can be regarded as the amplified output of a biological system. The metabolome coverage in human breast cancer tissues can be maximized by combining different technologies for metabolic profiling. Researchers are investigating alterations in the steady state concentrations of metabolites that reflect amplified changes in genetic control of metabolism. Metabolomic results can be used to classify breast cancer on the basis of tumor biology, to identify new prognostic and predictive markers and to discover new targets for future therapeutic interventions. Here, we examine recent results, including those from the European FP7 project METAcancer consortium, that show that integrated metabolomic analyses can provide information on the stage, subtype and grade of breast tumors and give mechanistic insights. We predict an intensified use of metabolomic screens in clinical and preclinical studies focusing on the onset and progression of tumor development.
breast cancer; metabolomics; lipidomics; biomarker analysis
We set out to test the hypothesis that pharmacometabolomic data could be efficiently merged with pharmacogenomic data by SNP imputation of metabolomic-derived pathway data on a “scaffolding” of genome-wide association (GWA) SNP data to broaden and accelerate “pharmacometabolomics-informed pharmacogenomic” studies by eliminating the need for initial genotyping and by making broader SNP association testing possible.
We previously genotyped 131 tag SNPs for six genes encoding enzymes in the glycine synthesis and degradation pathway using DNA from 529 depressed patients treated with citalopram/escitalopram to pursue a glycine metabolomics “signal” associated with selective serotonine reuptake inhibitor response. We identified a significant SNP in the glycine dehydrogenase gene. Subsequently, GWAS SNP data were generated for the same patients. In this study, we compared SNP imputation within 200 kb of these same six genes with results of the previous tag SNP strategy as a rapid strategy for merging pharmacometabolomic and pharmacogenomic data.
Imputed genotype data provided greater coverage and higher resolution than did tag SNP genotyping, with a higher average genotype concordance between genotyped and imputed SNP data for “1000 Genomes” (96.4%) than HapMap 2 (93.2%) imputation. Many low p-value SNPs with novel locations within genes were observed for imputed compared with tag SNPs, thus altering the focus for subsequent functional genomic studies.
These results indicate that the use of GWAS data to impute SNPs for genes in pathways identified by other “omics” approaches makes it possible to rapidly and economically identify SNP markers to “broaden” and accelerate pharmacogenomic studies.
Pharmacometabolomics; pharmacogenomics; imputation; tag SNPs; 1000 Genomes; HapMap; selective serotonin reuptake inhibitors; SSRIs; major depressive disorder; MDD
Antihypertensive drugs are among the most commonly prescribed drugs for chronic disease worldwide. The response to antihypertensive drugs varies substantially between individuals and important factors such as race that contribute to this heterogeneity are poorly understood. In this study we use metabolomics, a global biochemical approach to investigate biochemical changes induced by the beta-adrenergic receptor blocker atenolol in Caucasians and African Americans. Plasma from individuals treated with atenolol was collected at baseline (untreated) and after a 9 week treatment period and analyzed using a GC-TOF metabolomics platform. The metabolomic signature of atenolol exposure included saturated (palmitic), monounsaturated (oleic, palmitoleic) and polyunsaturated (arachidonic, linoleic) free fatty acids, which decreased in Caucasians after treatment but were not different in African Americans (p<0.0005, q<0.03). Similarly, the ketone body 3-hydroxybutyrate was significantly decreased in Caucasians by 33% (p<0.0001, q<0.0001) but was unchanged in African Americans. The contribution of genetic variation in genes that encode lipases to the racial differences in atenolol-induced changes in fatty acids was examined. SNP rs9652472 in LIPC was found to be associated with the change in oleic acid in Caucasians (p<0.0005) but not African Americans, whereas the PLA2G4C SNP rs7250148 associated with oleic acid change in African Americans (p<0.0001) but not Caucasians. Together, these data indicate that atenolol-induced changes in the metabolome are dependent on race and genotype. This study represents a first step of a pharmacometabolomic approach to phenotype patients with hypertension and gain mechanistic insights into racial variability in changes that occur with atenolol treatment, which may influence response to the drug.
Induced pluripotent stem cells are different from embryonic stem cells as shown by epigenetic and genomics analyses. Depending on cell types and culture conditions, such genetic alterations can lead to different metabolic phenotypes which may impact replication rates, membrane properties and cell differentiation. We here applied a comprehensive metabolomics strategy incorporating nanoelectrospray ion trap mass spectrometry (MS), gas chromatography-time of flight MS, and hydrophilic interaction- and reversed phase-liquid chromatography-quadrupole time-of-flight MS to examine the metabolome of induced pluripotent stem cells (iPSCs) compared to parental fibroblasts as well as to reference embryonic stem cells (ESCs). With over 250 identified metabolites and a range of structurally unknown compounds, quantitative and statistical metabolome data were mapped onto a metabolite networks describing the metabolic state of iPSCs relative to other cell types. Overall iPSCs exhibited a striking shift metabolically away from parental fibroblasts and toward ESCs, suggestive of near complete metabolic reprogramming. Differences between pluripotent cell types were not observed in carbohydrate or hydroxyl acid metabolism, pentose phosphate pathway metabolites, or free fatty acids. However, significant differences between iPSCs and ESCs were evident in phosphatidylcholine and phosphatidylethanolamine lipid structures, essential and non-essential amino acids, and metabolites involved in polyamine biosynthesis. Together our findings demonstrate that during cellular reprogramming, the metabolome of fibroblasts is also reprogrammed to take on an ESC-like profile, but there are select unique differences apparent in iPSCs. The identified metabolomics signatures of iPSCs and ESCs may have important implications for functional regulation of maintenance and induction of pluripotency.
One of the major obstacles in metabolomics is the identification of unknown metabolites. We tested constraints for re-identifying the correct structures of 29 known metabolite peaks from GCT premier accurate mass chemical ionization GC-TOF mass spectrometry data without any use of mass spectral libraries. Correct elemental formulas were retrieved within the top-3 hits for most molecular ion adducts using the “Seven Golden Rules” algorithm. An average of 514 potential structures per formula was downloaded from the PubChem chemical database and in-silico derivatized using the ChemAxon software package. After chemical curation, Kovats retention indices (RI) were predicted for up to 747 potential structures per formula using the NIST MS group contribution algorithm and corrected for contribution of trimethylsilyl groups using the Fiehnlib RI library. When matching the range of predicted RI values against the experimentally determined peak retention, all but three incorrect formulas were excluded. For all remaining isomeric structures, accurate mass electron ionization spectra were predicted using the MassFrontier software and scored against experimental spectra. Using a mass error window of 10 ppm for fragment ions, 89% of all isomeric structures were removed and the correct structure was reported in 73% within the top-5 hits of the cases.
Changes in energy metabolism of the cells are common to many kinds of tumors and are considered a hallmark of cancer. Gas chromatography followed by time-of-flight mass spectrometry (GC-TOFMS) is a well-suited technique to investigate the small molecules in the central metabolic pathways. However, the metabolic changes between invasive carcinoma and normal breast tissues were not investigated in a large cohort of breast cancer samples so far.
A cohort of 271 breast cancer and 98 normal tissue samples was investigated using GC-TOFMS-based metabolomics. A total number of 468 metabolite peaks could be detected; out of these 368 (79%) were significantly changed between cancer and normal tissues (p<0.05 in training and validation set). Furthermore, 13 tumor and 7 normal tissue markers were identified that separated cancer from normal tissues with a sensitivity and a specificity of >80%. Two-metabolite classifiers, constructed as ratios of the tumor and normal tissues markers, separated cancer from normal tissues with high sensitivity and specificity. Specifically, the cytidine-5-monophosphate / pentadecanoic acid metabolic ratio was the most significant discriminator between cancer and normal tissues and allowed detection of cancer with a sensitivity of 94.8% and a specificity of 93.9%.
For the first time, a comprehensive metabolic map of breast cancer was constructed by GC-TOF analysis of a large cohort of breast cancer and normal tissues. Furthermore, our results demonstrate that spectrometry-based approaches have the potential to contribute to the analysis of biopsies or clinical tissue samples complementary to histopathology.
Breast cancer; Metabolomics; Gas chromatography; Mass spectrometry; Cancer detection
Statins are widely prescribed for reducing LDL-cholesterol (C) and risk for cardiovascular disease (CVD), but there is considerable variation in therapeutic response. We used a gas chromatography-time-of-flight mass-spectrometry-based metabolomics platform to evaluate global effects of simvastatin on intermediary metabolism. Analyses were conducted in 148 participants in the Cholesterol and Pharmacogenetics study who were profiled pre and six weeks post treatment with 40 mg/day simvastatin: 100 randomly selected from the full range of the LDL-C response distribution and 24 each from the top and bottom 10% of this distribution (“good” and “poor” responders, respectively). The metabolic signature of drug exposure in the full range of responders included essential amino acids, lauric acid (p<0.0055, q<0.055), and alpha-tocopherol (p<0.0003, q<0.017). Using the HumanCyc database and pathway enrichment analysis, we observed that the metabolites of drug exposure were enriched for the pathway class amino acid degradation (p<0.0032). Metabolites whose change correlated with LDL-C lowering response to simvastatin in the full range responders included cystine, urea cycle intermediates, and the dibasic amino acids ornithine, citrulline and lysine. These dibasic amino acids share plasma membrane transporters with arginine, the rate-limiting substrate for nitric oxide synthase (NOS), a critical mediator of cardiovascular health. Baseline metabolic profiles of the good and poor responders were analyzed by orthogonal partial least square discriminant analysis so as to determine the metabolites that best separated the two response groups and could be predictive of LDL-C response. Among these were xanthine, 2-hydroxyvaleric acid, succinic acid, stearic acid, and fructose. Together, the findings from this study indicate that clusters of metabolites involved in multiple pathways not directly connected with cholesterol metabolism may play a role in modulating the response to simvastatin treatment.
Exposure to dioxins has been shown to contribute to the development of inflammatory diseases such as atherosclerosis. Macrophage-mediated inflammation is a critical event in the initiation of atherosclerosis. Previously we showed that treatment of macrophages with 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) leads to aryl hydrocarbon receptor (AhR)-dependent activation of inflammatory mediators and the formation of cholesterol laden foam cells. However, the mechanisms responsible for the formation of atherosclerotic lesions mediated through AhR have not been identified.
Methods and Results
An in vitro macrophage and an ApoE−/− mouse model were used to determine whether chemokines and their receptors are responsible for the AhR-mediated atherogenesis. Exposure of ApoE−/− mice to TCDD caused a time-dependent progression of atherosclerosis, which was associated with induction of inflammatory genes including Interleukin (IL)-8 as well as F4/80 and matrix metalloproteinase (MMP)-12. High fat diet enhanced the TCDD-mediated inflammatory response and deteriorated the formation of complex atheromas. Treatment with a CXCR2 inhibitor and an AhR antagonist reduced the TCDD-induced progression of early atherosclerotic lesions in ApoE−/− mice.
The results suggest that CXCR2 mediates the atherogenic activity of environmental pollutants, such as dioxins, and contributes to the development of atherosclerosis through the induction of a vascular inflammatory response by activating the AhR-signaling pathway.
AhR; IL-8; TCDD; MMP; CXCR2
Exposure to environmental tobacco smoke (ETS) leads to higher rates of pulmonary diseases and infections in children. To study the biochemical changes that may precede lung diseases, metabolomic effects on fetal and maternal lungs and plasma from rats exposed to ETS were compared to filtered air control animals. Genome- reconstructed metabolic pathways may be used to map and interpret dysregulation in metabolic networks. However, mass spectrometry-based non-targeted metabolomics datasets often comprise many metabolites for which links to enzymatic reactions have not yet been reported. Hence, network visualizations that rely on current biochemical databases are incomplete and also fail to visualize novel, structurally unidentified metabolites.
We present a novel approach to integrate biochemical pathway and chemical relationships to map all detected metabolites in network graphs (MetaMapp) using KEGG reactant pair database, Tanimoto chemical and NIST mass spectral similarity scores. In fetal and maternal lungs, and in maternal blood plasma from pregnant rats exposed to environmental tobacco smoke (ETS), 459 unique metabolites comprising 179 structurally identified compounds were detected by gas chromatography time of flight mass spectrometry (GC-TOF MS) and BinBase data processing. MetaMapp graphs in Cytoscape showed much clearer metabolic modularity and complete content visualization compared to conventional biochemical mapping approaches. Cytoscape visualization of differential statistics results using these graphs showed that overall, fetal lung metabolism was more impaired than lungs and blood metabolism in dams. Fetuses from ETS-exposed dams expressed lower lipid and nucleotide levels and higher amounts of energy metabolism intermediates than control animals, indicating lower biosynthetic rates of metabolites for cell division, structural proteins and lipids that are critical for in lung development.
MetaMapp graphs efficiently visualizes mass spectrometry based metabolomics datasets as network graphs in Cytoscape, and highlights metabolic alterations that can be associated with higher rate of pulmonary diseases and infections in children prenatally exposed to ETS. The MetaMapp scripts can be accessed at http://metamapp.fiehnlab.ucdavis.edu.
Metabolic networks; Enzymatic pathways; Perinatal lung development; Lung surfactants
Genetic polymorphisms of the organic cation transporter 2 (OCT2), encoded by SLC22A2, have been investigated in association with metformin disposition. A functional decrease in transport function has been shown to be associated with the OCT2 variants. Using metabolomics, our study aims at a comprehensive monitoring of primary metabolite changes in order to understand biochemical alteration associated with OCT2 polymorphisms and discovery of potential endogenous metabolites related to the genetic variation of OCT2. Using GC-TOF MS based metabolite profiling, clear clustering of samples was observed in Partial Least Square Discriminant Analysis, showing that metabolic profiles were linked to the genetic variants of OCT2. Tryptophan and uridine presented the most significant alteration in SLC22A2-808TT homozygous and the SLC22A2-808G>T heterozygous variants relative to the reference. Particularly tryptophan showed gene-dose effects of transporter activity according to OCT2 genotypes and the greatest linear association with the pharmacokinetic parameters (Clrenal, Clsec, Cl/F/kg, and Vd/F/kg) of metformin. An inhibition assay demonstrated the inhibitory effect of tryptophan on the uptake of 1-methyl-4-phenyl pyrinidium in a concentration dependent manner and subsequent uptake experiment revealed differential tryptophan-uptake rate in the oocytes expressing OCT2 reference and variant (808G>T). Our results collectively indicate tryptophan can serve as one of the endogenous substrate for the OCT2 as well as a biomarker candidate indicating the variability of the transport activity of OCT2.
Major depressive disorder (MDD) is a common psychiatric disease. Selective serotonin reuptake inhibitors (SSRIs) are an important class of drugs used to treat MDD. However, many patients do not respond adequately to SSRI therapy. We used a pharmacometabolomics-informed pharmacogenomic research strategy to identify citalopram/escitalopram treatment outcome biomarkers. Metabolomic assay of plasma samples from 20 escitalopram remitters and 20 non-remitters showed that glycine was negatively associated with treatment outcome (p=0.0054). That observation was pursued by genotyping tag single nucleotide polymorphisms (SNPs) for genes encoding glycine synthesis and degradation enzymes using 529 DNA samples from SSRI-treated MDD patients. The rs10975641 SNP in the glycine dehydrogenase gene was associated with treatment outcome phenotypes. Rs10975641 was then genotyped and was significant (p=0.02) in DNA from 1245 MDD patients in the STAR*D depression study. These results highlight both a possible role for glycine in SSRI response and the use of pharmacometabolomics to “inform” pharmacogenomics.
Selective serotonin reuptake inhibitors; SSRIs; major depressive disorder; MDD; pharmacometabolomics; metabolomics; pharmacogenomics; glycine; glycine dehydrogenase; GLDC; escitalopram; citalopram
Metabolomics is the methodology that identifies and measures global pools of small molecules (of less than about 1,000 Da) of a biological sample, which are collectively called the metabolome. Metabolomics can therefore reveal the metabolic outcome of a genetic or environmental perturbation of a metabolic regulatory network, and thus provide insights into the structure and regulation of that network. Because of the chemical complexity of the metabolome and limitations associated with individual analytical platforms for determining the metabolome, it is currently difficult to capture the complete metabolome of an organism or tissue, which is in contrast to genomics and transcriptomics. This paper describes the analysis of Arabidopsis metabolomics data sets acquired by a consortium that includes five analytical laboratories, bioinformaticists, and biostatisticians, which aims to develop and validate metabolomics as a hypothesis-generating functional genomics tool. The consortium is determining the metabolomes of Arabidopsis T-DNA mutant stocks, grown in standardized controlled environment optimized to minimize environmental impacts on the metabolomes. Metabolomics data were generated with seven analytical platforms, and the combined data is being provided to the research community to formulate initial hypotheses about genes of unknown function (GUFs). A public database (www.PlantMetabolomics.org) has been developed to provide the scientific community with access to the data along with tools to allow for its interactive analysis. Exemplary datasets are discussed to validate the approach, which illustrate how initial hypotheses can be generated from the consortium-produced metabolomics data, integrated with prior knowledge to provide a testable hypothesis concerning the functionality of GUFs.
Arabidopsis; metabolomics; gene annotation; functional genomics; database
Validation of analytical methods is critical in metabolomics. Over the past ten years there has been a significant progress in the field in terms of both number of hypotheses tested and results obtained with an impact on the fields of biology, medicine and nutrition. Therefore, data reliability, reproducibility and integrity have become extremely important in the analytical laboratory.
In this presentation, classical GLP approach to sample management and quality control in GC-MS analysis will be explained and good data and out of control data will be compared. High sample throughput and complexity of the samples present a challenge to control GC/MS data. The injector and analytical column contribute to uncertainty as they accumulate non-volatile components of the matrix. The change in the chemistry of the column and the injector influences the abundances of metabolites detected. These parameters have to be controlled by analyzing quality control standards by internal standards FAME markers throughout the analysis and so the biological variations between different experimental conditions are easy to detect from mass spectral data. The goal is to illustrate the relevance of quality control to statistical data analysis and study outcomes.
Consumption of large amounts of fructose or sucrose increases lipogenesis and circulating triglycerides in humans. Although the underlying molecular mechanisms responsible for this effect are not completely understood, it is possible that as reported for rodents, high fructose exposure increases expression of the lipogenic enzymes fatty acid synthase (FAS) and acetyl-CoA carboxylase (ACC-1) in human liver. Since activation of the hexosamine biosynthesis pathway (HBP) is associated with increases in the expression of FAS and ACC-1, it raises the possibility that HBP-related metabolites would contribute to any increase in hepatic expression of these enzymes following fructose exposure. Thus, we compared lipogenic gene expression in human-derived HepG2 cells after incubation in culture medium containing glucose alone or glucose plus 5 mM fructose, using the HBP precursor 10 mM glucosamine (GlcN) as a positive control. Cellular metabolite profiling was conducted to analyze differences between glucose and fructose metabolism. Despite evidence for the active uptake and metabolism of fructose by HepG2 cells, expression of FAS or ACC-1 did not increase in these cells compared with those incubated with glucose alone. Levels of UDP-N-acetylglucosamine (UDP-GlcNAc), the end-product of the HBP, did not differ significantly between the glucose and fructose conditions. Exposure to 10 mM GlcN for 10 minutes to 24 hours resulted in 8-fold elevated levels of intracellular UDP-GlcNAc (P<0.001), as well as a 74–126% increase in FAS (P<0.05) and 49–95% increase in ACC-1 (P<0.01) expression above controls. It is concluded that in HepG2 liver cells cultured under standard conditions, sustained exposure to fructose does not result in an activation of the HBP or increased lipogenic gene expression. Should this scenario manifest in human liver in vivo, it would suggest that high fructose consumption promotes triglyceride synthesis primarily through its action to provide lipid precursor carbon and not by activating lipogenic gene expression.
Volatile compounds comprise diverse chemical groups with wide-ranging sources and functions. These compounds originate from major pathways of secondary metabolism in many organisms and play essential roles in chemical ecology in both plant and animal kingdoms. In past decades, sampling methods and instrumentation for the analysis of complex volatile mixtures have improved; however, design and implementation of database tools to process and store the complex datasets have lagged behind.
The volatile compound BinBase (vocBinBase) is an automated peak annotation and database system developed for the analysis of GC-TOF-MS data derived from complex volatile mixtures. The vocBinBase DB is an extension of the previously reported metabolite BinBase software developed to track and identify derivatized metabolites. The BinBase algorithm uses deconvoluted spectra and peak metadata (retention index, unique ion, spectral similarity, peak signal-to-noise ratio, and peak purity) from the Leco ChromaTOF software, and annotates peaks using a multi-tiered filtering system with stringent thresholds. The vocBinBase algorithm assigns the identity of compounds existing in the database. Volatile compound assignments are supported by the Adams mass spectral-retention index library, which contains over 2,000 plant-derived volatile compounds. Novel molecules that are not found within vocBinBase are automatically added using strict mass spectral and experimental criteria. Users obtain fully annotated data sheets with quantitative information for all volatile compounds for studies that may consist of thousands of chromatograms. The vocBinBase database may also be queried across different studies, comprising currently 1,537 unique mass spectra generated from 1.7 million deconvoluted mass spectra of 3,435 samples (18 species). Mass spectra with retention indices and volatile profiles are available as free download under the CC-BY agreement (http://vocbinbase.fiehnlab.ucdavis.edu).
The BinBase database algorithms have been successfully modified to allow for tracking and identification of volatile compounds in complex mixtures. The database is capable of annotating large datasets (hundreds to thousands of samples) and is well-suited for between-study comparisons such as chemotaxonomy investigations. This novel volatile compound database tool is applicable to research fields spanning chemical ecology to human health. The BinBase source code is freely available at http://binbase.sourceforge.net/ under the LGPL 2.0 license agreement.
The metabolite profile changes induced by Fe deficiency in leaves and xylem sap of several Strategy I plant species have been characterized. We have confirmed that Fe deficiency causes consistent changes both in the xylem sap and leaf metabolite profiles. The main changes in the xylem sap metabolite profile in response to Fe deficiency include consistent decreases in amino acids, N-related metabolites and carbohydrates, and increases in TCA cycle metabolites. In tomato, Fe resupply causes a transitory flush of xylem sap carboxylates, but within 1 day the metabolite profile of the xylem sap from Fe-deficient plants becomes similar to that of Fe-sufficient controls. The main changes in the metabolite profile of leaf extracts in response to Fe deficiency include consistent increases in amino acids and N-related metabolites, carbohydrates and TCA cycle metabolites. In leaves, selected pairs of amino acids and TCA cycle metabolites show high correlations, with the sign depending of the Fe status. These data suggest that in low photosynthesis, C-starved Fe-deficient plants anaplerotic reactions involving amino acids can be crucial for short-term survival.
anaplerotic reactions; chlorosis; iron deficiency; leaves; metabolomics; xylem sap
At least two independent parameters are necessary for compound identification in metabolomics. We have compiled 2,212 electron impact mass spectra and retention indices for quadrupole and time-of-flight GC/MS for over 1,000 primary metabolites below 550 Da, covering lipids, amino acids, fatty acids, amines, alcohols, sugars, amino-sugars, sugar alcohols, sugar acids, organic phosphates, hydroxyl acids, aromatics, purines and sterols as methoximated and trimethylsilylated mass spectra under electron impact ionization. Compounds were selected from different metabolic pathway databases. The structural diversity of the libraries was found to be highly overlapping with metabolites represented in the BioMeta/KEGG pathway database using chemical fingerprints and calculations using Instant-JChem. In total, the FiehnLib libraries comprised 68% more compounds and twice as many spectra with higher spectral diversity than the public Golm Metabolite Database. A range of unique compounds are present in the FiehnLib libraries that are not comprised in the 4,345 trimethylsilylated spectra of the commercial NIST05 mass spectral database. The libraries can be used in conjunction with GC/MS software but also support compound identification in the public BinBase metabolomic database that currently comprises 5,598 unique mass spectra generated from 19,032 samples covering 279 studies of 47 species (plants, animals and microorganisms).
Insulin resistance progressing to type 2 diabetes mellitus (T2DM) is marked by a broad perturbation of macronutrient intermediary metabolism. Understanding the biochemical networks that underlie metabolic homeostasis and how they associate with insulin action will help unravel diabetes etiology and should foster discovery of new biomarkers of disease risk and severity. We examined differences in plasma concentrations of >350 metabolites in fasted obese T2DM vs. obese non-diabetic African-American women, and utilized principal components analysis to identify 158 metabolite components that strongly correlated with fasting HbA1c over a broad range of the latter (r = −0.631; p<0.0001). In addition to many unidentified small molecules, specific metabolites that were increased significantly in T2DM subjects included certain amino acids and their derivatives (i.e., leucine, 2-ketoisocaproate, valine, cystine, histidine), 2-hydroxybutanoate, long-chain fatty acids, and carbohydrate derivatives. Leucine and valine concentrations rose with increasing HbA1c, and significantly correlated with plasma acetylcarnitine concentrations. It is hypothesized that this reflects a close link between abnormalities in glucose homeostasis, amino acid catabolism, and efficiency of fuel combustion in the tricarboxylic acid (TCA) cycle. It is speculated that a mechanism for potential TCA cycle inefficiency concurrent with insulin resistance is “anaplerotic stress” emanating from reduced amino acid-derived carbon flux to TCA cycle intermediates, which if coupled to perturbation in cataplerosis would lead to net reduction in TCA cycle capacity relative to fuel delivery.
Summary: Metabolomic publications and databases use different database identifiers or even trivial names which disable queries across databases or between studies. The best way to annotate metabolites is by chemical structures, encoded by the International Chemical Identifier code (InChI) or InChIKey. We have implemented a web-based Chemical Translation Service that performs batch conversions of the most common compound identifiers, including CAS, CHEBI, compound formulas, Human Metabolome Database HMDB, InChI, InChIKey, IUPAC name, KEGG, LipidMaps, PubChem CID+SID, SMILES and chemical synonym names. Batch conversion downloads of 1410 CIDs are performed in 2.5 min. Structures are automatically displayed.
Implementation: The software was implemented in Groovy and JAVA, the web frontend was implemented in GRAILS and the database used was PostgreSQL.
Availability: The source code and an online web interface are freely available. Chemical Translation Service (CTS): http://cts.fiehnlab.ucdavis.edu
The structural elucidation of small molecules using mass spectrometry plays an important role in modern life sciences and bioanalytical approaches. This review covers different soft and hard ionization techniques and figures of merit for modern mass spectrometers, such as mass resolving power, mass accuracy, isotopic abundance accuracy, accurate mass multiple-stage MS(n) capability, as well as hybrid mass spectrometric and orthogonal chromatographic approaches. The latter part discusses mass spectral data handling strategies, which includes background and noise subtraction, adduct formation and detection, charge state determination, accurate mass measurements, elemental composition determinations, and complex data-dependent setups with ion maps and ion trees. The importance of mass spectral library search algorithms for tandem mass spectra and multiple-stage MS(n) mass spectra as well as mass spectral tree libraries that combine multiple-stage mass spectra are outlined. The successive chapter discusses mass spectral fragmentation pathways, biotransformation reactions and drug metabolism studies, the mass spectral simulation and generation of in silico mass spectra, expert systems for mass spectral interpretation, and the use of computational chemistry to explain gas-phase phenomena. A single chapter discusses data handling for hyphenated approaches including mass spectral deconvolution for clean mass spectra, cheminformatics approaches and structure retention relationships, and retention index predictions for gas and liquid chromatography. The last section reviews the current state of electronic data sharing of mass spectra and discusses the importance of software development for the advancement of structure elucidation of small molecules.
Electronic supplementary material
The online version of this article (doi:10.1007/s12566-010-0015-9) contains supplementary material, which is available to authorized users.
Structure elucidation; Mass spectrometry; Tandem mass spectra; Fragmentation prediction; Mass spectral interpretation; Mass spectral library search; Multistage tandem mass spectrometry