In this era of precision medicine, the deep and comprehensive characterization of tumor phenotypes will lead to therapeutic strategies beyond classical factors such as primary sites or anatomical staging. Recently, “-omics” approached have enlightened our knowledge of tumor biology. Such approaches have been extensively implemented in order to provide biomarkers for monitoring of the disease as well as to improve readouts of therapeutic impact. The application of metabolomics to the study of cancer is especially beneficial, since it reflects the biochemical consequences of many cancer type-specific pathophysiological processes. Here, we characterize metabolic profiles of colon and ovarian cancer cell lines to provide broader insight into differentiating metabolic processes for prospective drug development and clinical screening.
We applied non-targeted metabolomics-based mass spectroscopy combined with ultrahigh-performance liquid chromatography and gas chromatography for the metabolic phenotyping of four cancer cell lines: two from colon cancer (HCT15, HCT116) and two from ovarian cancer (OVCAR3, SKOV3). We used the MetaP server for statistical data analysis.
A total of 225 metabolites were detected in all four cell lines; 67 of these molecules significantly discriminated colon cancer from ovarian cancer cells. Metabolic signatures revealed in our study suggest elevated tricarboxylic acid cycle and lipid metabolism in ovarian cancer cell lines, as well as increased β-oxidation and urea cycle metabolism in colon cancer cell lines.
Our study provides a panel of distinct metabolic fingerprints between colon and ovarian cancer cell lines. These may serve as potential drug targets, and now can be evaluated further in primary cells, biofluids, and tissue samples for biomarker purposes.
Electronic supplementary material
The online version of this article (doi:10.1186/s12967-015-0576-z) contains supplementary material, which is available to authorized users.
Supplemental Digital Content is available in the text.
High blood pressure is a major contributor to the global burden of disease and discovering novel causal pathways of blood pressure regulation has been challenging. We tested blood pressure associations with 280 fasting blood metabolites in 3980 TwinsUK females. Survival analysis for all-cause mortality was performed on significant independent metabolites (P<8.9×10−5). Replication was conducted in 2 independent cohorts KORA (n=1494) and Hertfordshire (n=1515). Three independent animal experiments were performed to establish causality: (1) blood pressure change after increasing circulating metabolite levels in Wistar–Kyoto rats; (2) circulating metabolite change after salt-induced blood pressure elevation in spontaneously hypertensive stroke-prone rats; and (3) mesenteric artery response to noradrenaline and carbachol in metabolite treated and control rats. Of the15 metabolites that showed an independent significant association with blood pressure, only hexadecanedioate, a dicarboxylic acid, showed concordant association with blood pressure (systolic BP: β [95% confidence interval], 1.31 [0.83–1.78], P=6.81×10−8; diastolic BP: 0.81 [0.5–1.11], P=2.96×10−7) and mortality (hazard ratio [95% confidence interval], 1.49 [1.08–2.05]; P=0.02) in TwinsUK. The blood pressure association was replicated in KORA and Hertfordshire. In the animal experiments, we showed that oral hexadecanedioate increased both circulating hexadecanedioate and blood pressure in Wistar–Kyoto rats, whereas blood pressure elevation with oral sodium chloride in hypertensive rats did not affect hexadecanedioate levels. Vascular reactivity to noradrenaline was significantly increased in mesenteric resistance arteries from hexadecanedioate-treated rats compared with controls, indicated by the shift to the left of the concentration–response curve (P=0.013). Relaxation to carbachol did not show any difference. Our findings indicate that hexadecanedioate is causally associated with blood pressure regulation through a novel pathway that merits further investigation.
blood pressure; fatty acid synthases; hypertension; metabolomics; mortality
Biological systems consist of multiple organizational levels all densely interacting with each other to ensure function and flexibility of the system. Simultaneous analysis of cross-sectional multi-omics data from large population studies is a powerful tool to comprehensively characterize the underlying molecular mechanisms on a physiological scale. In this study, we systematically analyzed the relationship between fasting serum metabolomics and whole blood transcriptomics data from 712 individuals of the German KORA F4 cohort. Correlation-based analysis identified 1,109 significant associations between 522 transcripts and 114 metabolites summarized in an integrated network, the ‘human blood metabolome-transcriptome interface’ (BMTI). Bidirectional causality analysis using Mendelian randomization did not yield any statistically significant causal associations between transcripts and metabolites. A knowledge-based interpretation and integration with a genome-scale human metabolic reconstruction revealed systematic signatures of signaling, transport and metabolic processes, i.e. metabolic reactions mainly belonging to lipid, energy and amino acid metabolism. Moreover, the construction of a network based on functional categories illustrated the cross-talk between the biological layers at a pathway level. Using a transcription factor binding site enrichment analysis, this pathway cross-talk was further confirmed at a regulatory level. Finally, we demonstrated how the constructed networks can be used to gain novel insights into molecular mechanisms associated to intermediate clinical traits. Overall, our results demonstrate the utility of a multi-omics integrative approach to understand the molecular mechanisms underlying both normal physiology and disease.
Biological systems operate on multiple, intertwined organizational layers that can nowadays be accesses by high-throughput measurement methods, the so-called ‘omics’ technologies. A major aim in the field of systems biology is to understand the flow of biological information between the different layers at a systems level in both health and disease. To unravel the complex mechanisms underlying those molecular processes and to understand how the different functional levels interact with each other, an integrated analysis of multiple layers, i.e. a ‘multi-omics‘ approach is required. In our present study, we investigate the relationship between circulating metabolites in serum and whole-blood gene expression measured in the blood of individuals from a population-based cohort. To this end, we constructed a correlation network that displays which transcript and metabolite show the same trend of up- and down-regulation. We derived a functional characterization of the network by developing a novel computational analysis. The analysis revealed systematic signatures of signaling, transport and metabolic processes on both a regulatory and a pathway level. Moreover, integrating the network with associations to clinical markers such as HDL-cholesterol, LDL-cholesterol and TG identified coordinately activated pathways or modules which might help to assess the molecular machinery behind such an intermediate phenotype.
Metabolomics has opened new avenues for studying metabolic alterations in type 2 diabetes. While many urine and blood metabolites have been associated individually with diabetes, a complete systems view analysis of metabolic dysregulations across multiple biofluids and over varying timescales of glycaemic control is still lacking.
Here we report a broad metabolomics study in a clinical setting, covering 2,178 metabolite measures in saliva, blood plasma and urine from 188 individuals with diabetes and 181 controls of Arab and Asian descent. Using multivariate linear regression we identified metabolites associated with diabetes and markers of acute, short-term and long-term glycaemic control.
Ninety-four metabolite associations with diabetes were identified at a Bonferroni level of significance (p < 2.3 × 10−5), 16 of which have never been reported. Sixty-five of these diabetes-associated metabolites were associated with at least one marker of glycaemic control in the diabetes group. Using Gaussian graphical modelling, we constructed a metabolic network that links diabetes-associated metabolites from three biofluids across three different timescales of glycaemic control.
Our study reveals a complex network of biochemical dysregulation involving metabolites from different pathways of diabetes pathology, and provides a reference framework for future diabetes studies with metabolic endpoints.
Electronic supplementary material
The online version of this article (doi:10.1007/s00125-015-3636-2) contains peer-reviewed but unedited supplementary material, which is available to authorised users.
Arab population; Asian population; Blood metabolomics; Gaussian graphical modelling; Glycaemic control; Metabolic dysregulation; Partial correlation; Saliva metabolomics; Systems biology; Type 2 diabetes; Urine metabolomics
The date palm (Phoenix dactylifera L.) is one of the oldest cultivated trees and is intimately tied to the history of human civilization. There are hundreds of commercial cultivars with distinct fruit shapes, colors, and sizes growing mainly in arid lands from the west of North Africa to India. The origin of date palm domestication is still uncertain, and few studies have attempted to document genetic diversity across multiple regions. We conducted genotyping-by-sequencing on 70 female cultivar samples from across the date palm–growing regions, including four Phoenix species as the outgroup. Here, for the first time, we generate genome-wide genotyping data for 13,000–65,000 SNPs in a diverse set of date palm fruit and leaf samples. Our analysis provides the first genome-wide evidence confirming recent findings that the date palm cultivars segregate into two main regions of shared genetic background from North Africa and the Arabian Gulf. We identify genomic regions with high densities of geographically segregating SNPs and also observe higher levels of allele fixation on the recently described X-chromosome than on the autosomes. Our results fit a model with two centers of earliest cultivation including date palms autochthonous to North Africa. These results adjust our understanding of human agriculture history and will provide the foundation for more directed functional studies and a better understanding of genetic diversity in date palm.
date palm; domestication; genotyping-by-sequencing; population genetics; plant sex chromosomes
Feed efficiency is a paramount factor for livestock economy. Previous studies had indicated a substantial heritability of several feed efficiency traits. In our study, we investigated the genetic background of residual feed intake, a commonly used parameter of feed efficiency, in a cattle resource population generated from crossing dairy and beef cattle. Starting from a whole genome association analysis, we subsequently performed combined phenotype-metabolome-genome analysis taking a systems biology approach by inferring gene networks based on partial correlation and information theory approaches. Our data about biological processes enriched with genes from the feed efficiency network suggest that genetic variation in feed efficiency is driven by genetic modulation of basic processes relevant to general cellular functions. When looking at the predicted upstream regulators from the feed efficiency network, the Tumor Protein P53 (TP53) and Transforming Growth Factor beta 1 (TGFB1) genes stood out regarding significance of overlap and number of target molecules in the data set. These results further support the hypothesis that TP53 is a major upstream regulator for genetic variation of feed efficiency. Furthermore, our data revealed a significant effect of both, the Non-SMC Condensin I Complex, Subunit G (NCAPG) I442M (rs109570900) and the Growth /differentiation factor 8 (GDF8) Q204X (rs110344317) loci, on residual feed intake and feed conversion. For both loci, the growth promoting allele at the onset of puberty was associated with a negative, but favorable effect on residual feed intake. The elevated energy demand for increased growth triggered by the NCAPG 442M allele is obviously not fully compensated for by an increased efficiency in converting feed into body tissue. As a consequence, the individuals carrying the NCAPG 442M allele had an additional demand for energy uptake that is reflected by the association of the allele with increased daily energy intake as observed in our study.
Excess body weight is a major risk factor for cardiometabolic diseases. The complex molecular mechanisms of body weight change-induced metabolic perturbations are not fully understood. Specifically, in-depth molecular characterization of long-term body weight change in the general population is lacking. Here, we pursued a multi-omic approach to comprehensively study metabolic consequences of body weight change during a seven-year follow-up in a large prospective study.
We used data from the population-based Cooperative Health Research in the Region of Augsburg (KORA) S4/F4 cohort. At follow-up (F4), two-platform serum metabolomics and whole blood gene expression measurements were obtained for 1,631 and 689 participants, respectively. Using weighted correlation network analysis, omics data were clustered into modules of closely connected molecules, followed by the formation of a partial correlation network from the modules. Association of the omics modules with previous annual percentage weight change was then determined using linear models. In addition, we performed pathway enrichment analyses, stability analyses, and assessed the relation of the omics modules with clinical traits.
Four metabolite and two gene expression modules were significantly and stably associated with body weight change (P-values ranging from 1.9 × 10−4 to 1.2 × 10−24). The four metabolite modules covered major branches of metabolism, with VLDL, LDL and large HDL subclasses, triglycerides, branched-chain amino acids and markers of energy metabolism among the main representative molecules. One gene expression module suggests a role of weight change in red blood cell development. The other gene expression module largely overlaps with the lipid-leukocyte (LL) module previously reported to interact with serum metabolites, for which we identify additional co-expressed genes. The omics modules were interrelated and showed cross-sectional associations with clinical traits. Moreover, weight gain and weight loss showed largely opposing associations with the omics modules.
Long-term weight change in the general population globally associates with serum metabolite concentrations. An integrated metabolomics and transcriptomics approach improved the understanding of molecular mechanisms underlying the association of weight gain with changes in lipid and amino acid metabolism, insulin sensitivity, mitochondrial function as well as blood cell development and function.
Electronic supplementary material
The online version of this article (doi:10.1186/s12916-015-0282-y) contains supplementary material, which is available to authorized users.
Metabolomics; Transcriptomics; Weight change; Obesity; Molecular epidemiology; Bioinformatics
Modification of DNA by methylation of cytosines at CpG dinucleotides is a widespread phenomenon that leads to changes in gene expression, thereby influencing and regulating many biological processes. Recent technical advances in the genome-wide determination of single-base DNA-methylation enabled epigenome-wide association studies (EWASs). Early EWASs established robust associations between age and gender with the degree of CpG methylation at specific sites. Other studies uncovered associations with cigarette smoking. However, so far these studies were mainly conducted in Caucasians, raising the question of whether these findings can also be extrapolated to other populations.
Here, we present an EWAS with age, gender, and smoking status in a family study of 123 individuals of Arab descent. We determined DNA methylation at over 450,000 CpG sites using the Illumina Infinium HumanMethylation450 BeadChip, applied state-of-the-art data processing protocols, including correction for blood cell type heterogeneity and hidden confounders, and eliminated probes containing SNPs at the targeted CpG site using 40× whole-genome sequencing data. Using this approach, we could replicate the leading published EWAS associations with age, gender and smoking, and recovered hallmarks of gender-specific epigenetic changes. Interestingly, we could even replicate the recently reported precise prediction of chronological age based on the methylation of only a few selected CpG sites.
Our study supports the view that when applied with state-of-the art protocols to account for all potential confounders, DNA methylation arrays represent powerful tools for EWAS with more complex phenotypes that can also be successfully applied to non-Caucasian populations.
Electronic supplementary material
The online version of this article (doi:10.1186/s13148-014-0040-6) contains supplementary material, which is available to authorized users.
DNA methylation; Age; Gender; Smoking; Association study; Epigenetics
Background: The prevalence of type 2 diabetes (T2D) in Qatar and the Middle East is one of the highest in the world. It is estimated that about one quarter of the individuals with tbl2D are undiagnosed. Elevated HbA1c levels are an indicator of tbl2D or a pre-diabetic state. In this study we set out to examine which factors, such as anthropometric and socio-demographic risk factors, are associated with elevated HbA1c levels in a population without tbl2D. Methods: We examined 191 subjects with no record of tbl2D. Anthropometrics and HbA1c were measured. Socio-demographic (age, gender, ethnicity and educational level) and health information were assessed through questionnaires. Elevated HbA1c levels were defined as >6.0% (>42 mmol/mol). Individual risk factors were examined in relationship to having elevated HbA1c levels using logistic regression. Results: Thirty-eight (20%) study participants had elevated HbA1c levels. Participants from South Asian and Filipino descent were more likely to present with elevated HbA1c levels than Arab participants (adjusted odds ratios (OR): 13.30 (95% confidence interval (CI): 4.24, 41.79), p < 0.001 for South Asian and 4.54 (95% CI: 1.04, 19.83), p = 0.04 for Filipinos). A body mass index of above 30 kg/m2 was associated with elevated HbA1c levels (adjusted OR: 2.90 (95% CI: 1.29, 6.51), p = 0.01). Neither gender nor educational level was associated with elevated HbA1c levels. Conclusions: Elevated HbA1c levels in individuals not diagnosed with diabetes were most frequently found in the South Asian and Filipino immigrant population. Special attention should therefore be given to the early identification of tbl2D in these subjects.
HbA1c; undiagnosed type 2 diabetes; public health; pre-diabetes; ethnic differences
Using a nontargeted metabolomics approach of 447 fasting plasma metabolites, we searched for novel molecular markers that arise before and after hyperglycemia in a large population-based cohort of 2,204 females (115 type 2 diabetic [T2D] case subjects, 192 individuals with impaired fasting glucose [IFG], and 1,897 control subjects) from TwinsUK. Forty-two metabolites from three major fuel sources (carbohydrates, lipids, and proteins) were found to significantly correlate with T2D after adjusting for multiple testing; of these, 22 were previously reported as associated with T2D or insulin resistance. Fourteen metabolites were found to be associated with IFG. Among the metabolites identified, the branched-chain keto-acid metabolite 3-methyl-2-oxovalerate was the strongest predictive biomarker for IFG after glucose (odds ratio [OR] 1.65 [95% CI 1.39–1.95], P = 8.46 × 10−9) and was moderately heritable (h2 = 0.20). The association was replicated in an independent population (n = 720, OR 1.68 [ 1.34–2.11], P = 6.52 × 10−6) and validated in 189 twins with urine metabolomics taken at the same time as plasma (OR 1.87 [1.27–2.75], P = 1 × 10−3). Results confirm an important role for catabolism of branched-chain amino acids in T2D and IFG. In conclusion, this T2D-IFG biomarker study has surveyed the broadest panel of nontargeted metabolites to date, revealing both novel and known associated metabolites and providing potential novel targets for clinical prediction and a deeper understanding of causal mechanisms.
Genome-wide association scans with high-throughput metabolic profiling provide unprecedented insights into how genetic variation influences metabolism and complex disease. Here we report the most comprehensive exploration of genetic loci influencing human metabolism to date, including 7,824 adult individuals from two European population studies. We report genome-wide significant associations at 145 metabolic loci and their biochemical connectivity regarding more than 400 metabolites in human blood. We extensively characterize the resulting in vivo blueprint of metabolism in human blood by integrating it with information regarding gene expression, heritability, overlap with known drug targets, previous association with complex disorders and inborn errors of metabolism. We further developed a database and web-based resources for data mining and results visualization. Our findings contribute to a greater understanding of the role of inherited variation in blood metabolic diversity, and identify potential new opportunities for pharmacologic development and disease understanding.
Motivation: Linking genes and functional information to genetic variants identified by association studies remains difficult. Resources containing extensive genomic annotations are available but often not fully utilized due to heterogeneous data formats. To enhance their accessibility, we integrated many annotation datasets into a user-friendly webserver.
Availability and implementation:
Supplementary data are available at Bioinformatics online.
With diminishing costs of next generation sequencing (NGS), whole genome analysis becomes a standard tool for identifying genetic causes of inherited diseases. Commercial NGS service providers in general not only provide raw genomic reads, but further deliver SNP calls to their clients. However, the question for the user arises whether to use the SNP data as is, or process the raw sequencing data further through more sophisticated SNP calling pipelines with more advanced algorithms.
Here we report a detailed comparison of SNPs called using the popular GATK multiple-sample calling protocol to SNPs delivered as part of a 40x whole genome sequencing project by Illumina Inc of 171 human genomes of Arab descent (108 unrelated Qatari genomes, 19 trios, and 2 families with rare diseases) and compare them to variants provided by the Illumina CASAVA pipeline. GATK multi-sample calling identifies more variants than the CASAVA pipeline. The additional variants from GATK are robust for Mendelian consistencies but weak in terms of statistical parameters such as TsTv ratio. However, these additional variants do not make a difference in detecting the causative variants in the studied phenotype.
Both pipelines, GATK multi-sample calling and Illumina CASAVA single sample calling, have highly similar performance in SNP calling at the level of putatively causative variants.
Electronic supplementary material
The online version of this article (doi:10.1186/1756-0500-7-747) contains supplementary material, which is available to authorized users.
NGS; GATK; CASAVA; WGS pipeline; Mendelian inheritance; Qatari population; Multi-sample calling; Genotype calling; Variant; Trios; Illumina
With the help of epigenome-wide association studies (EWAS), increasing knowledge on the role of epigenetic mechanisms such as DNA methylation in disease processes is obtained. In addition, EWAS aid the understanding of behavioral and environmental effects on DNA methylation. In terms of statistical analysis, specific challenges arise from the characteristics of methylation data. First, methylation β-values represent proportions with skewed and heteroscedastic distributions. Thus, traditional modeling strategies assuming a normally distributed response might not be appropriate. Second, recent evidence suggests that not only mean differences but also variability in site-specific DNA methylation associates with diseases, including cancer. The purpose of this study was to compare different modeling strategies for methylation data in terms of model performance and performance of downstream hypothesis tests. Specifically, we used the generalized additive models for location, scale and shape (GAMLSS) framework to compare beta regression with Gaussian regression on raw, binary logit and arcsine square root transformed methylation data, with and without modeling a covariate effect on the scale parameter.
Using simulated and real data from a large population-based study and an independent sample of cancer patients and healthy controls, we show that beta regression does not outperform competing strategies in terms of model performance. In addition, Gaussian models for location and scale showed an improved performance as compared to models for location only. The best performance was observed for the Gaussian model on binary logit transformed β-values, referred to as M-values. Our results further suggest that models for location and scale are specifically sensitive towards violations of the distribution assumption and towards outliers in the methylation data. Therefore, a resampling procedure is proposed as a mode of inference and shown to diminish type I error rate in practically relevant settings. We apply the proposed method in an EWAS of BMI and age and reveal strong associations of age with methylation variability that are validated in an independent sample.
Models for location and scale are promising tools for EWAS that may help to understand the influence of environmental factors and disease-related phenotypes on methylation variability and its role during disease development.
DNA methylation; Beta regression; GAMLSS; Infinium HumanMethylation450k BeadChip; EWAS; Modeling variability; Resampling; Model performance; Model comparison; Models for location and scale
High-throughput screening techniques that analyze the metabolic endpoints of biological processes can identify the contributions of genetic predisposition and environmental factors to the development of common diseases. Studies applying controlled physiological challenges can reveal dysregulation in metabolic responses that may be predictive for or associated with these diseases. However, large-scale epidemiological studies with well controlled physiological challenge conditions, such as extended fasting periods and defined food intake, pose logistic challenges. Culturally and religiously motivated behavioral patterns of life style changes provide a natural setting that can be used to enroll a large number of study volunteers. Here we report a proof of principle study conducted within a Muslim community, showing that a metabolomics study during the Holy Month of Ramadan can provide a unique opportunity to explore the pre-prandial and postprandial response of human metabolism to nutritional challenges. Up to five blood samples were obtained from eleven healthy male volunteers, taken directly before and two hours after consumption of a controlled meal in the evening on days 7 and 26 of Ramadan, and after an over-night fast several weeks after Ramadan. The observed increases in glucose, insulin and lactate levels at the postprandial time point confirm the expected physiological response to food intake. Targeted metabolomics further revealed significant and physiologically plausible responses to food intake by an increase in bile acid and amino acid levels and a decrease in long-chain acyl-carnitine and polyamine levels. A decrease in the concentrations of a number of phospholipids between samples taken on days 7 and 26 of Ramadan shows that the long-term response to extended fasting may differ from the response to short-term fasting. The present study design is scalable to larger populations and may be extended to the study of the metabolic response in defined patient groups such as individuals with type 2 diabetes.
Metabolomics; Nutritional challenging; Ramadan fasting; Study design; Clinical research
Individualized Medicine aims at providing optimal treatment for an individual patient at a given time based on his specific genetic and molecular characteristics. This requires excellent clinical stratification of patients as well as the availability of genomic data and biomarkers as prerequisites for the development of novel diagnostic tools and therapeutic strategies. The University Medicine Greifswald, Germany, has launched the “Greifswald Approach to Individualized Medicine” (GANI_MED) project to address major challenges of Individualized Medicine. Herein, we describe the implementation of the scientific and clinical infrastructure that allows future translation of findings relevant to Individualized Medicine into clinical practice.
Clinical patient cohorts (N > 5,000) with an emphasis on metabolic and cardiovascular diseases are being established following a standardized protocol for the assessment of medical history, laboratory biomarkers, and the collection of various biosamples for bio-banking purposes. A multi-omics based biomarker assessment including genome-wide genotyping, transcriptome, metabolome, and proteome analyses complements the multi-level approach of GANI_MED. Comparisons with the general background population as characterized by our Study of Health in Pomerania (SHIP) are performed. A central data management structure has been implemented to capture and integrate all relevant clinical data for research purposes. Ethical research projects on informed consent procedures, reporting of incidental findings, and economic evaluations were launched in parallel.
Personalized Medicine; Individualized Medicine; Epidemiology
The mechanism of antihypertensive and lipid-lowering drugs on the human organism is still not fully understood. New insights on the drugs’ action can be provided by a metabolomics-driven approach, which offers a detailed view of the physiological state of an organism. Here, we report a metabolome-wide association study with 295 metabolites in human serum from 1,762 participants of the KORA F4 (Cooperative Health Research in the Region of Augsburg) study population. Our intent was to find variations of metabolite concentrations related to the intake of various drug classes and—based on the associations found—to generate new hypotheses about on-target as well as off-target effects of these drugs. In total, we found 41 significant associations for the drug classes investigated: For beta-blockers (11 associations), angiotensin-converting enzyme (ACE) inhibitors (four assoc.), diuretics (seven assoc.), statins (ten assoc.), and fibrates (nine assoc.) the top hits were pyroglutamine, phenylalanylphenylalanine, pseudouridine, 1-arachidonoylglycerophosphocholine, and 2-hydroxyisobutyrate, respectively. For beta-blockers we observed significant associations with metabolite concentrations that are indicative of drug side-effects, such as increased serotonin and decreased free fatty acid levels. Intake of ACE inhibitors and statins associated with metabolites that provide insight into the action of the drug itself on its target, such as an association of ACE inhibitors with des-Arg(9)-bradykinin and aspartylphenylalanine, a substrate and a product of the drug-inhibited ACE. The intake of statins which reduce blood cholesterol levels, resulted in changes in the concentration of metabolites of the biosynthesis as well as of the degradation of cholesterol. Fibrates showed the strongest association with 2-hydroxyisobutyrate which might be a breakdown product of fenofibrate and, thus, a possible marker for the degradation of this drug in the human organism. The analysis of diuretics showed a heterogeneous picture that is difficult to interpret. Taken together, our results provide a basis for a deeper functional understanding of the action and side-effects of antihypertensive and lipid-lowering drugs in the general population.
Electronic supplementary material
The online version of this article (doi:10.1007/s10654-014-9910-7) contains supplementary material, which is available to authorized users.
Beta-blockers; Angiotensin-converting enzyme inhibitors; Diuretics; Statins; Fibrates; Metabolomics
We aimed to assess whether whole blood expression quantitative trait loci (eQTLs) with effects in cis and trans are robust and can be used to identify regulatory pathways affecting disease susceptibility.
Materials and Methods
We performed whole-genome eQTL analyses in 890 participants of the KORA F4 study and in two independent replication samples (SHIP-TREND, N = 976 and EGCUT, N = 842) using linear regression models and Bonferroni correction.
In the KORA F4 study, 4,116 cis-eQTLs (defined as SNP-probe pairs where the SNP is located within a 500 kb window around the transcription unit) and 94 trans-eQTLs reached genome-wide significance and overall 91% (92% of cis-, 84% of trans-eQTLs) were confirmed in at least one of the two replication studies. Different study designs including distinct laboratory reagents (PAXgene™ vs. Tempus™ tubes) did not affect reproducibility (separate overall replication overlap: 78% and 82%). Immune response pathways were enriched in cis- and trans-eQTLs and significant cis-eQTLs were partly coexistent in other tissues (cross-tissue similarity 40–70%). Furthermore, four chromosomal regions displayed simultaneous impact on multiple gene expression levels in trans, and 746 eQTL-SNPs have been previously reported to have clinical relevance. We demonstrated cross-associations between eQTL-SNPs, gene expression levels in trans, and clinical phenotypes as well as a link between eQTLs and human metabolic traits via modification of gene regulation in cis.
Our data suggest that whole blood is a robust tissue for eQTL analysis and may be used both for biomarker studies and to enhance our understanding of molecular mechanisms underlying gene-disease associations.
The date palm is one of the oldest cultivated fruit trees. It is critical in many ways to cultures in arid lands by providing highly nutritious fruit while surviving extreme heat and environmental conditions. Despite its importance from antiquity, few genetic resources are available for improving the productivity and development of the dioecious date palm. To date there has been no genetic map and no sex chromosome has been identified.
Here we present the first genetic map for date palm and identify the putative date palm sex chromosome. We placed ~4000 markers on the map using nearly 1200 framework markers spanning a total of 1293 cM. We have integrated the genetic map, derived from the Khalas cultivar, with the draft genome and placed up to 19% of the draft genome sequence scaffolds onto linkage groups for the first time. This analysis revealed approximately ~1.9 cM/Mb on the map. Comparison of the date palm linkage groups revealed significant long-range synteny to oil palm. Analysis of the date palm sex-determination region suggests it is telomeric on linkage group 12 and recombination is not suppressed in the full chromosome.
Based on a modified gentoyping-by-sequencing approach we have overcome challenges due to lack of genetic resources and provide the first genetic map for date palm. Combined with the recent draft genome sequence of the same cultivar, this resource offers a critical new tool for date palm biotechnology, palm comparative genomics and a better understanding of sex chromosome development in the palms.
Sex chromosome; Genotyping by sequencing; Comparative genomics
Emerging technologies based on mass spectrometry or nuclear magnetic resonance enable the monitoring of hundreds of small metabolites from tissues or body fluids. Profiling of metabolites can help elucidate causal pathways linking established genetic variants to known disease risk factors such as blood lipid traits.
We applied statistical methodology to dissect causal relationships between single nucleotide polymorphisms, metabolite concentrations, and serum lipid traits, focusing on 95 genetic loci reproducibly associated with the four main serum lipids (total-, low-density lipoprotein-, and high-density lipoprotein- cholesterol and triglycerides). The dataset used included 2,973 individuals from two independent population-based cohorts with data for 151 small molecule metabolites and four main serum lipids. Three statistical approaches, namely conditional analysis, Mendelian randomization, and structural equation modeling, were compared to investigate causal relationship at sets of a single nucleotide polymorphism, a metabolite, and a lipid trait associated with one another.
A subset of three lipid-associated loci (FADS1, GCKR, and LPA) have a statistically significant association with at least one main lipid and one metabolite concentration in our data, defining a total of 38 cross-associated sets of a single nucleotide polymorphism, a metabolite and a lipid trait. Structural equation modeling provided sufficient discrimination to indicate that the association of a single nucleotide polymorphism with a lipid trait was mediated through a metabolite at 15 of the 38 sets, and involving variants at the FADS1 and GCKR loci.
These data provide a framework for evaluating the causal role of components of the metabolome (or other intermediate factors) in mediating the association between established genetic variants and diseases or traits.
The cross talk between the stroma and cancer cells plays a major role in phenotypic modulation. During peritoneal carcinomatosis ovarian cancer cells interact with mesenchymal stem cells (MSC) resulting in increased metastatic ability. Understanding the transcriptomic changes underlying the phenotypic modulation will allow identification of key genes to target. However in the context of personalized medicine we must consider inter and intra tumoral heterogeneity. In this study we used a pathway-based approach to illustrate the role of cell line background in transcriptomic modification during a cross talk with MSC.
We used two ovarian cancer cell lines as a surrogate for different ovarian cancer subtypes: OVCAR3 for an epithelial and SKOV3 for a mesenchymal subtype. We co-cultured them with MSCs. Genome wide gene expression was determined after cell sorting. Ingenuity pathway analysis was used to decipher the cell specific transcriptomic changes related to different pro-metastatic traits (Adherence, migration, invasion, proliferation and chemoresistance).
We demonstrate that co-culture of ovarian cancer cells in direct cellular contact with MSCs induces broad transcriptomic changes related to enhance metastatic ability. Genes related to cellular adhesion, invasion, migration, proliferation and chemoresistance were enriched under these experimental conditions. Network analysis of differentially expressed genes clearly shows a cell type specific pattern.
The contact with the mesenchymal niche increase metastatic initiation and expansion through cancer cells’ transcriptome modification dependent of the cellular subtype. Personalized medicine strategy might benefit from network analysis revealing the subtype specific nodes to target to disrupt acquired pro-metastatic profile.
Ovarian cancer; Mesenchymal stem cell; Transcriptome; Genomic modification; Metastasis
Changes in an individual’s human metabolic phenotype (metabotype) over time can be indicative of disorder-related modifications. Studies covering several months to a few years have shown that metabolic profiles are often specific for an individual. This “metabolic individuality” and detected changes may contribute to personalized approaches in human health care. However, it is not clear whether such individual metabotypes persist over longer time periods. Here we investigate the conservation of metabotypes characterized by 212 different metabolites of 818 participants from the Cooperative Health Research in the Region of Augsburg; Germany population, taken within a 7-year time interval. For replication, we used paired samples from 83 non-related individuals from the TwinsUK study. Results indicated that over 40 % of all study participants could be uniquely identified after 7 years based on their metabolic profiles alone. Moreover, 95 % of the study participants showed a high degree of metabotype conservation (>70 %) whereas the remaining 5 % displayed major changes in their metabolic profiles over time. These latter individuals were likely to have undergone important biochemical changes between the two time points. We further show that metabolite conservation was positively associated with heritability (rank correlation 0.74), although there were some notable exceptions. Our results suggest that monitoring changes in metabotypes over several years can trace changes in health status and may provide indications for disease onset. Moreover, our study findings provide a general reference for metabotype conservation over longer time periods that can be used in biomarker discovery studies.
Electronic supplementary material
The online version of this article (doi:10.1007/s11306-014-0629-y) contains supplementary material, which is available to authorized users.
Metabolomics; Longitudinal study; Heritability; Population study
Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) that associate with clinical phenotypes, but these SNPs usually explain just a small part of the heritability and have relatively modest effect sizes. In contrast, SNPs that associate with metabolite levels generally explain a higher percentage of the genetic variation and demonstrate larger effect sizes. Still, the discovery of SNPs associated with metabolite levels is challenging since testing all metabolites measured in typical metabolomics studies with all SNPs comes with a severe multiple testing penalty. We have developed an automated workflow approach that utilizes prior knowledge of biochemical pathways present in databases like KEGG and BioCyc to generate a smaller SNP set relevant to the metabolite. This paper explores the opportunities and challenges in the analysis of GWAS of metabolomic phenotypes and provides novel insights into the genetic basis of metabolic variation through the re-analysis of published GWAS datasets.
Re-analysis of the published GWAS dataset from Illig et al. (Nature Genetics, 2010) using a pathway-based workflow (http://www.myexperiment.org/packs/319.html), confirmed previously identified hits and identified a new locus of human metabolic individuality, associating Aldehyde dehydrogenase family1 L1 (ALDH1L1) with serine/glycine ratios in blood. Replication in an independent GWAS dataset of phospholipids (Demirkan et al., PLoS Genetics, 2012) identified two novel loci supported by additional literature evidence: GPAM (Glycerol-3 phosphate acyltransferase) and CBS (Cystathionine beta-synthase). In addition, the workflow approach provided novel insight into the affected pathways and relevance of some of these gene-metabolite pairs in disease development and progression.
We demonstrate the utility of automated exploitation of background knowledge present in pathway databases for the analysis of GWAS datasets of metabolomic phenotypes. We report novel loci and potential biochemical mechanisms that contribute to our understanding of the genetic basis of metabolic variation and its relationship to disease development and progression.
Genome-wide association; Metabolite; Genotype-phenotype prioritization; Bioinformatics; Pathway databases
Genome-wide association studies (GWAS) have identified many risk loci for complex diseases, but effect sizes are typically small and information on the underlying biological processes is often lacking. Associations with metabolic traits as functional intermediates can overcome these problems and potentially inform individualized therapy. Here we report a comprehensive analysis of genotype-dependent metabolic phenotypes using a GWAS with non-targeted metabolomics. We identified 37 genetic loci associated with blood metabolite concentrations, of which 25 exhibit effect sizes that are unusually high for GWAS and account for 10-60% of metabolite levels per allele copy. Our associations provide new functional insights for many disease-related associations that have been reported in previous studies, including cardiovascular and kidney disorders, type 2 diabetes, cancer, gout, venous thromboembolism, and Crohn’s disease. Taken together our study advances our knowledge of the genetic basis of metabolic individuality in humans and generates many new hypotheses for biomedical and pharmaceutical research.
Systems biology enables the identification of gene networks that modulate complex traits. Comprehensive metabolomic analyses provide innovative phenotypes that are intermediate between the initiator of genetic variability, the genome, and raw phenotypes that are influenced by a large number of environmental effects. The present study combines two concepts, systems biology and metabolic analyses, in an approach without prior functional hypothesis in order to dissect genes and molecular pathways that modulate differential growth at the onset of puberty in male cattle. Furthermore, this integrative strategy was applied to specifically explore distinctive gene interactions of non-SMC condensin I complex, subunit G (NCAPG) and myostatin (GDF8), known modulators of pre- and postnatal growth that are only partially understood for their molecular pathways affecting differential body weight.
Our study successfully established gene networks and interacting partners affecting growth at the onset of puberty in cattle. We demonstrated the biological relevance of the created networks by comparison to randomly created networks. Our data showed that GnRH (Gonadotropin-releasing hormone) signaling is associated with divergent growth at the onset of puberty and revealed two highly connected hubs, BTC and DGKH, within the network. Both genes are known to directly interact with the GnRH signaling pathway. Furthermore, a gene interaction network for NCAPG containing 14 densely connected genes revealed novel information concerning the functional role of NCAPG in divergent growth.
Merging both concepts, systems biology and metabolomic analyses, successfully yielded new insights into gene networks and interacting partners affecting growth at the onset of puberty in cattle. Genetic modulation in GnRH signaling was identified as key modifier of differential cattle growth at the onset of puberty. In addition, the benefit of our innovative concept without prior functional hypothesis was demonstrated by data suggesting that NCAPG might contribute to vascular smooth muscle contraction by indirect effects on the NO pathway via modulation of arginine metabolism. Our study shows for the first time in cattle that integration of genetic, physiological and metabolomics data in a systems biology approach will enable (or contribute to) an improved understanding of metabolic and gene networks and genotype-phenotype relationships.
Cattle; SEGFAM; Systems biology; Metabolomics; Genome-wide association study; Divergent growth; Puberty