Genomics, transcriptomics, proteomics, and metabolomics have delivered large arrays of data, allowing one to correlate physiological states with patterns of gene expression, protein levels, and metabolite abundance. A major challenge in the analysis and interpretation of this data is delivering models of causation from correlations 
. Mouse models of diabetes provide a unique method for exploring correlation structure since metabolic dysregulation creates a window for simultaneous application of multiple “omic” technologies.
We have previously shown that diabetes traits show strong heritability in an F2 intercross between the diabetes-resistant C57BL/6 leptinob/ob
and the diabetes-susceptible BTBR leptinob/ob
mouse strains. We assume that the disease phenotype is brought about by a complex pattern of gene expression changes in key tissues 
. However, we also recognize the complexity inherent in discriminating the gene expression changes that cause diabetes from those that occur as a consequence of the disease. For example, many genes are known to be responsive to elevated blood glucose levels 
. Through correlation alone, it is difficult to distinguish these “reactive” genes from ones that are “causal” for the disease.
We have taken advantage of the high heritability of mRNA abundance phenotypes, and via microarray technology, have mapped gene loci controlling gene expression at the genome-wide level 
. This establishes at least one node in a network simply because genetic variation leads to changes in gene expression and not vice versa
. However, it does not establish whether the link between a locus and a phenotype is direct or via multiple steps and pathways 
The purpose of the current study was to explore the possibility that the levels of metabolites in tissues are sufficiently heritable in an F2 intercross to provide significant linkage signals, leading to metabolic QTL. Given that many pathways converge upon common metabolites and that these pathways have multiple controllers, any one genetic locus may not alter metabolite levels significantly, and therefore may not be identified as a metabolite QTL. Nonetheless, in our F2 sample, we found significant linkage signals, including some that are quite strong (e.g. tyrosine: LOD>7, p<0.005; chromosome 2).
Our results reveal that metabolites can be mapped to distinct genetic regions, much like mRNA transcripts. Although QTL mapping in an F2 sample does not provide sufficient resolution to identify individual genes with high certainty, it can yield novel information about regulatory networks. Phenotypes mapping to the same locus can be hypothesized to be co-regulated by that locus. With our definition of “phenotype” now including transcripts, metabolites, and physiological traits, we can begin to devise relationships between these phenotypes and genetic regions.
This F2 study provides evidence of co-regulation of biologically related pathways. An example is the correlations we found between amino acids and short-chain acyl-carnitine derivatives. These findings are consistent with our understanding of metabolic physiology. In a catabolic, “glucose starved” state, muscle degrades proteins and delivers amino acids to the liver for glucose production. The liver transaminates amino acids to corresponding α-keto acid gluconeogenic substrates. Alpha-ketoglutarate is often the α-keto acid acceptor for these transaminase reactions, generating glutamate as a product. Glutamate, which can also be generated from glutamine in the glutaminase reaction, is then deaminated to produce ammonia by glutamate dehydrogenase, to be fixed through the urea cycle. Additionally, hepatic fatty acid oxidation and amino acid catabolism yield even and odd-numbered short-chain acyl CoAs, which can be used for fuel and for production of ketone bodies. These short-chain acyl-CoA species are readily converted to the cognate carnitine esters, which we have profiled by MS/MS in this study.
The amino acid metabolites provide the most striking evidence of functional clustering. We see in both the correlation matrix () and the genetic linkage data () that the majority of amino acids group together. However, a subset of the amino acids, asx, glx, arginine, and ornithine uniquely map to chromosome 7. Our data predict that these metabolites are driven by different genetic regulators, leading to a unique mapping signature, even within a group of highly correlated metabolites. The C/EBP transcription factors have been shown to alter expression of enzymes acting in the urea cycle and gluconeogenic pathway 
, and the C/EBPα isoform is encoded on chromosome 7. Although we cannot determine that metabolites are mapping to the same individual genes, we can identify genetic regions that coordinate groups of metabolites and transcripts and contain plausible candidate genes.
The relationship between mRNA transcripts and metabolites, however, can be bi-directional. Our network identifies a specific metabolite, glx that regulates gene expression. This is consistent with previous studies where glutamine alone increases hepatic expression of argininosuccinate synthetase and phosphoenolpyruvate carboxykinase, but when combined with other essential amino acids, alters additional transcripts of urea cycle and gluconeogenic pathways 
. Our work extends these prior observations by showing that glutamine also changes expression of Agxt
, and Slc1a2
, but does not alter Slc38a3
, despite the positive correlation with this transcript. The combination of pathway construction based on transcriptional and metabolic profiling and direct model testing in living cells provides evidence for a new pathway by which glx can regulate a key gluconeogenic enzyme. Future studies will be needed to investigate if this pathway is perturbed in development of diabetes.
The glutamine induced reduction in Slc1a2
expression was unexpected given that this glutamate transporter is upstream of glx in the best-proposed causal network (, solid lines). Slc1a2
mRNA abundance, however, maps in trans
(to a locus distinct from the physical location of the gene) to chromosome 9, its eQTL overlapping with the glx mQTL. It is therefore possible that glutamine could regulate Slc1a2
, as indicated by the second causal network (, dotted lines). Several studies have shown that Slc1a2
expression in astrocytes is reduced by increased ammonia 
. Despite the positive correlation between Slc1a2
and glx in vivo, the glutamine-treated hepatocytes produce ammonia via glutaminase, and could decrease expression of hepatic Slc1a2
in vitro. We also did not predict altered expression of Ivd
, an enzyme of leucine oxidation. It is interesting to note that Ivd
is a case where a gene maps both in cis
(to the locus containing the Ivd
gene) and in trans
, here overlapping with the glx mQTL on chromosomes 2 and 13. Studies have shown that glutamine has an inverse relationship with leucine oxidation, and this could be mediated by glutamine-induced decreased Ivd
We show that the combined use of eQTL and mQTL, with correlations allows one to derive a network and establish data-driven hypotheses about metabolite and gene expression relationships. For example, glycine and serine are the two amino acids most highly correlated with glx, and the transcript most highly correlated with glx is Agxt
(, Table S2
). Indeed, in our experiments, Agxt
was upregulated by glutamine. We hypothesize that the upregulation by glx of Agxt
is one mechanism by which glx is correlated with glycine and serine since Agxt
catalyzes the transamination of glyoxalate to form glycine, which can then be converted to serine. In further support of this hypothesis, in the F2 sample, serine and glycine correlate (r>0.5, p<0.01) to Agxt
The concurrent use of transcriptomics and metabolomics is not limited to one biochemical pathway. For example, the correlation between amino acids and transcripts of carbohydrate and lipid metabolism might reflect a broader signaling function of amino acids beyond pathways of protein metabolism. Furthermore, this correlation, co-mapping, and causal network analysis can uncover roles for transcripts of unknown function. We note Riken clones and ESTs are among the transcripts highly correlated to individual metabolites (Table S3
). By incorporating these transcripts of unknown function as nodes into causal networks, along with transcripts from known pathways, we may infer the functions of these previously unidentified mRNA species.
In conclusion, this study shows that metabolites, in addition to transcripts and physiological traits, can be mapped to genetic regions, providing a powerful tool to establish connections between genetic loci and physiological traits. The groups of metabolites and transcripts that are correlated or co-map to physiological traits in our F2 sample may offer insight into metabolic pathways that are causal or reactive to diabetes pathology.