A rigorous, rapid computational method for the generation of in silico tissue-specific metabolic models is presented. MBA carves out a tissue-specific model out of a generic species model, based on existing literature and molecular data characterizing the tissue's metabolism. The resulting model satisfies stoichiometric, mass-balance and thermodynamic constraints. The algorithm is structured according to the accuracy level of the input data by the division of the tissue-specific core into the more reliable human-curated (CH) and the omics data (CM), which are treated with different levels of confidence. Despite the heuristic nature of the algorithm the model construction is consistent, such that most reactions selected for the final model from the candidate solutions appear in the large majority of the solutions. The algorithm is applied to construct the first stoichiometric genome-scale liver metabolic model. The resulting liver model has the ability to perform a range of hepatic metabolic functions, as well as correctly depict the metabolic profile of the liver at different physiological and genetic conditions, in which it outperforms the generic, human model.
The above approach is akin to a recently published method by Christian et al (2009)
in that it defines a set of reactions as a core and attempts to carve out the minimal model that consists of this core, but it differs in that it is based on optimization rather than on a network expansion procedure that does not consider the stochiometric constraints. MBA has its pros and cons: on the pros side, it is a generic and fast approach to generate tissue-specific models, which are fairly accurate and useful. It is practically unlimited in its scalability, and can process a large variety of data sources. Importantly, using cross-validation, one can readily get a quantitative assessment of the model consistency and reliability, and know where things stand. On the cons side, several limitations should be noted. First, the starting point—as the approach hinges upon a generic species model, its accuracy depends on the quality of the latter. This dependency may be alleviated in the future with an extended computational approach that includes the possibility of adding reactions from a universal pool during model construction. Second, the accuracy of different molecular omics data that are used to determine the tissue core is also an obvious limiting factor—to mitigate this effect our approach treats human-curated pathways data in a preferential manner, and requires evidence from multiple molecular sources, and yet, it is likely that not all inaccuracies in the molecular data are filtered out. Bearing these potential caveats in mind, the accuracy of the liver model generated here is quite remarkable.
As the final liver model consists of more than 95% of the CM
core (), we further tested the added value of having a moderate reliability core over simply including all of its reactions in the resulting liver model (similarly to the high reliability core CH
). To this end, we constructed a model that has a fixed predetermined core composed of both the original CH
reaction sets (i.e., now all defined as CH
reactions), and studied its ability to predict flux alterations. The resulting model is denoted as a strict model
, for its construction does not tolerate a removal of any core reaction. The strict model's performance is rather similar to the performances of the liver model in predicting inner fluxes, obtaining an AUC of 0.6061 and 0.6433 for increasing and decreasing fluxes. However, comparison of the models by their ability to predict extracellular fluxes given the intracellular fluxes, in a global five-fold cross-validation test (Supplementary Figure 4
), as well as in biomarker prediction (Supplementary Figure 5
), shows the advantage of the liver model built by the standard, flexible MBA version. Although the liver model contains most of the CM
reactions, a considerable number of 100 reactions differentiate between it and the strict model. This set of reactions is involved in 22 metabolic pathways (see Supplementary information
, data set II, for full description), and effects the performances of the models.
Content of the input and output sets of MBA
The MBA approach opens up opportunities for many promising future applications. First and foremost, the liver model developed in this study can help in the rational design of bioengineering artificial devices that emulate hepatic metabolism (e.g., BAL; Yang et al, 2009
). Hepatocytes, the main components of the BAL (Strain and Neuberger, 2002
), have a tendency to rapidly lose their functionality due to metabolic transformations (e.g., lipid accumulation and reduced ammonia removal). Applying the liver metabolic model to optimize the functionality of these cells and predict potential ways for inhibiting the metabolic processes that contribute to this unwanted transformation can assist in bypassing this hurdle. Second, MBA can serve for the rapid development of an array of metabolic models of a variety of human tissues, providing a computational opportunity to probe the metabolism of such tissues as the kidney, heart, and brain on a genomic scale. Third, MBA can be used to generate tissue models for any organism for which a generic model exists, such as Mus musculus
(Sheikh et al, 2005
; Quek and Nielsen, 2008
) and the model plant Arabidopsis thaliana
(Poolman et al, 2009