Combining in silico tools and expert manual curation, we produced an accurate genome-scale metabolic model of the oleaginous yeast Y. lipolytica, using a functional metabolic model of the phylogenetically related yeast S. cerevisiae as a scaffold for the reconstruction. The method developed in the present work can be used for genome-scale metabolic model reconstruction of other organisms, making it a useful tool for biotechnology and research.
We noticed that, even if the list of S. cerevisiae reactions not present in Y. lipolytica was short, there was an important number of changes in the gene associations between both organisms. Also, the loss of some phenotypes in Y. lipolytica, compared to S. cerevisiae, was characterized by a loss of a small number of genes.
Thirteen new transport reactions were added to the new model in order to connect enzymatic reactions inside the peroxisome with molecular species in the cytosol, and to import species from extracellular space to the cytosol. We could not find genes encoding for all those transports, but we expect that the eventual characterization of the 1 034 (16%) Y. lipolytica genes with unknown function, will provide evidence for some of them. The lack of accuracy at predicting some experiments could be explained by missing reactions in the model, especially regarding the transport of specific carbon sources. This gives us hints about possible ways to improve our model.
The modifications to the draft model performed by the manual curators allowed us to formalize a set of edit operations over metabolic models. This facilitated an automatic iteration process, from improvements to the reconstruction method, to improved draft models, to automatic application of curator edits, to automatic assertion of accuracy.
The present model can be used to predict growth under different media conditions and gene knock-outs. It can also be used as a general description of the state-of-the-art in Y. lipolytica metabolism. Data from high-throughput experiments, like microarrays and metabolomics, can be mapped to this model to have an overview of metabolic changes under different media conditions.
Current understanding of Y. lipolytica
is constantly improving, and a number of features of its metabolism are the subject of ongoing work and consequently improvements to the model. Multigene families such as POX1–POX6 in peroxisomal β-oxidation could be modeled with better precision, since there are enzymatic specificities linked to the length of the carbon chain (e.g. Pox2 for long chains, Pox3 for short chain fatty acids, see for example [26
]). This is also true for multigene families LIP1–LIP19 hydrolases of triacylglycerides, where there also exists chain-length specificity [3
], although the specificities of the ALK1–ALK19 genes are not completely known. In general, lipid metabolism in Y. lipolytica
is still under study and there is a lack of knowledge in several areas, such as transport between compartments, or the link between nitrogen abundance and the production of either lipid or citric acid [11
Expansion of families of isozymes is detectable through expansion of paralogous protein families, but the method used here cannot detect these differences because FBA does not differentiate isoenzyme activities in the same reaction. Dynamic models that describe the kinetics of individual enzymes in reactions must be developed. This will require acquiring and integrating metabolic and transcriptomic data for targeted pathways, and developing models. Alvarez-Vasquez et al.
], for example, used biochemical systems theory to develop a model of S. cerevisiae
sphingolipid metabolism; more recently, Gupta et al.
] developed a quantitative model of this pathway in mammalian cells by combining metabolite and transcriptome data in their estimation of kinetic rate constants. In general, the constraint-based FBA approach used here for validation cannot describe Y. lipolytica
metabolic pathways with the same precision as dynamic differential equation models, but does have the merit of permitting a whole-genome model.
The most pressing need in further iterations of the model is refinement of alkane degradation for decane and hexadecane. Indeed the analysis of alkane growth of ANT1 and ABC1 mutants were performed on n-alkane from C10 to C16, including C11, C13, and C15, in [10
]. Also, Y. lipolytica
is described as growing on n-alkane paraffin (petroleum distillate) containing n-alkane oil (C12 to C18 n-akanes) and also n-paraffin wax (C20 and above, solid alkane) in [29
]. This suggests that it is necessary to introduce all even and odd chain lengths including C1, since Y. lipolytica
could use very long alkane chains above C20.