|Home | About | Journals | Submit | Contact Us | Français|
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Pseudomonas putida is the best studied pollutant degradative bacteria and is harnessed by industrial biotechnology to synthesize fine chemicals. Since the publication of P. putida KT2440's genome, some in silico analyses of its metabolic and biotechnology capacities have been published. However, global understanding of the capabilities of P. putida KT2440 requires the construction of a metabolic model that enables the integration of classical experimental data along with genomic and high-throughput data. The constraint-based reconstruction and analysis (COBRA) approach has been successfully used to build and analyze in silico genome-scale metabolic reconstructions.
We present a genome-scale reconstruction of P. putida KT2440's metabolism, iJN746, which was constructed based on genomic, biochemical, and physiological information. This manually-curated reconstruction accounts for 746 genes, 950 reactions, and 911 metabolites. iJN746 captures biotechnologically relevant pathways, including polyhydroxyalkanoate synthesis and catabolic pathways of aromatic compounds (e.g., toluene, benzoate, phenylacetate, nicotinate), not described in other metabolic reconstructions or biochemical databases. The predictive potential of iJN746 was validated using experimental data including growth performance and gene deletion studies. Furthermore, in silico growth on toluene was found to be oxygen-limited, suggesting the existence of oxygen-efficient pathways not yet annotated in P. putida's genome. Moreover, we evaluated the production efficiency of polyhydroxyalkanoates from various carbon sources and found fatty acids as the most prominent candidates, as expected.
Here we presented the first genome-scale reconstruction of P. putida, a biotechnologically interesting all-surrounder. Taken together, this work illustrates the utility of iJN746 as i) a knowledge-base, ii) a discovery tool, and iii) an engineering platform to explore P. putida's potential in bioremediation and bioplastic production.
Pseudomonas putida is a non-pathogenic member of rRNA group I of the genus Pseudomonas that colonizes many different environments and is well known for its broad metabolic versatility and genetic plasticity [1,2]. P. putida KT2440 is a TOL plasmid cured, spontaneous restriction deficient derivative of P. putida mt-2 [3,4]. This strain represents the first host-vector biosafety system for cloning in gram-negative soil bacteria and hence, has been extensively used as a host for gene cloning and expression of heterologous genes [5-8]. Consequently, large efforts have been made in exploiting these capacities in a diverse range of biotechnological applications including i) bioremediation of contaminated areas [9,10]; ii) quality improvement of fossil fuels, e.g., by desulphurization ; iii) biocatalytic production of fine chemicals [9,12-14]; iv) production of bioplastic [15-17]; and v) as agents of plant growth promotion and plant pest control [18,19].
Since the publication of P. putida KT2440's genome , our knowledge about this strain has significantly increased  and various "-omics" data sets have become available, such as transcriptomic [22,23], proteomic , and fluxomic data [25,26]. Subsequently, some in silico analyses of its metabolic and biotechnological capacities have been published [27,28]. However, systemic understanding of metabolic and biotechnology capabilities of P. putida KT2440 requires the construction of a more comprehensive model enabling the integration of the canonical experimental data along with genomic and high-throughput data in a hierarchical and coherent fashion .
The constraint-based reconstruction and analysis (COBRA) approach is one possible modeling approach that uses stoichiometric information about biochemical transformation taking place in a target organism to construct the model. While a metabolic reconstruction is unique to the target organism one can derive many different condition-specific models from a single reconstruction. This conversion of a metabolic reconstruction of an organism into models requires the imposition of physicochemical and environmental constraints to define systems boundaries [30-32]. The conversion also includes the transformation of the reaction list into a computable, mathematical matrix format. In this so-called S matrix, where S stands for stoichiometric, the rows correspond to the network metabolites and the columns to the network reactions. The coefficients of the substrates and products of each reaction are entered in the corresponding cell of the matrix. This conversion can be done automatically (e.g., using the Matlab-based COBRA toolbox ). Once in this format, numerous mathematical tools can be used to interrogate the metabolic network properties in silico. Many of the published mathematical tools have been reviewed  and encoded in Matlab format . A large subset of these tools relies on linear programming (LP), a mathematical tool used to find a solution to an optimization problem (e.g., maximal possible growth rate of my metabolic network under a given set of environmental constraints). While LP-based tools are very helpful in studying reconstructed metabolic networks, some questions may better be addressed without having to choose an objective function. Those methods are called unbiased methods, in contrast to biased LP-based methods, because they identify all feasible flux distributions under the given set of environmental constraints rather than only the optimal distributions. The COBRA approach [30,32] has been successfully used to build and analyze genome-scale in silico reconstructions for representatives of archaea (e.g.,Methanosarcina barkeri ), of bacteria (e.g., E. coli ; B. subtilis ; H. pylori ; M. tuberculosis [39,40]; S. aureus [41,42]; L. lactis ), and of eukarya (e.g., Human ). The numerous mathematical tools have been used for i) identification and filling of knowledge gaps (e.g. missing gene annotations ); ii) prediction of the outcome of adaptive evolution [46-48]; iii) design of engineered production strains ; and iv) the understanding of topological features of metabolic networks [50-53]. A recent review illustrates the variety of questions that have been addressed to E. coli's metabolic network using different biased and unbiased COBRA methods .
Here, we describe a highly detailed, genome-scale, metabolic reconstruction of Pseudomonas putida KT2440. Based on the naming convention for metabolic networks , this genome scale reconstruction was deemed iJN746, where i stands for in silico, JN are the initials of the constructor, and 746 corresponds to the number of included metabolic genes. The reconstruction was built using the COBRA approach [30,32] and validated using flux balance analysis (FBA, ). The in silico metabolic network was further evaluated by comparing i) predicted growth rate capacities in different carbon sources and ii) predicted essential genes with experimental data from P. putida KT2440 and P. aeruginosa. Finally, we show the utility of the P. putida reconstruction to analyze its biodegradative (i.e. toluene degradation) and biotechnological (i.e. bioplastic production) capacities.
The metabolic reconstruction of P. putida KT2440, iJN746, was constructed based on its annotated genome sequence , primary and review publications, various genetic and biochemical databases (i.e., KEGG Database , PSEUDOCYC , and SYSTOMONAS ), and biochemical information found in Pseudomonas-specific  and biochemical textbooks.
iJN746 accounts for 746 open reading frames (ORF), whose corresponding gene products are involved in 810 metabolic and transport reactions (Table (Table1).1). A total of 140 non-gene associated reactions were included in iJN746 based on physiological evidence in literature supporting their presence in P. putida's metabolism. Hence, the reconstruction captures a total of 950 metabolic reactions and 911 metabolites distributed over three different cellular compartments: cytoplasm, periplasm, and extracellular space. Each metabolite was placed in one or more of these compartments depending on the cellular localization of the catalyzing enzyme, and the flux across outer and inner membranes was enabled by transport reactions.
The reactions included in iJN746 were divided into 55 specific pathways, or subsystems, based on their functional role (Figure (Figure1A).1A). In general, the transport subsystem was found to be the subsystem with the highest number of gene-associated reactions, highlighting the importance of cellular transport for P. putida. This observation agrees well with the known lifestyle of P. putida  and successfully reflects the fact that approx. 12% of P. putida genome encodes for transport-associated gene products . Reactions related to amino acid metabolism were also found to be very important since the de novo synthesis pathways for all 20 amino acids are present in P. putida's genome . Moreover, P. putida is known for its capability to utilize many amino acids as a carbon and nitrogen source [21,60]. A third group of great importance contained reactions involved in aromatic acid degradation pathways, which reflects the physiological ability of P. putida to use many of these compounds as a carbon and energy source (see Figure Figure2)2) . Furthermore, despite the absence of the TOL pathway in KT2440's genome, the plasmid genes and the corresponding reactions were included into the P. putida metabolic reconstruction since the TOL plasmid is present in the parental strain P. putida mt-2 and this paradigmatic plasmid is often used to expand P. putida KT2440's metabolic capacities [6,12]. Finally, reactions associated with lipid metabolism constituted another important subsystem group. In fact, P. putida KT2440 can synthesize and accumulate medium-side-chain polyhydroxyalkanoates (msc-PHAs), which are lipid related polymers, from a wide range of carbon sources [17,61]. This ability is of special interest for biotechnological purposes (reviewed in [62,63]) and therefore, we incorporated both the msc-PHAs biosynthetic and TOL biodegradative pathways into the metabolic reconstruction (see below).
Every network reaction was associated with confidence scores based on the available evidence for its presence in the P. putida metabolic network (Figure (Figure1B).1B). For instance, reactions whose enzymes have been biochemically studied in P. putida received a confidence score of 4. If physiological or genetic knockout information was available, a score of 3 was associated with the network reaction. Reactions associated with enzymes that were only annotated in P. putida's genome but had no further experimental evidence were given a confidence score of 2. Finally, during the evaluation of the network functionality (i.e. biomass precursor production) some reactions had to be added to the network for which no genetic or experimental evidence could be found. Those reactions represent modeling hypotheses, which need further experimental validation and thus received a confidence score of 1. Upon completion, the reconstruction had an overall average confidence score of 2.83. In fact, two thirds of P. putida's metabolic pathways have been very well or well studied, while only a third of the subsystems were primarily based on the genome annotation (Figure (Figure1B).1B). This high level of confidence is also reflected by the number of references that lead to this metabolic reconstruction. Almost 90% of the internal reactions (844) have at least one associated citation, while a total of 176 unique primary and review publications were reviewed and incorporated into this reconstruction. Subsequently, this first genome-scale reconstruction of P. putida's metabolism represents a comprehensive knowledge base summarizing and categorizing the information currently available. The content of this knowledge base will be easily accessible through the BiGG database http://BiGG.ucsd.edu.
The properties of iJN746 were compared with the properties of recently published reconstructions of E. coli MG1655 (iAF1260, ), B. subtilis (iYO844 ), M. tuberculosis H37Rv (iNJ661 ), and P. aeruginosa PAO1 (iMO1056  (Table (Table1).1). We found that the percentage of included ORFs was smaller in iJN746 than in the other reconstructions. Subsequently, it can be expected that the number of metabolic functions present in P. putida is larger than currently identified in the genome annotation and literature. In fact, the number of included non-gene associated reactions was twice that of the E. coli metabolic reconstruction. Furthermore, the species knowledge index (SKI) , which relates the number of PubMed abstracts of an organism to its number of ORFs, was much lower for P. putida compared to the other reconstructions. In summary, this comparison indicates that the overall context coverage in iJN746 is comparable with other high-quality network reconstructions when the amount of available literature is considered.
A metabolic reconstruction for another representative of the Pseudomonas genera was published recently . A comparison of P. putida and P. aeruginosa metabolic reconstructions was performed (Table (Table1).1). In contrast to P. putida, P. aeruginosa is an opportunistic human pathogen and as such more information about its metabolism and physiology is available, which is directly reflected by a SKI value 7 times higher than that of P. putida (Table (Table1).1). As a consequence, a larger number of metabolic genes were included in the metabolic reconstruction (14% of P. putida's genome vs. 18% of P. aeruginosa's genome). Despite being close relatives, these two representatives have significant differences in lifestyle and metabolic capabilities. Subsequently, the two metabolic reconstructions have significant differences, emphasizing the importance of organism-specific reconstructions. For instance, the P. aeruginosa reconstruction contains pathways necessary for growth and production of common virulence factors, including alginate, rhamnolipids, phenezines, and quorum-sensing molecules , which are not present in P. putida's metabolic network. In contrast, P. aeruginosa's metabolic network does not account for pathways necessary to degrade aromatic compounds.
Flux balance analysis (FBA ) can provide insight into the growth capabilities of the reconstructed network. Comparison of in silico growth performance with experimental data allows for the assessment of the predictive potential of the metabolic reconstruction and thus represents a valuable tool for network evaluation. Furthermore, in silico growth analysis may expand the known array of carbon-, nitrogen-, and energy sources of the reconstructed organism. In this study, the aerobic growth capabilities of iJN746 in iM9 medium substituted with different carbon sources were determined qualitatively (Table (Table2)2) and quantitatively (Table (Table3).3). The growth simulation results reflected the metabolic versatility for which P. putida is well known, with a total of 59 carbon sources enabling in silico growth when added to the iM9 minimal medium (Table (Table2).2). Furthermore, we compared the in silico growth performance on different carbon, sulfur, and nitrogen sources with phenotyping data derived from literature [see Additional file 1]. For instance, P. putida is found in terrestrial and aquatic environments around the world, with preference for the rhizosphere , which is especially rich in carbon sources, amino acids, organic acids, and aromatic acids derived from seeds, roots, and other plant parts [66,67]. This niche specificity accounts for the broad carbon source usage of KT2440 and therefore, most of the known soil carbon sources were captured in iJN746 (Table (Table2).2). Of particular biotechnological importance is the ability of iJN746 to metabolize aromatic compounds, thus, representing the first metabolic reconstruction accounting for growth on these carbon sources. For example, aromatic compounds such as toluene or xylene are of special interest as they are archetypical pollutants. Subsequently, we studied the toluene degradation process using iJN746 (see below).
No false positive carbon, nitrogen, or sulfur sources were found in iJN746, as expected, as only exchange reactions were included in the reconstruction for metabolites, which have been reported to be taken up or secreted by P. putida KT2440. In contrast, some disagreements, such as false negatives, were observed despite a good overall agreement with the in vivo data  [Additional file 1]. For example, it was reported that P. putida can use L-alanine as a carbon- and nitrogen-source  but iJN746 cannot use this compound as a carbon or nitrogen source. This disagreement could not be resolved. In contrast, iJN746 was initially unable to use choline-O-sulphate, choline, or glycine betaine as carbon- and nitrogen-sources despite experimental evidence . However, the addition of two non-gene-associated reactions, betaine-homocysteine S-methyltransferase (EC- 126.96.36.199) and dimethylglycine dehydrogenase (EC- 188.8.131.52), enabled iJN746 to use these metabolites as carbon- and nitrogen-sources through the glycine metabolism. In addition, choline-O-sulphate could also be used as sulfur source [see Additional file 1]. The two added reactions represent a hypothesis that needs further experimental verification. These examples show how discrepancies between in silico predictions and physiological properties can be used to drive new discoveries, as was shown for E. coli .
P. putida KT2440, like other Pseudomonas species and rhizosymbionts, has an incomplete glycolytic pathway because of a missing 6-phosphofructokinase . However, P. putida KT2440 has a complete Entner-Doudoroff pathway, which allows for the utilization of glucose and other sugars as carbon sources (Table (Table2).2). Therefore, we investigated the properties of glucose metabolism in iJN746 to validate and evaluate the reconstructed network . For instance, comparison of predicted in silico growth with experimental data permits a direct assessment of the predictive potential of a reconstructed metabolic network. Subsequently, we determined the aerobic growth capability of iJN746 in Glucose-M9 minimal medium (iM9). Interestingly, iJN746 grew faster in glucose than experimental in vivo data suggested for P. putida KT2442 (Table (Table3,3, ). A similar difference in growth rate between in vivo and in silico measurements was reported for P. aeruginosa . The difference in growth rate might be explained by an incomplete formulation of biomass function or higher energy maintenance requirements not accounted for in the current reconstruction [30,36] or missing adaptation to glucose as primary carbon source. Another explanation could be that P. putida KT2442 converts only a part of glucose into biomass. In fact, a recent study showed that P. putida KT2442 accumulated low, extracellular concentrations of gluconate and 2-ketogluconate when grown on glucose . P. putida metabolizes glucose exclusively via the Entner-Doudoroff pathway in which 6-phosphogluconate is the key intermediate. This compound is produced by three convergent pathways; the glucokinase branch, the gluconokinase branch, and the 2-ketogluconate loop (Figure (Figure33). The latter two pathways produce gluconate and 2-ketogluconate as intermediate compounds of the glucose catabolism. iJN746 accounts for these alternate routes and corresponding transport reactions for gluconate and 2-ketogluconate.
Aromatic compounds such as toluene or xylene are found in polluted soil. Some Pseudomonas species are known to grow on these compounds as a sole carbon source , making them interesting candidates for bioremediation of contaminated areas [9,10]. As indicated above, P. putida KT2440 can metabolize various aromatic acids, amino acids, sugars, organic acids, fatty acids, and organo-sulfur compounds (see Table Table2).2). More specifically, P. putida KT2440 degrades many aromatic compounds into a limited number of intermediates using a few catabolic pathways that were captured in iJN746 (Figure (Figure2).2). In particular, the toluene biodegradation pathway has been extensively studied in P. putida [73-75] and its genetic regulation is well known . In this study, we assessed the capability of iJN746 to quantitatively predict aerobic growth on toluene (Table (Table3).3). The comparison showed a much lower in silico growth rate when compared to in vivo data, 0.421 versus 0.72 (60%) (Table (Table3).3). In the following, we used different mathematical tools to elucidate reasons for this significant discrepancy.
Linear Programming (LP) problems have two parameters, shadow price and reduced cost, which can be used to characterize the optimal solution. While shadow prices are associated with each network metabolite, reduced costs are associated with each network reaction. The reduced cost signifies the amount by which the objective function (e.g. growth rate) would increase when the flux rate through a chosen reaction was increased by a single unit . Analyses of the reduced costs associated with uptake rates in the oxygen-limited toluene simulations identified the OUR as the only non-zero reduced cost value, 0.021 g biomass/gDW/h. This value corresponds to an increase of the OUR to 33 mmol oxygen/gDW/h to achieve the experimentally determined growth rate . At an OUR higher than 62 mmol oxygen/gDW/h oxygen is no longer a growth-limiting factor but toluene is. Note that the upper limit of 18.5 mmol oxygen/gDW/h for the OUR was taken from measurements for E. coli corresponding to the normal oxygen diffusion rate under atmospheric oxygen conditions . Mathematically, the reduced cost analysis supports the hypothesis that oxygen is the limiting factor for toluene catabolism and hence causes the reduced in silico growth rate.
We performed a phase plane analysis to further elucidate the correlation between toluene uptake, OUR, and biomass production rate (Figure (Figure4).4). We analyzed all four cases listed in Table Table33 and found a direct effect of increased OUR on the toluene uptake capability and biomass production rate (Figure (Figure4A).4A). The experimentally observed growth rate of 0.72 μmax(h-1)  was achieved by TUR ranging from 6 to 11.9 mmol toluene/gDW/h and OUR higher than 33 mmol oxygen/gDW/h. Note that a higher toluene uptake rate (TUR) requires a higher OUR (Figure (Figure4A),4A), which indicates that the removal of intracellular oxygen was dependent on toluene availability. In fact, the three oxidative reactions involved in the conversion of toluene to 2-hydroxymuconate semialdehyde (toluene monooxygenase, benzoate 1,2-dioxygenase and catechol 2,3-dioxygenase) were found to have the higher flux rates besides the flux through the cytochrome C oxidase, an enzyme of the oxidative phosphorylation (Figure (Figure4B4B).
In order to better understand this situation and since no detailed information about OUR was found for P. putida KT2440 under toluene-dependent growth conditions, we carried out in vivo experiments to determine the OUR of P. putida KT2440 harboring the TOL plasmid (see Methods). As expected, the OUR in toluene growing cells was higher than glucose or octanoate growing cells; 20.93 compared to 15.34 and 14.88 mmol oxygen/gDW/h, respectively (Table (Table3).3). The measured OUR uptake rate for growth in toluene did not explain the high oxygen requirement of the model, but clearly indicates the importance of oxygen uptake in toluene metabolism. Also, the measured OUR was slightly higher than the E. coli value that was used for the standard in silico simulations (20.93 vs. 18.5 mmol oxygen/gDW/h). In fact, oxygen dependent growth of toluene grown cells has been described for other P. putida strains. For example, Alagappan and Cowan reported a 10× higher oxygen-half saturation of P. putida F1 grown on toluene than other aerobic organisms . Furthermore, the oxidative stress caused by toluene and other aromatic acids in the degradative process is well known [23,80]; however, this phenomenon was found to be mainly caused by reactive oxygen species due to incomplete oxygen reduction , indicating an active oxygen metabolism under this growth condition. Oxygen-limiting growth conditions were also reported for P. putida when grown on octanoate .
Taken together, our analysis suggests that the current P. putida metabolic network is incomplete. In fact, the current information and results suggest that the network is missing one or more reactions enabling a more oxygen-efficient catabolism of toluene and other highly reduced carbon sources (e.g. other aromatic compounds or fatty acids). This analysis represents a nice example of the broad range of applications for which iJN746 can be used to evaluate the consistency of experimental data and in silico prediction. iJN746 can serve as a platform to derive hypotheses about metabolic capabilities or missing functions in the network which can be ultimately tested in the laboratory. Hence, the metabolic reconstruction can help to increase our understanding and knowledge about this biotechnologically important organism.
iJN746 was used as a framework to analyze candidate essential genes in P. putida KT2440 in LB rich medium. Therefore, the network reaction(s) associated with each gene was individually "deleted" by setting the flux to 0 and optimizing for the biomass function . We wished to compare the in silico essentiality predictions with experimental data to assess the predictive potential of the model. However, no large-scale, experimental gene essentiality data are available for P. putida; the information can only be found for its phylogenetic relative P. aeruginosa PAO1 and P. aeruginosa PA14 [82,83]. A recently published comparison between the P. putida and P. aeruginosa PAO1 genomes identified 3,143 potential orthologous pairs corresponding to 60% of P. putida's total ORFs, as well as large sections of conserved gene order (synteny) . Therefore, we decided to compare our in silico single gene deletion results with the 335 essential metabolic and non-metabolic genes of P. aeruginosa [82,83]. About 12% (92) of the 746 metabolic genes present in iJN746 were predicted to be essential in iLB medium [see Additional file 2]. A total of 53% (48) of these predicted essential genes in iJN746 agreed with essential genes of P. aeruginosa [see Additional file 3]. More importantly, the 44 genes wrongly predicted as essential genes represent excellent targets for further refinement and expansion of the metabolism of iJN746 [see Additional file 4] as has been done for E. coli .
The disagreement between the experimental and computational results can reveal possible errors in the experimental data as well as in the reconstructed network. The disagreements might be caused by low experimental or sequence evidences, each of which would have hindered the inclusion of the information into the reconstruction. For example, the fabB gene was predicted to be only essential in iJN746; however, after carrying out a detailed search on Pseudomona's genomes using "The Pseudomonas Genome Database V2" http://www.Pseudomonas.com/ we found putative ORFs in the KT2440 and PA01 genome. These ORFs were annotated as alternative loci that could substitute a fabB deletion. Both, P. putida and P. aeruginosa have one copy of the fabB gene encoding for the 3-oxoacyl-(acyl-carrier-protein)synthase I (PP_4175 and PA1609, respectively). In addition, both strains have a copy of the fabF gene encoding for the 3-oxoacyl-(acyl-carrier-protein) synthase II (PP_1916 (40.92% identity with fabB-KT gene) and PA2965 (42.34% identity with fabB-PAO1 gene). Moreover, in the P. putida and P. aeruginosa genome, some ORFs were annotated putatively to encode for a 3-oxoacyl-(acyl-carrier-protein) synthase II (PP_3303 (35.94% identity) and PP_2780 (27.32% identity) in KT2440, and PA_1373 (36.17% identity) in PAO1 strain. These putative ORFs were not included in iJN746 due to the lack of supporting evidence for their metabolic function, but this analysis showed that i) PAO1 has an isozyme present in its genome, and ii) KT2440 is very likely to have at least one other ORF encoding this or a similar function. In a similar way, the discrepancy between in silico essentiality prediction and in vitro observation for msbA gene could be explained. The gene product of msbA encodes for a transporter of phosphatidylethanolamine, which is known to have a genetic redundancy in Pseudomonas sp. taking into account the Pseudomonas annotation present in "The Pseudomonas Genome Database V2". However, the supporting evidence for alternative ORFs was not strong enough to be included into iJN746.
Finally, 37 genes were not predicted to be essential in iJN746 but they were reported as essential genes in P. aeruginosa  [see Additional files 4 and 3]. Of these false negatives, 13 genes encode for tRNAs synthetases which are typically included into metabolic networks  but are not functionally connected to the rest of the network. Hence, this disagreement was expected. Four additional false negative predictions, namely glyA (PP_0322 or PP_0671), fold (PP_1945 or PP_2265), fabZ (PP_4174 or PP_1602), and pyrH (PP_1771 or 1593), have at least one isozyme in KT2440 which were also accounted for in iJN746. For many remaining incorrectly predicted non-essential genes, the in silico deletion had a significant effect on the growth rate, reflecting their important roles in iJN746 metabolism [see Additional file 5].
In general, many of these discrepancies suggest that metabolites enabling growth in the knock-outs might be imported from the external rich media since the exact composition of LB medium is not known [37,38]. This observation indicates the importance of using well defined minimal media in the experimental in vivo or in vitro procedure to enable the usage of the generated data for in silico predictions and comparison.
Jacobs et al. reported a detailed amino acid auxotroph study in P. aeruginosa PA01 using a minimal medium . We carried out another single gene deletion study in glucose iM9 medium and compared the results with this PA01 study. Here, we found an absolute agreement between in vivo and in silico gene essentiality for six amino acids, namely arginine, histidine, isoleucine, valine, leucine, and tryptophan (Table (Table4).4). The presence of alternative loci in iJN746 explains partial disagreement for argA, argE, ilvA, and argJ. In fact, genetic redundancy for these genes was reported in Pseudomonas species . This high correlation between in silico and in vivo data shows the utility of this approach when you take into account metabolic or anabolic reactions in a well defined minimal media. The complete lists of potential essential genes predicted in glucose iM9 medium are listed in the Additional file 6.
In the previous section, we used the metabolic reconstruction to assess the current knowledge of P. putida's metabolism by comparing and testing in silico predictions with physiological data. However, metabolic network reconstructions can also serve as engineering and design tools  in addition to their use for discovery purposes . Here, we investigate the poly-3-hydroxyalkanoate (PHA) production capability by the metabolic network. PHAs are a class of microbially produced polyesters that have the potential to replace conventional, petrochemically derived plastics in packaging and coating applications . The biotechnological interest originates from their biodegradability and the broad range of physical properties depending on the number of carbons and side chains present in the PHA polymers . These polymers are stored by many microorganisms under inorganic nutrient limited and carbon-excess growth conditions and are used as carbon- and energy sources under starvation conditions . The medium-side-chain PHAs (msc-PHAs) are composed of C6 to C16 3-hydroxy fatty acids and are commonly produced by fluorescent Pseudomonas. In this way, P. putida KT2440 is an excellent candidate for msc-PHA production studies, since i) the basic msc-PHA production processes in KT2440 are well known [17,61], ii) its genome is completely sequenced, iii) KT2440 has a well known metabolic versatility (can use a large list of carbon source as PHA precursors), iv) it is a very good host-vector biosafety system for gene cloning and expression of heterologous genes and v) this strain has been used in numerous biotechnology processes including msc-PHA production.
iJN746 accounts for msc-PHAs ranging from C6 to C14, including two unsaturated msc-PHAs and a mixed msc-PHA polymer consisting of C8 to C12 chains. We tested the msc-PHA production capability of iJN746 from the different carbon- and energy sources listed in Table Table2.2. All carbon sources were found to result in msc-PHA production under the chosen simulation condition (dilution rate of 0.2 hr-1). Many of these metabolites have been reported to yield in PHA production in Pseudomonas [see Additional file 7] although many studies focused on fatty acid or carbohydrate derived msc-PHAs. In general, it is assumed that carbon sources generating high levels of acetyl-CoA are good candidates for PHA production . Therefore, it was not surprising to find fatty acids and carbohydrates as the best PHA precursors in iJN746 as well (Figure (Figure5).5). The list of candidate (in silico) precursors includes i) L-branched-chain amino acids (L-leucine, L-isoleucine, L-Valine etc), ii) some aromatic compounds metabolized via β-ketoadipate pathway (catechol, p-coumarate, etc), and iii) other (phenylacetic acid or glycerol) (Figure (Figure5).5). Interestingly, phenylacetic acid and glycerol have been reported as excellent precursors for PHA [Additional file 7]. In fact, a recent study showed that P. putida CA3 can accumulate 0.17 g of PHA per g of phenylacetate .
Fatty acids resulted in the highest PHA production rate overall and when scaled per carbons (see Figure Figure5,5, and Additional file 7). In fact, fatty acids are converted into msc-PHAs quickly via β-oxidation . Experimental studies showed that the resulting msc-PHA-monomers have the same or a smaller number of carbons as the fatty acids from which they are derived [61,85]. In contrast, in the model, higher carbon msc-PHAs could be formed since the current model formulation does not exclude simultaneous fatty acid synthesis and β-oxidation. This situation has been experimentally demonstrated using hexanoate as a msc-PHA precursor. Huijberts et al. used inhibitors of fatty acid metabolism and demonstrated that, depending on the nature of the substrate, precursors for PHA synthesis could be derived from either beta-oxidation or fatty acid biosynthesis, and interestingly, when hexanoate was used as carbon source for msc-PHA accumulation, both routes can operate simultaneously . On the other hand, the carbohydrates are converted into msc-PHA from intermediates of the fatty acid synthesis and have been shown to result primarily in C8 and C10 monomers. The model, in contrast, is able to produce the full range of msc-PHAs from carbohydrates (Figure (Figure5).5). These discrepancies suggest that despite broad specificity of the Poly-(3-hydroxyalkanoate) polymerase, ranging from C6 to C16 3-hydroxy fatty acids , the PHA polymerizing enzyme system might have preferences for monomers with 8 or 10 carbon atoms, while larger and smaller monomers are incorporated less efficiently. This fact can also explain why, during growth on hexanoate, msc-PHA precursors are synthesized by elongation and de novo fatty acid synthesis pathway, resulting more preferably in the generation of C8 and C10 monomers . Such differences in specific activity could be applied as additional constraints to the model to obtain similar results as those observed experimentally.
Taken together, this example illustrates how iJN746 could be employed as a tool to identify new substrates (catechol, p-coumarate, isoleucine etc) for production of the different msc-PHA monomers or msc-PHA mixtures. Furthermore, computational tools such as OptKnock or OptStrain could help to design i) higher production strains, and/or ii) couple PHA production to growth rate. Such approaches have proven successful for other metabolic engineering designs such as lactate production in E. coli  or succinate production in M. succiniciproducens .
Here, we presented the first genome-scale reconstruction of P. putida, a biotechnologically interesting all-surrounder. iJN746 is a highly detailed reconstruction of the P. putida KT2440 metabolic network that captures the important biotechnological capabilities, such as biodegradation of aromatic compounds, of this paradigmatic bacterium. Moreover, iJN746 represents a comprehensive knowledge base summarizing and categorizing the information currently available for P. putida KT2440. This study evaluated the metabolic network content and showed some examples of how iJN746 could be used for biotechnological purposes. Taken together, our results underlined the value of iJN746 as a suitable tool to study of P. putida's metabolism and its biotechnical applications by the P. putida community.
P. putida KT2440 harboring the TOL plasmid was used for in vivo determination of oxygen consumption experiments. The bacterium was grown at 30°C in M9 minimal medium  with octanoate (15 mM), glucose (0.3% [wt/vol]), or toluene (6 mM) as a carbon source. Liquid cultures were agitated on a gyratory shaker operated at 250 rpm. For the OUR experiment, an overnight culture of P. putida KT2440 strain grown in each carbon source was diluted until the turbidity at 600 nm (OD600) was 0.05 in fresh M9 minimal medium with the appropriate carbon source, samples were then incubated until the culture reached a turbidity at 600 nm of 0.6 for glucose or octanoate growing cells and 0.45 in toluene growing cells. Aliquots of 2 ml were taken for OUR determination; the cells were harvested by centrifugation, washed twice and re-suspended in 1 ml of fresh medium containing the appropriate carbon source using the above concentrations. The OUR was measured by monitoring the substrate-dependent oxygen consumption rate at 30°C using an oxygen electrode (DW1 Hansa-Tech Oxygen Electrode, Hansa-Tech Oxygen Instrument Limited) in 1-ml assay mixture. Cellular dry weight (CDW) was determined using previously published methods , using at least 3 parallel 10-ml cell suspensions that were harvested by centrifugation at 15,800 × g. The pellets were washed with 0.9% NaCl and then dried at 105°C for 24 h to a constant weight using pre-dried and weighed 2-ml Eppendorf cups.
The reconstruction process was done as described previously . Briefly, the genome annotation of P. putida KT2440 was obtained from TIGR (http://cmr.tigr.org/tigr-scripts/CMR/GenomePage.cgi?org=gpp, 06/27/2007) and was used as the framework of the network reconstruction. P. putida-specific primary and review literature and books were used to retrieve information about every network reaction: i) substrate specificity, ii) coenzyme specificity, iii) reaction directionality, iv) enzyme and reaction localization, and v) gene-protein-reaction (GPR) association. Relevant references were associated with every network reaction [see Additional files 7 and 8]. Public databases such as KEGG , PSEUDOCYC , and SYSTOMONAS  were used when no literature evidence could be found for the previous reaction characteristics. Spontaneous reactions were included into the reconstruction if i) physiological evidence suggested their presence (e.g., the presence of at least the substrate or product in the reconstruction); and ii) textbooks or KEGG  suggested the existence of such reactions. Every network reaction was mass- and charge balanced assuming an intracellular pH of 7.2 [38,55]. Note that this mass- and charge balancing also included balancing the network reactions for protons (H+), water (H2O), and various co-factors (e.g., adenosine triphosphate (ATP)). No gene-associated reactions were included when no corresponding gene was annotated in P. putida's genome but physiological or experimental data supported the presence of the biochemical transformation being part of P. putida's metabolism. Finally the reversibility was determined from primary literature data for each particular enzyme/reaction, if available. This literature search resulted in a first manually-curated reconstruction specific to P. putida's metabolism based on genome annotation and available biochemical evidence. However, this list is normally incomplete and will contain network gaps that may need to be filled depending on supporting evidence. This step requires manual effort again by searching the scientific literature for supporting information. If no P. putida-specific experimental evidence could be found for a transport reaction or biochemical transformation of a metabolite, no reaction or transporter was added to the network. Finally, the network capabilities were evaluated and compared with experimental data as described in Reed et al. . Detailed lists of the genes, proteins, and reactions are contained in the Additional file 8, and the definitions of all metabolites and their abbreviations are found in the Additional file 9.
SimPheny (Genomatica Inc., San Diego, CA) software was used for the reconstruction and gap evaluation process.
The reconstructed metabolic network is often represented in a tabular format, listing all network reactions and metabolites in a human-readable manner along with confidence scores and comments (see Reed et al  for details). The conversion into a mathematical, or computer-readable format, can be done automatically by parsing the stoichiometric coefficients from the network reaction list (e.g. using the COBRA toolbox ). The mathematical format is called a stoichiometric matrix, or S-matrix, where the rows correspond to the network metabolites and the columns represent the network reactions. For each reaction, the stoichiometric coefficients of the substrates are listed with a minus sign in the corresponding cell of the matrix, while the product coefficients are positive numbers, by definition. The resulting size of the S-matrix is m × n, where m is the number of metabolites and n the number of network reactions. Mathematically, the S-matrix is a linear transformation of the flux vector v = (v1, v2,.., vn) to a vector of time derivatives of the concentration vector x = (x1, x2,.., xm) as . At steady-state, the change in concentration as a function of time is zero; hence, it follows: = 0. The set of possible flux vectors v that satisfy this equality constraint might be subject to further constraints by defining vi,min≤ vi ≤ vi,max for reaction i. In fact, for every irreversible network reaction i, the lower bound was defined as vi,min ≥ 0 and the upper bound was defined as vi,max ≥ 0.
Exchange reactions, which supply the network with nutrients or remove secretion products from the medium, were defined for all known medium components (see Additional file 9 for details). The uptake of a substrate by the network was defined by a flux rate vi < 0 and secretion of a by-product was defined to be vi > 0 for every exchange reaction i. An exchange reaction is represented in the reaction is as follows: e.g. D-glucose exchange: Ex_glc-D: 1 glc-D →. Note that this exchange reaction is unbalanced. Exchange (uptake) reactions define the presence of media components as if one would add metabolites into an in silico flask.
Finally, the application of constraints corresponding to different environmental conditions (e.g. minimal growth medium) or different genetic background (e.g. enzyme-deficient mutant) allow the transition from metabolic network reconstruction to condition-specific model. Note that the metabolic network reconstruction is unique to the target organism (and defined by its genome) while it can give rise to many different models by applying condition-specific constraints. All flux rates, vi, except biomass formation, are given in mmol/gDW/h.
It is generally assumed that the objective of living organisms is to divide and proliferate. Subsequently, many metabolic network reconstructions have a so-called biomass function, in which all known metabolic precursors of cellular biomass are gathered (e.g. amino acids, nucleotides, phospholipids, vitamins, cofactors, energetic requirements etc.) [36-39]. Since no detailed studies about P. putida's biomass composition are available, the biomass composition from E. coli [55,93] was used as a template for iJN746's biomass function. However, data from P. putida were added, (e.g. membrane phospholipid composition ), when available. The detailed calculation of the biomass composition is provided in the Additional file 10.
Aerobic growth was modeled in two different culture media: in silico M9 minimal medium (iM9) and in silico Luria-Bertani medium (iLB) . For iM9 simulation, and according to the well described M9 minimal medium , the following external metabolites, CO2, Co2 +, Fe2 +, H+, H2O, Na2 +, Ni2 +, NH4, Pi and SO4 were allowed to enter and leave the network by setting the constraints on the corresponding exchange reactions (i) to vi,min≥ -106 mmol/gDW/h and to vi,max≤ 106 mmol/gDW/h. The uptake rate for each carbon source was constrained to vi,min≥ -10 mmol/gDW/h and vi,max≤ 0 mmol/gDW/h. The oxygen uptake rate (OUR) was limited to vi,min≥ -18.5 mmol/gDW/h (based on E. coli data ), if not noted differently. In each individual simulation, all other external metabolites were only allowed to leave the system by constraining their exchange fluxes i between vi,min≥ 0 and vi,max≥ 106 mmol/gDW/h. The iLB medium was based on the published analysis of yeast extract and tryptone provided by the corresponding manufactures, and the iLB simulations were performed according previously published methods .
Phenotypic phase-plane analysis (PhPP) was carried out using SimPheny (Genomatica Inc., San Diego, CA). The underlying algorithm was described elsewhere [96,97]. The simulation was carried out using iM9 minimal medium (as described above) and setting the bounds of toluene uptake between vi,min≥ -11.9 mmol/gDW/h (based on measurement by  and vi,max≤ 0 mmol/gDW/h; and of oxygen between vi,min≥ -160 mmol/gDW/h and vi,max≤ 0 mmol/gDW/h. The step size was chosen to be 35.
Reduced cost is a parameter of linear programming (LP) problems which is associated with each network reaction (vi) and represents the amount by which the objective function (e.g. growth rate) could be increased when the flux rate through this reaction was increased by a single unit . Reduced cost is often used to analyze the obtained optimal solution and evaluate alternate solutions from the original solution . In this study, we analyzed the reduced costs associated with uptake reactions to identify candidate reactions through which an increased flux would result in a higher growth rate (under the chosen simulation condition). The growth condition was iM9 medium with toluene as carbon source. The constraints were set as described above and linear programming was employed to solve the optimization problem (maximizing growth).
In order to determine the effect of a single gene deletion, all the reactions associated with each gene in iJN746 were individually "deleted" by setting the flux to 0 and optimizing for the biomass function . A lethal deletion was defined if no positive flux value for the biomass function could be obtained for the given mutant in silico strain and medium. The simulations were performed using i) iLB rich medium for general gene essentially experiment and ii) glucose-iM9 minimal medium for auxotrophy experiments (See above). The glucose uptake rate was fixed to vi,min = vi,max = -6.3 mmol/gDW/h in the latter study. OUR was set to be vi,min≥ -18.5 mmol/gDW/h in both cases.
The msc-PHA production from each possible carbon source (Table (Table2)2) in iM9 medium was determined by setting the growth rate to vgrowth,min = vgrowth,max 0.2 gDW/gDW/h. The lower bound of each carbon uptake reaction was set to vi,min≥ -10 mmol/gDW/h and the upper bound was set to be vi,max≤ 0 mmol/gDW/h. The lower bound of the oxygen uptake rate was set to vi,min≥ -20 mmol/gDW/h for all simulations. In iJN746, six types of msc-PHAs are defined as well as msc-PHA compounds consisting of four different carbon chains [see Figure Figure55 and Additional file 7]. The corresponding demand functions were used as objective functions independently for the optimization problem. The resulting msc-PHA production rates were scaled by the number of carbons of the corresponding carbon sources to facilitate a yield comparison.
All computational simulations were performed using Matlab (The MathWorks Inc., Natick, MA) if not stated otherwise. TomLab (Tomlab Optimization Inc., San Diego, CA) was used as linear programming solver. Optimization formulations and the gene deletion studies employed the Matlab-based COBRA toolbox .
All authors conceived the study. JN carried out the reconstruction of Pseudomonas putida KT2440. JN and IT performed the analyses. JN, IT, BOP designed the study and wrote manuscript. All authors read and approved the final manuscript.
Table S1. Carbon, nitrogen, and sulfur sources, which enabled growth of iJN746.
Table S2. Essentials genes predicted correctly in iJN746 compared with experimental data of P. aeruginosa.
Figure S1. Schematic representation of in silico gene essentiality in iJN746 (iLB medium) compared experimental data of gene essentiality in P. aeruginosa .
Table S3. False-positive essential genes in iJN746 when compared with P. aeruginosa's experimental data .
Table S4. False-negative essential genes in iJN746. Genes that were not predicted to be essential in iJN746 but were reported as essential genes in P. aeruginosa .
Table S5. Predicted essential genes in Glucose-iM9 minimal medium. Not shown are genes that were also predicted to be essential in iLB rich medium.
Table S6. PHA polymer composition found in different Pseudomonas strains sorted by carbon sources.
Table S7. List of metabolites in iJN746. The file contains a detail list of metabolites present in the metabolic reconstruction. The molecular formulae, the charge as well as the KeggID are shown.
Table S8. List of the reactions contain in iJN746. The file details the reactions account in the metabolic reconstruction. The official name, the equation of the reaction, the subsystem, the EC number and de GPR association is shown.
Table S9. List of biomass components in iJN746. This file contains the complete list of compounds which are part of Pseudomonas putida biomass.
We thank J.R. Luque-Ortega for help in the oxygen uptake experiments. We thank M. Abrahams, M. Mo, and S. Burning for critical reading of the manuscript. JN is grateful to T. Conrad for your help during the San Diego stay and E. Díaz and M.A. Prieto for their valuable help and suggestion during the metabolic reconstruction. JN is the recipients of an I3P predoctoral Fellowship from the Consejo Superior de Investigaciones Científicas (CSIC) and JN stay in San Diego was supported by a short term I3P fellowship.