|Home | About | Journals | Submit | Contact Us | Français|
Constraint based modeling is an approach for quantitative prediction of net reaction flux in genome scale biochemical networks. In vivo, the second law of thermodynamics requires that net macroscopic flux be forward, when the transformed reaction Gibbs energy is negative. We calculate the latter by using (i) group contribution estimates of metabolite species Gibbs energy, combined with (ii) experimentally measured equilibrium constants. In an application to a genome scale stoichiometric model of E. coli metabolism, iAF1260, we demonstrate that quantitative prediction of reaction directionality is increased in scope and accuracy by integration of both data sources, transformed appropriately to in vivo pH, temperature and ionic strength. Comparison of quantitative versus qualitative assignment of reaction directionality in iAF1260, assuming an accommodating reactant concentration range of 0.02 – 20 mM, revealed that quantitative assignment leads to a low false positive, but high false negative, prediction of effectively irreversible reactions. The latter is partly due to the uncertainty associated with group contribution estimates. We also uncovered evidence that the high intracellular concentration of glutamate in E. coli may be essential to direct otherwise thermodynamically unfavorable essential reactions, such as the leucine transaminase reaction, in an anabolic direction.
Biological systems can be modeled at a large scale by taking an approach which balances computationally tractability with physically and biochemical realistic representation. Constraint-based modeling is a flexible and scalable approach for in silico phenotype prediction . It relies on an accurate biochemical network reconstruction which is a biochemically, genetically and genomically structured representation of experimental biochemical & molecular biological literature . In the case of metabolic networks, biochemical characterization of an enzyme establishes the substrate(s) and product(s) and genetic studies establish the gene-protein-reaction associations which tie a particular metabolic function in a model to a particular genomic location. A biochemical network reconstruction is then converted into a prototype computational model such that predictions may be compared with experimental data. In many cases, initial in silico tests suggest further refinements to the reconstruction underlying the prototype computational model. Iterative refinement of a constraint-based model, by comparison of prediction with experiment, supports its use for a priori in silico prediction of phenotypic capabilities for a posteriori in vivo experimental validation.
There have been many practical biological uses of constraint-based models, including study of bacterial evolution , analysis of network properties [4, 5, 6, 7], study of phenotypic behavior [8, 9], biological discovery [10, 11, 12], and metabolic engineering [13, 14, 15]. The growing scope of applications of genome scale metabolic reconstructions in metabolic engineering and other fields has recently been reviewed . The predictive fidelity of a constraint-based model is dependent on the accuracy of the constraints used to eliminate physicochemically and biochemically infeasible network states. Generally, the resulting constraint equations define an under-determined feasible set of network states. Therefore, in unicellular organisms, a biological objective, such as maximization of growth rate, can be used to predict a single network state within this feasible set depending on the objective. The sensitivity of in silico predictions to the choice of objective function was treated in detail by Savinell & Palsson , and more recently by comparison of predictions with fluxomics data .
In this work, we focus on the assignment of reaction directionality in stoichiometric, metabolic models since it has a significant effect on the feasible set of functional states [19, 20, 21, 22]. There are two forms of thermodynamic constraints on reaction directionality. Local thermodynamic constraints apply on a reaction, by reaction basis. Essentially, a negative reaction Gibbs energy dictates a net forward reaction flux. This application of thermodynamics to the direction of biochemical reactions has a long history , with the first comprehensive treatment by Burton, Krebs and Kornberg . Non-local thermodynamic constraints apply to sets of reactions  and arise due to the necessity to satisfy energy conservation, in addition to the second law of thermodynamics. Non-local thermodynamic constraints have been applied to small systems of biochemical reactions , but that approach “cannot be efficiently applied directly to genome-scale problems”  due to limitations imposed by computational complexity. Here, our focus is on local thermodynamic constraints for a system of reactions at genome scale. We summarize the theory underlying quantitative assignment of local reaction directionality at physiologically relevant conditions. Then, we apply this theory to a stoichiometric model of E. coli metabolism . Our study relies on the extensive body of work on the thermodynamics of biochemical reactions by Alberty [29, 30] which we apply to a genome scale model for the first time. We complemented Alberty's approach with ongoing efforts by Henry et al. [19, 20] and Jankowski et al.  which seek to estimate the standard Gibbs energy of metabolite species based on a group contribution methodology.
There exist two complimentary quantitative methods for assigning reaction directionality based on different ways of calculating standard Gibbs energy of formation for metabolite species. The first involves back-calculation of standard Gibbs energy of formation using experimentally measured equilibrium constants . In the absence of apparent equilibrium constants, standard Gibbs energy of formation for certain metabolite species cannot be back-calculated. In this case, a second complementary method involves estimation of standard Gibbs energy of formation of a metabolite species by summation of terms representing contributions to standard Gibbs energy of formation from different structural subgroups within that metabolite species [32, 31] (see Supplementary Material).
In constraint-based modeling of metabolic networks, it is typical to assume that all metabolite species concentrations are uniform throughout a single compartment enclosed by a lipid bilayer [33, 34]. This assumption is based on a local equilibrium hypothesis; that local variation in metabolite concentration is insignificant with respect to average compartment concentration . In theory, a chemical reaction will perturb the spatial distribution of metabolite concentration, especially for certain metabolite intermediates at low concentration. However, unless the rate of such a reaction is diffusion limited, one can expect such perturbations to be insignificant on longer timescales. A significant number of publications on possible deviations of intracellular reaction thermodynamics from a local equilibrium hypothesis have been published and are reviewed in . However, at present, due to the dearth of spatio-temporally resolved concentration data, one cannot be sure that any deviation from a local equilibrium hypothesis is actually significant in vivo. We assume the metabolite concentration is spatially invariant, and that, temperature and pressure are constant. Therefore one can define a single Gibbs energy of formation for each metabolite species which is valid throughout a single compartment.
The standard Gibbs energy of formation of a metabolite species, , is an experimental measurement of the change in chemical potential which accompanies the synthesis of one mole of a metabolite species from its constituent elements in standard conditions. Standard conditions, denoted with the superscript, o, specify the synthesis of a one molar solution of a metabolite species from its constituent elements in their natural state, at temperature 298.15K, atmospheric pressure 1 bar, zero ionic strength and zero pH. Throughout this study, we assume atmospheric pressure and all thermodynamic potentials are expressed in kJ mol−1.
In our treatment of thermodynamic potentials, we assume that, with respect to metabolite species, the interior of a cell can be approximated by a buffer solution. This assumption neglects the fact that in general, there are effects on the equilibrium constants of in vivoreactions due to macromolecular crowding , confinement, adsorption and high fluid phase viscosity . These effects are all a function of size and charge of the molecules involved. We confine ourselves to consideration of equilibrium constants for metabolic overall reactions between relatively small metabolites. Implicitly, with the exception of spontaneous reactions, all such reactions involve enzymes in elementary steps within the overall reactions. Enzymes are relatively large molecules but do not effect the equilibrium constant of an overall enzyme catalyzed reaction. For charged metabolites, there could be significant effects due to macromolecular adsorption and it is known that a higher intracellular fluid phase viscosity slows the rate of diffusion of metabolites . However, in this study, we omit a treatment of these effects. The effect of macromolecular crowding and confinement is much more significant for macromolecules than metabolites and we do not consider these further. We do however consider the effects of in vivo pH, temperature and ionic strength.
In any quantitative system of relative measurements it is essential to have a standard reference point. However, any thermodynamic potential given for a ex vivo chemical standard reference condition must be adjusted to reflect in vivo conditions. The procedure for adjusting standard Gibbs energy of formation for each metabolite species, to in vivo ionic strength, temperature and pH, has been discussed theoretically and illustrated with computational implementation in textbooks by Alberty [29, 30]. We follow closely the same theoretical procedure, which is detailed in the Supplementary Material. There have been attempts to indirectly measure in vivo ionic strength, without actually measuring the concentrations of all charged metabolite species . Such work would significantly refine our ability to model biochemical thermodynamics. However, at present, a pragmatic approach is to assume a physiological range of ionic strength between 0.05M and 0.35M , and then to adjust each metabolite species standard Gibbs energy of formation by assuming a particular ionic strength in this range.
We assume an ionic strength of 0.25 M then use the extended Debye-Hückel equation , to estimate the activity coefficient associated with each metabolite species, γj. The RT ln γi term is then absorbed into the standard Gibbs energy of formation at in vivo ionic strength. The advantage of this approach is that metabolomic data can be incorporated directly without adjusting it to represent ionic strength and charge dependent activities. The chemical standard reference temperature is typically room temperature, 298.15 K (25°C). However, generally for biochemical thermodynamics, reactions occur at body temperature, 310.15 K (37°C). It is possible to adjust metabolite species standard Gibbs energy of formation, within a narrow range of the chemical standard reference temperature, provided that the standard enthalpy of formation of a metabolite species is known, (See Supplementary Material).
Through a variety of homeostatic mechanisms , E. coli can grow over a wide range of external pH values, (5.5-9.0), yet maintains cytoplasmic pH between 7.6-7.8 [43, 44]. In contrast, the pH in the periplasmic space equals that of the external medium . Metabolites, co-factors and ions make up approximately 3.5% of the dry weight of a typical E. coli cell . While proteins account for approximately 55% of the dry weight. Cytoplasmic proteins represent a large buffering capacity, relative to the total pool of metabolites, co-factors and ions . Effectively, the large buffering capacity of intracellular proteins maintains a constant concentration of cytoplasmic hydrogen ions. In a thermodynamic treatment of any system, knowledge of the values of a constant intensive variables dictates the appropriate thermodynamic potential to be used. This is concept applies regardless of whether the intensive variable is temperature or pH. As stressed by Alberty , the appropriate thermodynamic potential may be mathematically defined using a Legendre transform. However, knowledge of the latter is not essential to understand why it is appropriate to define a new thermodynamic potential, rather it is an elegant mathematical device for doing so. In the Supplementary Material, we show how to define a new thermodynamic potential for constant pH by adapting the classical thought experiment of a ‘bath’ that maintains an intensive variable constant.
A reactant represents the properties of a set of related metabolite species. When pH is a known constant, a reactant may be treated as a pseudo-isomer group of metabolite species each of which is in a different state of hydrogen ion dissociation. e.g. at physiological ranges of pH, the reactant ATP consists of a series of different hydrogen ion dissociated forms. The concentration of the reactant ATP is given by the sum of the concentrations of the metabolite species in its pseudo-isomer group
Thus far, established practice in reconstruction of metabolic networks has been to approximate each reactant with a single predominant ionic form [28, 31]. Instead of this approximation, we follow the method of Alberty , and quantitatively amalgamate each property of a set of metabolite species into a property of the corresponding reactant. This retains the simplicity of qualitatively dealing with a single reactant but quantitatively representing the properties of a pseudo-isomer group.
Thermodynamically, at the timescale relevant for enzyme catalyzed reactions, the metabolite species which make up a single pseudo-isomer group may be considered to be in equilibrium with each other . The hydrogen ion transfer rate constants range between 109 – 1011 s−1 M−1 as compared to the fastest enzyme turnover rates of ~ 107 s−1 M−1 . Therefore, when modeling enzyme catalyzed reactions, it is reasonable to assume that the different metabolite species of a reactant are at equilibrium . Thermodynamic state variables  for each metabolite species, such as standard Gibbs energy of formation, , must be individually adjusted from chemical standard conditions, to in vivo pH, ionic strength, and temperature. Thereafter, assuming equilibrium between metabolite species in a pseudo-isomer group, the adjusted state variables for each metabolite species in a pseudo-isomer group are then combined, and represented by a reactant state variable. The reactant state variable we require is the standard transformed Gibbs energy of a reactant, . Note that we distinguish a reactant, with index i, from a metabolite species with index j. Note also, the slight change in superscript, from o to 0, which is used to remind the reader that the latter is now a function of ionic strength (See supplementary material A.3).
The standard transformed Gibbs energy of formation of a reactant can be calculated by assuming that each metabolite species in a group acts as an isomer and applying established techniques for isomer group thermodynamics . The equation for standard transformed reactant Gibbs energy of formation, , in terms of metabolite species standard transformed Gibbs energy of formation, , is
where the summation is over all the metabolite species, j, within the pseudo-isomer group, i. Each metabolite species standard transformed Gibbs energy of formation is individually a function of ionic strength, pH and temperature. It is important to realize that Eq. 1 does not represent a linear average of the standard transformed Gibbs energy of formation of each metabolite species. In other words, reactant standard transformed Gibbs energy of formation is not additive in the standard transformed Gibbs energy of its metabolite species.
At specified ionic strength, pH and temperature, the mole fraction of a metabolite species, rj, with respect to its pseudo-isomer group, may be calculated using
where, again, the summation is over all the metabolite species, j, within a particular pseudo-isomer group, i . This mole fraction rj, represents the fraction of a reactant which is present as a particular metabolite species j. It is evident from Eq. 2 that the mole fraction of a metabolite species is a non-linear function of its standard transformed Gibbs energy of formation. If each metabolite species in a pseudo-isomer group could be adjusted identically for temperature, ionic strength, and pH then it would be possible to adjust the properties of the reactant after it was approximated with the pseudo-isomer group approach. Mathematically, this follows since subtraction of any constant term from each in Eq. 2 has no effect on the mole fraction.
Transformed reactant Gibbs energy, , is given by , where is standard transformed reactant Gibbs energy and xi is the concentration of reactant i (Molar), which is the sum of the concentrations of its constituent metabolite species j. Quantitative metabolomic data , which can measure simultaneously a wide selection of reactant concentrations, is increasingly becoming available . Quantitative metabolomic data can be used to estimate in vivo reactant transformed Gibbs energy . Experimentally measured reactant concentrations indicate that the concentration of most amino acids , and reactants in central metabolism fall within the 0.02 – 20 mM range [21, 51]. Therefore, in the absence of metabolomic data for a particular reactant, we define the minimum and maximum standard transformed reactant Gibbs energy of formation, and , where we assume xmin = 0.02 mM and xmax = 20 mM. There are exception for water and dissolved gases (see Supplementary Material A.7).
Although, in principle, all microscopic processes in biochemistry are reversible, certain reactions are effectively irreversible at typical in vivo reactant concentrations. Let Sk m,1 denote a column from a stoichiometric matrix , then the transformed Gibbs energy for a biochemical reaction is . In most modeling situations, metabolomic data will not be available for all reactants in a model. In this situation, we assume a physiological concentration range for each reactant. We define the minimum and maximum transformed Gibbs energy for a biochemical reaction with and , whereinf denotes the infimum, sup denotes the supremum, and reactant standard transformed Gibbs energy of formation is bounded . A reaction may be assigned to be quantitatively forward, if , or quantitatively reverse if . We assigned a reaction to be quantitatively reversible if the physiological range of biochemical reaction transformed Gibbs energy spans the zero line, i.e. and .
Given a set of reactions assigned to be quantitatively reversible, we now propose a metric to rank them in a descending sequence according to probability that each reaction is irreversible in the forward direction. Standard reaction Gibbs energy is defined for one molar concentration for each reactant, which far exceeds physiological concentration ranges in E. coli. We define a new physiological group contribution estimate for each reactant standard transformed Gibbs energy of formation,, where is the 1 Molar group contribution estimate of reactant standard Gibbs energy of formation, transformed to in vivo conditions, and xi,m is the geometric mean of the concentration range for each reactant, xi,m exp ((ln (xi,min) + ln (xi,max))/2). This refines the approach of Henry et al., since xi,m represents a new standard concentration for each reactant, depending on its physiological range in vivo. We choose the arithmetic mean of minimum and maximum concentration since logarithmic concentration is linear in Gibbs energy.
In the group contribution method, the standard error in estimated reactant standard Gibbs energy of formation, , assumes that the actual value of standard Gibbs energy of formation is normally distributed. Likewise, we assume that the actual value of physiological standard transformed reaction Gibbs energy, ΔrG′m, is also normally distributed. This is justified by comparing the relative magnitude of with RT ln (xi,max/xi,m) for reactants in iAF1260. Assuming a concentration range of 0.02 – 20 mM, the former is greater than the latter for 85% of reactants, see Figure 1. That is, for the majority of metabolites, the uncertainty in estimated reactant standard Gibbs energy of formation is more significant than the uncertainty associated with a lack of metabolomic data. The qualitatively reversible reactions, that are assigned to be quantitatively reversible, can be ranked based on the probability that the real physiological standard transformed reaction Gibbs energy is negative, P(ΔrG′m < 0). This probability that a reactions direction is forward is given by the cumulative of a normal distribution
where er f denotes the error function . Ranking the reactions by the probability that physiological standard transformed reaction Gibbs energy is negative, P(ΔrG′m < 0), is superior to ranking reactions by mean estimated physiological standard transformed reaction Gibbs energy since the former accounts for the uncertainty in the group contribution method. (For the sake of clarity we omit a k subscript from P(ΔrG′m < 0), but it is specific to each reaction). Note that here probability is used in the Bayesian sense to reflect the confidence with which a directionality assignment can be made. If P(ΔrG′m < 0) = 0.5, then we have no confidence to assign direction either way.
We applied quantitative assignment of reaction directionality to the genome scale model of E. coli metabolism, iAF1260 (1668 reactants, 2076 reactions) . This highlighted a few pertinent methodological lessons which apply regardless of the organism of interest. The assumption that a metabolite is present as a single predominant metabolite species does not always apply. Figure 2 illustrates that certain reactants in E. coli do have significant mole fractions present as non-predominant metabolite species. For instance, acetyl-phosphate is almost equally present as C2H3O5P2− and C2H4O5P−, while a small mole fraction is predicted to be present as uncharged C2H5O5P. Each metabolite species has a different standard Gibbs energy of formation and responds individually to changes in pH, temperature and ionic strength. Where the necessary data was available, we calculated the standard transformed Gibbs energy of formation of each metabolite species as a function of pH, temperature and ionic strength. The standard transformed reactant Gibbs energy, and hence standard (Legendre) transformed reaction Gibbs energy is a function of the properties of each metabolite species in the respective pseudo-isomer groups, not just the predominant hydrogen ion dissociated form.
Another approach reported in the literature assumes that all reactants can be approximated by a single predominant metabolite species at a given pH [19, 20]. There, in order to represent a known constant hydrogen ion concentration, [H+], a new standard hydrogen ion Gibbs energy of formation is defined, ΔfGo(H+) RT ln [H+], and the Legendre transformation of all metabolite species at constant pH is omitted [19, 20]. Thereafter, standard ’transformed’ reaction Gibbs energy is calculated with the newly defined standard hydrogen ion Gibbs energy of formation and standard Gibbs energy of formation for the remaining reactants, each approximated by a single metabolite species. Figure 3 illustrates that there is a significant difference between the standard (Legendre) transformed reaction Gibbs energy and the standard ’transformed’ Gibbs energy of the same reaction, calculated when one assumes that each reactant can be approximated with one predominant metabolite species. As expected, in exceptional cases, where all reactants involved in a biochemical reaction exist as one metabolite species, the standard ’transformed’ Gibbs energy of a reaction is identical to the standard transformed Gibbs energy of the same reaction.
To quantitatively assign reaction directionality, where available, we used standard reactant Gibbs energy of formation back-calculated from experimentally measured equilibrium constants in preference to group contribution estimates. However, a minority of the reactants in E. coli (311/1668) have standard Gibbs energy of formation back-calculated from experimentally measured equilibrium constants . Most of the remaining reactant standard Gibbs energy of formation (1127/1668) can be estimated using group contribution methodology . Certain reactants have no estimated standard Gibbs energy of formation due to structural subgroups in metabolite species not covered by the current version of this method (230/1668 or 163/1039 unique iAF1260 reactants).
We compared quantitative assignment of reaction directionality with qualitatively assigned reaction reconstruction directionality in the genome scale metabolic model of E. coli, iAF1260 . The heuristics for the latter are given in the Supplementary Material A.8. Overall, quantitative assignment of reaction directionality results in a relaxation of qualitative directionality constraints, see Table 1. Directionality changes are in three categories: i) unchanged directionality ii) relative relaxation of reaction directionality (qualitatively forward yet quantitatively reversible), and iii) tightened or reversed reaction directionality. Unless otherwise specified, we refer to quantitative assignment at a temperature of 310.15 K, ionic strength of 0.25 M, pH of 7.7 [43, 44], and an accommodating physiological concentration range of 0.02 – 20 mM. Initially, we assumed thermodynamic reversibility for internal reactions (309/2077) where data is missing for at least one reactant.
In iAF1260, most of the internal reactions (1524/2077) have been qualitatively assigned to be irreversible in the forward direction. Of these forward reactions, about a third are also quantitatively forward (638/1524). Another third of qualitatively forward reactions are assigned to be quantitatively reversible (603/1524), see Table 1. Taking into account the uncertainty in group contribution estimation of metabolite species standard Gibbs energy of formation, the fraction of qualitatively forward reactions changed to quantitatively reversible almost doubles (1011/1524). Relaxation of directionality constraints for so many reactions would significantly increase the flexibility of the resulting iAF1260 constraint-based model. Such relaxation significantly effects model predictions, therefore, we conducted a detailed analysis of these qualitatively forward, yet quantitatively reverse reactions.
A tenth of qualitatively forward yet quantitatively reversible reactions (103/1011) exclusively involved reactant standard transformed Gibbs energy of formation back calculated from equilibrium constants. The majority of these reactions do indeed have a negative standard transformed reaction Gibbs energy, but within our accommodating physiological range of reactant concentration, they are quantitatively reversible (See Supplementary Figure 10). Strictly, these reactions are quantitatively reversible, i.e., and , but most are probably forward at in vivo concentrations.
Many of the quantitatively reversible reactions transport metabolites between compartments. Although the net electrical charge within a compartment is assumed to be neutral, an electrical potential difference can exist between compartments. In E. coli, the cytoplasmic side of the cytoplasmic membrane is more negative than the periplasmic side . A thorough treatment of electrochemical potential difference is beyond the scope of this current work. However, even when cytoplasm and periplasm are assumed to have an identical pH, an electrical potential difference still exists across the cytoplasmic membrane. We assumed an electrical potential difference of 90 mV, which corresponds to a Gibbs energy change of −8.7 kJ mol−1 per negative charge transported from cytoplasm to periplasm . Therefore, the passive transport of negatively charged reactants, e.g., formate, from cytoplasm to periplasm is thermodynamically favorable at equal concentrations of formate in both compartments. However, if the concentration of formate is high in the periplasm, and low in the cytoplasm, this can reverse the direction of formate diffusion. Therefore, many transport reactions are quantitatively assigned to be reversible within a physiological range of reactant concentration. This is the case regardless of the provenience of the reactant standard Gibbs energy of formation, see Supplementary Figure 11.
A small minority (134/1011) of the qualitatively forward yet quantitatively reversible reactions are assigned based solely on group contribution estimates of reactant standard transformed Gibbs energy of formation. The uncertainty in these estimates combines with variability in metabolite concentration to increase the feasible range of reaction Gibbs energy. The remaining majority (774/1011) of the qualitatively forward yet quantitatively reversible assignments are based on standard transformed Gibbs energy of formation for reactants estimated by the group contribution method combined with standard transformed Gibbs energy of formation for reactants back calculated from equilibrium constants. The latter reactants include most of the widely used co-enzymes  and prosthetic groups. In order to understand the reasons for relaxation of so many reaction directions ((134+774)/1011), we categorized the reactions. Figure 4 illustrates all reactions that are qualitatively forward yet quantitatively reversible (908/1011), using at least one group contribution estimate of reactant standard transformed Gibbs energy of formation. Apart from transport reactions, the reactions in Figure 4 are ordered by descending probability of the reaction being forward, P(ΔrG′m < 0), at physiological standard concentrations of reactants.
For most transport reactions, there is no uncertainty in physiological standard transformed reaction Gibbs energy since the same reactant is present on both sides of the reaction. In Figure 4, transport reactions are sorted by ascending physiological standard transformed reaction Gibbs energy since P(ΔrG′m < 0) is undefined if the denominator in Eq. 3 is zero. A non-zero standard transformed reaction Gibbs energy for a transport reaction is due to net transport of a charged metabolite between the cytoplasm and periplasm. Of the qualitatively forward yet quantitatively reversible reactions, 155/1011 result in net transport of a negative (or neutral) charge from the cytoplasm to the periplasm and therefore have a negative (or zero) physiological standard transformed reaction Gibbs energy, ΔrG′m ≤ 0. These transport reactions are qualitatively assigned to be forward, but within the physiological range of reactant concentration, they are quantitatively reversible.
The forward direction of a transport reaction, as for any reaction, is defined by convention. Assuming the concentration range of the reactant is the same in both compartments, reactions which transport a positively charged reactant from the cytoplasm to the periplasm have a positive physiological standard transformed reaction Gibbs energy, ΔrG′m > 0. Likewise, reactions which transport a negatively charged reactant from periplasm to the cytoplasm have a positive standard transformed reaction Gibbs energy. At equal cytoplasmic and periplasmic reactant concentrations these reactions would operate in reverse, despite being qualitatively assigned to be forward reactions. These 21 transport reactions are detailed in Supplementary Figure 12. Any facilitated diffusion reaction in this group would have to have a higher periplasmic than cytoplasmic concentration in order to proceed in the forward direction as qualitatively assigned, e.g., undecaprenyl phosphate transport cytoplasm to periplasm.
At an equal pH of cytoplasm and periplasm, any proton symport reaction, transporting a reactant with a charge of −2 into the cell, would require more than one proton to translocate across the periplasmic membrane in order to make the reaction thermodynamically favorable. It may be that the proton stoichiometry of these reactions needs to be revised in order to take this into account. For example the reaction ‘GlcNAc anhMurNAc tetrapeptide transport in via proton symport’, which symports anhMurNAc tetrapeptide (N-Acetyl-D-glucosamine (anhydrous) N-Acetylmuramyl-tetrapeptide), a biopolymer in the bacterial cell wall, with a charge of −2, from the periplasm to the cytoplasm, see Supplementary Figure 12. This result indicates how thermodynamic consideration of reaction directionality can highlight reactions with thermodynamically infeasible stoichiometry.
The probability that physiological standard transformed reaction Gibbs energy less than zero, P(ΔrG′m < 0), Eq. 3, attempts to quantify the probability that a reaction is indeed irreversible in the forward direction at physiological standard concentrations of reactants. Reactions that are most probable to be irreversible in the forward direction are those where the physiological standard transformed reaction Gibbs energy is negative, even one standard deviation from the mean, i.e. 1 ≥ P(ΔrG′m < 0) ≥ 0.86. Of the qualitatively forward, yet quantitatively reversible reactions, a quarter (221/1011) are probably forward by this reasoning. By the same reasoning for the opposite direction, 0.14 ≥ P(ΔrG′m < 0) ≥ 0, there are 11 reactions that are probably reverse, but are qualitatively assigned to operate in the forward direction, see Supplementary Figure 12. These quantitative reaction directionality assignments are interesting because they possibly change the direction of qualitative assignment.
The intermediate range of probability of forward physiological standard transformed reaction Gibbs energy, 0.86 ≥ P(ΔrG′m < 0) > 0.14, still accounts for half (500/1011) of the qualitatively forward, yet quantitatively reverse reactions. Figure 4 illustrates that a large uncertainty in physiological standard transformed reaction Gibbs energy precludes definitive quantitative forward assignment for many of these reactions. We arbitrarily set the 214 reactions with 0.86 ≥ P(ΔrG′m < 0) > 0.7 to be quantitatively forward, the 279 reactions with 0.7 ≥ P(ΔrG′m < 0) > 0.3 to be quantitatively reversible and the remaining 7 reactions with 0.3 ≥ P(ΔrG′m < 0) > 0.14 to be quantitatively reverse.
Whatever upper cutoff is chosen for qualitatively forward reactions, then in order to be logically consistent, one minus the same cutoff must be used to quantitatively assign other reactions to be reverse. Our conservative upper & lower cutoff was chosen to compensate for the uncertainty in group contribution estimates of reactant standard Gibbs energy. Even with these conservative cutoffs, the resulting model cannot produce biomass in glucose minimal medium, because certain qualitatively forward reactions are essential in this direction and therefore cannot be reversed. Computational analysis can reliably predict, which reaction directions are essential in vivo for a given boundary condition [8, 55]. We identified the 7 reactions that were essential for production of biomass in glucose minimal medium, yet seemed thermodynamically unfavorable according to 0.3 ≥ P(ΔrG′m < 0) (See Supplementary Figure 12). This highlights the importance of incorporating metabolite concentration data to refine thermodynamic assignment of reaction directionality  since it is ultimately the transformed Gibbs energy of a biochemical reaction, ΔrG′, that determines directionality.
The amino acid L-glutamate is the major nitrogen donor in the cell, distributing ~88% of the total nitrogen that ends up in biomass, largely via transamination reactions. It has been measured at a relatively high~ 100 mM concentration in a range of growth conditions . L-glutamate is a reactant in three of the qualitatively forward reactions, that could possibly be assigned reverse based on consideration of physiological standard transformed reaction Gibbs energy alone (Supplementary Figure 12). L-glutamate is a substrate for glutamate 5 kinase (E.C. 184.108.40.206) and leucine transaminase (E.C. 220.127.116.11), yet a product of the reaction catalyzed by 1-pyrroline 5-carboxylate dehydrogenase (EC 18.104.22.168). The reaction catalyzed by leucine transaminase is essential for production of biomass from glucose minimal medium in silico and in vivo . A high concentration of L-glutamate favors the forward direction of this reaction. In the same conditions, the other two reactions involving L-glutamate are not essential in silico.
Leucine transaminase catalyzes the transfer of the amine group in L-glutamate to 4-methyl-2-oxopentanoate to form L-leucine. This reaction can also be catalyzed by branched-chain-amino-acid transaminase (EC 22.214.171.124) which catalyses the first step in the catabolism of branched-chain amino acids, being leucine, isoleucine and valine. At 298.15 K, pH 7.21, and ionic strength 0.31 mol kg−1, the apparent equilibrium constant of
was reported to be 2.42 ± 0.25 with respect to the forward direction . Without error this corresponds to a standard transformed Gibbs energy of −2.19 kJ mol−1. This indicates that the reverse direction, biosynthesis of L-leucine, is thermodynamically unfavorable at equal concentrations of substrates and products. This indicates that the high concentration of glutamate observed in the cell may well be thermodynamically necessary to drive the biosynthesis of L-leucine. On the contrary, depending on the needs of the cell, all else being equal, the reaction will run in the opposite direction at low glutamate concentration. This illustrates how thermodynamic data can be used to interpret the meaning of metabolomic data .
Using quantitative assignment, a small minority of reaction directions are tightened or reversed, depending on the quantitative method used. Using group contribution estimates alone, there are at least 35 qualitatively reversible reactions that are deemed to be quantitatively irreversible, see Table 1. This number rises to 56 reactions when the uncertainty associated with the group contribution methodology is not taken into account. In contrast, the use of standard Gibbs energy of formation back calculated from experimentally measured equilibrium constants, in place of group contribution estimates, reduces the number of reaction directions that are tightened or reversed. In particular, of 553 qualitatively reversible internal reactions in iAF1260, only 6 are deemed to be quantitatively irreversible using standard transformed Gibbs energy of formation back calculated from experimentally measured equilibrium constants, in preference to group contribution estimates. Of course, group contribution estimates are still required for the majority of reactants. Of 1524 qualitatively forward reactions, only 5 are deemed to be quantitatively irreversible in the reverse direction.
The tightened or reversed directionality assignments, directly conflict with qualitative assignments and therefore we investigated the literature on biochemical and/or thermodynamic characterisation experiments for each of these reactions. The detailed results of this study are given in Supplementary Material A.9. Quantitative tightening of reaction directionality was supported for a minority of reactions, e.g. D-Erythrose-4-phosphate dehydrogenase reaction. Other reactions had experimental evidence indicating that quantitative assignment was too tight, which may perhaps be improved by more accurate modeling of the effect of of Magnesium on the standard transformed Gibbs energy of ATP, e.g. galactokinase reaction. As discussed by Alberty , the concept of a pseudo-isomer group may be extended to different metal ion dissociated forms, e.g., when pMg = −log10[Mg2+] is a known constant. The group contribution method seemed to have difficulty in estimating the standard Gibbs energy of metabolites with an imidazole ring structure, giving rise to erroneous, over tight assignments for certain reactions, e.g. IMP cyclohydrolase. Similarly, certain qualitatively forward reactions were also incorrectly predicted to be quantitatively reverse for reactions involving other metabolites with complex structure, e.g. glycogen. Ongoing refinements to the group contribution method can be expected to iron out these issues with structurally complex metabolite species.
Using the stoichiometry of the latest genome scale metabolic model, iAF1260, we quantitatively set reaction directionality according to local thermodynamic constraints and compared growth rate prediction against those experimentally measured. A first pass thermodynamic assignment of reaction directionality based on ΔrG′ alone, resulted in a minority of reactions being assigned thermodynamically forward (242/2077), and a few reactions being thermodynamically reverse (6/2077). A few of these reversed reactions were incorrectly quantitatively assigned, essential for growth, and therefore qualitatively set to forward, ignoring the thermodynamic assignment. Using flux balance analysis [33, 59] to maximize reaction representing the production of biomass, at first, the biomass production rate of the resulting model far exceeded the in vivo growth rate on glucose minimal medium. This indicates that significant reaction directionality constraints were missing since the published iAF1260 model does reproduce experimentally reported growth rates in various media.
At this point, we attempted to identify the minimum number of biochemically reasonable qualitative assignments necessary in order to match the growth rate observed in vivo. The qualitative assignments were based the on the experimental biochemical literature used to assign reaction directionality in the iAF1260 reconstruction . In addition, care was taken to qualitatively assess prediction of internal fluxes, such as flux through the electron transport chain, as expected during aerobic growth. First, we assigned the qualitative reconstruction directions to reactions without any thermodynamic data (310/2011). Similarly for reactions involving a quinone, since quinones are structurally complex metabolite species with significant uncertainty in group contribution estimates. Then, using the probabilistic criteria discussed above, we made these further quantitative assignments: P(ΔrG′m < 0) > 0.7 forward, 0.7 ≥ P(ΔrG′m < 0) > 0.3 reversible, and 0.3 ≥ P(ΔrG′m < 0) reverse, except for in silico essential reactions.
In addition, it was necessary to eliminate artefactual excess ATP synthesis due to (i) reversal of ATP consuming ATP-binding cassette transport reactions, and (ii) net shuttling of protons from the cytoplasm into the periplasmic compartment via transport reactions, leading to excess ATP synthase flux. We assigned reconstruction directions to all transport reactions involving proton symport or antiport, and all ATP-binding cassette transport reactions. The majority of transport reactions qualitatively assigned a forward direction in this way do indeed have a negative physiological standard transformed reaction Gibbs energy, see Supplementary Figure 11. For transport reactions, compartmentally resolved reactant concentration, if available, would significantly influence reaction directionality, since, for many such reactions, the standard Gibbs energy is close to zero.
We qualitatively assigned reaction directionality to cytoplasmic reactions involving the cofactors ATP, GTP, CTP & UTP. In addition, we incorporated experimentally reported concentration ranges for key cofactors (ATP, ADP, AMP, NAD, NADH, NADP, NADPH) measured for E. coli growing aerobically on glucose minimal medium . At glucose and oxygen uptake rates of 11 and 18.2 mmol g−1 h−1 respectively, the experimentally observed growth rate is 0.82 h−1 . The aforementioned qualitative reaction directionality assignments, which override the quantitative assignments, reduce the growth rate to 1.1 h−1, which compares favorably with that observed experimentally. At this margin of error, setting the effective stoichiometry of oxidative phosphorylation by constraints on the relative fluxes through various NADH dehydrogenases and terminal oxidases has a significant effect on growth rate . It is clear that the latter modeling issue awaits a thorough investigation. In summary, quantitative assignment of reaction directionality must be accompanied by qualitative assignment, for certain classes of reactions, in order to match experimental data with constraint based models of metabolism.
In principle, the application of the second law of thermodynamics to metabolic reactions results in a constraint on the direction of all network reactions. However, this assumes that all reactant concentrations and standard Gibbs energies are are known. Using experimentally measured apparent equilibrium constants, Alberty has published tables of standard transformed reactant Gibbs energies for 200 reactants [29, 30]. In addition, group contribution estimates of standard reactant Gibbs energy have recently become available for two thirds of the 15,000 reactants in the KEGG compound database . We have integrated these estimates, with Alberty's tables in order to quantitatively constrain the directions in the E. coli genome scale metabolic model.
Standard Gibbs energy applies to chemical standard conditions, which are markedly different from in vivo conditions. Therefore, following the methods developed by Alberty, we have theoretically summarized and practically implemented the main steps in transformation of chemical standard thermodynamic potentials into biochemical standard thermodynamic potentials. The large buffering capacity of proteins, relative to the size of the soluble reactant pool means that pH is effectively held constant. Therefore, we use the transformed reaction Gibbs energy to provide the criterion for spontaneous change, and therefore the direction of net flux, at in vivo temperature, pressure, pH and ionic strength.
Even in the absence of quantitative metabolomic data, thermodynamic assignment of reaction directionality can be made by assuming an accommodating range of reactant concentration (0.02 – 20 mM). In E. coli, our thermodynamic assignment of reaction directionality rarely conflicts with qualitatively assigned reconstruction direction based on experimental literature. This indicates that the same algorithms may be reliably used to constrain reaction directionality for other organisms without the same breadth of experimental evidence as exists for E. coli. However, a large proportion of reaction directions, which were qualitatively assigned to be irreversible in E. coli, become quantitatively reversible, by transformed reaction Gibbs energy alone. Partially, this is due to the accommodating range of reactant concentration, especially for transport reactions. However, the main reason is that the uncertainty associated with computational estimation of standard reactant Gibbs energy can give rise to a significant uncertainty in standard transformed reaction Gibbs energy. This is especially so when structurally complex reactants are involved.
We partially overcame the uncertainty issue by formulating a probabilistic metric expressing our confidence that a given reaction is irreversible, given the uncertainty in standard reactant Gibbs energy. Using this Bayesian probabilistic metric, it is possible to quantitatively assign a large number of reactions to be irreversible, at a given confidence cutoff. Even still, some qualitative assignment of reaction directionality is necessary in order to sufficiently constrain the genome scale E. coli model to agree with experimentally observed growth rate in glucose minimal medium. In particular, this is so for quinone coupled oxidative phosphorylation reactions transport reactions, and reactions involving nucleotide cofactors.
Quantitative metabolomics data, combined with standard transformed reactant Gibbs energy calculations, would increase the number of predicted effectively irreversible reactions. However, metabolite concentration data only becomes important for determining reaction directionality when the uncertainty in group contribution estimation of reactant Gibbs energy is low ( 6 kJ mol−1). Cofactor pair concentration ratios are particularly useful since these reactants are involved in such a large proportion of reactions and since experimentally derived standard reactant Gibbs energy are available for many cofactors from the tables compiled by Alberty . Beyond cofactors, the concentration of other key metabolites are also important for determining the direction of essential anabolic reactions. For instance, we provide evidence that the high concentration of cytoplasmic L-glutamate, observed in E. coli , may well be thermodynamically necessary to drive the biosynthesis of L-leucine.
We observed that reaction directionality is relatively insensitive to the specific adjustments for temperature, pH and ionic strength necessary to represent in vivo conditions. This is more due to the uncertainty in standard reactant Gibbs energy than the absence of metabolomic data. Integration of standard reactant Gibbs energy back calculated from equilibrium constants significantly reduces the number of false positive irreversible reactions, as compared to the use of group contribution estimates alone. Nevertheless, group contribution estimates are essential to match the broad scope of current genome scale models, therefore it is evident that these approaches to the thermodynamics of biochemical reactions are complementary.
This work was supported by NIH grant Grant 5R01GM057089-11 and a National University of Ireland, Galway, Science Faculty Fellowship.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.