Ultimately, systems biology strives to gain a quantitative systems-level understanding of complex and highly inter-related cellular processes and phenomena. The various interactions between the cellular domains and the mere number of the components involved, however, represent a complexity beyond intuitive comprehension. For this reason, mathematical models are required as tools to integrate the ever increasing biological knowledge and the data originating from the diverse cellular domains and, in a further step, to infer novel insight from the integrated available knowledge and data (

Kitano, 2002;

Stelling, 2004).

In order to fully exploit the wealth of information contained in genome-scale data, the mathematical model to be used to extract insight from the data should ideally have the same dimensionality. Unfortunately, the only existing class of genome-scale models are stoichiometric models (

Borodina and Nielsen, 2005), whose development was pioneered by Palsson (

Reed and Palsson, 2003). As these models only reflect the metabolic capabilities of an organism, today's basis for model-based mechanistic integration and analysis of genome-scale data is rather limited.

Nevertheless, already the stoichiometric models were demonstrated to be valuable tools for integration and analysis of a number of different omics data sets, such as fluxome data (

Blank *et al*, 2005), high-throughput growth phenotyping data (

Covert *et al*, 2004) and transcriptome data (

Covert *et al*, 2004;

Patil and Nielsen, 2005). Owing to the development of affordable and powerful mass spectrometers, large-scale sets of quantitative metabolite data are currently emerging into the area of systems biology (

Goodacre *et al*, 2004;

Nielsen and Oliver, 2005), and it is desired to also integrate these data in order to infer novel insight (

Stitt and Fernie, 2003;

Sato *et al*, 2004;

Nielsen and Oliver, 2005).

A natural approach for model-based analysis of metabolome data would be the extension of stoichiometric models by kinetic rate expressions for each enzymatic reaction. However, there is no comprehensive knowledge about

*in vivo* reaction mechanisms and parameters. In addition, the continuing challenges in the area of measurement (

Lafaye *et al*, 2005;

Wu *et al*, 2005) and computational analysis (

Voit *et al*, 2005) make it very unlikely that large-scale kinetic models will be available in the near future. For these reasons, large-scale sets of metabolome data cannot yet be assimilated into mathematical models (

Nielsen and Oliver, 2005). Consequently, insight, for instance into underlying regulatory mechanisms, can hardly be inferred.

In this work, we also draw on these fundamental principles and present a computational thermodynamics-based framework for the analysis of quantitative metabolome data. The mapping of such data onto a stoichiometric reaction network allows extraction of novel insight without requiring any kind of kinetic modeling. In the proposed network-embedded thermodynamic analysis (NET analysis), large-scale qualitative intracellular fluxes (derived from metabolic flux analysis) and metabolite concentrations are coupled to each other via the second law of thermodynamics and the metabolites' Gibbs energies of formation, and an optimization algorithm is employed to compute network-constrained, feasible ranges of Gibbs energies of reaction.

After illustrating the novel concept, first, we will apply the NET analysis tool to a small set of quantitative metabolite data acquired from *Escherichia coli* to demonstrate the practical application of the optimization framework as a tool (i) for consistency analysis of measured metabolite concentrations, (ii) for prediction of metabolite concentrations beyond the actually taken measurements and (iii) for identification of putative active sites of genetic or allosteric regulation. Then, we analyze a larger data set obtained from *Saccharomyces cerevisiae* in order to illustrate that the method is also applicable to more complex systems such as organisms with subcellular structures.

Network-embedded thermodynamic analysis

Assuming constant pressure and a closed system, according to the second law of thermodynamics a reaction occurs only in the direction of negative Gibbs energy of reaction, Δ_{r}*G*. This can be expressed in the inequalities

where *r* denotes the reaction rate, or in other words the net flux between metabolites participating in a reaction. Here, a negative reaction rate signifies a flux in backward direction.

The Gibbs energy of a reaction *j* can be calculated from the Gibbs energies of formation of the participating reactants *i*, Δ_{f}*G*_{i}, and the reactants' stoichiometric coefficients in the reaction *j*, *s*_{ij},

In turn, a metabolite's Gibbs energy of formation can be calculated from its standard Gibbs energy and its thermodynamic activity. In biochemistry, thermodynamic activities are typically replaced by molar concentrations, whereas the effect of ionic strength is taken into account by an adequate standard Gibbs energy. Moreover, possible reactant dissociation forms are lumped into a single reactant, and thus transformed Gibbs energies of formation, Δ

_{f}*G*_{i}′, are used, which in turn are calculated from the standard transformed Gibbs energies of formation, Δ

_{f}*G*_{i}′

^{0}, and the concentration

*c*_{i} of the particular metabolite

*i* (

Alberty, 2003),

For simplicity, ‘transformed Gibbs energies' will only be referred to as ‘Gibbs energies' in the following.

The presented equations form the foundation for NET analysis. Metabolite concentrations and metabolic fluxes are linked via thermodynamics and the stoichiometric network. The metabolites' Gibbs energies of formation determine, together with the stoichiometry, the Gibbs energies of reaction. These Gibbs energies of reaction and the flux directions are then coupled via the second law of thermodynamics to identify thermodynamically feasible ranges for the Gibbs energies of reaction and for concentrations of non-measured metabolites. provides an illustration of the input data required for NET analysis (i.e. metabolite concentrations, flux directions, a metabolic network model and Gibbs energies of formation) and the various insights, which can be obtained.

In the NET analysis, as an extension to an earlier employed method (

Mavrovouniotis, 1993;

Pissarra and Nielsen, 1997;

Stephanopoulos *et al*, 1998), the Gibbs energies of reaction are constrained by the mutual thermodynamic interdependencies of reactions in a network (i.e. the reactions' simultaneous action in the network). This way metabolite concentrations have to be feasible not only in view of one specific reaction but with respect to the functioning of the entire network (cf. ). As we will show below, this significantly limits the feasible ranges of Gibbs energies of reaction and therefore also the feasible concentration ranges of unmeasured metabolites.

In the following, we will outline the optimization procedure that underlies the NET analysis. The optimization determines the feasible range (i.e. upper and lower bounds) of the Gibbs energy of a particular reaction *k*, Δ_{r}*G*_{k}′, using metabolite concentrations, reaction directionalities, a metabolic network and thermodynamic data:

In this optimization, the Gibbs energy for the reaction

*k* is minimized and maximized under the following constraints: All reactions

*j* in the network (including the considered reaction

*k*) can only proceed in the direction of a negative Gibbs energy of reaction (

equations 4(a) and (b)). In turn, the Gibbs energies of reaction are determined by the reaction stoichiometries and the reactants' Gibbs energies of formation (

equation (4c)), which are a function of the predetermined standard Gibbs energies of formation and the metabolite concentrations (

equation (4d)). The latter are by default constrained to typical intracellular concentration ranges (see Materials and methods). These ranges should be defined cautiously such that they surely cover the possible variation in concentration under the considered experimental conditions. Measured concentration values are also considered with these constraints, which, owing to the fact that measurement uncertainties cannot be excluded, are typically allowed to vary by 10% around the measured values.

With an analogous procedure, we can determine feasible concentration ranges rather than Gibbs energies of reaction.

As mentioned above, a prerequisite for the NET analysis is the knowledge of the directions of intracellular fluxes (cf. ). These can either be defined based on preexisting knowledge (a), determined from experimental data (b), or (in case experimental flux data are not available) computed using flux balance analysis (FBA) (c). FBA employs linear programming to optimize a suitable cellular objective, while assuming steady state for the mass balances (

Kauffman *et al*, 2003;

Price *et al*, 2003).

In the context of assigning flux directions, it is important to note that one does not necessarily need to provide directions for all fluxes. If in doubt about a certain flux, no direction should be assigned to the particular reaction. In consequence, fewer constraints are imposed on the NET analysis optimization. This results in a larger solution space meaning that the computed ranges of feasible Gibbs energies of reaction and concentrations eventually become wider. Thus, neglecting a reaction can only lead to less insight from the measurement data but in no case will such an omission lead to wrong results. The same holds true for missing or unknown pathways: If a pathway is not considered, also only less insight is obtained.