PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (637673)

Clipboard (0)
None

Related Articles

1.  Metabolic network reconstruction of Chlamydomonas offers insight into light-driven algal metabolism 
A comprehensive genome-scale metabolic network of Chlamydomonas reinhardtii, including a detailed account of light-driven metabolism, is reconstructed and validated. The model provides a new resource for research of C. reinhardtii metabolism and in algal biotechnology.
The genome-scale metabolic network of Chlamydomonas reinhardtii (iRC1080) was reconstructed, accounting for >32% of the estimated metabolic genes encoded in the genome, and including extensive details of lipid metabolic pathways.This is the first metabolic network to explicitly account for stoichiometry and wavelengths of metabolic photon usage, providing a new resource for research of C. reinhardtii metabolism and developments in algal biotechnology.Metabolic functional annotation and the largest transcript verification of a metabolic network to date was performed, at least partially verifying >90% of the transcripts accounted for in iRC1080. Analysis of the network supports hypotheses concerning the evolution of latent lipid pathways in C. reinhardtii, including very long-chain polyunsaturated fatty acid and ceramide synthesis pathways.A novel approach for modeling light-driven metabolism was developed that accounts for both light source intensity and spectral quality of emitted light. The constructs resulting from this approach, termed prism reactions, were shown to significantly improve the accuracy of model predictions, and their use was demonstrated for evaluation of light source efficiency and design.
Algae have garnered significant interest in recent years, especially for their potential application in biofuel production. The hallmark, model eukaryotic microalgae Chlamydomonas reinhardtii has been widely used to study photosynthesis, cell motility and phototaxis, cell wall biogenesis, and other fundamental cellular processes (Harris, 2001). Characterizing algal metabolism is key to engineering production strains and understanding photobiological phenomena. Based on extensive literature on C. reinhardtii metabolism, its genome sequence (Merchant et al, 2007), and gene functional annotation, we have reconstructed and experimentally validated the genome-scale metabolic network for this alga, iRC1080, the first network to account for detailed photon absorption permitting growth simulations under different light sources. iRC1080 accounts for 1080 genes, associated with 2190 reactions and 1068 unique metabolites and encompasses 83 subsystems distributed across 10 cellular compartments (Figure 1A). Its >32% coverage of estimated metabolic genes is a tremendous expansion over previous algal reconstructions (Boyle and Morgan, 2009; Manichaikul et al, 2009). The lipid metabolic pathways of iRC1080 are considerably expanded relative to existing networks, and chemical properties of all metabolites in these pathways are accounted for explicitly, providing sufficient detail to completely specify all individual molecular species: backbone molecule and stereochemical numbering of acyl-chain positions; acyl-chain length; and number, position, and cis–trans stereoisomerism of carbon–carbon double bonds. Such detail in lipid metabolism will be critical for model-driven metabolic engineering efforts.
We experimentally verified transcripts accounted for in the network under permissive growth conditions, detecting >90% of tested transcript models (Figure 1B) and providing validating evidence for the contents of iRC1080. We also analyzed the extent of transcript verification by specific metabolic subsystems. Some subsystems stood out as more poorly verified, including chloroplast and mitochondrial transport systems and sphingolipid metabolism, all of which exhibited <80% of transcripts detected, reflecting incomplete characterization of compartmental transporters and supporting a hypothesis of latent pathway evolution for ceramide synthesis in C. reinhardtii. Additional lines of evidence from the reconstruction effort similarly support this hypothesis including lack of ceramide synthetase and other annotation gaps downstream in sphingolipid metabolism. A similar hypothesis of latent pathway evolution was established for very long-chain fatty acids (VLCFAs) and their polyunsaturated analogs (VLCPUFAs) (Figure 1C), owing to the absence of this class of lipids in previous experimental measurements, lack of a candidate VLCFA elongase in the functional annotation, and additional downstream annotation gaps in arachidonic acid metabolism.
The network provides a detailed account of metabolic photon absorption by light-driven reactions, including photosystems I and II, light-dependent protochlorophyllide oxidoreductase, provitamin D3 photoconversion to vitamin D3, and rhodopsin photoisomerase; this network accounting permits the precise modeling of light-dependent metabolism. iRC1080 accounts for effective light spectral ranges through analysis of biochemical activity spectra (Figure 3A), either reaction activity or absorbance at varying light wavelengths. Defining effective spectral ranges associated with each photon-utilizing reaction enabled our network to model growth under different light sources via stoichiometric representation of the spectral composition of emitted light, termed prism reactions. Coefficients for different photon wavelengths in a prism reaction correspond to the ratios of photon flux in the defined effective spectral ranges to the total emitted photon flux from a given light source (Figure 3B). This approach distinguishes the amount of emitted photons that drive different metabolic reactions. We created prism reactions for most light sources that have been used in published studies for algal and plant growth including solar light, various light bulbs, and LEDs. We also included regulatory effects, resulting from lighting conditions insofar as published studies enabled. Light and dark conditions have been shown to affect metabolic enzyme activity in C. reinhardtii on multiple levels: transcriptional regulation, chloroplast RNA degradation, translational regulation, and thioredoxin-mediated enzyme regulation. Through application of our light model and prism reactions, we were able to closely recapitulate experimental growth measurements under solar, incandescent, and red LED lights. Through unbiased sampling, we were able to establish the tremendous statistical significance of the accuracy of growth predictions achievable through implementation of prism reactions. Finally, application of the photosynthetic model was demonstrated prospectively to evaluate light utilization efficiency under different light sources. The results suggest that, of the existing light sources, red LEDs provide the greatest efficiency, about three times as efficient as sunlight. Extending this analysis, the model was applied to design a maximally efficient LED spectrum for algal growth. The result was a 677-nm peak LED spectrum with a total incident photon flux of 360 μE/m2/s, suggesting that for the simple objective of maximizing growth efficiency, LED technology has already reached an effective theoretical optimum.
In summary, the C. reinhardtii metabolic network iRC1080 that we have reconstructed offers insight into the basic biology of this species and may be employed prospectively for genetic engineering design and light source design relevant to algal biotechnology. iRC1080 was used to analyze lipid metabolism and generate novel hypotheses about the evolution of latent pathways. The predictive capacity of metabolic models developed from iRC1080 was demonstrated in simulating mutant phenotypes and in evaluation of light source efficiency. Our network provides a broad knowledgebase of the biochemistry and genomics underlying global metabolism of a photoautotroph, and our modeling approach for light-driven metabolism exemplifies how integration of largely unvisited data types, such as physicochemical environmental parameters, can expand the diversity of applications of metabolic networks.
Metabolic network reconstruction encompasses existing knowledge about an organism's metabolism and genome annotation, providing a platform for omics data analysis and phenotype prediction. The model alga Chlamydomonas reinhardtii is employed to study diverse biological processes from photosynthesis to phototaxis. Recent heightened interest in this species results from an international movement to develop algal biofuels. Integrating biological and optical data, we reconstructed a genome-scale metabolic network for this alga and devised a novel light-modeling approach that enables quantitative growth prediction for a given light source, resolving wavelength and photon flux. We experimentally verified transcripts accounted for in the network and physiologically validated model function through simulation and generation of new experimental growth data, providing high confidence in network contents and predictive applications. The network offers insight into algal metabolism and potential for genetic engineering and efficient light source design, a pioneering resource for studying light-driven metabolism and quantitative systems biology.
doi:10.1038/msb.2011.52
PMCID: PMC3202792  PMID: 21811229
Chlamydomonas reinhardtii; lipid metabolism; metabolic engineering; photobioreactor
2.  RMBNToolbox: random models for biochemical networks 
BMC Systems Biology  2007;1:22.
Background
There is an increasing interest to model biochemical and cell biological networks, as well as to the computational analysis of these models. The development of analysis methodologies and related software is rapid in the field. However, the number of available models is still relatively small and the model sizes remain limited. The lack of kinetic information is usually the limiting factor for the construction of detailed simulation models.
Results
We present a computational toolbox for generating random biochemical network models which mimic real biochemical networks. The toolbox is called Random Models for Biochemical Networks. The toolbox works in the Matlab environment, and it makes it possible to generate various network structures, stoichiometries, kinetic laws for reactions, and parameters therein. The generation can be based on statistical rules and distributions, and more detailed information of real biochemical networks can be used in situations where it is known. The toolbox can be easily extended. The resulting network models can be exported in the format of Systems Biology Markup Language.
Conclusion
While more information is accumulating on biochemical networks, random networks can be used as an intermediate step towards their better understanding. Random networks make it possible to study the effects of various network characteristics to the overall behavior of the network. Moreover, the construction of artificial network models provides the ground truth data needed in the validation of various computational methods in the fields of parameter estimation and data analysis.
doi:10.1186/1752-0509-1-22
PMCID: PMC1896132  PMID: 17524136
3.  A Scalable Computational Framework for Establishing Long-Term Behavior of Stochastic Reaction Networks 
PLoS Computational Biology  2014;10(6):e1003669.
Reaction networks are systems in which the populations of a finite number of species evolve through predefined interactions. Such networks are found as modeling tools in many biological disciplines such as biochemistry, ecology, epidemiology, immunology, systems biology and synthetic biology. It is now well-established that, for small population sizes, stochastic models for biochemical reaction networks are necessary to capture randomness in the interactions. The tools for analyzing such models, however, still lag far behind their deterministic counterparts. In this paper, we bridge this gap by developing a constructive framework for examining the long-term behavior and stability properties of the reaction dynamics in a stochastic setting. In particular, we address the problems of determining ergodicity of the reaction dynamics, which is analogous to having a globally attracting fixed point for deterministic dynamics. We also examine when the statistical moments of the underlying process remain bounded with time and when they converge to their steady state values. The framework we develop relies on a blend of ideas from probability theory, linear algebra and optimization theory. We demonstrate that the stability properties of a wide class of biological networks can be assessed from our sufficient theoretical conditions that can be recast as efficient and scalable linear programs, well-known for their tractability. It is notably shown that the computational complexity is often linear in the number of species. We illustrate the validity, the efficiency and the wide applicability of our results on several reaction networks arising in biochemistry, systems biology, epidemiology and ecology. The biological implications of the results as well as an example of a non-ergodic biological network are also discussed.
Author Summary
In many biological disciplines, computational modeling of interaction networks is the key for understanding biological phenomena. Such networks are traditionally studied using deterministic models. However, it has been recently recognized that when the populations are small in size, the inherent random effects become significant and to incorporate them, a stochastic modeling paradigm is necessary. Hence, stochastic models of reaction networks have been broadly adopted and extensively used. Such models, for instance, form a cornerstone for studying heterogeneity in clonal cell populations. In biological applications, one is often interested in knowing the long-term behavior and stability properties of reaction networks even with incomplete knowledge of the model parameters. However for stochastic models, no analytical tools are known for this purpose, forcing many researchers to use a simulation-based approach, which is highly unsatisfactory. To address this issue, we develop a theoretical and computational framework for determining the long-term behavior and stability properties for stochastic reaction networks. Our approach is based on a mixture of ideas from probability theory, linear algebra and optimization theory. We illustrate the broad applicability of our results by considering examples from various biological areas. The biological implications of our results are discussed as well.
doi:10.1371/journal.pcbi.1003669
PMCID: PMC4072526  PMID: 24968191
4.  Propagation of kinetic uncertainties through a canonical topology of the TLR4 signaling network in different regions of biochemical reaction space 
Background
Signal transduction networks represent the information processing systems that dictate which dynamical regimes of biochemical activity can be accessible to a cell under certain circumstances. One of the major concerns in molecular systems biology is centered on the elucidation of the robustness properties and information processing capabilities of signal transduction networks. Achieving this goal requires the establishment of causal relations between the design principle of biochemical reaction systems and their emergent dynamical behaviors.
Methods
In this study, efforts were focused in the construction of a relatively well informed, deterministic, non-linear dynamic model, accounting for reaction mechanisms grounded on standard mass action and Hill saturation kinetics, of the canonical reaction topology underlying Toll-like receptor 4 (TLR4)-mediated signaling events. This signaling mechanism has been shown to be deployed in macrophages during a relatively short time window in response to lypopolysaccharyde (LPS) stimulation, which leads to a rapidly mounted innate immune response. An extensive computational exploration of the biochemical reaction space inhabited by this signal transduction network was performed via local and global perturbation strategies. Importantly, a broad spectrum of biologically plausible dynamical regimes accessible to the network in widely scattered regions of parameter space was reconstructed computationally. Additionally, experimentally reported transcriptional readouts of target pro-inflammatory genes, which are actively modulated by the network in response to LPS stimulation, were also simulated. This was done with the main goal of carrying out an unbiased statistical assessment of the intrinsic robustness properties of this canonical reaction topology.
Results
Our simulation results provide convincing numerical evidence supporting the idea that a canonical reaction mechanism of the TLR4 signaling network is capable of performing information processing in a robust manner, a functional property that is independent of the signaling task required to be executed. Nevertheless, it was found that the robust performance of the network is not solely determined by its design principle (topology), but this may be heavily dependent on the network's current position in biochemical reaction space. Ultimately, our results enabled us the identification of key rate limiting steps which most effectively control the performance of the system under diverse dynamical regimes.
Conclusions
Overall, our in silico study suggests that biologically relevant and non-intuitive aspects on the general behavior of a complex biomolecular network can be elucidated only when taking into account a wide spectrum of dynamical regimes attainable by the system. Most importantly, this strategy provides the means for a suitable assessment of the inherent variational constraints imposed by the structure of the system when systematically probing its parameter space.
doi:10.1186/1742-4682-7-7
PMCID: PMC2907738  PMID: 20230643
5.  The Fidelity of Dynamic Signaling by Noisy Biomolecular Networks 
PLoS Computational Biology  2013;9(3):e1002965.
Cells live in changing, dynamic environments. To understand cellular decision-making, we must therefore understand how fluctuating inputs are processed by noisy biomolecular networks. Here we present a general methodology for analyzing the fidelity with which different statistics of a fluctuating input are represented, or encoded, in the output of a signaling system over time. We identify two orthogonal sources of error that corrupt perfect representation of the signal: dynamical error, which occurs when the network responds on average to other features of the input trajectory as well as to the signal of interest, and mechanistic error, which occurs because biochemical reactions comprising the signaling mechanism are stochastic. Trade-offs between these two errors can determine the system's fidelity. By developing mathematical approaches to derive dynamics conditional on input trajectories we can show, for example, that increased biochemical noise (mechanistic error) can improve fidelity and that both negative and positive feedback degrade fidelity, for standard models of genetic autoregulation. For a group of cells, the fidelity of the collective output exceeds that of an individual cell and negative feedback then typically becomes beneficial. We can also predict the dynamic signal for which a given system has highest fidelity and, conversely, how to modify the network design to maximize fidelity for a given dynamic signal. Our approach is general, has applications to both systems and synthetic biology, and will help underpin studies of cellular behavior in natural, dynamic environments.
Author Summary
Cells do not live in constant conditions, but in environments that change over time. To adapt to their surroundings, cells must therefore sense fluctuating concentrations and ‘interpret’ the state of their environment to see whether, for example, a change in the pattern of gene expression is needed. This task is achieved via the noisy computations of biomolecular networks. But what levels of signaling fidelity can be achieved and how are dynamic signals encoded in the network's outputs? Here we present a general technique for analyzing such questions. We identify two sources of signaling error: dynamic error, which occurs when the network responds to features of the input other than the signal of interest; and mechanistic error, which arises because of the inevitable stochasticity of biochemical reactions. We show analytically that increased biochemical noise can sometimes improve fidelity and that, for genetic autoregulation, feedback can be deleterious. Our approach also allows us to predict the dynamic signal for which a given signaling network has highest fidelity and to design networks to maximize fidelity for a given signal. We thus propose a new way to analyze the flow of information in signaling networks, particularly for the dynamic environments expected in nature.
doi:10.1371/journal.pcbi.1002965
PMCID: PMC3610653  PMID: 23555208
6.  Developing optimal input design strategies in cancer systems biology with applications to microfluidic device engineering 
BMC Bioinformatics  2009;10(Suppl 12):S4.
Background
Mechanistic models are becoming more and more popular in Systems Biology; identification and control of models underlying biochemical pathways of interest in oncology is a primary goal in this field. Unfortunately the scarce availability of data still limits our understanding of the intrinsic characteristics of complex pathologies like cancer: acquiring information for a system understanding of complex reaction networks is time consuming and expensive. Stimulus response experiments (SRE) have been used to gain a deeper insight into the details of biochemical mechanisms underlying cell life and functioning. Optimisation of the input time-profile, however, still remains a major area of research due to the complexity of the problem and its relevance for the task of information retrieval in systems biology-related experiments.
Results
We have addressed the problem of quantifying the information associated to an experiment using the Fisher Information Matrix and we have proposed an optimal experimental design strategy based on evolutionary algorithm to cope with the problem of information gathering in Systems Biology. On the basis of the theoretical results obtained in the field of control systems theory, we have studied the dynamical properties of the signals to be used in cell stimulation. The results of this study have been used to develop a microfluidic device for the automation of the process of cell stimulation for system identification.
Conclusion
We have applied the proposed approach to the Epidermal Growth Factor Receptor pathway and we observed that it minimises the amount of parametric uncertainty associated to the identified model. A statistical framework based on Monte-Carlo estimations of the uncertainty ellipsoid confirmed the superiority of optimally designed experiments over canonical inputs. The proposed approach can be easily extended to multiobjective formulations that can also take advantage of identifiability analysis. Moreover, the availability of fully automated microfluidic platforms explicitly developed for the task of biochemical model identification will hopefully reduce the effects of the 'data rich-data poor' paradox in Systems Biology.
doi:10.1186/1471-2105-10-S12-S4
PMCID: PMC2762069  PMID: 19828080
7.  The slow-scale linear noise approximation: an accurate, reduced stochastic description of biochemical networks under timescale separation conditions 
BMC Systems Biology  2012;6:39.
Background
It is well known that the deterministic dynamics of biochemical reaction networks can be more easily studied if timescale separation conditions are invoked (the quasi-steady-state assumption). In this case the deterministic dynamics of a large network of elementary reactions are well described by the dynamics of a smaller network of effective reactions. Each of the latter represents a group of elementary reactions in the large network and has associated with it an effective macroscopic rate law. A popular method to achieve model reduction in the presence of intrinsic noise consists of using the effective macroscopic rate laws to heuristically deduce effective probabilities for the effective reactions which then enables simulation via the stochastic simulation algorithm (SSA). The validity of this heuristic SSA method is a priori doubtful because the reaction probabilities for the SSA have only been rigorously derived from microscopic physics arguments for elementary reactions.
Results
We here obtain, by rigorous means and in closed-form, a reduced linear Langevin equation description of the stochastic dynamics of monostable biochemical networks in conditions characterized by small intrinsic noise and timescale separation. The slow-scale linear noise approximation (ssLNA), as the new method is called, is used to calculate the intrinsic noise statistics of enzyme and gene networks. The results agree very well with SSA simulations of the non-reduced network of elementary reactions. In contrast the conventional heuristic SSA is shown to overestimate the size of noise for Michaelis-Menten kinetics, considerably under-estimate the size of noise for Hill-type kinetics and in some cases even miss the prediction of noise-induced oscillations.
Conclusions
A new general method, the ssLNA, is derived and shown to correctly describe the statistics of intrinsic noise about the macroscopic concentrations under timescale separation conditions. The ssLNA provides a simple and accurate means of performing stochastic model reduction and hence it is expected to be of widespread utility in studying the dynamics of large noisy reaction networks, as is common in computational and systems biology.
doi:10.1186/1752-0509-6-39
PMCID: PMC3532178  PMID: 22583770
8.  Using Chemistry and Microfluidics To Understand the Spatial Dynamics of Complex Biological Networks 
Accounts of chemical research  2008;41(4):549-558.
CONSPECTUS
Understanding the spatial dynamics of biochemical networks is both fundamentally important for understanding life at the systems level and also has practical implications for medicine, engineering, biology, and chemistry. Studies at the level of individual reactions provide essential information about the function, interactions, and localization of individual molecular species and reactions in a network. However, analyzing the spatial dynamics of complex biochemical networks at this level is difficult. Biochemical networks are non-equilibrium systems containing dozens to hundreds of reactions with nonlinear and time-dependent interactions, and these interactions are influenced by diffusion, flow, and the relative values of state-dependent kinetic parameters.
To achieve an overall understanding of the spatial dynamics of a network and the global mechanisms that drive its function, networks must be analyzed as a whole, where all of the components and influential parameters of a network are simultaneously considered. Here, we describe chemical concepts and microfluidic tools developed for network-level investigations of the spatial dynamics of these networks. Modular approaches can be used to simplify these networks by separating them into modules, and simple experimental or computational models can be created by replacing each module with a single reaction. Microfluidics can be used to implement these models as well as to analyze and perturb the complex network itself with spatial control on the micrometer scale.
We also describe the application of these network-level approaches to elucidate the mechanisms governing the spatial dynamics of two networks–hemostasis (blood clotting) and early patterning of the Drosophila embryo. To investigate the dynamics of the complex network of hemostasis, we simplified the network by using a modular mechanism and created a chemical model based on this mechanism by using microfluidics. Then, we used the mechanism and the model to predict the dynamics of initiation and propagation of blood clotting and tested these predictions with human blood plasma by using microfluidics. We discovered that both initiation and propagation of clotting are regulated by a threshold response to the concentration of activators of clotting, and that clotting is sensitive to the spatial localization of stimuli. To understand the dynamics of patterning of the Drosophila embryo, we used microfluidics to perturb the environment around a developing embryo and observe the effects of this perturbation on the expression of Hunchback, a protein whose localization is essential to proper development. We found that the mechanism that is responsible for Hunchback positioning is asymmetric, time-dependent, and more complex than previously proposed by studies of individual reactions.
Overall, these approaches provide strategies for simplifying, modeling, and probing complex networks without sacrificing the functionality of the network. Such network-level strategies may be most useful for understanding systems with nonlinear interactions where spatial dynamics is essential for function. In addition, microfluidics provides an opportunity to investigate the mechanisms responsible for robust functioning of complex networks. By creating nonideal, stressful, and perturbed environments, microfluidic experiments could reveal the function of pathways thought to be nonessential under ideal conditions.
doi:10.1021/ar700174g
PMCID: PMC2593841  PMID: 18217723
9.  Synthetic in vitro transcriptional oscillators 
A fundamental goal of synthetic biology is to understand design principles through engineering biochemical systems.Three in vitro synthetic transcriptional oscillators were constructed and analyzed: a two-node-negative feedback oscillator, an amplified negative-feedback oscillator, and a three-node ring oscillator.The in vitro oscillators are governed by similar design principles as previous theoretical studies and synthetic oscillators in vivo.Because of unintended reactions that arise even without the complexity of living cells, several challenges remain for predictive and robust oscillator performance.
Fundamental goals for synthetic biology are to understand the principles of biological circuitry from an engineering perspective and to establish engineering methods for creating biochemical circuitry to control molecular processes—both in vitro and in vivo (Benner and Sismour, 2005; Adrianantoandro et al, 2006). Here, we make use of a previously proposed class of in vitro biochemical systems, transcriptional circuits, that can be modularly wired into arbitrarily complex networks by changing the regulatory and coding sequence domains of DNA templates (Kim et al, 2006; Subsoontorn et al 2011). Using design motifs for inhibitory and excitatory regulations, three different oscillator designs were constructed and characterized: a two-switch negative-feedback oscillator, loosely analogous to the p53–Mdm2-feedback loop (Bar-Or et al, 2000); the same oscillator augmented with a positive-feedback loop, loosely analogous to a synthetic relaxation oscillator (Atkinson et al, 2003); and a three-switch ring oscillator analogous to the repressilator (Elowitz and Leibler, 2000).
DNA and RNA hybridization reactions (Figure 1B) can be assembled to create either an inhibitable switch (Figure 1A, right and bottom) with a threshold set by the total concentration of its DNA activator strand (Figure 1C, bottom), or an activatable switch (Figure 1A, left and top) with a threshold set by its DNA inhibitor strand concentration (Figure 1C, top). This threshold mechanism is analogous to biological threshold mechanisms such as ‘inhibitor ultrasensitivity' (Ferrell, 1996) and ‘molecular titration' (Buchler and Louis, 2008). Using these design motifs, we constructed a two-switch negative-feedback oscillator (Figure 1A, inset): RNA activator rA1 activates the production of RNA inhibitor rI2 by modulating switch Sw21, while RNA inhibitor rI2, in turn, inhibits the production of RNA activator rA1 by modulating switch Sw12. A total of seven DNA strands are used, in addition to the two enzymes, bacteriophage T7 RNA polymerase and Escherichia coli ribonuclease H. The fact that such a negative-feedback loop can lead to temporal oscillations can be seen from a mathematical model of transcriptional networks. Experimental results showed qualitative agreement with predicted oscillator behavior from simple model simulations.
The fully optimized system revealed five complete oscillation cycles with a nearly 50% amplitude swing (Figure 3A) until, after ∼20 h, the production rate could no longer be sustained in the batch reaction. Gel measurements verified oscillations in RNA concentrations and switch states (Figure 3B and C). However to our surprise, rather than oscillations with constant amplitude and constant mean, the RNA inhibitor concentration builds up after each cycle. An extended mathematical model that incorporated an interference reaction from ‘waste' product (Figure 3B and C) could qualitatively capture this behavior.
Using a new autoregulatory switch Sw11, we added a positive-feedback loop to the two-node oscillator to make an amplified negative feedback oscillator (Design II, Figure 1D). Further, we replaced the excitatory connection of Sw21 by a chain of two inhibitory connections, Sw23 and Sw31, to construct a three-switch ring oscillator (Design III, Figure 1D). All three oscillator designs could be tuned to reach the oscillatory regime in parameter space.
Reassuringly, our in vitro oscillators exhibit several design principles previously observed in vivo. (1) Introducing delay in a simple negative-feedback loop can help achieve stable oscillation (Novák and Tyson, 2008; Stricker et al, 2008). (2) The addition of a positive-feedback self-loop to a negative-feedback oscillator provides access to rich dynamics and improved tunability (Tsai et al, 2008). (3) Oscillations in biochemical ring oscillators (such as the repressilator) are sensitive to parameter asymmetry among individual components (Tuttle et al, 2005). (4) The saturation of degradation machinery and the management of waste products could play an important role.
However, several significant difficulties remain for predictive and robust oscillator performances: limited lifetime of closed batch reactions, interference from waste products, and asymmetry of switch components make quantitative modeling and predictio difficult. As a complementary approach to top-down view of systems biology, cell-free in vitro systems offer a valuable training ground to create and explore increasingly interesting and powerful information-based chemical systems (Simpson, 2006). In vitro oscillators could be used to orchestrate other chemical processes such as DNA nanomachines (Dittmer and Simmel, 2004) and to provide embedded controllers within prototype artificial cells (Noireaux and Libchaber, 2004; Griffiths and Tawfik, 2006).
The construction of synthetic biochemical circuits from simple components illuminates how complex behaviors can arise in chemistry and builds a foundation for future biological technologies. A simplified analog of genetic regulatory networks, in vitro transcriptional circuits, provides a modular platform for the systematic construction of arbitrary circuits and requires only two essential enzymes, bacteriophage T7 RNA polymerase and Escherichia coli ribonuclease H, to produce and degrade RNA signals. In this study, we design and experimentally demonstrate three transcriptional oscillators in vitro. First, a negative feedback oscillator comprising two switches, regulated by excitatory and inhibitory RNA signals, showed up to five complete cycles. To demonstrate modularity and to explore the design space further, a positive-feedback loop was added that modulates and extends the oscillatory regime. Finally, a three-switch ring oscillator was constructed and analyzed. Mathematical modeling guided the design process, identified experimental conditions likely to yield oscillations, and explained the system's robust response to interference by short degradation products. Synthetic transcriptional oscillators could prove valuable for systematic exploration of biochemical circuit design principles and for controlling nanoscale devices and orchestrating processes within artificial cells.
doi:10.1038/msb.2010.119
PMCID: PMC3063688  PMID: 21283141
cell free; in vitro; oscillation; synthetic biology; transcriptional circuits
10.  Systematic integration of experimental data and models in systems biology 
BMC Bioinformatics  2010;11:582.
Background
The behaviour of biological systems can be deduced from their mathematical models. However, multiple sources of data in diverse forms are required in the construction of a model in order to define its components and their biochemical reactions, and corresponding parameters. Automating the assembly and use of systems biology models is dependent upon data integration processes involving the interoperation of data and analytical resources.
Results
Taverna workflows have been developed for the automated assembly of quantitative parameterised metabolic networks in the Systems Biology Markup Language (SBML). A SBML model is built in a systematic fashion by the workflows which starts with the construction of a qualitative network using data from a MIRIAM-compliant genome-scale model of yeast metabolism. This is followed by parameterisation of the SBML model with experimental data from two repositories, the SABIO-RK enzyme kinetics database and a database of quantitative experimental results. The models are then calibrated and simulated in workflows that call out to COPASIWS, the web service interface to the COPASI software application for analysing biochemical networks. These systems biology workflows were evaluated for their ability to construct a parameterised model of yeast glycolysis.
Conclusions
Distributed information about metabolic reactions that have been described to MIRIAM standards enables the automated assembly of quantitative systems biology models of metabolic networks based on user-defined criteria. Such data integration processes can be implemented as Taverna workflows to provide a rapid overview of the components and their relationships within a biochemical system.
doi:10.1186/1471-2105-11-582
PMCID: PMC3008707  PMID: 21114840
11.  Comparing Transcription Rate and mRNA Abundance as Parameters for Biochemical Pathway and Network Analysis 
PLoS ONE  2010;5(3):e9908.
The cells adapt to extra- and intra-cellular signals by dynamic orchestration of activities of pathways in the biochemical networks. Dynamic control of the gene expression process represents a major mechanism for pathway activity regulation. Gene expression has thus been routinely measured, most frequently at steady-state mRNA abundance level using micro-array technology. The results are widely used in statistical inference of the structures of underlying biochemical networks, with the assumption that functionally related genes exhibit similar dynamic profiles. Steady-state mRNA abundance, however, is a composite of two factors: transcription rate and mRNA degradation rate. The question being asked here is therefore whether steady-state mRNA abundance or any of two factors is a more informative measurement target for studying network dynamics. The yeast S. cerevisiae was used as model organism and transcription rate was chosen out of the two factors in this study, because genome-wide determination of transcription rates has been reported for several physiological processes in this species. Our strategy is to test which one is a better measurement of functional relatedness between genes. The analysis was performed on those S. cerevisiae genes that have bacterial orthologs as identified by reciprocal BLAST analysis, so that functional relatedness of a gene pair can be measured by the frequency at which their bacterial orthologs co-occur in the same operon in the collection of bacterial genomes. It is found that transcription rate data is generally a better parameter for functional relatedness than steady state mRNA abundance, suggesting transcription rate data is more informative to use in deciphering the logics used by the cells in dynamic regulation of biochemical network behaviors. The significance of this finding for network and systems biology, as well as biomedical research in general, is discussed.
doi:10.1371/journal.pone.0009908
PMCID: PMC2845646  PMID: 20361042
12.  A systematic molecular circuit design method for gene networks under biochemical time delays and molecular noises 
BMC Systems Biology  2008;2:103.
Background
Gene networks in nanoscale are of nonlinear stochastic process. Time delays are common and substantial in these biochemical processes due to gene transcription, translation, posttranslation protein modification and diffusion. Molecular noises in gene networks come from intrinsic fluctuations, transmitted noise from upstream genes, and the global noise affecting all genes. Knowledge of molecular noise filtering and biochemical process delay compensation in gene networks is crucial to understand the signal processing in gene networks and the design of noise-tolerant and delay-robust gene circuits for synthetic biology.
Results
A nonlinear stochastic dynamic model with multiple time delays is proposed for describing a gene network under process delays, intrinsic molecular fluctuations, and extrinsic molecular noises. Then, the stochastic biochemical processing scheme of gene regulatory networks for attenuating these molecular noises and compensating process delays is investigated from the nonlinear signal processing perspective. In order to improve the robust stability for delay toleration and noise filtering, a robust gene circuit for nonlinear stochastic time-delay gene networks is engineered based on the nonlinear robust H∞ stochastic filtering scheme. Further, in order to avoid solving these complicated noise-tolerant and delay-robust design problems, based on Takagi-Sugeno (T-S) fuzzy time-delay model and linear matrix inequalities (LMIs) technique, a systematic gene circuit design method is proposed to simplify the design procedure.
Conclusion
The proposed gene circuit design method has much potential for application to systems biology, synthetic biology and drug design when a gene regulatory network has to be designed for improving its robust stability and filtering ability of disease-perturbed gene network or when a synthetic gene network needs to perform robustly under process delays and molecular noises.
doi:10.1186/1752-0509-2-103
PMCID: PMC2661895  PMID: 19038029
13.  Construction of a computable cell proliferation network focused on non-diseased lung cells 
BMC Systems Biology  2011;5:105.
Background
Critical to advancing the systems-level evaluation of complex biological processes is the development of comprehensive networks and computational methods to apply to the analysis of systems biology data (transcriptomics, proteomics/phosphoproteomics, metabolomics, etc.). Ideally, these networks will be specifically designed to capture the normal, non-diseased biology of the tissue or cell types under investigation, and can be used with experimentally generated systems biology data to assess the biological impact of perturbations like xenobiotics and other cellular stresses. Lung cell proliferation is a key biological process to capture in such a network model, given the pivotal role that proliferation plays in lung diseases including cancer, chronic obstructive pulmonary disease (COPD), and fibrosis. Unfortunately, no such network has been available prior to this work.
Results
To further a systems-level assessment of the biological impact of perturbations on non-diseased mammalian lung cells, we constructed a lung-focused network for cell proliferation. The network encompasses diverse biological areas that lead to the regulation of normal lung cell proliferation (Cell Cycle, Growth Factors, Cell Interaction, Intra- and Extracellular Signaling, and Epigenetics), and contains a total of 848 nodes (biological entities) and 1597 edges (relationships between biological entities). The network was verified using four published gene expression profiling data sets associated with measured cell proliferation endpoints in lung and lung-related cell types. Predicted changes in the activity of core machinery involved in cell cycle regulation (RB1, CDKN1A, and MYC/MYCN) are statistically supported across multiple data sets, underscoring the general applicability of this approach for a network-wide biological impact assessment using systems biology data.
Conclusions
To the best of our knowledge, this lung-focused Cell Proliferation Network provides the most comprehensive connectivity map in existence of the molecular mechanisms regulating cell proliferation in the lung. The network is based on fully referenced causal relationships obtained from extensive evaluation of the literature. The computable structure of the network enables its application to the qualitative and quantitative evaluation of cell proliferation using systems biology data sets. The network is available for public use.
doi:10.1186/1752-0509-5-105
PMCID: PMC3160372  PMID: 21722388
14.  Scalable Rule-Based Modelling of Allosteric Proteins and Biochemical Networks 
PLoS Computational Biology  2010;6(11):e1000975.
Much of the complexity of biochemical networks comes from the information-processing abilities of allosteric proteins, be they receptors, ion-channels, signalling molecules or transcription factors. An allosteric protein can be uniquely regulated by each combination of input molecules that it binds. This “regulatory complexity” causes a combinatorial increase in the number of parameters required to fit experimental data as the number of protein interactions increases. It therefore challenges the creation, updating, and re-use of biochemical models. Here, we propose a rule-based modelling framework that exploits the intrinsic modularity of protein structure to address regulatory complexity. Rather than treating proteins as “black boxes”, we model their hierarchical structure and, as conformational changes, internal dynamics. By modelling the regulation of allosteric proteins through these conformational changes, we often decrease the number of parameters required to fit data, and so reduce over-fitting and improve the predictive power of a model. Our method is thermodynamically grounded, imposes detailed balance, and also includes molecular cross-talk and the background activity of enzymes. We use our Allosteric Network Compiler to examine how allostery can facilitate macromolecular assembly and how competitive ligands can change the observed cooperativity of an allosteric protein. We also develop a parsimonious model of G protein-coupled receptors that explains functional selectivity and can predict the rank order of potency of agonists acting through a receptor. Our methodology should provide a basis for scalable, modular and executable modelling of biochemical networks in systems and synthetic biology.
Author Summary
The complexity of biochemical networks challenges our ability to create quantitative and predictive models of cellular responses to extracellular changes. In these networks, the regulation of allosteric receptors and proteins by multiple drugs or endogenous ligands introduces “regulatory complexity” because a large number of parameters is required to describe such interactions. Protein interactions also give rise to “combinatorial complexity” by generating large numbers of protein complexes and covalent modification states. To address these twin problems, we propose a modelling framework that combines a modular description of protein structure and function with a rule-based description of protein interactions. We define the input-output function of an allosteric protein through its thermodynamic properties and structural components. We show that our “biomolecule-centric” methodology, in contrast to ad hoc approaches that emphasize the regulatory logic of interactions, can reduce the number of parameters required to model experimental observations. We also demonstrate how the application of our framework gives insights into the assembly of macromolecular complexes and increases the predictive power of a standard model of G protein-coupled receptors. These benefits are possible in many systems, given the ubiquity of allostery in biochemical networks. Our research delineates a fundamental relationship between allostery, modularity, and complexity in biochemical networks.
doi:10.1371/journal.pcbi.1000975
PMCID: PMC2973810  PMID: 21079669
15.  A method for inverse bifurcation of biochemical switches: inferring parameters from dose response curves 
BMC Systems Biology  2014;8(1):114.
Background
Within cells, stimuli are transduced into cell responses by complex networks of biochemical reactions. In many cell decision processes the underlying networks behave as bistable switches, converting graded stimuli or inputs into all or none cell responses. Observing how systems respond to different perturbations, insight can be gained into the underlying molecular mechanisms by developing mathematical models. Emergent properties of systems, like bistability, can be exploited to this purpose. One of the main challenges in modeling intracellular processes, from signaling pathways to gene regulatory networks, is to deal with high structural and parametric uncertainty, due to the complexity of the systems and the difficulty to obtain experimental measurements. Formal methods that exploit structural properties of networks for parameter estimation can help to overcome these problems.
Results
We here propose a novel method to infer the kinetic parameters of bistable biochemical network models. Bistable systems typically show hysteretic dose response curves, in which the so called bifurcation points can be located experimentally. We exploit the fact that, at the bifurcation points, a condition for multistationarity derived in the context of the Chemical Reaction Network Theory must be fulfilled. Chemical Reaction Network Theory has attracted attention from the (systems) biology community since it connects the structure of biochemical reaction networks to qualitative properties of the corresponding model of ordinary differential equations. The inverse bifurcation method developed here allows determining the parameters that produce the expected behavior of the dose response curves and, in particular, the observed location of the bifurcation points given by experimental data.
Conclusions
Our inverse bifurcation method exploits inherent structural properties of bistable switches in order to estimate kinetic parameters of bistable biochemical networks, opening a promising route for developments in Chemical Reaction Network Theory towards kinetic model identification.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0114-2) contains supplementary material, which is available to authorized users.
doi:10.1186/s12918-014-0114-2
PMCID: PMC4263113  PMID: 25409687
Biochemical reaction network; Bistability; Saddle node bifurcation; Dose response curve; Chemical reaction network theory
16.  Metabolic Constraint-Based Refinement of Transcriptional Regulatory Networks 
PLoS Computational Biology  2013;9(12):e1003370.
There is a strong need for computational frameworks that integrate different biological processes and data-types to unravel cellular regulation. Current efforts to reconstruct transcriptional regulatory networks (TRNs) focus primarily on proximal data such as gene co-expression and transcription factor (TF) binding. While such approaches enable rapid reconstruction of TRNs, the overwhelming combinatorics of possible networks limits identification of mechanistic regulatory interactions. Utilizing growth phenotypes and systems-level constraints to inform regulatory network reconstruction is an unmet challenge. We present our approach Gene Expression and Metabolism Integrated for Network Inference (GEMINI) that links a compendium of candidate regulatory interactions with the metabolic network to predict their systems-level effect on growth phenotypes. We then compare predictions with experimental phenotype data to select phenotype-consistent regulatory interactions. GEMINI makes use of the observation that only a small fraction of regulatory network states are compatible with a viable metabolic network, and outputs a regulatory network that is simultaneously consistent with the input genome-scale metabolic network model, gene expression data, and TF knockout phenotypes. GEMINI preferentially recalls gold-standard interactions (p-value = 10−172), significantly better than using gene expression alone. We applied GEMINI to create an integrated metabolic-regulatory network model for Saccharomyces cerevisiae involving 25,000 regulatory interactions controlling 1597 metabolic reactions. The model quantitatively predicts TF knockout phenotypes in new conditions (p-value = 10−14) and revealed potential condition-specific regulatory mechanisms. Our results suggest that a metabolic constraint-based approach can be successfully used to help reconstruct TRNs from high-throughput data, and highlights the potential of using a biochemically-detailed mechanistic framework to integrate and reconcile inconsistencies across different data-types. The algorithm and associated data are available at https://sourceforge.net/projects/gemini-data/
Author Summary
Cellular networks, such as metabolic and transcriptional regulatory networks (TRNs), do not operate independently but work together in unison to determine cellular phenotypes. Further, the phenotype and architecture of one network constrains the topology of other networks. Hence, it is critical to study network components and interactions in the context of the entire cell. Typically, efforts to reconstruct TRNs focus only on immediately proximal data such as gene co-expression and transcription factor (TF)-binding. Herein, we take a different strategy by linking candidate TRNs with the metabolic network to predict systems-level responses such as growth phenotypes of TF knockout strains, and compare predictions with experimental phenotype data to select amongst the candidate TRNs. Our approach goes beyond traditional data integration approaches for network inference and refinement by using a predictive network model (metabolism) to refine another network model (regulation) – thus providing an alternative avenue to this area of research. Understanding how the networks function together in a cell will pave the way for synthetic biology and has a wide-range of applications in biotechnology, drug discovery and diagnostics. Further we demonstrate how metabolic models can integrate and reconcile inconsistencies across different data-types.
doi:10.1371/journal.pcbi.1003370
PMCID: PMC3857774  PMID: 24348226
17.  Systems infection biology: a compartmentalized immune network of pig spleen challenged with Haemophilus parasuis 
BMC Genomics  2013;14:46.
Background
Network biology (systems biology) approaches are useful tools for elucidating the host infection processes that often accompany complex immune networks. Although many studies have recently focused on Haemophilus parasuis, a model of Gram-negative bacterium, little attention has been paid to the host's immune response to infection. In this article, we use network biology to investigate infection with Haemophilus parasuis in an in vivo pig model.
Results
By targeting the spleen immunogenome, we established an expression signature indicative of H. parasuis infection using a PCA/GSEA combined method. We reconstructed the immune network and estimated the network topology parameters that characterize the immunogene expressions in response to H. parasuis infection. The results showed that the immune network of H. parasuis infection is compartmentalized (not globally linked). Statistical analysis revealed that the reconstructed network is scale-free but not small-world. Based on the quantitative topological prioritization, we inferred that the C1R-centered clique might play a vital role in responding to H. parasuis infection.
Conclusions
Here, we provide the first report of reconstruction of the immune network in H. parasuis-infected porcine spleen. The distinguishing feature of our work is the focus on utilizing the immunogenome for a network biology-oriented analysis. Our findings complement and extend the frontiers of knowledge of host infection biology for H. parasuis and also provide a new clue for systems infection biology of Gram-negative bacilli in mammals.
doi:10.1186/1471-2164-14-46
PMCID: PMC3610166  PMID: 23339624
Pig model; Haemophilus parasuis; Spleen; Immunogenome; Network; Quantitative topology; Scale-free, C1R
18.  The Signaling Petri Net-Based Simulator: A Non-Parametric Strategy for Characterizing the Dynamics of Cell-Specific Signaling Networks 
PLoS Computational Biology  2008;4(2):e1000005.
Reconstructing cellular signaling networks and understanding how they work are major endeavors in cell biology. The scale and complexity of these networks, however, render their analysis using experimental biology approaches alone very challenging. As a result, computational methods have been developed and combined with experimental biology approaches, producing powerful tools for the analysis of these networks. These computational methods mostly fall on either end of a spectrum of model parameterization. On one end is a class of structural network analysis methods; these typically use the network connectivity alone to generate hypotheses about global properties. On the other end is a class of dynamic network analysis methods; these use, in addition to the connectivity, kinetic parameters of the biochemical reactions to predict the network's dynamic behavior. These predictions provide detailed insights into the properties that determine aspects of the network's structure and behavior. However, the difficulty of obtaining numerical values of kinetic parameters is widely recognized to limit the applicability of this latter class of methods.
Several researchers have observed that the connectivity of a network alone can provide significant insights into its dynamics. Motivated by this fundamental observation, we present the signaling Petri net, a non-parametric model of cellular signaling networks, and the signaling Petri net-based simulator, a Petri net execution strategy for characterizing the dynamics of signal flow through a signaling network using token distribution and sampling. The result is a very fast method, which can analyze large-scale networks, and provide insights into the trends of molecules' activity-levels in response to an external stimulus, based solely on the network's connectivity.
We have implemented the signaling Petri net-based simulator in the PathwayOracle toolkit, which is publicly available at http://bioinfo.cs.rice.edu/pathwayoracle. Using this method, we studied a MAPK1,2 and AKT signaling network downstream from EGFR in two breast tumor cell lines. We analyzed, both experimentally and computationally, the activity level of several molecules in response to a targeted manipulation of TSC2 and mTOR-Raptor. The results from our method agreed with experimental results in greater than 90% of the cases considered, and in those where they did not agree, our approach provided valuable insights into discrepancies between known network connectivities and experimental observations.
Author Summary
Many cellular behaviors including growth, differentiation, and movement are influenced by external stimuli. Such external stimuli are obtained, processed, and carried to the nucleus by the signaling network—a dense network of cellular biochemical reactions. Beyond being interesting for their role in directing cellular behavior, deleterious changes in a cell's signaling network can alter a cell's responses to external stimuli, giving rise to devastating diseases such as cancer. As a result, building accurate mathematical and computational models of cellular signaling networks is a major endeavor in biology. The scale and complexity of these networks render them difficult to analyze by experimental techniques alone, which has led to the development of computational analysis methods. In this paper, we present a novel computational simulation technique that can provide qualitatively accurate predictions of the behavior of a cellular signaling network without requiring detailed knowledge of the signaling network's parameters. Our approach makes use of recent discoveries that network structure alone can determine many aspects of a network's dynamics. When compared against experimental results, our method correctly predicted 90% of the cases considered. In those where it did not agree, our approach provided valuable insights into discrepancies between known network structure and experimental observations.
doi:10.1371/journal.pcbi.1000005
PMCID: PMC2265486  PMID: 18463702
19.  Training Signaling Pathway Maps to Biochemical Data with Constrained Fuzzy Logic: Quantitative Analysis of Liver Cell Responses to Inflammatory Stimuli 
PLoS Computational Biology  2011;7(3):e1001099.
Predictive understanding of cell signaling network operation based on general prior knowledge but consistent with empirical data in a specific environmental context is a current challenge in computational biology. Recent work has demonstrated that Boolean logic can be used to create context-specific network models by training proteomic pathway maps to dedicated biochemical data; however, the Boolean formalism is restricted to characterizing protein species as either fully active or inactive. To advance beyond this limitation, we propose a novel form of fuzzy logic sufficiently flexible to model quantitative data but also sufficiently simple to efficiently construct models by training pathway maps on dedicated experimental measurements. Our new approach, termed constrained fuzzy logic (cFL), converts a prior knowledge network (obtained from literature or interactome databases) into a computable model that describes graded values of protein activation across multiple pathways. We train a cFL-converted network to experimental data describing hepatocytic protein activation by inflammatory cytokines and demonstrate the application of the resultant trained models for three important purposes: (a) generating experimentally testable biological hypotheses concerning pathway crosstalk, (b) establishing capability for quantitative prediction of protein activity, and (c) prediction and understanding of the cytokine release phenotypic response. Our methodology systematically and quantitatively trains a protein pathway map summarizing curated literature to context-specific biochemical data. This process generates a computable model yielding successful prediction of new test data and offering biological insight into complex datasets that are difficult to fully analyze by intuition alone.
Author Summary
Over the past few years, many methods have been developed to construct large-scale networks from the literature or databases of genetic and physical interactions. With the advent of high-throughput biochemical methods, it is also possible to measure the states and activities of many proteins in these biochemical networks under different conditions of cellular stimulation and perturbation. Here we use constrained fuzzy logic to systematically compare interaction networks to experimental data. This systematic comparison elucidates interactions that were theoretically possible but not actually operating in the biological system of interest, as well as data that was not described by interactions in the prior knowledge network, pointing to a need to increase our knowledge in specific parts of the network. Furthermore, the result of this comparison is a trained, quantitative model that can be used to make a priori quantitative predictions about how the cellular protein network will respond in conditions not initially tested.
doi:10.1371/journal.pcbi.1001099
PMCID: PMC3048376  PMID: 21408212
20.  Learning Gene Networks under SNP Perturbations Using eQTL Datasets 
PLoS Computational Biology  2014;10(2):e1003420.
The standard approach for identifying gene networks is based on experimental perturbations of gene regulatory systems such as gene knock-out experiments, followed by a genome-wide profiling of differential gene expressions. However, this approach is significantly limited in that it is not possible to perturb more than one or two genes simultaneously to discover complex gene interactions or to distinguish between direct and indirect downstream regulations of the differentially-expressed genes. As an alternative, genetical genomics study has been proposed to treat naturally-occurring genetic variants as potential perturbants of gene regulatory system and to recover gene networks via analysis of population gene-expression and genotype data. Despite many advantages of genetical genomics data analysis, the computational challenge that the effects of multifactorial genetic perturbations should be decoded simultaneously from data has prevented a widespread application of genetical genomics analysis. In this article, we propose a statistical framework for learning gene networks that overcomes the limitations of experimental perturbation methods and addresses the challenges of genetical genomics analysis. We introduce a new statistical model, called a sparse conditional Gaussian graphical model, and describe an efficient learning algorithm that simultaneously decodes the perturbations of gene regulatory system by a large number of SNPs to identify a gene network along with expression quantitative trait loci (eQTLs) that perturb this network. While our statistical model captures direct genetic perturbations of gene network, by performing inference on the probabilistic graphical model, we obtain detailed characterizations of how the direct SNP perturbation effects propagate through the gene network to perturb other genes indirectly. We demonstrate our statistical method using HapMap-simulated and yeast eQTL datasets. In particular, the yeast gene network identified computationally by our method under SNP perturbations is well supported by the results from experimental perturbation studies related to DNA replication stress response.
Author Summary
A complete understanding of how gene regulatory networks are wired in a biological system is important in many areas of biology and medicine. The most popular method for investigating a gene network has been based on experimental perturbation studies, where the expression of a gene is experimentally manipulated to observe how this perturbation affects the expressions of other genes. Such experimental methods are costly, laborious, and do not scale to a perturbation of more than two genes at a time. As an alternative, genetical genomics approach uses genetic variants as naturally-occurring perturbations of gene regulatory system and learns gene networks by decoding the perturbation effects by genetic variants, given population gene-expression and genotype data. However, since there exist millions of genetic variants in genomes that simultaneously perturb a gene network, it is not obvious how to decode the effects of such multifactorial perturbations from data. Our statistical approach overcomes this computational challenge and recovers gene networks under SNP perturbations using probabilistic graphical models. As population gene-expression and genotype datasets are routinely collected to study genetic architectures of complex diseases and phenotypes, our approach can directly leverage these existing datasets to provide a more effective way of identifying gene networks.
doi:10.1371/journal.pcbi.1003420
PMCID: PMC3937098  PMID: 24586125
21.  Metabolic modeling of endosymbiont genome reduction on a temporal scale 
This study explores the order in which individual metabolic genes are lost in an in silico evolutionary process leading from the metabolic network of Eschericia coli to that of the genome-reduced endosymbiont Buchnera aphidicola.
Simulating the reductive evolutionary process under several growth conditions, a remarkable correlation between in silico and phylogenetically reconstructed gene loss time is obtained.A gene's k-robustness (its depth of backups) is prime determinant of its loss time.In silico gene loss time is a better predictor of their actual loss times than genomic features and network properties.Simulating the reductive evolutionary process by the loss of large blocks followed by single-gene deletions, as known to occur in evolution, yields a remarkable correspondence with the phylogenetic reconstruction and the block loss reported in the literature.
An open fundamental challenge in Systems Biology is whether a genome-scale model can predict patterns of genome evolution by realistically accounting for the associated biochemical constraints. In this study, we explore the order in which individual genes are lost in an in silico evolutionary process, leading from the metabolic network of Eschericia coli to that of the endosymbiont Buchnera aphidicola.
To evaluate the in silico gene loss time, we repeated the reductive evolutionary process introduced by Pál et al (2006), denoting the in silico deletion time of a gene in a single run of the reductive evolutionary process as the number of genes deleted before its own deletion occurred. By comparing the in silico evaluations of the gene loss time to that obtained by a phylogenetic reconstruction (Figure 1), we could evaluate the ability of an in silico process to predict temporal patterns of genome reduction. Applying this procedure on a literature-based viable media, we obtained a mean Spearman's correlation of 0.46 (53% of the maximal correlation, empirical P-value <9.9e−4) between in silico and phylogenetically reconstructed loss times. In order to provide an upper bound on evolutionary necessity stemming from metabolic constraints, we searched the space of potential growth media and biomass functions via a simulated annealing search algorithm aimed at identifying an environment/biomass function that maximizes the target correlation between in silico and reconstructed loss times. Simulating the reductive evolutionary process under the growth conditions and biomass function obtained in this process, we managed to improve the correlation between in silico and reconstructed loss times to a mean Spearman's correlation of 0.54 (63% of the maximal correlation, empirical P-value <9.9e−4, Figure 3).
Examining the dependency of the predicted loss time of each gene on its intrinsic network-level properties we find a very strong inverse Spearman's correlation of −0.84 (empirical P-value <9.9e−4) between the order of gene loss predicted in silico and the k-robustness levels of the genes, the latter denoting the depth of their functional backups in the network (Deutscher et al, 2006). Moreover, in order to examine whether the relative loss time of a gene is influenced by its functional dependencies with other genes, we performed a flux-coupling analysis and identified pairs of reactions whose activities asymmetrically depend on each other, i.e., are directionally coupled (Burgard et al, 2004). We find that genes encoding reactions whose activity is needed for activating the other reaction (and not vice versa) have a tendency to be lost later, as one would expect (binomial P-value <1e−14).
To assess the scale of these results, we examined as a control how well genomic features and network properties predict the phylogenetically reconstructed gene loss times. We examined the dependency of the latter on several factors that are known be inversely correlated with the propensity of a gene to be lost (Brinza et al, 2009; Delmotte et al, 2006; Tamames et al, 2007), including the genes' mRNA levels, tAI values (Covert et al, 2004; Reis et al, 2004; Sharp and Li, 1987; Tuller et al, 2010a) and the number of partners the gene products have in a protein–protein interaction network. Remarkably, these genomic features yield considerably lower Spearman's correlation than that obtained by the in silico simulations. Moreover, multiply regressing the loss times from the phylogenetic reconstruction on the in silico gene loss time predictions and the genomic and network variables, we found that the (normalized) coefficient of the in silico predictions in the regression is much higher than those of the genomic features, further testifying to the considerable independent predictive power of the metabolic model.
Finally, simulating the evolutionary process as large block deletions at first followed by single-gene deletions as is thought to occur in evolution (Moran and Mira, 2001; van Ham et al, 2003), a remarkable correspondence with the phylogenetic reconstruction was found. Namely, we find that after a certain amount of genes are deleted from the genome, no further block deletions can occur due to the increasing density of essential genes. Notably, the maximum amount of genes that can be deleted in blocks (i.e., until no more blocks can be deleted) corresponds to the number of genes appearing in our phylogenetic reconstruction from the LCA (last common ancestor of Buchnera and E. coli) to the LCSA (last common symbiotic ancestor, nodes 1–3 in Figure 1A), as described in the literature.
A fundamental challenge in Systems Biology is whether a cell-scale metabolic model can predict patterns of genome evolution by realistically accounting for associated biochemical constraints. Here, we study the order in which genes are lost in an in silico evolutionary process, leading from the metabolic network of Eschericia coli to that of the endosymbiont Buchnera aphidicola. We examine how this order correlates with the order by which the genes were actually lost, as estimated from a phylogenetic reconstruction. By optimizing this correlation across the space of potential growth and biomass conditions, we compute an upper bound estimate on the model's prediction accuracy (R=0.54). The model's network-based predictive ability outperforms predictions obtained using genomic features of individual genes, reflecting the effect of selection imposed by metabolic stoichiometric constraints. Thus, while the timing of gene loss might be expected to be a completely stochastic evolutionary process, remarkably, we find that metabolic considerations, on their own, make a marked 40% contribution to determining when such losses occur.
doi:10.1038/msb.2011.11
PMCID: PMC3094061  PMID: 21451589
constraint-based modeling; endosymbiont; evolution; metabolism
22.  Boolean Network Model Predicts Knockout Mutant Phenotypes of Fission Yeast 
PLoS ONE  2013;8(9):e71786.
Boolean networks (or: networks of switches) are extremely simple mathematical models of biochemical signaling networks. Under certain circumstances, Boolean networks, despite their simplicity, are capable of predicting dynamical activation patterns of gene regulatory networks in living cells. For example, the temporal sequence of cell cycle activation patterns in yeasts S. pombe and S. cerevisiae are faithfully reproduced by Boolean network models. An interesting question is whether this simple model class could also predict a more complex cellular phenomenology as, for example, the cell cycle dynamics under various knockout mutants instead of the wild type dynamics, only. Here we show that a Boolean network model for the cell cycle control network of yeast S. pombe correctly predicts viability of a large number of known mutants. So far this had been left to the more detailed differential equation models of the biochemical kinetics of the yeast cell cycle network and was commonly thought to be out of reach for models as simplistic as Boolean networks. The new results support our vision that Boolean networks may complement other mathematical models in systems biology to a larger extent than expected so far, and may fill a gap where simplicity of the model and a preference for an overall dynamical blueprint of cellular regulation, instead of biochemical details, are in the focus.
doi:10.1371/journal.pone.0071786
PMCID: PMC3777975  PMID: 24069138
23.  In silico models of cancer 
Cancer is a complex disease that involves multiple types of biological interactions across diverse physical, temporal, and biological scales. This complexity presents substantial challenges for the characterization of cancer biology, and motivates the study of cancer in the context of molecular, cellular, and physiological systems. Computational models of cancer are being developed to aid both biological discovery and clinical medicine. The development of these in silico models is facilitated by rapidly advancing experimental and analytical tools that generate information-rich, high-throughput biological data. Statistical models of cancer at the genomic, transcriptomic, and pathway levels have proven effective in developing diagnostic and prognostic molecular signatures, as well as in identifying perturbed pathways. Statistically-inferred network models can prove useful in settings where data overfitting can be avoided, and provide an important means for biological discovery. Mechanistically-based signaling and metabolic models that apply a priori knowledge of biochemical processes derived from experiments can also be reconstructed where data are available, and can provide insight and predictive ability regarding the dynamical behavior of these systems. At longer length scales, continuum and agent-based models of the tumor microenvironment and other tissue-level interactions enable modeling of cancer cell populations and tumor progression. Even though cancer has been among the most-studied human diseases using systems approaches, significant challenges remain before the enormous potential of in silico cancer biology can be fully realized.
doi:10.1002/wsbm.75
PMCID: PMC3157287  PMID: 20836040
Systems medicine; Cancer; Personalized medicine; Systems biology; Computational biology
24.  METANNOGEN: compiling features of biochemical reactions needed for the reconstruction of metabolic networks 
Background
One central goal of computational systems biology is the mathematical modelling of complex metabolic reaction networks. The first and most time-consuming step in the development of such models consists in the stoichiometric reconstruction of the network, i. e. compilation of all metabolites, reactions and transport processes relevant to the considered network and their assignment to the various cellular compartments. Therefore an information system is required to collect and manage data from different databases and scientific literature in order to generate a metabolic network of biochemical reactions that can be subjected to further computational analyses.
Results
The computer program METANNOGEN facilitates the reconstruction of metabolic networks. It uses the well-known database of biochemical reactions KEGG of biochemical reactions as primary information source from which biochemical reactions relevant to the considered network can be selected, edited and stored in a separate, user-defined database. Reactions not contained in KEGG can be entered manually into the system. To aid the decision whether or not a reaction selected from KEGG belongs to the considered network METANNOGEN contains information of SWISSPROT and ENSEMBL and provides Web links to a number of important information sources like METACYC, BRENDA, NIST, and REACTOME. If a reaction is reported to occur in more than one cellular compartment, a corresponding number of reactions is generated each referring to one specific compartment. Transport processes of metabolites are entered like chemical reactions where reactants and products have different compartment attributes. The list of compartmentalized biochemical reactions and membrane transport processes compiled by means of METANNOGEN can be exported as an SBML file for further computational analysis. METANNOGEN is highly customizable with respect to the content of the SBML output file, additional data-fields, the graphical input form, highlighting of project specific search terms and dynamically generated Web-links.
Conclusion
METANNOGEN is a flexible tool to manage information for the design of metabolic networks. The program requires Java Runtime Environment 1.4 or higher and about 100 MB of free RAM and about 200 MB of free HD space. It does not require installation and can be directly Java-webstarted from .
doi:10.1186/1752-0509-1-5
PMCID: PMC1839895  PMID: 17408512
25.  Robust simplifications of multiscale biochemical networks 
BMC Systems Biology  2008;2:86.
Background
Cellular processes such as metabolism, decision making in development and differentiation, signalling, etc., can be modeled as large networks of biochemical reactions. In order to understand the functioning of these systems, there is a strong need for general model reduction techniques allowing to simplify models without loosing their main properties. In systems biology we also need to compare models or to couple them as parts of larger models. In these situations reduction to a common level of complexity is needed.
Results
We propose a systematic treatment of model reduction of multiscale biochemical networks. First, we consider linear kinetic models, which appear as "pseudo-monomolecular" subsystems of multiscale nonlinear reaction networks. For such linear models, we propose a reduction algorithm which is based on a generalized theory of the limiting step that we have developed in [1]. Second, for non-linear systems we develop an algorithm based on dominant solutions of quasi-stationarity equations. For oscillating systems, quasi-stationarity and averaging are combined to eliminate time scales much faster and much slower than the period of the oscillations. In all cases, we obtain robust simplifications and also identify the critical parameters of the model. The methods are demonstrated for simple examples and for a more complex model of NF-κB pathway.
Conclusion
Our approach allows critical parameter identification and produces hierarchies of models. Hierarchical modeling is important in "middle-out" approaches when there is need to zoom in and out several levels of complexity. Critical parameter identification is an important issue in systems biology with potential applications to biological control and therapeutics. Our approach also deals naturally with the presence of multiple time scales, which is a general property of systems biology models.
doi:10.1186/1752-0509-2-86
PMCID: PMC2654786  PMID: 18854041

Results 1-25 (637673)