PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1382980)

Clipboard (0)
None

Related Articles

1.  Integrating Bayesian variable selection with Modular Response Analysis to infer biochemical network topology 
BMC Systems Biology  2013;7:57.
Background
Recent advancements in genetics and proteomics have led to the acquisition of large quantitative data sets. However, the use of these data to reverse engineer biochemical networks has remained a challenging problem. Many methods have been proposed to infer biochemical network topologies from different types of biological data. Here, we focus on unraveling network topologies from steady state responses of biochemical networks to successive experimental perturbations.
Results
We propose a computational algorithm which combines a deterministic network inference method termed Modular Response Analysis (MRA) and a statistical model selection algorithm called Bayesian Variable Selection, to infer functional interactions in cellular signaling pathways and gene regulatory networks. It can be used to identify interactions among individual molecules involved in a biochemical pathway or reveal how different functional modules of a biological network interact with each other to exchange information. In cases where not all network components are known, our method reveals functional interactions which are not direct but correspond to the interaction routes through unknown elements. Using computer simulated perturbation responses of signaling pathways and gene regulatory networks from the DREAM challenge, we demonstrate that the proposed method is robust against noise and scalable to large networks. We also show that our method can infer network topologies using incomplete perturbation datasets. Consequently, we have used this algorithm to explore the ERBB regulated G1/S transition pathway in certain breast cancer cells to understand the molecular mechanisms which cause these cells to become drug resistant. The algorithm successfully inferred many well characterized interactions of this pathway by analyzing experimentally obtained perturbation data. Additionally, it identified some molecular interactions which promote drug resistance in breast cancer cells.
Conclusions
The proposed algorithm provides a robust, scalable and cost effective solution for inferring network topologies from biological data. It can potentially be applied to explore novel pathways which play important roles in life threatening disease like cancer.
doi:10.1186/1752-0509-7-57
PMCID: PMC3726398  PMID: 23829771
Network inference; Bayesian statistics; Modular Response Analysis; Signaling pathways.
2.  Functional Dissection of Regulatory Models Using Gene Expression Data of Deletion Mutants 
PLoS Genetics  2013;9(9):e1003757.
Genome-wide gene expression profiles accumulate at an alarming rate, how to integrate these expression profiles generated by different laboratories to reverse engineer the cellular regulatory network has been a major challenge. To automatically infer gene regulatory pathways from genome-wide mRNA expression profiles before and after genetic perturbations, we introduced a new Bayesian network algorithm: Deletion Mutant Bayesian Network (DM_BN). We applied DM_BN to the expression profiles of 544 yeast single or double deletion mutants of transcription factors, chromatin remodeling machinery components, protein kinases and phosphatases in S. cerevisiae. The network inferred by this method identified causal regulatory and non-causal concurrent interactions among these regulators (genetically perturbed genes) that are strongly supported by the experimental evidence, and generated many new testable hypotheses. Compared to networks reconstructed by routine similarity measures or by alternative Bayesian network algorithms, the network inferred by DM_BN excels in both precision and recall. To facilitate its application in other systems, we packaged the algorithm into a user-friendly analysis tool that can be downloaded at http://www.picb.ac.cn/hanlab/DM_BN.html.
Author Summary
The complex functions of a living cell are carried out through hierarchically organized regulatory pathways composed of complex interactions between regulators themselves and between regulators and their targets. Here we developed a Bayesian network inference algorithm, Deletion Mutant Bayesian Network (DM_BN) to reverse engineer the yeast regulatory network based on the hypothesis that components of the same protein complexes or the same regulatory pathways share common target genes. We used this approach to analyze expression profiles of 544 single or double deletion mutants of transcription factors, chromatin remodeling machinery components, protein kinases and phosphatases in S. cerevisiae. The Bayesian network inferred by this method identified causal regulatory relationships and non-causal concurrent interactions among these regulators in different cellular processes, strongly supported by the experimental evidence and generated many testable hypotheses. Compared to networks reconstructed by routine similarity measures or by alternative Bayesian network algorithms, the network inferred by DM_BN excels in both precision and recall. To facilitate its application in other systems, we packaged the algorithm into a user-friendly analysis tool that can be downloaded at http://www.picb.ac.cn/hanlab/DM_BN.html.
doi:10.1371/journal.pgen.1003757
PMCID: PMC3764135  PMID: 24039601
3.  Network Modeling Reveals Prevalent Negative Regulatory Relationships between Signaling Sectors in Arabidopsis Immune Signaling 
PLoS Pathogens  2010;6(7):e1001011.
Biological signaling processes may be mediated by complex networks in which network components and network sectors interact with each other in complex ways. Studies of complex networks benefit from approaches in which the roles of individual components are considered in the context of the network. The plant immune signaling network, which controls inducible responses to pathogen attack, is such a complex network. We studied the Arabidopsis immune signaling network upon challenge with a strain of the bacterial pathogen Pseudomonas syringae expressing the effector protein AvrRpt2 (Pto DC3000 AvrRpt2). This bacterial strain feeds multiple inputs into the signaling network, allowing many parts of the network to be activated at once. mRNA profiles for 571 immune response genes of 22 Arabidopsis immunity mutants and wild type were collected 6 hours after inoculation with Pto DC3000 AvrRpt2. The mRNA profiles were analyzed as detailed descriptions of changes in the network state resulting from the genetic perturbations. Regulatory relationships among the genes corresponding to the mutations were inferred by recursively applying a non-linear dimensionality reduction procedure to the mRNA profile data. The resulting static network model accurately predicted 23 of 25 regulatory relationships reported in the literature, suggesting that predictions of novel regulatory relationships are also accurate. The network model revealed two striking features: (i) the components of the network are highly interconnected; and (ii) negative regulatory relationships are common between signaling sectors. Complex regulatory relationships, including a novel negative regulatory relationship between the early microbe-associated molecular pattern-triggered signaling sectors and the salicylic acid sector, were further validated. We propose that prevalent negative regulatory relationships among the signaling sectors make the plant immune signaling network a “sector-switching” network, which effectively balances two apparently conflicting demands, robustness against pathogenic perturbations and moderation of negative impacts of immune responses on plant fitness.
Author Summary
When a plant detects pathogen attack, this information is conveyed through a molecular signaling network to turn on a large variety of immune responses. We investigated how this plant immune signaling network was organized using the model plant Arabidopsis. Wild type and mutant plants with defects in immune signaling were challenged with a pathogen. Then, expression levels of many genes were measured using microarrays. Detailed analysis of the mutation effects on gene expression allowed us to build a signaling network model composed of the genes corresponding to the mutations. This model predicted that the network components are highly interconnected and that it is very common for network components that mediate different signaling events to inhibit each other. The prevalent signaling inhibitions in the network suggest that only part of the signaling network is usually used but that if this part is attacked by pathogens, other parts kick in and back up the function of the attacked part. We speculate that plant immune signaling is highly tolerant to pathogen attack due to this backup mechanism. We also speculate use of only part of the network at any one time helps minimize negative impacts of the immune response on plant fitness.
doi:10.1371/journal.ppat.1001011
PMCID: PMC2908620  PMID: 20661428
4.  TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach 
BMC Bioinformatics  2010;11:154.
Background
One of main aims of Molecular Biology is the gain of knowledge about how molecular components interact each other and to understand gene function regulations. Using microarray technology, it is possible to extract measurements of thousands of genes into a single analysis step having a picture of the cell gene expression. Several methods have been developed to infer gene networks from steady-state data, much less literature is produced about time-course data, so the development of algorithms to infer gene networks from time-series measurements is a current challenge into bioinformatics research area. In order to detect dependencies between genes at different time delays, we propose an approach to infer gene regulatory networks from time-series measurements starting from a well known algorithm based on information theory.
Results
In this paper we show how the ARACNE (Algorithm for the Reconstruction of Accurate Cellular Networks) algorithm can be used for gene regulatory network inference in the case of time-course expression profiles. The resulting method is called TimeDelay-ARACNE. It just tries to extract dependencies between two genes at different time delays, providing a measure of these dependencies in terms of mutual information. The basic idea of the proposed algorithm is to detect time-delayed dependencies between the expression profiles by assuming as underlying probabilistic model a stationary Markov Random Field. Less informative dependencies are filtered out using an auto calculated threshold, retaining most reliable connections. TimeDelay-ARACNE can infer small local networks of time regulated gene-gene interactions detecting their versus and also discovering cyclic interactions also when only a medium-small number of measurements are available. We test the algorithm both on synthetic networks and on microarray expression profiles. Microarray measurements concern S. cerevisiae cell cycle, E. coli SOS pathways and a recently developed network for in vivo assessment of reverse engineering algorithms. Our results are compared with ARACNE itself and with the ones of two previously published algorithms: Dynamic Bayesian Networks and systems of ODEs, showing that TimeDelay-ARACNE has good accuracy, recall and F-score for the network reconstruction task.
Conclusions
Here we report the adaptation of the ARACNE algorithm to infer gene regulatory networks from time-course data, so that, the resulting network is represented as a directed graph. The proposed algorithm is expected to be useful in reconstruction of small biological directed networks from time course data.
doi:10.1186/1471-2105-11-154
PMCID: PMC2862045  PMID: 20338053
5.  IRIS: a method for reverse engineering of regulatory relations in gene networks 
BMC Bioinformatics  2009;10:444.
Background
The ultimate aim of systems biology is to understand and describe how molecular components interact to manifest collective behaviour that is the sum of the single parts. Building a network of molecular interactions is the basic step in modelling a complex entity such as the cell. Even if gene-gene interactions only partially describe real networks because of post-transcriptional modifications and protein regulation, using microarray technology it is possible to combine measurements for thousands of genes into a single analysis step that provides a picture of the cell's gene expression. Several databases provide information about known molecular interactions and various methods have been developed to infer gene networks from expression data. However, network topology alone is not enough to perform simulations and predictions of how a molecular system will respond to perturbations. Rules for interactions among the single parts are needed for a complete definition of the network behaviour. Another interesting question is how to integrate information carried by the network topology, which can be derived from the literature, with large-scale experimental data.
Results
Here we propose an algorithm, called inference of regulatory interaction schema (IRIS), that uses an iterative approach to map gene expression profile values (both steady-state and time-course) into discrete states and a simple probabilistic method to infer the regulatory functions of the network. These interaction rules are integrated into a factor graph model. We test IRIS on two synthetic networks to determine its accuracy and compare it to other methods. We also apply IRIS to gene expression microarray data for the Saccharomyces cerevisiae cell cycle and for human B-cells and compare the results to literature findings.
Conclusions
IRIS is a rapid and efficient tool for the inference of regulatory relations in gene networks. A topological description of the network and a matrix of gene expression profiles are required as input to the algorithm. IRIS maps gene expression data onto discrete values and then computes regulatory functions as conditional probability tables. The suitability of the method is demonstrated for synthetic data and microarray data. The resulting network can also be embedded in a factor graph model.
doi:10.1186/1471-2105-10-444
PMCID: PMC2813854  PMID: 20030818
6.  Modularity and hormone sensitivity of the Drosophila melanogaster insulin receptor/target of rapamycin interaction proteome 
First systematic analysis of the evolutionary conserved InR/TOR pathway interaction proteome in Drosophila.Quantitative mass spectrometry revealed that 22% of identified protein interactions are regulated by the growth hormone insulin affecting membrane proximal as well as intracellular signaling complexes.Systematic RNA interference linked a significant fraction of network components to the control of dTOR kinase activity.Combined biochemical and genetic data suggest dTTT, a dTOR-containing complex required for cell growth control by dTORC1 and dTORC2 in vivo.
Cellular growth is a fundamental process that requires constant adaptations to changing environmental conditions, like growth factor and nutrient availability, energy levels and more. Over the years, the insulin receptor/target of rapamycin pathway (InR/TOR) emerged as a key signaling system for the control of metazoan cell growth. Genetic screens carried out in the fruit fly Drosophila melanogaster identified key InR/TOR pathway components and their relationships. Phenotypes such as altered cell growth are likely to emerge from perturbed dynamic networks containing InR/TOR pathway components, which stably or transiently interact with other cellular proteins to form complexes and networks thereof. Systematic studies on the topology and dynamics of protein interaction networks become therefore highly relevant to gain systems level understanding of deregulated cell growth. Despite much progress in genetic analysis only few systematic protein interaction studies have been reported for Drosophila, which in most cases lack quantitative information representing the dynamic nature of such networks. Here, we present the first quantitative affinity purification mass spectrometry (AP–MS/MS) analysis on the evolutionary conserved InR/TOR signaling network in Drosophila. Systematic RNAi-based functional analysis of identified network components revealed key components linked to the regulation of the central effector kinase dTOR. This includes also dTTT, a novel dTOR-containing complex required for the control of dTORC1 and dTORC2 in vivo.
For systematic AP–MS analysis, we generated Drosophila Kc167 cell lines inducibly expressing affinity-tagged bait proteins previously linked to InR/TOR signaling. Bait expressing Kc167 cell lines were harvested before and after insulin stimulation for subsequent affinity purification. Following LC–MS/MS analysis and probabilistic data filtering using SAINT (Choi et al, 2010), we generated a quantitative network model from 97 high confidence protein–protein interactions and 58 network components (Figure 2). The presented network displayed a high degree of orthologous interactions conserved also in human cells and identified a number of novel molecular interactions with InR/TOR signaling components for future hypothesis driven analysis.
To measure insulin-induced changes within the InR/TOR interaction proteome, we applied a recently introduced label-free quantitative MS approach (Rinner et al, 2007). The obtained quantitative data suggest that 22% of all interactions in the network are regulated by insulin. Major changes could be observed within the membrane proximal InR/chico/PI3K signaling complexes, and also in 14-3-3 protein containing signaling complexes and dTORC1, a complex that contains besides dTOR all major orthologous proteins found also in human mTORC1 including the two dTORC1 substrates d4E-BP (Thor) and S6 Kinase (S6K). Insulin triggered both, dissociation and association of dTORC1 proteins. Among the proteins that showed enhanced binding to dTORC1 upon insulin stimulation we found Unkempt, a RING-finger protein with a proposed role in ubiquitin-mediated protein degradation (Lores et al, 2010). Besides dTORC1 our systematic AP–MS analysis also revealed the presence of dTORC2, the second major TOR complex in Drosophila. dTORC2 contains the Drosophila orthologous of human mTORC2 proteins, but in contrast to dTORC1 was not affected upon insulin stimulation. Interestingly, we also found a specific set of proteins that were not linked to the canonical TOR complexes TORC1 and TORC2 in dTOR purifications. These include LqfR (liquid facets related), Pontin, Reptin, Spaghetti and the gene product of CG16908. We found the same set of proteins when we used CG16908 as a bait, suggesting complex formation among the identified proteins. None of the dTORC1/2 components besides dTOR was identified in CG16908 purifications, indicating that these proteins form dTOR complexes distinct from dTORC1 and dTORC2. Based on known interaction information from other species and data obtained from this study we refer to this complex as dTTT (Drosophila TOR, TELO2, TTI1) (Horejsi et al, 2010; [18]Hurov et al, 2010; [20]Kaizuka et al, 2010). A directed quantitative MS analysis of dTOR complex components suggests that dTORC1 is the most abundant dTOR complex we identified in Kc167 cells.
We next studied the potential roles of the identified network components for controlling the activity of the dInR/TOR pathway using systematic RNAi depletion and quantitative western blotting to measure the changes in abundance of phosphorylated substrates of dTORC1 (Thor/d4E-BP, dS6K) and dTORC2 (dPKB) in RNAi-treated cells (Figure 5). Overall, we could identify 16 proteins (out of 58) whose depletion caused an at least 50% increase or decrease in the levels of phosphorylated d4E-BP, S6K and/or PKB compared with control GFP RNAi. Besides established pathway components, we found several novel regulators within the dInR/TOR interaction network. For example, RNAi against the novel insulin-regulated dTORC1 component Unkempt resulted in enhanced phosphorylation of the dTORC1 substrate d4E-BP, which suggests a negative role for Unkempt on dTORC1 activity. In contrast, depletion of CG16908 and LqfR caused hypo-phosphorylation of all dTOR substrates similar to dTOR itself, suggesting a positive role for the dTTT complex on dTOR activity. Subsequently, we tested whether dTTT components also plays a role in dTOR-mediated cell growth in vivo. Depletion of both dTTT components, CG16908 and LqfR, in the Drosophila eye resulted in a substantial decrease in eye size. Likewise, FLP-FRT-mediated mitotic recombination resulted in CG16908 and LqfR mutant clones with a similar reduced growth phenotype as observed in dTOR mutant clones. Hence, the combined biochemical and genetic analysis revealed dTTT as a dTOR-containing complex required for the activity of both dTORC1 and dTORC2 and thus plays a critical role in controlling cell growth.
Taken together, these results illustrate how a systematic quantitative AP–MS approach when combined with systematic functional analysis in Drosophila can reveal novel insights into the dynamic organization of regulatory networks for cell growth control in metazoans.
Using quantitative mass spectrometry, this study reports how insulin affects the modularity of the interaction proteome of the Drosophila InR/TOR pathway, an evolutionary conserved signaling system for the control of metazoan cell growth. Systematic functional analysis linked a significant number of identified network components to the control of dTOR activity and revealed dTTT, a dTOR complex required for in vivo cell growth control by dTORC1 and dTORC2.
Genetic analysis in Drosophila melanogaster has been widely used to identify a system of genes that control cell growth in response to insulin and nutrients. Many of these genes encode components of the insulin receptor/target of rapamycin (InR/TOR) pathway. However, the biochemical context of this regulatory system is still poorly characterized in Drosophila. Here, we present the first quantitative study that systematically characterizes the modularity and hormone sensitivity of the interaction proteome underlying growth control by the dInR/TOR pathway. Applying quantitative affinity purification and mass spectrometry, we identified 97 high confidence protein interactions among 58 network components. In all, 22% of the detected interactions were regulated by insulin affecting membrane proximal as well as intracellular signaling complexes. Systematic functional analysis linked a subset of network components to the control of dTORC1 and dTORC2 activity. Furthermore, our data suggest the presence of three distinct dTOR kinase complexes, including the evolutionary conserved dTTT complex (Drosophila TOR, TELO2, TTI1). Subsequent genetic studies in flies suggest a role for dTTT in controlling cell growth via a dTORC1- and dTORC2-dependent mechanism.
doi:10.1038/msb.2011.79
PMCID: PMC3261712  PMID: 22068330
cell growth; InR/TOR pathway; interaction proteome; quantitative mass spectrometry; signaling
7.  The auxin signalling network translates dynamic input into robust patterning at the shoot apex 
We provide a comprehensive expression map of the different genes (TIR1/AFBs, ARFs and Aux/IAAs) involved in the signalling pathway regulating gene transcription in response to auxin in the shoot apical meristem (SAM).We demonstrate a relatively simple structure of this pathway using a high-throughput yeast two-hybrid approach to obtain the Aux/IAA-ARF full interactome.The topology of the signalling network was used to construct a model for auxin signalling and to predict a role for the spatial regulation of auxin signalling in patterning of the SAM.We used a new sensor to monitor the input in the auxin signalling pathway and to confirm the model prediction, thus demonstrating that auxin signalling is essential to create robust patterns at the SAM.
The plant hormone auxin is a key morphogenetic signal involved in the control of cell identity throughout development. A striking example of auxin action is at the shoot apical meristem (SAM), a population of stem cells generating the aerial parts of the plant. Organ positioning and patterning depends on local accumulations of auxin in the SAM, generated by polar transport of auxin (Vernoux et al, 2010). However, it is still unclear how auxin is distributed at cell resolution in tissues and how the hormone is sensed in space and time during development. A complex ensemble of 29 Aux/IAAs and 23 ARFs is central to the regulation of gene transcription in response to auxin (for review, see Leyser, 2006; Guilfoyle and Hagen, 2007; Chapman and Estelle, 2009). Protein–protein interactions govern the properties of this transduction pathway (Del Bianco and Kepinski, 2011). Limited interaction studies suggest that, in the absence of auxin, the Aux/IAA repressors form heterodimers with the ARF transcription factors, preventing them from regulating target genes. In the presence of auxin, the Aux/IAA proteins are targeted to the proteasome by an SCF E3 ubiquitin ligase complex (Chapman and Estelle, 2009; Leyser, 2006). In this process, auxin promotes the interaction between Aux/IAA proteins and the TIR1 F-box of the SCF complex (or its AFB homologues) that acts as an auxin co-receptor (Dharmasiri et al, 2005a, 2005b; Kepinski and Leyser, 2005; Tan et al, 2007). The auxin-induced degradation of Aux/IAAs would then release ARFs to regulate transcription of their target genes. This includes activation of most of the Aux/IAA genes themselves, thus establishing a negative feedback loop (Guilfoyle and Hagen, 2007). Although this general scenario provides a framework for understanding gene regulation by auxin, the underlying protein–protein network remains to be fully characterized.
In this paper, we combined experimental and theoretical analyses to understand how this pathway contributes to sensing auxin in space and time (Figure 1). We first analysed the expression patterns of the ARFs, Aux/IAAs and TIR1/AFBs genes in the SAM. Our results demonstrate a general tendency for most of the 25 ARFs and Aux/IAAs detected in the SAM: a differential expression with low levels at the centre of the meristem (where the stem cells are located) and high levels at the periphery of the meristem (where organ initiation takes place). We also observed a similar differential expression for TIR1/AFB co-receptors. To understand the functional significance of the distribution of ARFs and Aux/IAAs in the SAM, we next investigated the global structure of the Aux/IAA-ARF network using a high-throughput yeast two-hybrid approach and uncover a rather simple topology that relies on three basic generic features: (i) Aux/IAA proteins interact with themselves, (ii) Aux/IAA proteins interact with ARF activators and (iii) ARF repressors have no or very limited interactions with other proteins in the network.
The results of our interaction analysis suggest a model for the Aux/IAA-ARF signalling pathway in the SAM, where transcriptional activation by ARF activators would be negatively regulated by two independent systems, one involving the ARF repressors, the other the Aux/IAAs. The presence of auxin would remove the inhibitory action of Aux/IAAs, but leave the ARF repressors to compete with ARF activators for promoter-binding sites. To explore the regulatory properties of this signalling network, we developed a mathematical model to describe the transcriptional output as a function of the signalling input that is the combinatorial effect of auxin concentration and of its perception. We then used the model and a simplified view of the meristem (where the same population of Aux/IAAs and ARFs exhibit a low expression at the centre and a high expression in the peripheral zone) for investigating the role of auxin signalling in SAM function. We show that in the model, for a given ARF activator-to-repressor ratio, the gene induction capacity increases with the absolute levels of ARF proteins. We thus predict that the differential expression of the ARFs generates differences in auxin sensitivities between the centre (low sensitivity) and the periphery (high sensitivity), and that the expression of TIR1/AFB participates to this regulation (prediction 1). We also use the model to analyse the transcriptional response to rapidly changing auxin concentrations. By simulating situations equivalent either to the centre or the periphery of our simplified representation of the SAM, we predict that the signalling pathway buffers its response to the auxin input via the balance between ARF activators and repressors, in turn generated by their differential spatial distributions (prediction 2).
To test the predictions from the model experimentally, we needed to assess both the input (auxin level and/or perception) and the output (target gene induction) of the signalling cascade. For measuring the transcriptional output, the widely used DR5 reporter is perfectly adapted (Figure 5) (Ulmasov et al, 1997; Sabatini et al, 1999; Benkova et al, 2003; Heisler et al, 2005). For assaying pathway input, we designed DII-VENUS, a novel auxin signalling sensor that comprises a constitutively expressed fusion of the auxin-binding domain (termed domain II or DII) (Dreher et al, 2006; Tan et al, 2007) of an IAA to a fast-maturating variant of YFP, VENUS (Figure 5). The degradation patterns from DII-VENUS indicate a high auxin signalling input both in flower primordia and at the centre of the SAM. This is in contrast to the organ-specific expression pattern of DR5::VENUS (Figure 5). These results indicate that the signalling pathway limits gene activation in response to auxin at the meristem centre and confirm the differential sensitivity to auxin between the centre and the periphery (prediction 1). We further confirmed the buffering capacities of the signalling pathway (prediction 2) by carrying out live imaging experiments to monitor DII-VENUS and DR5::VENUS expression in real time (Figure 5). This analysis reveals the presence of important temporal variations of DII-VENUS fluorescence, while DR5::VENUS does not show such global variations. Our approach thus provides evidence that the Aux/IAA-ARF pathway has a key role in patterning in the SAM, alongside the auxin transport system. Our results illustrate how the tight spatio-temporal regulation of both the distribution of a morphogenetic signal and the activity of the downstream signalling pathway provides robustness to a dynamic developmental process.
A comprehensive expression and interaction map of auxin signalling factors in the Arabidopsis shoot apical meristem is constructed and used to derive a mathematical model of auxin signalling, from which key predictions are experimentally confirmed.
The plant hormone auxin is thought to provide positional information for patterning during development. It is still unclear, however, precisely how auxin is distributed across tissues and how the hormone is sensed in space and time. The control of gene expression in response to auxin involves a complex network of over 50 potentially interacting transcriptional activators and repressors, the auxin response factors (ARFs) and Aux/IAAs. Here, we perform a large-scale analysis of the Aux/IAA-ARF pathway in the shoot apex of Arabidopsis, where dynamic auxin-based patterning controls organogenesis. A comprehensive expression map and full interactome uncovered an unexpectedly simple distribution and structure of this pathway in the shoot apex. A mathematical model of the Aux/IAA-ARF network predicted a strong buffering capacity along with spatial differences in auxin sensitivity. We then tested and confirmed these predictions using a novel auxin signalling sensor that reports input into the signalling pathway, in conjunction with the published DR5 transcriptional output reporter. Our results provide evidence that the auxin signalling network is essential to create robust patterns at the shoot apex.
doi:10.1038/msb.2011.39
PMCID: PMC3167386  PMID: 21734647
auxin; biosensor; live imaging; ODE; signalling
8.  Detecting and Removing Inconsistencies between Experimental Data and Signaling Network Topologies Using Integer Linear Programming on Interaction Graphs 
PLoS Computational Biology  2013;9(9):e1003204.
Cross-referencing experimental data with our current knowledge of signaling network topologies is one central goal of mathematical modeling of cellular signal transduction networks. We present a new methodology for data-driven interrogation and training of signaling networks. While most published methods for signaling network inference operate on Bayesian, Boolean, or ODE models, our approach uses integer linear programming (ILP) on interaction graphs to encode constraints on the qualitative behavior of the nodes. These constraints are posed by the network topology and their formulation as ILP allows us to predict the possible qualitative changes (up, down, no effect) of the activation levels of the nodes for a given stimulus. We provide four basic operations to detect and remove inconsistencies between measurements and predicted behavior: (i) find a topology-consistent explanation for responses of signaling nodes measured in a stimulus-response experiment (if none exists, find the closest explanation); (ii) determine a minimal set of nodes that need to be corrected to make an inconsistent scenario consistent; (iii) determine the optimal subgraph of the given network topology which can best reflect measurements from a set of experimental scenarios; (iv) find possibly missing edges that would improve the consistency of the graph with respect to a set of experimental scenarios the most. We demonstrate the applicability of the proposed approach by interrogating a manually curated interaction graph model of EGFR/ErbB signaling against a library of high-throughput phosphoproteomic data measured in primary hepatocytes. Our methods detect interactions that are likely to be inactive in hepatocytes and provide suggestions for new interactions that, if included, would significantly improve the goodness of fit. Our framework is highly flexible and the underlying model requires only easily accessible biological knowledge. All related algorithms were implemented in a freely available toolbox SigNetTrainer making it an appealing approach for various applications.
Author Summary
Cellular signal transduction is orchestrated by communication networks of signaling proteins commonly depicted on signaling pathway maps. However, each cell type may have distinct variants of signaling pathways, and wiring diagrams are often altered in disease states. The identification of truly active signaling topologies based on experimental data is therefore one key challenge in systems biology of cellular signaling. We present a new framework for training signaling networks based on interaction graphs (IG). In contrast to complex modeling formalisms, IG capture merely the known positive and negative edges between the components. This basic information, however, already sets hard constraints on the possible qualitative behaviors of the nodes when perturbing the network. Our approach uses Integer Linear Programming to encode these constraints and to predict the possible changes (down, neutral, up) of the activation levels of the involved players for a given experiment. Based on this formulation we developed several algorithms for detecting and removing inconsistencies between measurements and network topology. Demonstrated by EGFR/ErbB signaling in hepatocytes, our approach delivers direct conclusions on edges that are likely inactive or missing relative to canonical pathway maps. Such information drives the further elucidation of signaling network topologies under normal and pathological phenotypes.
doi:10.1371/journal.pcbi.1003204
PMCID: PMC3764019  PMID: 24039561
9.  Structural and functional protein network analyses predict novel signaling functions for rhodopsin 
Proteomic analyses, literature mining, and structural data were combined to generate an extensive signaling network linked to the visual G protein-coupled receptor rhodopsin. Network analysis suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking.
Using a shotgun proteomic approach, we identified the protein inventory of the light sensing outer segment of the mammalian photoreceptor.These data, combined with literature mining, structural modeling, and computational analysis, offer a comprehensive view of signal transduction downstream of the visual G protein-coupled receptor rhodopsin.The network suggests novel signaling branches downstream of rhodopsin to cytoskeleton dynamics and vesicular trafficking.The network serves as a basis for elucidating physiological principles of photoreceptor function and suggests potential disease-associated proteins.
Photoreceptor cells are neurons capable of converting light into electrical signals. The rod outer segment (ROS) region of the photoreceptor cells is a cellular structure made of a stack of around 800 closed membrane disks loaded with rhodopsin (Liang et al, 2003; Nickell et al, 2007). In disc membranes, rhodopsin arranges itself into paracrystalline dimer arrays, enabling optimal association with the heterotrimeric G protein transducin as well as additional regulatory components (Ciarkowski et al, 2005). Disruption of these highly regulated structures and processes by germline mutations is the cause of severe blinding diseases such as retinitis pigmentosa, macular degeneration, or congenital stationary night blindness (Berger et al, 2010).
Traditionally, signal transduction networks have been studied by combining biochemical and genetic experiments addressing the relations among a small number of components. More recently, large throughput experiments using different techniques like two hybrid or co-immunoprecipitation coupled to mass spectrometry have added a new level of complexity (Ito et al, 2001; Gavin et al, 2002, 2006; Ho et al, 2002; Rual et al, 2005; Stelzl et al, 2005). However, in these studies, space, time, and the fact that many interactions detected for a particular protein are not compatible, are not taken into consideration. Structural information can help discriminate between direct and indirect interactions and more importantly it can determine if two or more predicted partners of any given protein or complex can simultaneously bind a target or rather compete for the same interaction surface (Kim et al, 2006).
In this work, we build a functional and dynamic interaction network centered on rhodopsin on a systems level, using six steps: In step 1, we experimentally identified the proteomic inventory of the porcine ROS, and we compared our data set with a recent proteomic study from bovine ROS (Kwok et al, 2008). The union of the two data sets was defined as the ‘initial experimental ROS proteome'. After removal of contaminants and applying filtering methods, a ‘core ROS proteome', consisting of 355 proteins, was defined.
In step 2, proteins of the core ROS proteome were assigned to six functional modules: (1) vision, signaling, transporters, and channels; (2) outer segment structure and morphogenesis; (3) housekeeping; (4) cytoskeleton and polarity; (5) vesicles formation and trafficking, and (6) metabolism.
In step 3, a protein-protein interaction network was constructed based on the literature mining. Since for most of the interactions experimental evidence was co-immunoprecipitation, or pull-down experiments, and in addition many of the edges in the network are supported by single experimental evidence, often derived from high-throughput approaches, we refer to this network, as ‘fuzzy ROS interactome'. Structural information was used to predict binary interactions, based on the finding that similar domain pairs are likely to interact in a similar way (‘nature repeats itself') (Aloy and Russell, 2002). To increase the confidence in the resulting network, edges supported by a single evidence not coming from yeast two-hybrid experiments were removed, exception being interactions where the evidence was the existence of a three-dimensional structure of the complex itself, or of a highly homologous complex. This curated static network (‘high-confidence ROS interactome') comprises 660 edges linking the majority of the nodes. By considering only edges supported by at least one evidence of direct binary interaction, we end up with a ‘high-confidence binary ROS interactome'. We next extended the published core pathway (Dell'Orco et al, 2009) using evidence from our high-confidence network. We find several new direct binary links to different cellular functional processes (Figure 4): the active rhodopsin interacts with Rac1 and the GTP form of Rho. There is also a connection between active rhodopsin and Arf4, as well as PDEδ with Rab13 and the GTP-bound form of Arl3 that links the vision cycle to vesicle trafficking and structure. We see a connection between PDEδ with prenyl-modified proteins, such as several small GTPases, as well as with rhodopsin kinase. Further, our network reveals several direct binary connections between Ca2+-regulated proteins and cytoskeleton proteins; these are CaMK2A with actinin, calmodulin with GAP43 and S1008, and PKC with 14-3-3 family members.
In step 4, part of the network was experimentally validated using three different approaches to identify physical protein associations that would occur under physiological conditions: (i) Co-segregation/co-sedimentation experiments, (ii) immunoprecipitations combined with mass spectrometry and/or subsequent immunoblotting, and (iii) utilizing the glycosylated N-terminus of rhodopsin to isolate its associated protein partners by Concanavalin A affinity purification. In total, 60 co-purification and co-elution experiments supported interactions that were already in our literature network, and new evidence from 175 co-IP experiments in this work was added. Next, we aimed to provide additional independent experimental confirmation for two of the novel networks and functional links proposed based on the network analysis: (i) the proposed complex between Rac1/RhoA/CRMP-2/tubulin/and ROCK II in ROS was investigated by culturing retinal explants in the presence of an ROCK II-specific inhibitor (Figure 6). While morphology of the retinas treated with ROCK II inhibitor appeared normal, immunohistochemistry analyses revealed several alterations on the protein level. (ii) We supported the hypothesis that PDEδ could function as a GDI for Rac1 in ROS, by demonstrating that PDEδ and Rac1 co localize in ROS and that PDEδ could dissociate Rac1 from ROS membranes in vitro.
In step 5, we use structural information to distinguish between mutually compatible (‘AND') or excluded (‘XOR') interactions. This enables breaking a network of nodes and edges into functional machines or sub-networks/modules. In the vision branch, both ‘AND' and ‘XOR' gates synergize. This may allow dynamic tuning of light and dark states. However, all connections from the vision module to other modules are ‘XOR' connections suggesting that competition, in connection with local protein concentration changes, could be important for transmitting signals from the core vision module.
In the last step, we map and functionally characterize the known mutations that produce blindness.
In summary, this represents the first comprehensive, dynamic, and integrative rhodopsin signaling network, which can be the basis for integrating and mapping newly discovered disease mutants, to guide protein or signaling branch-specific therapies.
Orchestration of signaling, photoreceptor structural integrity, and maintenance needed for mammalian vision remain enigmatic. By integrating three proteomic data sets, literature mining, computational analyses, and structural information, we have generated a multiscale signal transduction network linked to the visual G protein-coupled receptor (GPCR) rhodopsin, the major protein component of rod outer segments. This network was complemented by domain decomposition of protein–protein interactions and then qualified for mutually exclusive or mutually compatible interactions and ternary complex formation using structural data. The resulting information not only offers a comprehensive view of signal transduction induced by this GPCR but also suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking, predicting an important level of regulation through small GTPases. Further, it demonstrates a specific disease susceptibility of the core visual pathway due to the uniqueness of its components present mainly in the eye. As a comprehensive multiscale network, it can serve as a basis to elucidate the physiological principles of photoreceptor function, identify potential disease-associated genes and proteins, and guide the development of therapies that target specific branches of the signaling pathway.
doi:10.1038/msb.2011.83
PMCID: PMC3261702  PMID: 22108793
protein interaction network; rhodopsin signaling; structural modeling
10.  Dynamic interaction networks in a hierarchically organized tissue 
We have integrated gene expression profiling with database and literature mining, mechanistic modeling, and cell culture experiments to identify intercellular and intracellular networks regulating blood stem cell self-renewal.Blood stem cell fate in vitro is regulated non-autonomously by a coupled positive–negative intercellular feedback circuit, composed of megakaryocyte-derived stimulatory growth factors (VEGF, PDGF, EGF, and serotonin) versus monocyte-derived inhibitory factors (CCL3, CCL4, CXCL10, TGFB2, and TNFSF9).The antagonistic signals converge in a core intracellular network focused around PI3K, Raf, PLC, and Akt.Model simulations enable functional classification of the novel endogenous ligands and signaling molecules.
Intercellular (between cell) communication networks are required to maintain homeostasis and coordinate regenerative and developmental cues in multicellular organisms. Despite the recognized importance of intercellular networks in regulating adult stem and progenitor cell fate, the specific cell populations involved, and the underlying molecular mechanisms are largely undefined. Although a limited number of studies have applied novel bioinformatic approaches to unravel intercellular signaling in other cell systems (Frankenstein et al, 2006), a comprehensive analysis of intercellular communication in a stem cell-derived, hierarchical tissue network has yet to be reported.
As a model system to explore intercellular communication networks in a hierarchically organized tissue, we cultured human umbilical cord blood (UCB)-derived stem and progenitor cells in defined, minimal cytokine-supplemented liquid culture (Madlambayan et al, 2006). To systematically explore the molecular and cellular dynamics underlying primitive progenitor growth and differentiation, gene expression profiles of primitive (lineage negative; Lin−) and mature (lineage positive; Lin+) populations were generated during phases of stem cell expansion versus depletion. Parallel phenotypic and subproteomic experiments validated that mRNA expression correlated with complex measures of proteome activity (protein secretion and cell surface expression). Using a curated list of secreted ligand–receptor interactions and published expression profiles of purified mature blood populations, we implemented a novel algorithm to reconstruct the intercellular signaling networks established between stem cells and multi-lineage progeny in vitro. By correlating differential expression patterns with stem cell growth, we predict cell populations, pathways, and secreted ligands associated with stem cell self-renewal and differentiation (Figure 3A).
We then tested the correlative predictions in a series of cell culture experiments. UCB progenitor cell cultures were supplemented with saturating amounts of 18 putative regulatory ligands, or cocultured with purified mature blood lineages (megakaryocytes, monocytes, and erythrocytes), and analyzed for effects on total cell, progenitor, and primitive progenitor growth. At the primitive progenitor level, 3/5 novel predicted stimulatory ligands (EGF, PDGFB, and VEGF) displayed significant positive effects, 5/7 predicted inhibitory factors (CCL3, CCL4, CXCL10, TNFSF9, and TGFB2) displayed negative effects, whereas only 1/5 non-correlated ligand (CXCL7) displayed an effect. Also consistent with predictions from gene expression data, megakaryocytes and monocytes were found to stimulate and inhibit primitive progenitor growth, respectively, and these effects were attributable to differential secretome profiles of stimulatory versus inhibitory ligands.
Cellular responses to external stimuli, particularly in heterogeneous and dynamic cell populations, represent complex functions of multiple cell fate decisions acting both directly and indirectly on the target (stem cell) populations. Experimentally distinguishing the mode of action of cytokines is thus a difficult task. To address this we used our previously published interactive model of hematopoiesis (Kirouac et al, 2009) to classify experimentally identified regulatory ligands into one of four distinct functional categories based on their differential effects on cell population growth. TGFB2 was classified as a proliferation inhibitor, CCL4, CXCL10, SPARC, and TNFSF9 as self-renewal inhibitors, CCL3 a proliferation stimulator, and EGF, VEGF, and PDGFB as self-renewal stimulators.
Stem and progenitor cells exposed to combinatorial extracellular signals must propagate this information through intracellular molecular networks, and respond appropriately by modifying cell fate decisions. To explore how our experimentally identified positive and negative regulatory signals are integrated at the intracellular level, we constructed a blood stem cell self-renewal signaling network through extensive literature curation and protein–protein interaction (PPI) network mapping. We find that signal transduction pathways activated by the various stimulatory and inhibitory ligands converge on a limited set of molecular control nodes, forming a core subnetwork enriched for known regulators of self-renewal (Figure 6A). To experimentally test the intracellular signaling molecules computationally predicted as regulators of stem cell self-renewal, we obtained five small molecule antagonists against the kinases Phosphatidylinositol 3-kinase (PI3K), Raf, Akt, Phospholipase C (PLC), and MEK1. Liquid cultures were supplemented with the five molecules individually, and resultant cell population outputs compared against model simulations to deconvolute the functional effects on proliferation (and survival) versus self-renewal. This analysis classifies inhibition of PI3K and Raf activity as selectively targeting self-renewal, PLC as selectively targeting survival, and Akt as selectively targeting proliferation; MEK inhibition appears non-specific for these processes.
This represents the first systematic characterization of how cell fate decisions are regulated non-autonomously through lineage-specific interactions with differentiated progeny. The complex intercellular communication networks can be approximated as an antagonistic positive–negative feedback circuit, wherein progenitor expansion is modulated by a balance of megakaryocyte-derived stimulatory factors (EGF, PDGF, VEGF, and possibly serotonin) versus monocyte-derived inhibitory factors (CCL3, CCL4, CXCL10, TGFB2, and TNFSF9). This complex milieu of endogenous regulatory signals is integrated and processed within a core intracellular signaling network, resulting in modulation of cell-level kinetic parameters (proliferation, survival, and self-renewal). We reconstruct a stem cell associated intracellular network, and identify PI3K, Raf, Akt, and PLC as functionally distinct signal integration nodes, linking extracellular and intracellular signaling. These findings lay the groundwork for novel strategies to control blood stem cell self-renewal in vitro and in vivo.
Intercellular (between cell) communication networks maintain homeostasis and coordinate regenerative and developmental cues in multicellular organisms. Despite the importance of intercellular networks in stem cell biology, their rules, structure and molecular components are poorly understood. Herein, we describe the structure and dynamics of intercellular and intracellular networks in a stem cell derived, hierarchically organized tissue using experimental and theoretical analyses of cultured human umbilical cord blood progenitors. By integrating high-throughput molecular profiling, database and literature mining, mechanistic modeling, and cell culture experiments, we show that secreted factor-mediated intercellular communication networks regulate blood stem cell fate decisions. In particular, self-renewal is modulated by a coupled positive–negative intercellular feedback circuit composed of megakaryocyte-derived stimulatory growth factors (VEGF, PDGF, EGF, and serotonin) versus monocyte-derived inhibitory factors (CCL3, CCL4, CXCL10, TGFB2, and TNFSF9). We reconstruct a stem cell intracellular network, and identify PI3K, Raf, Akt, and PLC as functionally distinct signal integration nodes, linking extracellular, and intracellular signaling. This represents the first systematic characterization of how stem cell fate decisions are regulated non-autonomously through lineage-specific interactions with differentiated progeny.
doi:10.1038/msb.2010.71
PMCID: PMC2990637  PMID: 20924352
cellular networks; hematopoiesis; intercellular signaling; self-renewal; stem cells
11.  Using Chemistry and Microfluidics To Understand the Spatial Dynamics of Complex Biological Networks 
Accounts of chemical research  2008;41(4):549-558.
CONSPECTUS
Understanding the spatial dynamics of biochemical networks is both fundamentally important for understanding life at the systems level and also has practical implications for medicine, engineering, biology, and chemistry. Studies at the level of individual reactions provide essential information about the function, interactions, and localization of individual molecular species and reactions in a network. However, analyzing the spatial dynamics of complex biochemical networks at this level is difficult. Biochemical networks are non-equilibrium systems containing dozens to hundreds of reactions with nonlinear and time-dependent interactions, and these interactions are influenced by diffusion, flow, and the relative values of state-dependent kinetic parameters.
To achieve an overall understanding of the spatial dynamics of a network and the global mechanisms that drive its function, networks must be analyzed as a whole, where all of the components and influential parameters of a network are simultaneously considered. Here, we describe chemical concepts and microfluidic tools developed for network-level investigations of the spatial dynamics of these networks. Modular approaches can be used to simplify these networks by separating them into modules, and simple experimental or computational models can be created by replacing each module with a single reaction. Microfluidics can be used to implement these models as well as to analyze and perturb the complex network itself with spatial control on the micrometer scale.
We also describe the application of these network-level approaches to elucidate the mechanisms governing the spatial dynamics of two networks–hemostasis (blood clotting) and early patterning of the Drosophila embryo. To investigate the dynamics of the complex network of hemostasis, we simplified the network by using a modular mechanism and created a chemical model based on this mechanism by using microfluidics. Then, we used the mechanism and the model to predict the dynamics of initiation and propagation of blood clotting and tested these predictions with human blood plasma by using microfluidics. We discovered that both initiation and propagation of clotting are regulated by a threshold response to the concentration of activators of clotting, and that clotting is sensitive to the spatial localization of stimuli. To understand the dynamics of patterning of the Drosophila embryo, we used microfluidics to perturb the environment around a developing embryo and observe the effects of this perturbation on the expression of Hunchback, a protein whose localization is essential to proper development. We found that the mechanism that is responsible for Hunchback positioning is asymmetric, time-dependent, and more complex than previously proposed by studies of individual reactions.
Overall, these approaches provide strategies for simplifying, modeling, and probing complex networks without sacrificing the functionality of the network. Such network-level strategies may be most useful for understanding systems with nonlinear interactions where spatial dynamics is essential for function. In addition, microfluidics provides an opportunity to investigate the mechanisms responsible for robust functioning of complex networks. By creating nonideal, stressful, and perturbed environments, microfluidic experiments could reveal the function of pathways thought to be nonessential under ideal conditions.
doi:10.1021/ar700174g
PMCID: PMC2593841  PMID: 18217723
12.  An integer optimization algorithm for robust identification of non-linear gene regulatory networks 
BMC Systems Biology  2012;6:119.
Background
Reverse engineering gene networks and identifying regulatory interactions are integral to understanding cellular decision making processes. Advancement in high throughput experimental techniques has initiated innovative data driven analysis of gene regulatory networks. However, inherent noise associated with biological systems requires numerous experimental replicates for reliable conclusions. Furthermore, evidence of robust algorithms directly exploiting basic biological traits are few. Such algorithms are expected to be efficient in their performance and robust in their prediction.
Results
We have developed a network identification algorithm to accurately infer both the topology and strength of regulatory interactions from time series gene expression data in the presence of significant experimental noise and non-linear behavior. In this novel formulism, we have addressed data variability in biological systems by integrating network identification with the bootstrap resampling technique, hence predicting robust interactions from limited experimental replicates subjected to noise. Furthermore, we have incorporated non-linearity in gene dynamics using the S-system formulation. The basic network identification formulation exploits the trait of sparsity of biological interactions. Towards that, the identification algorithm is formulated as an integer-programming problem by introducing binary variables for each network component. The objective function is targeted to minimize the network connections subjected to the constraint of maximal agreement between the experimental and predicted gene dynamics. The developed algorithm is validated using both in silico and experimental data-sets. These studies show that the algorithm can accurately predict the topology and connection strength of the in silico networks, as quantified by high precision and recall, and small discrepancy between the actual and predicted kinetic parameters. Furthermore, in both the in silico and experimental case studies, the predicted gene expression profiles are in very close agreement with the dynamics of the input data.
Conclusions
Our integer programming algorithm effectively utilizes bootstrapping to identify robust gene regulatory networks from noisy, non-linear time-series gene expression data. With significant noise and non-linearities being inherent to biological systems, the present formulism, with the incorporation of network sparsity, is extremely relevant to gene regulatory networks, and while the formulation has been validated against in silico and E. Coli data, it can be applied to any biological system.
doi:10.1186/1752-0509-6-119
PMCID: PMC3444924  PMID: 22937832
Gene regulatory networks; Non-linear dynamics; S-system; Robust network identification; Bootstrapping; Integer programming; Optimization algorithm
13.  Inference of Gene Regulatory Networks with Sparse Structural Equation Models Exploiting Genetic Perturbations 
PLoS Computational Biology  2013;9(5):e1003068.
Integrating genetic perturbations with gene expression data not only improves accuracy of regulatory network topology inference, but also enables learning of causal regulatory relations between genes. Although a number of methods have been developed to integrate both types of data, the desiderata of efficient and powerful algorithms still remains. In this paper, sparse structural equation models (SEMs) are employed to integrate both gene expression data and cis-expression quantitative trait loci (cis-eQTL), for modeling gene regulatory networks in accordance with biological evidence about genes regulating or being regulated by a small number of genes. A systematic inference method named sparsity-aware maximum likelihood (SML) is developed for SEM estimation. Using simulated directed acyclic or cyclic networks, the SML performance is compared with that of two state-of-the-art algorithms: the adaptive Lasso (AL) based scheme, and the QTL-directed dependency graph (QDG) method. Computer simulations demonstrate that the novel SML algorithm offers significantly better performance than the AL-based and QDG algorithms across all sample sizes from 100 to 1,000, in terms of detection power and false discovery rate, in all the cases tested that include acyclic or cyclic networks of 10, 30 and 300 genes. The SML method is further applied to infer a network of 39 human genes that are related to the immune function and are chosen to have a reliable eQTL per gene. The resulting network consists of 9 genes and 13 edges. Most of the edges represent interactions reasonably expected from experimental evidence, while the remaining may just indicate the emergence of new interactions. The sparse SEM and efficient SML algorithm provide an effective means of exploiting both gene expression and perturbation data to infer gene regulatory networks. An open-source computer program implementing the SML algorithm is freely available upon request.
Author Summary
Deciphering the structure of gene regulatory networks is crucial for understanding gene functions and cellular dynamics, as well as system-level modeling of individual genes and cellular functions. Computational methods exploiting gene expression and other types of data generated from high-throughput experiments provide an efficient and low-cost means of inferring gene networks. Sparse structural equation models are employed to: i) integrate both gene expression and genetic perturbation data for inference of gene networks; and, ii) develop an efficient sparsity-aware inference algorithm. Computer simulations corroborate that the novel algorithm markedly outperforms state-of-the-art alternatives. The algorithm is further applied to infer a real human gene network unveiling possible interactions between several genes. Since gene networks can be perturbed not only by genetic variations but also by other means such as gene copy number changes, gene knockdown or controlled gene over-expression, this paper's method can be applied to a number of practical scenarios.
doi:10.1371/journal.pcbi.1003068
PMCID: PMC3662697  PMID: 23717196
14.  Metabolic Constraint-Based Refinement of Transcriptional Regulatory Networks 
PLoS Computational Biology  2013;9(12):e1003370.
There is a strong need for computational frameworks that integrate different biological processes and data-types to unravel cellular regulation. Current efforts to reconstruct transcriptional regulatory networks (TRNs) focus primarily on proximal data such as gene co-expression and transcription factor (TF) binding. While such approaches enable rapid reconstruction of TRNs, the overwhelming combinatorics of possible networks limits identification of mechanistic regulatory interactions. Utilizing growth phenotypes and systems-level constraints to inform regulatory network reconstruction is an unmet challenge. We present our approach Gene Expression and Metabolism Integrated for Network Inference (GEMINI) that links a compendium of candidate regulatory interactions with the metabolic network to predict their systems-level effect on growth phenotypes. We then compare predictions with experimental phenotype data to select phenotype-consistent regulatory interactions. GEMINI makes use of the observation that only a small fraction of regulatory network states are compatible with a viable metabolic network, and outputs a regulatory network that is simultaneously consistent with the input genome-scale metabolic network model, gene expression data, and TF knockout phenotypes. GEMINI preferentially recalls gold-standard interactions (p-value = 10−172), significantly better than using gene expression alone. We applied GEMINI to create an integrated metabolic-regulatory network model for Saccharomyces cerevisiae involving 25,000 regulatory interactions controlling 1597 metabolic reactions. The model quantitatively predicts TF knockout phenotypes in new conditions (p-value = 10−14) and revealed potential condition-specific regulatory mechanisms. Our results suggest that a metabolic constraint-based approach can be successfully used to help reconstruct TRNs from high-throughput data, and highlights the potential of using a biochemically-detailed mechanistic framework to integrate and reconcile inconsistencies across different data-types. The algorithm and associated data are available at https://sourceforge.net/projects/gemini-data/
Author Summary
Cellular networks, such as metabolic and transcriptional regulatory networks (TRNs), do not operate independently but work together in unison to determine cellular phenotypes. Further, the phenotype and architecture of one network constrains the topology of other networks. Hence, it is critical to study network components and interactions in the context of the entire cell. Typically, efforts to reconstruct TRNs focus only on immediately proximal data such as gene co-expression and transcription factor (TF)-binding. Herein, we take a different strategy by linking candidate TRNs with the metabolic network to predict systems-level responses such as growth phenotypes of TF knockout strains, and compare predictions with experimental phenotype data to select amongst the candidate TRNs. Our approach goes beyond traditional data integration approaches for network inference and refinement by using a predictive network model (metabolism) to refine another network model (regulation) – thus providing an alternative avenue to this area of research. Understanding how the networks function together in a cell will pave the way for synthetic biology and has a wide-range of applications in biotechnology, drug discovery and diagnostics. Further we demonstrate how metabolic models can integrate and reconcile inconsistencies across different data-types.
doi:10.1371/journal.pcbi.1003370
PMCID: PMC3857774  PMID: 24348226
15.  Identifying Biological Network Structure, Predicting Network Behavior, and Classifying Network State With High Dimensional Model Representation (HDMR) 
PLoS ONE  2012;7(6):e37664.
This work presents an adapted Random Sampling - High Dimensional Model Representation (RS-HDMR) algorithm for synergistically addressing three key problems in network biology: (1) identifying the structure of biological networks from multivariate data, (2) predicting network response under previously unsampled conditions, and (3) inferring experimental perturbations based on the observed network state. RS-HDMR is a multivariate regression method that decomposes network interactions into a hierarchy of non-linear component functions. Sensitivity analysis based on these functions provides a clear physical and statistical interpretation of the underlying network structure. The advantages of RS-HDMR include efficient extraction of nonlinear and cooperative network relationships without resorting to discretization, prediction of network behavior without mechanistic modeling, robustness to data noise, and favorable scalability of the sampling requirement with respect to network size. As a proof-of-principle study, RS-HDMR was applied to experimental data measuring the single-cell response of a protein-protein signaling network to various experimental perturbations. A comparison to network structure identified in the literature and through other inference methods, including Bayesian and mutual-information based algorithms, suggests that RS-HDMR can successfully reveal a network structure with a low false positive rate while still capturing non-linear and cooperative interactions. RS-HDMR identified several higher-order network interactions that correspond to known feedback regulations among multiple network species and that were unidentified by other network inference methods. Furthermore, RS-HDMR has a better ability to predict network response under unsampled conditions in this application than the best statistical inference algorithm presented in the recent DREAM3 signaling-prediction competition. RS-HDMR can discern and predict differences in network state that arise from sources ranging from intrinsic cell-cell variability to altered experimental conditions, such as when drug perturbations are introduced. This ability ultimately allows RS-HDMR to accurately classify the experimental conditions of a given sample based on its observed network state.
doi:10.1371/journal.pone.0037664
PMCID: PMC3377689  PMID: 22723838
16.  Trimming of mammalian transcriptional networks using network component analysis 
BMC Bioinformatics  2010;11:511.
Background
Network Component Analysis (NCA) has been used to deduce the activities of transcription factors (TFs) from gene expression data and the TF-gene binding relationship. However, the TF-gene interaction varies in different environmental conditions and tissues, but such information is rarely available and cannot be predicted simply by motif analysis. Thus, it is beneficial to identify key TF-gene interactions under the experimental condition based on transcriptome data. Such information would be useful in identifying key regulatory pathways and gene markers of TFs in further studies.
Results
We developed an algorithm to trim network connectivity such that the important regulatory interactions between the TFs and the genes were retained and the regulatory signals were deduced. Theoretical studies demonstrated that the regulatory signals were accurately reconstructed even in the case where only three independent transcriptome datasets were available. At least 80% of the main target genes were correctly predicted in the extreme condition of high noise level and small number of datasets. Our algorithm was tested with transcriptome data taken from mice under rapamycin treatment. The initial network topology from the literature contains 70 TFs, 778 genes, and 1423 edges between the TFs and genes. Our method retained 1074 edges (i.e. 75% of the original edge number) and identified 17 TFs as being significantly perturbed under the experimental condition. Twelve of these TFs are involved in MAPK signaling or myeloid leukemia pathways defined in the KEGG database, or are known to physically interact with each other. Additionally, four of these TFs, which are Hif1a, Cebpb, Nfkb1, and Atf1, are known targets of rapamycin. Furthermore, the trimmed network was able to predict Eno1 as an important target of Hif1a; this key interaction could not be detected without trimming the regulatory network.
Conclusions
The advantage of our new algorithm, relative to the original NCA, is that our algorithm can identify the important TF-gene interactions. Identifying the important TF-gene interactions is crucial for understanding the roles of pleiotropic global regulators, such as p53. Also, our algorithm has been developed to overcome NCA's inability to analyze large networks where multiple TFs regulate a single gene. Thus, our algorithm extends the applicability of NCA to the realm of mammalian regulatory network analysis.
doi:10.1186/1471-2105-11-511
PMCID: PMC2967563  PMID: 20942926
17.  Division of labor by dual feedback regulators controls JAK2/STAT5 signaling over broad ligand range 
Quantitative analysis of time-resolved data in primary erythroid progenitor cells reveals that a dual negative transcriptional feedback mechanism underlies the ability of STAT5 to respond to the broad spectrum of physiologically relevant Epo concentrations.
A mathematical dual feedback model of the Epo-induced JAK2/STAT5 signaling pathway was calibrated with extensive time-resolved quantitative data sets from immunoblotting, mass spectrometry and qRT–PCR experiments in primary erythroid progenitor cells.We show that the amount of nuclear phosphorylated STAT5 integrated for 60 min post Epo stimulation directly correlates with the fraction of surviving cells 24 h later.CIS and SOCS3 were identified as the most relevant transcriptional feedback regulators of JAK2/STAT5 signaling in primary erythroid progenitor cells. Applying the model, we revealed that CIS-mediated inhibitory effects are most important at low ligand concentrations, whereas SOCS3 inhibition is more effective at high ligand doses.The distinct modes of inhibition of CIS and SOCS3 at various Epo concentrations provide a strategy for achieving control of JAK2/STAT5 signaling over the entire range of physiological Epo concentrations.
Cells interpret information encoded by extracellular stimuli through the activation of intracellular signaling networks and translate this information into cellular decisions. A prime example for a system that is exposed to extremely variable ligand concentrations is the erythroid lineage. The key regulator Erythropoietin (Epo) facilitates continuous renewal of erythrocytes at low basal levels but also secures compensation in case of, e.g., blood loss through an up to 1000-fold increase in hormone concentration. The Epo receptor (EpoR) is expressed on erythroid progenitor cells at the colony forming unit erythroid (CFU-E) stage. Stimulation of these cells with Epo leads to rapid but transient activation of receptor and JAK2 phosphorylation followed by phosphorylation of the latent transcription factor STAT5. Although STAT5 is known to be an essential regulator of survival and differentiation of erythroid progenitor cells, a quantitative link between the dynamic properties of STAT5 signaling and survival decisions remained unknown. STAT5-mediated responses in CFU-E cells are modulated by multiple attenuation mechanisms that operate on different time scales. Fast-acting mechanisms such as depletion of Epo by rapid receptor turnover and recruitment of the phosphatase SHP-1 control the initial signal amplitude at the receptor level. Transcriptional feedback regulators such as suppressor of cytokine signaling (SOCS) family members CIS and SOCS3 operate at a slower time scale. Despite the ample knowledge of the individual components involved, only little is known about the specific contributions of these regulators in controlling dynamic properties of STAT5 in response to a broad range of input signals. Therefore, dynamic pathway modeling is required to understand the complex regulatory network of feedback regulators.
To address these questions, we established a dual negative feedback model of JAK2/STAT5 signaling in primary erythroid progenitor cells isolated from mouse fetal livers. We provide a large data set of JAK2/STAT5 signaling dynamics employing quantitative immunoblotting, mass spectrometry and quantitative RT–PCR measured under different perturbation conditions to calibrate our model (Figure 3). The structure of our model was constructed to comprise the minimal number of parameters necessary to explain the data. Thereby, we aimed at a model with fully identifiable parameters that are essential to obtain high predictive power. Parameter identifiability was analyzed by the profile likelihood approach. Applying this method, we could establish a dual negative feedback model of JAK2-STAT5 signaling with structurally and in most cases practically identifiable parameters.
A major bottle-neck in combining signal transduction events with cellular phenotypes is the discrepancy in the time scale and stimuli concentrations that are applied in the different experiments. The sensitivity of biochemical assays to determine phosphorylation events within minutes or hours after stimulation is usually lower than the threshold of sensitivity in assays to determine the physiological response after one or more days. Facilitated by the model, we were able to compute the integrated response of JAK2/STAT5 signaling components for experimentally unaddressable Epo concentrations. Our results demonstrate that the integrated response of pSTAT5 in the nucleus accurately correlates with the experimentally determined survival of CFU-E cells. This provides a quantitative link of the dependency of primary CFU-E cells on pSTAT5 activation dynamics. By correlation analysis, we could identify the early signaling phase (⩽1 h) of STAT5 to be the most predictive for the fraction of surviving cells, which was determined ∼24 h later. Thus, we hypothesize that as a general principle in apoptotic decisions, ligand concentrations translated into kinetic-encoded information of early signaling events downstream of receptors can be predictive for survival decisions 24 h later.
After the first hour of stimulation, it is important to constrain signaling to a residual steady-state level. Constitutive phosphorylation of the JAK2/STAT5 pathway has a crucial role in the onset of polycythemia vera (PV), a disease associated with Epo-independent erythroid differentiation. The two identified transcriptional feedback proteins, CIS and SOCS3, are responsible for adjusting the phosphorylation level of STAT5 after 1 h of stimulation. Since the Epo input signal can vary over a broad range of ligand concentrations, we asked how CIS and SOCS3 can facilitate control of STAT5 long-term phosphorylation levels over the entire physiological relevant hormone concentrations. By using model simulations, we revealed that the two feedbacks are most effective at different Epo concentration ranges. Predicted by our mathematical model, the major role of CIS in modulating STAT5 phosphorylation levels is at low, basal Epo concentrations, whereas SOCS3 is essential to control the STAT5 phosphorylation levels at high Epo doses (Figure 6). As a potential molecular mechanism of this dose-dependent inhibitory effect, we could identify the quantity of pJAK2 relative to pEpoR that increases with higher Epo concentrations. Since SOCS3 can inhibit JAK2 directly via its KIR domain to attenuate downstream STAT5 activation, SOCS3 becomes more effective with the relative increase of JAK2 activation. Hence, CIS and SOCS3 act in a concerted manner to ensure tight regulation of STAT5 responses over the broad physiological range of Epo concentrations.
In summary, our mathematical approach provided new insights into the specific function of feedback regulation in STAT5-mediated life or death decisions of primary erythroid cells. We dissected the roles of the transcriptionally induced proteins CIS and SOCS3 that operate as dual feedback with divided function thereby facilitating the control of STAT5 activation levels over the entire range of physiological Epo concentrations. The detailed understanding of the molecular processes and control distribution of Epo-induced JAK/STAT signaling can be further applied to gain insights into alterations promoting malignant hematopoietic diseases.
Cellular signal transduction is governed by multiple feedback mechanisms to elicit robust cellular decisions. The specific contributions of individual feedback regulators, however, remain unclear. Based on extensive time-resolved data sets in primary erythroid progenitor cells, we established a dynamic pathway model to dissect the roles of the two transcriptional negative feedback regulators of the suppressor of cytokine signaling (SOCS) family, CIS and SOCS3, in JAK2/STAT5 signaling. Facilitated by the model, we calculated the STAT5 response for experimentally unobservable Epo concentrations and provide a quantitative link between cell survival and the integrated response of STAT5 in the nucleus. Model predictions show that the two feedbacks CIS and SOCS3 are most effective at different ligand concentration ranges due to their distinct inhibitory mechanisms. This divided function of dual feedback regulation enables control of STAT5 responses for Epo concentrations that can vary 1000-fold in vivo. Our modeling approach reveals dose-dependent feedback control as key property to regulate STAT5-mediated survival decisions over a broad range of ligand concentrations.
doi:10.1038/msb.2011.50
PMCID: PMC3159971  PMID: 21772264
apoptosis; erythropoietin; mathematical modeling; negative feedback; SOCS
18.  Bayesian Orthogonal Least Squares (BOLS) algorithm for reverse engineering of gene regulatory networks 
BMC Bioinformatics  2007;8:251.
Background
A reverse engineering of gene regulatory network with large number of genes and limited number of experimental data points is a computationally challenging task. In particular, reverse engineering using linear systems is an underdetermined and ill conditioned problem, i.e. the amount of microarray data is limited and the solution is very sensitive to noise in the data. Therefore, the reverse engineering of gene regulatory networks with large number of genes and limited number of data points requires rigorous optimization algorithm.
Results
This study presents a novel algorithm for reverse engineering with linear systems. The proposed algorithm is a combination of the orthogonal least squares, second order derivative for network pruning, and Bayesian model comparison. In this study, the entire network is decomposed into a set of small networks that are defined as unit networks. The algorithm provides each unit network with P(D|Hi), which is used as confidence level. The unit network with higher P(D|Hi) has a higher confidence such that the unit network is correctly elucidated. Thus, the proposed algorithm is able to locate true positive interactions using P(D|Hi), which is a unique property of the proposed algorithm.
The algorithm is evaluated with synthetic and Saccharomyces cerevisiae expression data using the dynamic Bayesian network. With synthetic data, it is shown that the performance of the algorithm depends on the number of genes, noise level, and the number of data points. With Yeast expression data, it is shown that there is remarkable number of known physical or genetic events among all interactions elucidated by the proposed algorithm.
The performance of the algorithm is compared with Sparse Bayesian Learning algorithm using both synthetic and Saccharomyces cerevisiae expression data sets. The comparison experiments show that the algorithm produces sparser solutions with less false positives than Sparse Bayesian Learning algorithm.
Conclusion
From our evaluation experiments, we draw the conclusion as follows: 1) Simulation results show that the algorithm can be used to elucidate gene regulatory networks using limited number of experimental data points. 2) Simulation results also show that the algorithm is able to handle the problem with noisy data. 3) The experiment with Yeast expression data shows that the proposed algorithm reliably elucidates known physical or genetic events. 4) The comparison experiments show that the algorithm more efficiently performs than Sparse Bayesian Learning algorithm with noisy and limited number of data.
doi:10.1186/1471-2105-8-251
PMCID: PMC1959566  PMID: 17626641
19.  Single-gene tuning of Caulobacter cell cycle period and noise, swarming motility, and surface adhesion 
We established that the sensor histidine kinase DivJ has an important role in the regulation of C. crescentus cell cycle period and noise. This was accomplished by designing and conducting single-cell experiments to probe the dependence of cell cycle noise on divJ expression and constructing a simplified cell cycle model that captures the dependence of cell cycle noise on DivJ with molecular details.In addition to its role in regulating the cell cycle, DivJ also affects polar cell development in C. crescentus, regulating swarming motility and surface adhesion. We propose that pleiotropic control of polar cell development by the DivJ–DivK–PleC signaling pathway underlies divJ-dependent tuning of cell swarming and adhesion behaviors.We have integrated the study of single-cell fluorescence dynamics with a kinetic model simulation to provide direct quantitative evidence that the DivJ histidine kinase is localized to the cell pole through a dynamic diffusion-and-capture mechanism during the C. crescentus cell cycle.
Temporally-coordinated localization of various structural and signaling proteins is critical for proper cell cycle regulation and polar cell development in the bacterium, Caulobacter crescentus. Included among these dynamically-localized regulatory proteins is the sensor histidine kinase, DivJ (Wheeler and Shapiro, 1999). Co-localized with DivJ in the early stalked phase is the phosphorylated response regulator DivK∼P (Jacobs et al, 2001), and the protease ClpXP (McGrath et al, 2006), which degrades the master cell cycle regulator, CtrA (Jenal and Fuchs, 1998). Recent single-cell measurements of surface attached C. crescentus cells have revealed an intriguing role for DivJ in the control of noise in cell division period (Siegal-Gaskins and Crosson, 2008). The noise of the cell cycle increases significantly upon disruption of the divJ gene, with a relatively small accompanying increase in the mean cell cycle time. The deterministic nature of the existing cell cycle models (Li et al, 2008, 2009; Shen et al, 2008) cannot explain the measured increase in cell cycle period and noise in a divJ null strain. Moreover, mechanistic descriptions of how DivJ and its signaling partners are localized and how these proteins underlie the control of polar cell development and cell adhesion in C. crescentus remain immature.
The single-cell experiments and analysis presented herein reveal that C. crescentus cell cycle period and noise can be tuned by DivJ (Figure 2). Specifically, in the case of low (or no) divJ expression the cell cycle is perturbed, and this is quantified by way of the (measured) noise in the cell cycle period. The level of noise is readily controlled through regulated expression of the divJ gene (Figure 2B). A simplified protein interaction network of stalked C. crescentus cell cycle regulation involving minimal components (CtrA, CtrA∼P, DivK, DivK∼P, and DivJ) was constructed to explore such tunability at the molecular level. The agreement of our model with our (and other) experiments suggests this simplified protein regulatory network is sufficient to explain the major features of the C. crescentus cell cycle. Indeed, stochastic simulations of this model using the Gillespie method (Gillespie, 1976) establish the importance of robust DivJ-mediated phosphorylation of its cognate receiver protein, DivK, in regulating the variance of cell cycle oscillations. Increased variability in the concentration of DivK∼P at the single cell level under divJ depletion subsequently leads to increased noise in the regulation of CtrA phosphorylation and degradation. Our experiments and simulations provide evidence that the steady state level of DivK∼P at the single-cell level (as maintained by DivJ) is essential in maintaining regular timing of the cell division period in C. crescentus.
In addition to its role in regulating cell cycle, divJ expression also affects polar cell development in C. crescentus. Specifically, the capacity of swarmer cells to adhere to a glass surface is suppressed at high levels of divJ expression. The effect of elevated divJ expression on the adhesive capacity of the cell is reflected in a reduced rate of two-dimensional biofilm formation. This effect is quantitatively captured by our mathematical model that relates single-cell surface adhesion physiology and biofilm formation dynamics. This result, and our observation that divJ expression tunes swarming motility in semi-solid growth medium, suggests a model in which increased DivJ concentration in the swarmer compartment (due to constitutive overexpression) ultimately results in improper development of polar organelles that are required for adhesion of swarming motility.
Despite the appreciated significance of protein localization for bacterial physiological functions, the molecular mechanism of how polar protein localization is achieved has only been tested in a few cases (Shapiro et al, 2002; Thanbichler and Shapiro, 2008). Mechanisms such as the polar insertion model and diffusion-and-capture have been proposed but the community's knowledge is limited to very few examples (Charles et al, 2001; Rudner et al, 2002). We provide direct evidence from experiments and simulations that the DivJ histidine kinase becomes localized to the cell pole through a dynamic diffusion-and-capture mechanism during the C. crescentus cell cycle (Figure 7). We show that a kinetic model based on a Langmuir adsorption/desorption relationship (Figure 7D) is sufficient to explain the time evolution of the single cell fluorescence time traces (Figure 7C and E) and allows establishing quantitative correspondences between the simulated dynamics and experimentally determined DivJ–EGFP dynamics. This localization mechanism is consistent with a diffusion-and-capture model. In short, the model posits that proteins are randomly distributed and are freely diffusing until they are captured at the site where they ultimately reside (Rudner et al, 2002; Shapiro et al, 2002; Bardy and Maddock, 2007). With a diffusion-and-capture pathway, it has been argued that proteins can be adsorbed either dynamically or statically (Shapiro et al, 2009). Our analysis of DivJ–EGFP in single cells supports a dynamic diffuse-and-capture mechanism for DivJ localization.
Sensor histidine kinases underlie the regulation of a range of physiological processes in bacterial cells, from chemotaxis to cell division. In the gram-negative bacterium Caulobacter crescentus, the membrane-bound histidine kinase, DivJ, is a polar-localized regulator of cell cycle progression and development. We show that DivJ localizes to the cell pole through a dynamic diffusion and capture mechanism rather than by active localization. Analysis of single C. crescentus cells in microfluidic culture demonstrates that controlled expression of divJ permits facile tuning of both the mean and noise of the cell division period. Simulations of the cell cycle that use a simplified protein interaction network capture previously measured oscillatory protein profiles, and recapitulate the experimental observation that deletion of divJ increases the cell cycle period and noise. We further demonstrate that surface adhesion and swarming motility of C. crescentus in semi-solid media can also be tuned by divJ expression. We propose a model in which pleiotropic control of polar cell development by the DivJ–DivK–PleC signaling pathway underlies divJ-dependent tuning of cell swarming and adhesion behaviors.
doi:10.1038/msb.2010.95
PMCID: PMC3018171  PMID: 21179017
cell cycle; histidine kinase; protein interaction network; protein localization; single cell
20.  An algebra-based method for inferring gene regulatory networks 
BMC Systems Biology  2014;8:37.
Background
The inference of gene regulatory networks (GRNs) from experimental observations is at the heart of systems biology. This includes the inference of both the network topology and its dynamics. While there are many algorithms available to infer the network topology from experimental data, less emphasis has been placed on methods that infer network dynamics. Furthermore, since the network inference problem is typically underdetermined, it is essential to have the option of incorporating into the inference process, prior knowledge about the network, along with an effective description of the search space of dynamic models. Finally, it is also important to have an understanding of how a given inference method is affected by experimental and other noise in the data used.
Results
This paper contains a novel inference algorithm using the algebraic framework of Boolean polynomial dynamical systems (BPDS), meeting all these requirements. The algorithm takes as input time series data, including those from network perturbations, such as knock-out mutant strains and RNAi experiments. It allows for the incorporation of prior biological knowledge while being robust to significant levels of noise in the data used for inference. It uses an evolutionary algorithm for local optimization with an encoding of the mathematical models as BPDS. The BPDS framework allows an effective representation of the search space for algebraic dynamic models that improves computational performance. The algorithm is validated with both simulated and experimental microarray expression profile data. Robustness to noise is tested using a published mathematical model of the segment polarity gene network in Drosophila melanogaster. Benchmarking of the algorithm is done by comparison with a spectrum of state-of-the-art network inference methods on data from the synthetic IRMA network to demonstrate that our method has good precision and recall for the network reconstruction task, while also predicting several of the dynamic patterns present in the network.
Conclusions
Boolean polynomial dynamical systems provide a powerful modeling framework for the reverse engineering of gene regulatory networks, that enables a rich mathematical structure on the model search space. A C++ implementation of the method, distributed under LPGL license, is available, together with the source code, at http://www.paola-vera-licona.net/Software/EARevEng/REACT.html.
doi:10.1186/1752-0509-8-37
PMCID: PMC4022379  PMID: 24669835
Reverse-engineering; network inference; Boolean networks; molecular networks; gene regulatory networks; polynomial dynamical systems; algebraic dynamic models; evolutionary computation; DNA microarray data; time series data; data noise
21.  Reverse engineering a hierarchical regulatory network downstream of oncogenic KRAS 
Systematic RNA interference perturbations within ovarian cancer cells reveal a hierarchically organized transcription factor network downstream of the oncogenic RAS pathway. Modules within the network are shown to control distinct aspects of cell growth and migration.
Cellular transformation by KRAS oncogenes results in the upregulation of a multitude of transcription factors and a general deregulation of the transcriptomeTo exploit the network organization of selected transcriptional regulators responding to chronic RAS pathway activation, we used an integrated strategy combining experimental perturbation of transcription factor and signalling kinase expression with a reverse-engineering approach based on modular response analysis (MRA).The network shows strong modularity, high connectivity and hierarchical organization.The network hierarchy is reflected in distinct phenotypic consequences of perturbation within modules that separately control cellular proliferation and anchorage independence.
RAS mutations are highly relevant for progression and therapy response of human tumours, but the genetic network that ultimately executes the oncogenic effects is poorly understood. Here, we used a reverse-engineering approach in an ovarian cancer model to reconstruct KRAS oncogene-dependent cytoplasmic and transcriptional networks from perturbation experiments based on gene silencing and pathway inhibitor treatments. We measured mRNA and protein levels in manipulated cells by microarray, RT–PCR and western blot analysis, respectively. The reconstructed model revealed complex interactions among the transcriptional and cytoplasmic components, some of which were confirmed by double pertubation experiments. Interestingly, the transcription factors decomposed into two hierarchically arranged groups. To validate the model predictions, we analysed growth parameters and transcriptional deregulation in the KRAS-transformed epithelial cells. As predicted by the model, we found two functional groups among the selected transcription factors. The experiments thus confirmed the predicted hierarchical transcription factor regulation and showed that the hierarchy manifests itself in downstream gene expression patterns and phenotype.
doi:10.1038/msb.2012.32
PMCID: PMC3421447  PMID: 22864383
cancer systems biology; modular response analysis; oncogenes; ovarian carcinoma model; signal transduction
22.  Systematic evaluation of objective functions for predicting intracellular fluxes in Escherichia coli 
The in vivo distribution of metabolic fluxes in Escherichia coli can be predicted from optimality principles At least two different sets of optimality principles govern the operation of the metabolic network under different environmental conditionsMetabolism during unlimited growth on glucose in batch culture is best described by the nonlinear maximization of ATP yield per unit of flux
Based on a long history of biochemical and lately genomic research, metabolic networks, in particular microbial ones, are among the best characterized cellular networks. Most components (genes, proteins and metabolites) and their interactions are known. This topological knowledge of the reaction stoichiometry allows to construct metabolic models up to the level of genome scale (Price et al, 2004). Experimentally, sophisticated 13C-tracer-based methodologies were developed that enable tracking of the intracellular flux traffic through the reaction network (Sauer, 2006). With the accumulation of such experimental flux data, the question arises why a particular distribution of flux within the network is realized and not one of many alternatives?
Here, we address the question whether the intracellular flux state can be predicted from optimality principles, with the underlying rational that evolution might have optimized metabolic operation toward particular objectives or combinations of multiple objectives. For this purpose, we performed a systematic and rigorous comparison between computational flux predictions and available experimental flux data (Emmerling et al, 2002; Perrenoud and Sauer, 2005; Nanchen et al, 2006) under six different environmental conditions for the model bacterium E. coli. For computational flux predictions, we used a constraint-based modeling approach that requires a stoichiometric model of metabolism (Stelling, 2004). More specifically, we employed flux balance analysis (FBA) where objective functions are defined that represent optimality principles of network operation (Price et al, 2004). This approach has been applied successfully to predict gene deletion lethality (Edwards and Palsson, 2000a, bEdwards and Palsson, 2000a, b; Forster et al, 2003; Kuepfer et al, 2005), network capacities and feasible network states (Edwards 2001, Ibarra 2002), but in only few cases to predict the intracellular flux state (Beard et al, 2002; Holzhütter, 2004).
While different objective functions were proposed for different biological systems (Holzhütter, 2004; Price et al, 2004; Knorr et al, 2006), by far the most common assumption is that microbial cells maximize their growth. To address this issue more generally, we evaluated the accuracy of FBA-based flux predictions for 11 linear and nonlinear objective functions that were combined with eight adjustable constraints. For this purpose, we constructed a highly interconnected stoichiometric network model with 98 reactions and 60 metabolites of E. coli central carbon metabolism. Based on mathematical analyses, the overall model could be reduced to a set of 10 reactions that summarize the actual systemic degree of freedom.
As a quantitative measure of how accurate the experimental data are predicted, we defined predictive fidelity as a single value to quantify the overall deviation between in silico and in vivo fluxes. By comparing all in silico predictions to 13C-based in vivo fluxes, we show that prediction of intracellular steady-state fluxes from network stoichiometry alone is, within limits, possible. An unexpected key result is that no further assumptions on network operation in the form of additional and potentially artificial constraints are necessary, provided the appropriate objective function is chosen for a given condition.
While no single objective was able to describe the flux states under all six conditions, we identified two sets of objectives for biologically meaningful predictions without the need for further constraints. For unlimited growth on glucose in aerobic or nitrate-respiring batch cultures, we find that the most accurate and robust results are obtained with the nonlinear maximization of ATP yield per flux unit (Figure 1). Under nutrient scarcity in glucose- or ammonium-limited continuous cultures, in contrast, linear maximization of the overall ATP or biomass yields achieved the highest predictive accuracy.
Since these identified optimality principles describe the system behavior without preconditioning of the network through further constraints, they reflect, to some extent, the evolutionary selection of metabolic network regulation that realizes the various flux states. For conditions of nutrient scarcity, the maximization of energy or biomass yield objective is consistent with the generally observed physiology (Russell and Cook, 1995). The meaning of the maximization of ATP yield per flux unit objective for unlimited growth, however, is less obvious. Generally, it selects for small networks with yet high, albeit suboptimal ATP formation, which has three biological consequences. Firstly, resources are economically allocated since expenditures for enzyme synthesis are, on average, greater for longer pathways. Secondly, suboptimal ATP yields dissipate more energy and thus enable higher catabolic rates. Thirdly, at a constant catabolic rate, a small network results in shorter residence times of substrate molecules until they generate ATP. The relative contribution of these consequences to the evolution of network regulation is unclear, but simultaneous optimization for ATP yield and catabolic rate under this optimality principle identifies a trade-off between the contradicting objectives of maximum overall ATP yield and maximum rate of ATP formation (Pfeiffer et al, 2001).
To which extent can optimality principles describe the operation of metabolic networks? By explicitly considering experimental errors and in silico alternate optima in flux balance analysis, we systematically evaluate the capacity of 11 objective functions combined with eight adjustable constraints to predict 13C-determined in vivo fluxes in Escherichia coli under six environmental conditions. While no single objective describes the flux states under all conditions, we identified two sets of objectives for biologically meaningful predictions without the need for further, potentially artificial constraints. Unlimited growth on glucose in oxygen or nitrate respiring batch cultures is best described by nonlinear maximization of the ATP yield per flux unit. Under nutrient scarcity in continuous cultures, in contrast, linear maximization of the overall ATP or biomass yields achieved the highest predictive accuracy. Since these particular objectives predict the system behavior without preconditioning of the network structure, the identified optimality principles reflect, to some extent, the evolutionary selection of metabolic network regulation that realizes the various flux states.
doi:10.1038/msb4100162
PMCID: PMC1949037  PMID: 17625511
13C-flux; evolution; flux balance analysis; metabolic network; network optimality
23.  Computational Modeling and Analysis of Insulin Induced Eukaryotic Translation Initiation 
PLoS Computational Biology  2011;7(11):e1002263.
Insulin, the primary hormone regulating the level of glucose in the bloodstream, modulates a variety of cellular and enzymatic processes in normal and diseased cells. Insulin signals are processed by a complex network of biochemical interactions which ultimately induce gene expression programs or other processes such as translation initiation. Surprisingly, despite the wealth of literature on insulin signaling, the relative importance of the components linking insulin with translation initiation remains unclear. We addressed this question by developing and interrogating a family of mathematical models of insulin induced translation initiation. The insulin network was modeled using mass-action kinetics within an ordinary differential equation (ODE) framework. A family of model parameters was estimated, starting from an initial best fit parameter set, using 24 experimental data sets taken from literature. The residual between model simulations and each of the experimental constraints were simultaneously minimized using multiobjective optimization. Interrogation of the model population, using sensitivity and robustness analysis, identified an insulin-dependent switch that controlled translation initiation. Our analysis suggested that without insulin, a balance between the pro-initiation activity of the GTP-binding protein Rheb and anti-initiation activity of PTEN controlled basal initiation. On the other hand, in the presence of insulin a combination of PI3K and Rheb activity controlled inducible initiation, where PI3K was only critical in the presence of insulin. Other well known regulatory mechanisms governing insulin action, for example IRS-1 negative feedback, modulated the relative importance of PI3K and Rheb but did not fundamentally change the signal flow.
Author Summary
Insulin is a hormone produced by the body that regulates uptake of glucose from the bloodstream. The cellular response to insulin is governed by a complex network of intracellular interactions that ultimately influence cell growth and metabolism. Because of its central role in physiology, insulin signaling has been extensively studied. Yet despite this wealth of research, the relative importance of components in insulin signaling remains unclear. Mechanistic computer simulations have been shown to provide insight into the function of complex systems, such as insulin signaling. In this work we constructed and interrogated a mathematical computer simulation of insulin signaling to better understand the important components of the insulin signaling network. We determined the most important network components and identified network perturbations that can induce dramatic shifts in cellular phenotype. Our results offer an in-depth analysis of the insulin signaling pathway and provide a unique paradigm towards understanding how malfunctions in insulin signaling can result in numerous disease states.
doi:10.1371/journal.pcbi.1002263
PMCID: PMC3213178  PMID: 22102801
24.  Brain Rhythms Reveal a Hierarchical Network Organization 
PLoS Computational Biology  2011;7(10):e1002207.
Recordings of ongoing neural activity with EEG and MEG exhibit oscillations of specific frequencies over a non-oscillatory background. The oscillations appear in the power spectrum as a collection of frequency bands that are evenly spaced on a logarithmic scale, thereby preventing mutual entrainment and cross-talk. Over the last few years, experimental, computational and theoretical studies have made substantial progress on our understanding of the biophysical mechanisms underlying the generation of network oscillations and their interactions, with emphasis on the role of neuronal synchronization. In this paper we ask a very different question. Rather than investigating how brain rhythms emerge, or whether they are necessary for neural function, we focus on what they tell us about functional brain connectivity. We hypothesized that if we were able to construct abstract networks, or “virtual brains”, whose dynamics were similar to EEG/MEG recordings, those networks would share structural features among themselves, and also with real brains. Applying mathematical techniques for inverse problems, we have reverse-engineered network architectures that generate characteristic dynamics of actual brains, including spindles and sharp waves, which appear in the power spectrum as frequency bands superimposed on a non-oscillatory background dominated by low frequencies. We show that all reconstructed networks display similar topological features (e.g. structural motifs) and dynamics. We have also reverse-engineered putative diseased brains (epileptic and schizophrenic), in which the oscillatory activity is altered in different ways, as reported in clinical studies. These reconstructed networks show consistent alterations of functional connectivity and dynamics. In particular, we show that the complexity of the network, quantified as proposed by Tononi, Sporns and Edelman, is a good indicator of brain fitness, since virtual brains modeling diseased states display lower complexity than virtual brains modeling normal neural function. We finally discuss the implications of our results for the neurobiology of health and disease.
Author Summary
The fact that the brain generates weak but measurable electromagnetic waves has intrigued neuroscientists for over a century. Even more remarkable is the fact that these oscillations of brain activity correlate with the state of awareness and that their frequency and amplitude display reproducible features that are different in health and disease. In healthy conditions, oscillations of different frequency bands tend to minimize interference. This is similar to radio stations avoiding overlap between frequency bands to ensure clear transmission. During epileptic seizures, mutual entrainment, analogous to interference of radio signals, has also been reported. Moreover, in schizophrenia and autism, there is a loss of higher frequency oscillations. Most studies thus far have focused on the mechanisms generating the oscillations, as well as on their functional relevance. In contrast, our study focuses on what brain rhythms tell us about the functional network organization of the brain. Using a reverse-engineering approach, we construct abstract networks (virtual brains) that display oscillations of actual brains. These networks reveal that brain rhythms reflect different levels of hierarchical organization in health and disease. We predict specific alterations in brain connectivity in the aforementioned diseases and explain how clinicians and experimentalists can test our predictions.
doi:10.1371/journal.pcbi.1002207
PMCID: PMC3192826  PMID: 22022251
25.  Reverse Engineering Validation using a Benchmark Synthetic Gene Circuit in Human Cells 
ACS synthetic biology  2013;2(5):255-262.
Multi-component biological networks are often understood incompletely, in large part due to the lack of reliable and robust methodologies for network reverse engineering and characterization. As a consequence, developing automated and rigorously validated methodologies for unraveling the complexity of biomolecular networks in human cells remains a central challenge to life scientists and engineers. Today, when it comes to experimental and analytical requirements, there exists a great deal of diversity in reverse engineering methods, which renders the independent validation and comparison of their predictive capabilities difficult. In this work we introduce an experimental platform customized for the development and verification of reverse engineering and pathway characterization algorithms in mammalian cells. Specifically, we stably integrate a synthetic gene network in human kidney cells and use it as a benchmark for validating reverse engineering methodologies. The network, which is orthogonal to endogenous cellular signaling, contains a small set of regulatory interactions that can be used to quantify the reconstruction performance. By performing successive perturbations to each modular component of the network and comparing protein and RNA measurements, we study the conditions under which we can reliably reconstruct the causal relationships of the integrated synthetic network.
doi:10.1021/sb300093y
PMCID: PMC3716858  PMID: 23654266
Reverse Engineering; Benchmark Synthetic Circuits; Human Cells; Modular Response Analysis

Results 1-25 (1382980)