Search tips
Search criteria

Results 1-25 (1337696)

Clipboard (0)

Related Articles

1.  Rapidly exploring structural and dynamic properties of signaling networks using PathwayOracle 
BMC Systems Biology  2008;2:76.
In systems biology the experimentalist is presented with a selection of software for analyzing dynamic properties of signaling networks. These tools either assume that the network is in steady-state or require highly parameterized models of the network of interest. For biologists interested in assessing how signal propagates through a network under specific conditions, the first class of methods does not provide sufficiently detailed results and the second class requires models which may not be easily and accurately constructed. A tool that is able to characterize the dynamics of a signaling network using an unparameterized model of the network would allow biologists to quickly obtain insights into a signaling network's behavior.
We introduce PathwayOracle, an integrated suite of software tools for computationally inferring and analyzing structural and dynamic properties of a signaling network. The feature which differentiates PathwayOracle from other tools is a method that can predict the response of a signaling network to various experimental conditions and stimuli using only the connectivity of the signaling network. Thus signaling models are relatively easy to build. The method allows for tracking signal flow in a network and comparison of signal flows under different experimental conditions. In addition, PathwayOracle includes tools for the enumeration and visualization of coherent and incoherent signaling paths between proteins, and for experimental analysis – loading and superimposing experimental data, such as microarray intensities, on the network model.
PathwayOracle provides an integrated environment in which both structural and dynamic analysis of a signaling network can be quickly conducted and visualized along side experimental results. By using the signaling network connectivity, analyses and predictions can be performed quickly using relatively easily constructed signaling network models. The application has been developed in Python and is designed to be easily extensible by groups interested in adding new or extending existing features. PathwayOracle is freely available for download and use.
PMCID: PMC2527501  PMID: 18713463
2.  Modularization of biochemical networks based on classification of Petri net t-invariants 
BMC Bioinformatics  2008;9:90.
Structural analysis of biochemical networks is a growing field in bioinformatics and systems biology. The availability of an increasing amount of biological data from molecular biological networks promises a deeper understanding but confronts researchers with the problem of combinatorial explosion. The amount of qualitative network data is growing much faster than the amount of quantitative data, such as enzyme kinetics. In many cases it is even impossible to measure quantitative data because of limitations of experimental methods, or for ethical reasons. Thus, a huge amount of qualitative data, such as interaction data, is available, but it was not sufficiently used for modeling purposes, until now. New approaches have been developed, but the complexity of data often limits the application of many of the methods. Biochemical Petri nets make it possible to explore static and dynamic qualitative system properties. One Petri net approach is model validation based on the computation of the system's invariant properties, focusing on t-invariants. T-invariants correspond to subnetworks, which describe the basic system behavior.
With increasing system complexity, the basic behavior can only be expressed by a huge number of t-invariants. According to our validation criteria for biochemical Petri nets, the necessary verification of the biological meaning, by interpreting each subnetwork (t-invariant) manually, is not possible anymore. Thus, an automated, biologically meaningful classification would be helpful in analyzing t-invariants, and supporting the understanding of the basic behavior of the considered biological system.
Here, we introduce a new approach to automatically classify t-invariants to cope with network complexity. We apply clustering techniques such as UPGMA, Complete Linkage, Single Linkage, and Neighbor Joining in combination with different distance measures to get biologically meaningful clusters (t-clusters), which can be interpreted as modules. To find the optimal number of t-clusters to consider for interpretation, the cluster validity measure, Silhouette Width, is applied.
We considered two different case studies as examples: a small signal transduction pathway (pheromone response pathway in Saccharomyces cerevisiae) and a medium-sized gene regulatory network (gene regulation of Duchenne muscular dystrophy). We automatically classified the t-invariants into functionally distinct t-clusters, which could be interpreted biologically as functional modules in the network. We found differences in the suitability of the various distance measures as well as the clustering methods. In terms of a biologically meaningful classification of t-invariants, the best results are obtained using the Tanimoto distance measure. Considering clustering methods, the obtained results suggest that UPGMA and Complete Linkage are suitable for clustering t-invariants with respect to the biological interpretability.
We propose a new approach for the biological classification of Petri net t-invariants based on cluster analysis. Due to the biologically meaningful data reduction and structuring of network processes, large sets of t-invariants can be evaluated, allowing for model validation of qualitative biochemical Petri nets. This approach can also be applied to elementary mode analysis.
PMCID: PMC2277402  PMID: 18257938
3.  Structural and functional protein network analyses predict novel signaling functions for rhodopsin 
Proteomic analyses, literature mining, and structural data were combined to generate an extensive signaling network linked to the visual G protein-coupled receptor rhodopsin. Network analysis suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking.
Using a shotgun proteomic approach, we identified the protein inventory of the light sensing outer segment of the mammalian photoreceptor.These data, combined with literature mining, structural modeling, and computational analysis, offer a comprehensive view of signal transduction downstream of the visual G protein-coupled receptor rhodopsin.The network suggests novel signaling branches downstream of rhodopsin to cytoskeleton dynamics and vesicular trafficking.The network serves as a basis for elucidating physiological principles of photoreceptor function and suggests potential disease-associated proteins.
Photoreceptor cells are neurons capable of converting light into electrical signals. The rod outer segment (ROS) region of the photoreceptor cells is a cellular structure made of a stack of around 800 closed membrane disks loaded with rhodopsin (Liang et al, 2003; Nickell et al, 2007). In disc membranes, rhodopsin arranges itself into paracrystalline dimer arrays, enabling optimal association with the heterotrimeric G protein transducin as well as additional regulatory components (Ciarkowski et al, 2005). Disruption of these highly regulated structures and processes by germline mutations is the cause of severe blinding diseases such as retinitis pigmentosa, macular degeneration, or congenital stationary night blindness (Berger et al, 2010).
Traditionally, signal transduction networks have been studied by combining biochemical and genetic experiments addressing the relations among a small number of components. More recently, large throughput experiments using different techniques like two hybrid or co-immunoprecipitation coupled to mass spectrometry have added a new level of complexity (Ito et al, 2001; Gavin et al, 2002, 2006; Ho et al, 2002; Rual et al, 2005; Stelzl et al, 2005). However, in these studies, space, time, and the fact that many interactions detected for a particular protein are not compatible, are not taken into consideration. Structural information can help discriminate between direct and indirect interactions and more importantly it can determine if two or more predicted partners of any given protein or complex can simultaneously bind a target or rather compete for the same interaction surface (Kim et al, 2006).
In this work, we build a functional and dynamic interaction network centered on rhodopsin on a systems level, using six steps: In step 1, we experimentally identified the proteomic inventory of the porcine ROS, and we compared our data set with a recent proteomic study from bovine ROS (Kwok et al, 2008). The union of the two data sets was defined as the ‘initial experimental ROS proteome'. After removal of contaminants and applying filtering methods, a ‘core ROS proteome', consisting of 355 proteins, was defined.
In step 2, proteins of the core ROS proteome were assigned to six functional modules: (1) vision, signaling, transporters, and channels; (2) outer segment structure and morphogenesis; (3) housekeeping; (4) cytoskeleton and polarity; (5) vesicles formation and trafficking, and (6) metabolism.
In step 3, a protein-protein interaction network was constructed based on the literature mining. Since for most of the interactions experimental evidence was co-immunoprecipitation, or pull-down experiments, and in addition many of the edges in the network are supported by single experimental evidence, often derived from high-throughput approaches, we refer to this network, as ‘fuzzy ROS interactome'. Structural information was used to predict binary interactions, based on the finding that similar domain pairs are likely to interact in a similar way (‘nature repeats itself') (Aloy and Russell, 2002). To increase the confidence in the resulting network, edges supported by a single evidence not coming from yeast two-hybrid experiments were removed, exception being interactions where the evidence was the existence of a three-dimensional structure of the complex itself, or of a highly homologous complex. This curated static network (‘high-confidence ROS interactome') comprises 660 edges linking the majority of the nodes. By considering only edges supported by at least one evidence of direct binary interaction, we end up with a ‘high-confidence binary ROS interactome'. We next extended the published core pathway (Dell'Orco et al, 2009) using evidence from our high-confidence network. We find several new direct binary links to different cellular functional processes (Figure 4): the active rhodopsin interacts with Rac1 and the GTP form of Rho. There is also a connection between active rhodopsin and Arf4, as well as PDEδ with Rab13 and the GTP-bound form of Arl3 that links the vision cycle to vesicle trafficking and structure. We see a connection between PDEδ with prenyl-modified proteins, such as several small GTPases, as well as with rhodopsin kinase. Further, our network reveals several direct binary connections between Ca2+-regulated proteins and cytoskeleton proteins; these are CaMK2A with actinin, calmodulin with GAP43 and S1008, and PKC with 14-3-3 family members.
In step 4, part of the network was experimentally validated using three different approaches to identify physical protein associations that would occur under physiological conditions: (i) Co-segregation/co-sedimentation experiments, (ii) immunoprecipitations combined with mass spectrometry and/or subsequent immunoblotting, and (iii) utilizing the glycosylated N-terminus of rhodopsin to isolate its associated protein partners by Concanavalin A affinity purification. In total, 60 co-purification and co-elution experiments supported interactions that were already in our literature network, and new evidence from 175 co-IP experiments in this work was added. Next, we aimed to provide additional independent experimental confirmation for two of the novel networks and functional links proposed based on the network analysis: (i) the proposed complex between Rac1/RhoA/CRMP-2/tubulin/and ROCK II in ROS was investigated by culturing retinal explants in the presence of an ROCK II-specific inhibitor (Figure 6). While morphology of the retinas treated with ROCK II inhibitor appeared normal, immunohistochemistry analyses revealed several alterations on the protein level. (ii) We supported the hypothesis that PDEδ could function as a GDI for Rac1 in ROS, by demonstrating that PDEδ and Rac1 co localize in ROS and that PDEδ could dissociate Rac1 from ROS membranes in vitro.
In step 5, we use structural information to distinguish between mutually compatible (‘AND') or excluded (‘XOR') interactions. This enables breaking a network of nodes and edges into functional machines or sub-networks/modules. In the vision branch, both ‘AND' and ‘XOR' gates synergize. This may allow dynamic tuning of light and dark states. However, all connections from the vision module to other modules are ‘XOR' connections suggesting that competition, in connection with local protein concentration changes, could be important for transmitting signals from the core vision module.
In the last step, we map and functionally characterize the known mutations that produce blindness.
In summary, this represents the first comprehensive, dynamic, and integrative rhodopsin signaling network, which can be the basis for integrating and mapping newly discovered disease mutants, to guide protein or signaling branch-specific therapies.
Orchestration of signaling, photoreceptor structural integrity, and maintenance needed for mammalian vision remain enigmatic. By integrating three proteomic data sets, literature mining, computational analyses, and structural information, we have generated a multiscale signal transduction network linked to the visual G protein-coupled receptor (GPCR) rhodopsin, the major protein component of rod outer segments. This network was complemented by domain decomposition of protein–protein interactions and then qualified for mutually exclusive or mutually compatible interactions and ternary complex formation using structural data. The resulting information not only offers a comprehensive view of signal transduction induced by this GPCR but also suggests novel signaling routes to cytoskeleton dynamics and vesicular trafficking, predicting an important level of regulation through small GTPases. Further, it demonstrates a specific disease susceptibility of the core visual pathway due to the uniqueness of its components present mainly in the eye. As a comprehensive multiscale network, it can serve as a basis to elucidate the physiological principles of photoreceptor function, identify potential disease-associated genes and proteins, and guide the development of therapies that target specific branches of the signaling pathway.
PMCID: PMC3261702  PMID: 22108793
protein interaction network; rhodopsin signaling; structural modeling
4.  Modeling Integrated Cellular Machinery Using Hybrid Petri-Boolean Networks 
PLoS Computational Biology  2013;9(11):e1003306.
The behavior and phenotypic changes of cells are governed by a cellular circuitry that represents a set of biochemical reactions. Based on biological functions, this circuitry is divided into three types of networks, each encoding for a major biological process: signal transduction, transcription regulation, and metabolism. This division has generally enabled taming computational complexity dealing with the entire system, allowed for using modeling techniques that are specific to each of the components, and achieved separation of the different time scales at which reactions in each of the three networks occur. Nonetheless, with this division comes loss of information and power needed to elucidate certain cellular phenomena. Within the cell, these three types of networks work in tandem, and each produces signals and/or substances that are used by the others to process information and operate normally. Therefore, computational techniques for modeling integrated cellular machinery are needed. In this work, we propose an integrated hybrid model (IHM) that combines Petri nets and Boolean networks to model integrated cellular networks. Coupled with a stochastic simulation mechanism, the model simulates the dynamics of the integrated network, and can be perturbed to generate testable hypotheses. Our model is qualitative and is mostly built upon knowledge from the literature and requires fine-tuning of very few parameters. We validated our model on two systems: the transcriptional regulation of glucose metabolism in human cells, and cellular osmoregulation in S. cerevisiae. The model produced results that are in very good agreement with experimental data, and produces valid hypotheses. The abstract nature of our model and the ease of its construction makes it a very good candidate for modeling integrated networks from qualitative data. The results it produces can guide the practitioner to zoom into components and interconnections and investigate them using such more detailed mathematical models.
Author Summary
Within the cell of an organism, three networks—signaling, transcriptional, and metabolic—are always at work to determine the response of the cell to signals from its environment, and consequently, its fate. Evidence from experimental studies is painting a picture of complex crosstalk among these networks. Thus, while a wide array of computational techniques exist for analyzing each of these network types, there is clear need for new modeling techniques that allow for simultaneously analyzing integrated networks, which combine elements from all three networks. Here, we provide a step towards achieving this task by combining two population modeling techniques—Petri nets and Boolean networks—to produce an integrated hybrid model. We demonstrate the accuracy and utility of this model on two biological systems: transcriptional regulation of glucose metabolism in human cells, and cellular osmoregulation in yeast.
PMCID: PMC3820535  PMID: 24244124
5.  Dynamic interaction networks in a hierarchically organized tissue 
We have integrated gene expression profiling with database and literature mining, mechanistic modeling, and cell culture experiments to identify intercellular and intracellular networks regulating blood stem cell self-renewal.Blood stem cell fate in vitro is regulated non-autonomously by a coupled positive–negative intercellular feedback circuit, composed of megakaryocyte-derived stimulatory growth factors (VEGF, PDGF, EGF, and serotonin) versus monocyte-derived inhibitory factors (CCL3, CCL4, CXCL10, TGFB2, and TNFSF9).The antagonistic signals converge in a core intracellular network focused around PI3K, Raf, PLC, and Akt.Model simulations enable functional classification of the novel endogenous ligands and signaling molecules.
Intercellular (between cell) communication networks are required to maintain homeostasis and coordinate regenerative and developmental cues in multicellular organisms. Despite the recognized importance of intercellular networks in regulating adult stem and progenitor cell fate, the specific cell populations involved, and the underlying molecular mechanisms are largely undefined. Although a limited number of studies have applied novel bioinformatic approaches to unravel intercellular signaling in other cell systems (Frankenstein et al, 2006), a comprehensive analysis of intercellular communication in a stem cell-derived, hierarchical tissue network has yet to be reported.
As a model system to explore intercellular communication networks in a hierarchically organized tissue, we cultured human umbilical cord blood (UCB)-derived stem and progenitor cells in defined, minimal cytokine-supplemented liquid culture (Madlambayan et al, 2006). To systematically explore the molecular and cellular dynamics underlying primitive progenitor growth and differentiation, gene expression profiles of primitive (lineage negative; Lin−) and mature (lineage positive; Lin+) populations were generated during phases of stem cell expansion versus depletion. Parallel phenotypic and subproteomic experiments validated that mRNA expression correlated with complex measures of proteome activity (protein secretion and cell surface expression). Using a curated list of secreted ligand–receptor interactions and published expression profiles of purified mature blood populations, we implemented a novel algorithm to reconstruct the intercellular signaling networks established between stem cells and multi-lineage progeny in vitro. By correlating differential expression patterns with stem cell growth, we predict cell populations, pathways, and secreted ligands associated with stem cell self-renewal and differentiation (Figure 3A).
We then tested the correlative predictions in a series of cell culture experiments. UCB progenitor cell cultures were supplemented with saturating amounts of 18 putative regulatory ligands, or cocultured with purified mature blood lineages (megakaryocytes, monocytes, and erythrocytes), and analyzed for effects on total cell, progenitor, and primitive progenitor growth. At the primitive progenitor level, 3/5 novel predicted stimulatory ligands (EGF, PDGFB, and VEGF) displayed significant positive effects, 5/7 predicted inhibitory factors (CCL3, CCL4, CXCL10, TNFSF9, and TGFB2) displayed negative effects, whereas only 1/5 non-correlated ligand (CXCL7) displayed an effect. Also consistent with predictions from gene expression data, megakaryocytes and monocytes were found to stimulate and inhibit primitive progenitor growth, respectively, and these effects were attributable to differential secretome profiles of stimulatory versus inhibitory ligands.
Cellular responses to external stimuli, particularly in heterogeneous and dynamic cell populations, represent complex functions of multiple cell fate decisions acting both directly and indirectly on the target (stem cell) populations. Experimentally distinguishing the mode of action of cytokines is thus a difficult task. To address this we used our previously published interactive model of hematopoiesis (Kirouac et al, 2009) to classify experimentally identified regulatory ligands into one of four distinct functional categories based on their differential effects on cell population growth. TGFB2 was classified as a proliferation inhibitor, CCL4, CXCL10, SPARC, and TNFSF9 as self-renewal inhibitors, CCL3 a proliferation stimulator, and EGF, VEGF, and PDGFB as self-renewal stimulators.
Stem and progenitor cells exposed to combinatorial extracellular signals must propagate this information through intracellular molecular networks, and respond appropriately by modifying cell fate decisions. To explore how our experimentally identified positive and negative regulatory signals are integrated at the intracellular level, we constructed a blood stem cell self-renewal signaling network through extensive literature curation and protein–protein interaction (PPI) network mapping. We find that signal transduction pathways activated by the various stimulatory and inhibitory ligands converge on a limited set of molecular control nodes, forming a core subnetwork enriched for known regulators of self-renewal (Figure 6A). To experimentally test the intracellular signaling molecules computationally predicted as regulators of stem cell self-renewal, we obtained five small molecule antagonists against the kinases Phosphatidylinositol 3-kinase (PI3K), Raf, Akt, Phospholipase C (PLC), and MEK1. Liquid cultures were supplemented with the five molecules individually, and resultant cell population outputs compared against model simulations to deconvolute the functional effects on proliferation (and survival) versus self-renewal. This analysis classifies inhibition of PI3K and Raf activity as selectively targeting self-renewal, PLC as selectively targeting survival, and Akt as selectively targeting proliferation; MEK inhibition appears non-specific for these processes.
This represents the first systematic characterization of how cell fate decisions are regulated non-autonomously through lineage-specific interactions with differentiated progeny. The complex intercellular communication networks can be approximated as an antagonistic positive–negative feedback circuit, wherein progenitor expansion is modulated by a balance of megakaryocyte-derived stimulatory factors (EGF, PDGF, VEGF, and possibly serotonin) versus monocyte-derived inhibitory factors (CCL3, CCL4, CXCL10, TGFB2, and TNFSF9). This complex milieu of endogenous regulatory signals is integrated and processed within a core intracellular signaling network, resulting in modulation of cell-level kinetic parameters (proliferation, survival, and self-renewal). We reconstruct a stem cell associated intracellular network, and identify PI3K, Raf, Akt, and PLC as functionally distinct signal integration nodes, linking extracellular and intracellular signaling. These findings lay the groundwork for novel strategies to control blood stem cell self-renewal in vitro and in vivo.
Intercellular (between cell) communication networks maintain homeostasis and coordinate regenerative and developmental cues in multicellular organisms. Despite the importance of intercellular networks in stem cell biology, their rules, structure and molecular components are poorly understood. Herein, we describe the structure and dynamics of intercellular and intracellular networks in a stem cell derived, hierarchically organized tissue using experimental and theoretical analyses of cultured human umbilical cord blood progenitors. By integrating high-throughput molecular profiling, database and literature mining, mechanistic modeling, and cell culture experiments, we show that secreted factor-mediated intercellular communication networks regulate blood stem cell fate decisions. In particular, self-renewal is modulated by a coupled positive–negative intercellular feedback circuit composed of megakaryocyte-derived stimulatory growth factors (VEGF, PDGF, EGF, and serotonin) versus monocyte-derived inhibitory factors (CCL3, CCL4, CXCL10, TGFB2, and TNFSF9). We reconstruct a stem cell intracellular network, and identify PI3K, Raf, Akt, and PLC as functionally distinct signal integration nodes, linking extracellular, and intracellular signaling. This represents the first systematic characterization of how stem cell fate decisions are regulated non-autonomously through lineage-specific interactions with differentiated progeny.
PMCID: PMC2990637  PMID: 20924352
cellular networks; hematopoiesis; intercellular signaling; self-renewal; stem cells
6.  Computational modeling with forward and reverse engineering links signaling network and genomic regulatory responses: NF-κB signaling-induced gene expression responses in inflammation 
BMC Bioinformatics  2010;11:308.
Signal transduction is the major mechanism through which cells transmit external stimuli to evoke intracellular biochemical responses. Diverse cellular stimuli create a wide variety of transcription factor activities through signal transduction pathways, resulting in different gene expression patterns. Understanding the relationship between external stimuli and the corresponding cellular responses, as well as the subsequent effects on downstream genes, is a major challenge in systems biology. Thus, a systematic approach is needed to integrate experimental data and theoretical hypotheses to identify the physiological consequences of environmental stimuli.
We proposed a systematic approach that combines forward and reverse engineering to link the signal transduction cascade with the gene responses. To demonstrate the feasibility of our strategy, we focused on linking the NF-κB signaling pathway with the inflammatory gene regulatory responses because NF-κB has long been recognized to play a crucial role in inflammation. We first utilized forward engineering (Hybrid Functional Petri Nets) to construct the NF-κB signaling pathway and reverse engineering (Network Components Analysis) to build a gene regulatory network (GRN). Then, we demonstrated that the corresponding IKK profiles can be identified in the GRN and are consistent with the experimental validation of the IKK kinase assay. We found that the time-lapse gene expression of several cytokines and chemokines (TNF-α, IL-1, IL-6, CXCL1, CXCL2 and CCL3) is concordant with the NF-κB activity profile, and these genes have stronger influence strength within the GRN. Such regulatory effects have highlighted the crucial roles of NF-κB signaling in the acute inflammatory response and enhance our understanding of the systemic inflammatory response syndrome.
We successfully identified and distinguished the corresponding signaling profiles among three microarray datasets with different stimuli strengths. In our model, the crucial genes of the NF-κB regulatory network were also identified to reflect the biological consequences of inflammation. With the experimental validation, our strategy is thus an effective solution to decipher cross-talk effects when attempting to integrate new kinetic parameters from other signal transduction pathways. The strategy also provides new insight for systems biology modeling to link any signal transduction pathways with the responses of downstream genes of interest.
PMCID: PMC2889938  PMID: 20529327
7.  Plato's Cave Algorithm: Inferring Functional Signaling Networks from Early Gene Expression Shadows 
PLoS Computational Biology  2010;6(6):e1000828.
Improving the ability to reverse engineer biochemical networks is a major goal of systems biology. Lesions in signaling networks lead to alterations in gene expression, which in principle should allow network reconstruction. However, the information about the activity levels of signaling proteins conveyed in overall gene expression is limited by the complexity of gene expression dynamics and of regulatory network topology. Two observations provide the basis for overcoming this limitation: a. genes induced without de-novo protein synthesis (early genes) show a linear accumulation of product in the first hour after the change in the cell's state; b. The signaling components in the network largely function in the linear range of their stimulus-response curves. Therefore, unlike most genes or most time points, expression profiles of early genes at an early time point provide direct biochemical assays that represent the activity levels of upstream signaling components. Such expression data provide the basis for an efficient algorithm (Plato's Cave algorithm; PLACA) to reverse engineer functional signaling networks. Unlike conventional reverse engineering algorithms that use steady state values, PLACA uses stimulated early gene expression measurements associated with systematic perturbations of signaling components, without measuring the signaling components themselves. Besides the reverse engineered network, PLACA also identifies the genes detecting the functional interaction, thereby facilitating validation of the predicted functional network. Using simulated datasets, the algorithm is shown to be robust to experimental noise. Using experimental data obtained from gonadotropes, PLACA reverse engineered the interaction network of six perturbed signaling components. The network recapitulated many known interactions and identified novel functional interactions that were validated by further experiment. PLACA uses the results of experiments that are feasible for any signaling network to predict the functional topology of the network and to identify novel relationships.
Author Summary
Elucidating the biochemical interactions in living cells is essential to understanding their behavior under various external conditions. Some of these interactions occur between signaling components with many active states, and their activity levels may be difficult to measure directly. However, most methods to reverse engineer interaction networks rely on measuring gene activity at steady state under various cellular stimuli. Such gene measurements therefore ignore the intermediate effects of signaling components, and cannot reliably convey the interactions between the signaling components themselves. We propose using the changes in activity of early genes shortly after the stimulus to infer the functional interactions between the unmeasured signaling components. The change in expression in such genes at these times is directly and linearly affected by the signaling components, since there is insufficient time for other genes to be transcribed and interfere with the early genes' expression. We present an algorithm that uses such measurements to reverse engineer the functional interaction network between signaling components, and also provides a means for testing these predictions. The algorithm therefore uses feasible experiments to reconstruct functional networks. We applied the algorithm to experimental measurements and uncovered known interactions, as well as novel interactions that were then confirmed experimentally.
PMCID: PMC2891706  PMID: 20585619
8.  A methodology for the structural and functional analysis of signaling and regulatory networks 
BMC Bioinformatics  2006;7:56.
Structural analysis of cellular interaction networks contributes to a deeper understanding of network-wide interdependencies, causal relationships, and basic functional capabilities. While the structural analysis of metabolic networks is a well-established field, similar methodologies have been scarcely developed and applied to signaling and regulatory networks.
We propose formalisms and methods, relying on adapted and partially newly introduced approaches, which facilitate a structural analysis of signaling and regulatory networks with focus on functional aspects. We use two different formalisms to represent and analyze interaction networks: interaction graphs and (logical) interaction hypergraphs. We show that, in interaction graphs, the determination of feedback cycles and of all the signaling paths between any pair of species is equivalent to the computation of elementary modes known from metabolic networks. Knowledge on the set of signaling paths and feedback loops facilitates the computation of intervention strategies and the classification of compounds into activators, inhibitors, ambivalent factors, and non-affecting factors with respect to a certain species. In some cases, qualitative effects induced by perturbations can be unambiguously predicted from the network scheme. Interaction graphs however, are not able to capture AND relationships which do frequently occur in interaction networks. The consequent logical concatenation of all the arcs pointing into a species leads to Boolean networks. For a Boolean representation of cellular interaction networks we propose a formalism based on logical (or signed) interaction hypergraphs, which facilitates in particular a logical steady state analysis (LSSA). LSSA enables studies on the logical processing of signals and the identification of optimal intervention points (targets) in cellular networks. LSSA also reveals network regions whose parametrization and initial states are crucial for the dynamic behavior.
We have implemented these methods in our software tool CellNetAnalyzer (successor of FluxAnalyzer) and illustrate their applicability using a logical model of T-Cell receptor signaling providing non-intuitive results regarding feedback loops, essential elements, and (logical) signal processing upon different stimuli.
The methods and formalisms we propose herein are another step towards the comprehensive functional analysis of cellular interaction networks. Their potential, shown on a realistic T-cell signaling model, makes them a promising tool.
PMCID: PMC1458363  PMID: 16464248
9.  Structural Analysis to Determine the Core of Hypoxia Response Network 
PLoS ONE  2010;5(1):e8600.
The advent of sophisticated molecular biology techniques allows to deduce the structure of complex biological networks. However, networks tend to be huge and impose computational challenges on traditional mathematical analysis due to their high dimension and lack of reliable kinetic data. To overcome this problem, complex biological networks are decomposed into modules that are assumed to capture essential aspects of the full network's dynamics. The question that begs for an answer is how to identify the core that is representative of a network's dynamics, its function and robustness. One of the powerful methods to probe into the structure of a network is Petri net analysis. Petri nets support network visualization and execution. They are also equipped with sound mathematical and formal reasoning based on which a network can be decomposed into modules. The structural analysis provides insight into the robustness and facilitates the identification of fragile nodes. The application of these techniques to a previously proposed hypoxia control network reveals three functional modules responsible for degrading the hypoxia-inducible factor (HIF). Interestingly, the structural analysis identifies superfluous network parts and suggests that the reversibility of the reactions are not important for the essential functionality. The core network is determined to be the union of the three reduced individual modules. The structural analysis results are confirmed by numerical integration of the differential equations induced by the individual modules as well as their composition. The structural analysis leads also to a coarse network structure highlighting the structural principles inherent in the three functional modules. Importantly, our analysis identifies the fragile node in this robust network without which the switch-like behavior is shown to be completely absent.
PMCID: PMC2808224  PMID: 20098728
10.  Elementary signaling modes predict the essentiality of signal transduction network components 
BMC Systems Biology  2011;5:44.
Understanding how signals propagate through signaling pathways and networks is a central goal in systems biology. Quantitative dynamic models help to achieve this understanding, but are difficult to construct and validate because of the scarcity of known mechanistic details and kinetic parameters. Structural and qualitative analysis is emerging as a feasible and useful alternative for interpreting signal transduction.
In this work, we present an integrative computational method for evaluating the essentiality of components in signaling networks. This approach expands an existing signaling network to a richer representation that incorporates the positive or negative nature of interactions and the synergistic behaviors among multiple components. Our method simulates both knockout and constitutive activation of components as node disruptions, and takes into account the possible cascading effects of a node's disruption. We introduce the concept of elementary signaling mode (ESM), as the minimal set of nodes that can perform signal transduction independently. Our method ranks the importance of signaling components by the effects of their perturbation on the ESMs of the network. Validation on several signaling networks describing the immune response of mammals to bacteria, guard cell abscisic acid signaling in plants, and T cell receptor signaling shows that this method can effectively uncover the essentiality of components mediating a signal transduction process and results in strong agreement with the results of Boolean (logical) dynamic models and experimental observations.
This integrative method is an efficient procedure for exploratory analysis of large signaling and regulatory networks where dynamic modeling or experimental tests are impractical. Its results serve as testable predictions, provide insights into signal transduction and regulatory mechanisms and can guide targeted computational or experimental follow-up studies. The source codes for the algorithms developed in this study can be found at
PMCID: PMC3070649  PMID: 21426566
11.  Steady-State Kinetic Modeling Constrains Cellular Resting States and Dynamic Behavior 
PLoS Computational Biology  2009;5(3):e1000298.
A defining characteristic of living cells is the ability to respond dynamically to external stimuli while maintaining homeostasis under resting conditions. Capturing both of these features in a single kinetic model is difficult because the model must be able to reproduce both behaviors using the same set of molecular components. Here, we show how combining small, well-defined steady-state networks provides an efficient means of constructing large-scale kinetic models that exhibit realistic resting and dynamic behaviors. By requiring each kinetic module to be homeostatic (at steady state under resting conditions), the method proceeds by (i) computing steady-state solutions to a system of ordinary differential equations for each module, (ii) applying principal component analysis to each set of solutions to capture the steady-state solution space of each module network, and (iii) combining optimal search directions from all modules to form a global steady-state space that is searched for accurate simulation of the time-dependent behavior of the whole system upon perturbation. Importantly, this stepwise approach retains the nonlinear rate expressions that govern each reaction in the system and enforces constraints on the range of allowable concentration states for the full-scale model. These constraints not only reduce the computational cost of fitting experimental time-series data but can also provide insight into limitations on system concentrations and architecture. To demonstrate application of the method, we show how small kinetic perturbations in a modular model of platelet P2Y1 signaling can cause widespread compensatory effects on cellular resting states.
Author Summary
Cells respond to extracellular signals through a complex coordination of interacting molecular components. Computational models can serve as powerful tools for prediction and analysis of signaling systems, but constructing large models typically requires extensive experimental datasets and computation. To facilitate the construction of complex signaling models, we present a strategy in which the models are built in a stepwise fashion, beginning with small “resting” networks that are combined to form larger models with complex time-dependent behaviors. Interestingly, we found that only a minor fraction of potential model configurations were compatible with resting behavior in an example signaling system. These reduced sets of configurations were used to limit the search for more complicated solutions that also captured the dynamic behavior of the system. Using an example model constructed by this approach, we show how a cell's resting behavior adjusts to changes in the kinetic rate processes of the system. This strategy offers a general and biologically intuitive framework for building large-scale kinetic models of steady-state cellular systems and their dynamics.
PMCID: PMC2637974  PMID: 19266013
12.  Division of labor by dual feedback regulators controls JAK2/STAT5 signaling over broad ligand range 
Quantitative analysis of time-resolved data in primary erythroid progenitor cells reveals that a dual negative transcriptional feedback mechanism underlies the ability of STAT5 to respond to the broad spectrum of physiologically relevant Epo concentrations.
A mathematical dual feedback model of the Epo-induced JAK2/STAT5 signaling pathway was calibrated with extensive time-resolved quantitative data sets from immunoblotting, mass spectrometry and qRT–PCR experiments in primary erythroid progenitor cells.We show that the amount of nuclear phosphorylated STAT5 integrated for 60 min post Epo stimulation directly correlates with the fraction of surviving cells 24 h later.CIS and SOCS3 were identified as the most relevant transcriptional feedback regulators of JAK2/STAT5 signaling in primary erythroid progenitor cells. Applying the model, we revealed that CIS-mediated inhibitory effects are most important at low ligand concentrations, whereas SOCS3 inhibition is more effective at high ligand doses.The distinct modes of inhibition of CIS and SOCS3 at various Epo concentrations provide a strategy for achieving control of JAK2/STAT5 signaling over the entire range of physiological Epo concentrations.
Cells interpret information encoded by extracellular stimuli through the activation of intracellular signaling networks and translate this information into cellular decisions. A prime example for a system that is exposed to extremely variable ligand concentrations is the erythroid lineage. The key regulator Erythropoietin (Epo) facilitates continuous renewal of erythrocytes at low basal levels but also secures compensation in case of, e.g., blood loss through an up to 1000-fold increase in hormone concentration. The Epo receptor (EpoR) is expressed on erythroid progenitor cells at the colony forming unit erythroid (CFU-E) stage. Stimulation of these cells with Epo leads to rapid but transient activation of receptor and JAK2 phosphorylation followed by phosphorylation of the latent transcription factor STAT5. Although STAT5 is known to be an essential regulator of survival and differentiation of erythroid progenitor cells, a quantitative link between the dynamic properties of STAT5 signaling and survival decisions remained unknown. STAT5-mediated responses in CFU-E cells are modulated by multiple attenuation mechanisms that operate on different time scales. Fast-acting mechanisms such as depletion of Epo by rapid receptor turnover and recruitment of the phosphatase SHP-1 control the initial signal amplitude at the receptor level. Transcriptional feedback regulators such as suppressor of cytokine signaling (SOCS) family members CIS and SOCS3 operate at a slower time scale. Despite the ample knowledge of the individual components involved, only little is known about the specific contributions of these regulators in controlling dynamic properties of STAT5 in response to a broad range of input signals. Therefore, dynamic pathway modeling is required to understand the complex regulatory network of feedback regulators.
To address these questions, we established a dual negative feedback model of JAK2/STAT5 signaling in primary erythroid progenitor cells isolated from mouse fetal livers. We provide a large data set of JAK2/STAT5 signaling dynamics employing quantitative immunoblotting, mass spectrometry and quantitative RT–PCR measured under different perturbation conditions to calibrate our model (Figure 3). The structure of our model was constructed to comprise the minimal number of parameters necessary to explain the data. Thereby, we aimed at a model with fully identifiable parameters that are essential to obtain high predictive power. Parameter identifiability was analyzed by the profile likelihood approach. Applying this method, we could establish a dual negative feedback model of JAK2-STAT5 signaling with structurally and in most cases practically identifiable parameters.
A major bottle-neck in combining signal transduction events with cellular phenotypes is the discrepancy in the time scale and stimuli concentrations that are applied in the different experiments. The sensitivity of biochemical assays to determine phosphorylation events within minutes or hours after stimulation is usually lower than the threshold of sensitivity in assays to determine the physiological response after one or more days. Facilitated by the model, we were able to compute the integrated response of JAK2/STAT5 signaling components for experimentally unaddressable Epo concentrations. Our results demonstrate that the integrated response of pSTAT5 in the nucleus accurately correlates with the experimentally determined survival of CFU-E cells. This provides a quantitative link of the dependency of primary CFU-E cells on pSTAT5 activation dynamics. By correlation analysis, we could identify the early signaling phase (⩽1 h) of STAT5 to be the most predictive for the fraction of surviving cells, which was determined ∼24 h later. Thus, we hypothesize that as a general principle in apoptotic decisions, ligand concentrations translated into kinetic-encoded information of early signaling events downstream of receptors can be predictive for survival decisions 24 h later.
After the first hour of stimulation, it is important to constrain signaling to a residual steady-state level. Constitutive phosphorylation of the JAK2/STAT5 pathway has a crucial role in the onset of polycythemia vera (PV), a disease associated with Epo-independent erythroid differentiation. The two identified transcriptional feedback proteins, CIS and SOCS3, are responsible for adjusting the phosphorylation level of STAT5 after 1 h of stimulation. Since the Epo input signal can vary over a broad range of ligand concentrations, we asked how CIS and SOCS3 can facilitate control of STAT5 long-term phosphorylation levels over the entire physiological relevant hormone concentrations. By using model simulations, we revealed that the two feedbacks are most effective at different Epo concentration ranges. Predicted by our mathematical model, the major role of CIS in modulating STAT5 phosphorylation levels is at low, basal Epo concentrations, whereas SOCS3 is essential to control the STAT5 phosphorylation levels at high Epo doses (Figure 6). As a potential molecular mechanism of this dose-dependent inhibitory effect, we could identify the quantity of pJAK2 relative to pEpoR that increases with higher Epo concentrations. Since SOCS3 can inhibit JAK2 directly via its KIR domain to attenuate downstream STAT5 activation, SOCS3 becomes more effective with the relative increase of JAK2 activation. Hence, CIS and SOCS3 act in a concerted manner to ensure tight regulation of STAT5 responses over the broad physiological range of Epo concentrations.
In summary, our mathematical approach provided new insights into the specific function of feedback regulation in STAT5-mediated life or death decisions of primary erythroid cells. We dissected the roles of the transcriptionally induced proteins CIS and SOCS3 that operate as dual feedback with divided function thereby facilitating the control of STAT5 activation levels over the entire range of physiological Epo concentrations. The detailed understanding of the molecular processes and control distribution of Epo-induced JAK/STAT signaling can be further applied to gain insights into alterations promoting malignant hematopoietic diseases.
Cellular signal transduction is governed by multiple feedback mechanisms to elicit robust cellular decisions. The specific contributions of individual feedback regulators, however, remain unclear. Based on extensive time-resolved data sets in primary erythroid progenitor cells, we established a dynamic pathway model to dissect the roles of the two transcriptional negative feedback regulators of the suppressor of cytokine signaling (SOCS) family, CIS and SOCS3, in JAK2/STAT5 signaling. Facilitated by the model, we calculated the STAT5 response for experimentally unobservable Epo concentrations and provide a quantitative link between cell survival and the integrated response of STAT5 in the nucleus. Model predictions show that the two feedbacks CIS and SOCS3 are most effective at different ligand concentration ranges due to their distinct inhibitory mechanisms. This divided function of dual feedback regulation enables control of STAT5 responses for Epo concentrations that can vary 1000-fold in vivo. Our modeling approach reveals dose-dependent feedback control as key property to regulate STAT5-mediated survival decisions over a broad range of ligand concentrations.
PMCID: PMC3159971  PMID: 21772264
apoptosis; erythropoietin; mathematical modeling; negative feedback; SOCS
13.  A Dynamic Analysis of IRS-PKR Signaling in Liver Cells: A Discrete Modeling Approach 
PLoS ONE  2009;4(12):e8040.
A major challenge in systems biology is to develop a detailed dynamic understanding of the functions and behaviors in a particular cellular system, which depends on the elements and their inter-relationships in a specific network. Computational modeling plays an integral part in the study of network dynamics and uncovering the underlying mechanisms. Here we proposed a systematic approach that incorporates discrete dynamic modeling and experimental data to reconstruct a phenotype-specific network of cell signaling. A dynamic analysis of the insulin signaling system in liver cells provides a proof-of-concept application of the proposed methodology. Our group recently identified that double-stranded RNA-dependent protein kinase (PKR) plays an important role in the insulin signaling network. The dynamic behavior of the insulin signaling network is tuned by a variety of feedback pathways, many of which have the potential to cross talk with PKR. Given the complexity of insulin signaling, it is inefficient to experimentally test all possible interactions in the network to determine which pathways are functioning in our cell system. Our discrete dynamic model provides an in silico model framework that integrates potential interactions and assesses the contributions of the various interactions on the dynamic behavior of the signaling network. Simulations with the model generated testable hypothesis on the response of the network upon perturbation, which were experimentally evaluated to identify the pathways that function in our particular liver cell system. The modeling in combination with the experimental results enhanced our understanding of the insulin signaling dynamics and aided in generating a context-specific signaling network.
PMCID: PMC2779448  PMID: 19956598
14.  Detecting and Removing Inconsistencies between Experimental Data and Signaling Network Topologies Using Integer Linear Programming on Interaction Graphs 
PLoS Computational Biology  2013;9(9):e1003204.
Cross-referencing experimental data with our current knowledge of signaling network topologies is one central goal of mathematical modeling of cellular signal transduction networks. We present a new methodology for data-driven interrogation and training of signaling networks. While most published methods for signaling network inference operate on Bayesian, Boolean, or ODE models, our approach uses integer linear programming (ILP) on interaction graphs to encode constraints on the qualitative behavior of the nodes. These constraints are posed by the network topology and their formulation as ILP allows us to predict the possible qualitative changes (up, down, no effect) of the activation levels of the nodes for a given stimulus. We provide four basic operations to detect and remove inconsistencies between measurements and predicted behavior: (i) find a topology-consistent explanation for responses of signaling nodes measured in a stimulus-response experiment (if none exists, find the closest explanation); (ii) determine a minimal set of nodes that need to be corrected to make an inconsistent scenario consistent; (iii) determine the optimal subgraph of the given network topology which can best reflect measurements from a set of experimental scenarios; (iv) find possibly missing edges that would improve the consistency of the graph with respect to a set of experimental scenarios the most. We demonstrate the applicability of the proposed approach by interrogating a manually curated interaction graph model of EGFR/ErbB signaling against a library of high-throughput phosphoproteomic data measured in primary hepatocytes. Our methods detect interactions that are likely to be inactive in hepatocytes and provide suggestions for new interactions that, if included, would significantly improve the goodness of fit. Our framework is highly flexible and the underlying model requires only easily accessible biological knowledge. All related algorithms were implemented in a freely available toolbox SigNetTrainer making it an appealing approach for various applications.
Author Summary
Cellular signal transduction is orchestrated by communication networks of signaling proteins commonly depicted on signaling pathway maps. However, each cell type may have distinct variants of signaling pathways, and wiring diagrams are often altered in disease states. The identification of truly active signaling topologies based on experimental data is therefore one key challenge in systems biology of cellular signaling. We present a new framework for training signaling networks based on interaction graphs (IG). In contrast to complex modeling formalisms, IG capture merely the known positive and negative edges between the components. This basic information, however, already sets hard constraints on the possible qualitative behaviors of the nodes when perturbing the network. Our approach uses Integer Linear Programming to encode these constraints and to predict the possible changes (down, neutral, up) of the activation levels of the involved players for a given experiment. Based on this formulation we developed several algorithms for detecting and removing inconsistencies between measurements and network topology. Demonstrated by EGFR/ErbB signaling in hepatocytes, our approach delivers direct conclusions on edges that are likely inactive or missing relative to canonical pathway maps. Such information drives the further elucidation of signaling network topologies under normal and pathological phenotypes.
PMCID: PMC3764019  PMID: 24039561
15.  The Logic of EGFR/ErbB Signaling: Theoretical Properties and Analysis of High-Throughput Data 
PLoS Computational Biology  2009;5(8):e1000438.
The epidermal growth factor receptor (EGFR) signaling pathway is probably the best-studied receptor system in mammalian cells, and it also has become a popular example for employing mathematical modeling to cellular signaling networks. Dynamic models have the highest explanatory and predictive potential; however, the lack of kinetic information restricts current models of EGFR signaling to smaller sub-networks. This work aims to provide a large-scale qualitative model that comprises the main and also the side routes of EGFR/ErbB signaling and that still enables one to derive important functional properties and predictions. Using a recently introduced logical modeling framework, we first examined general topological properties and the qualitative stimulus-response behavior of the network. With species equivalence classes, we introduce a new technique for logical networks that reveals sets of nodes strongly coupled in their behavior. We also analyzed a model variant which explicitly accounts for uncertainties regarding the logical combination of signals in the model. The predictive power of this model is still high, indicating highly redundant sub-structures in the network. Finally, one key advance of this work is the introduction of new techniques for assessing high-throughput data with logical models (and their underlying interaction graph). By employing these techniques for phospho-proteomic data from primary hepatocytes and the HepG2 cell line, we demonstrate that our approach enables one to uncover inconsistencies between experimental results and our current qualitative knowledge and to generate new hypotheses and conclusions. Our results strongly suggest that the Rac/Cdc42 induced p38 and JNK cascades are independent of PI3K in both primary hepatocytes and HepG2. Furthermore, we detected that the activation of JNK in response to neuregulin follows a PI3K-dependent signaling pathway.
Author Summary
The epidermal growth factor receptor (EGFR) signaling pathway is arguably the best-characterized receptor system in mammalian cells and has become a prime example for mathematical modeling of cellular signal transduction. Most of these models are constructed to describe dynamic and quantitative events but, due to the lack of precise kinetic information, focus only on certain regions of the network. Qualitative modeling approaches relying on the network structure provide a suitable way to deal with large-scale networks as a whole. Here, we constructed a comprehensive qualitative model of the EGFR/ErbB signaling pathway with more than 200 interactions reflecting our current state of knowledge. A theoretical analysis revealed important topological and functional properties of the network such as qualitative stimulus-response behavior and redundant sub-structures. Subsequently, we demonstrate how this qualitative model can be used to assess high-throughput data leading to new biological insights: comparing qualitative predictions (such as expected “ups” and “downs” of activation levels) of our model with experimental data from primary human hepatocytes and from the liver cancer cell line HepG2, we uncovered inconsistencies between measurements and model structure. These discrepancies lead to modifications in the EGFR/ErbB signaling network relevant at least for liver biology.
PMCID: PMC2710522  PMID: 19662154
16.  Multiple Model-Informed Open-Loop Control of Uncertain Intracellular Signaling Dynamics 
PLoS Computational Biology  2014;10(4):e1003546.
Computational approaches to tune the activation of intracellular signal transduction pathways both predictably and selectively will enable researchers to explore and interrogate cell biology with unprecedented precision. Techniques to control complex nonlinear systems typically involve the application of control theory to a descriptive mathematical model. For cellular processes, however, measurement assays tend to be too time consuming for real-time feedback control and models offer rough approximations of the biological reality, thus limiting their utility when considered in isolation. We overcome these problems by combining nonlinear model predictive control with a novel adaptive weighting algorithm that blends predictions from multiple models to derive a compromise open-loop control sequence. The proposed strategy uses weight maps to inform the controller of the tendency for models to differ in their ability to accurately reproduce the system dynamics under different experimental perturbations (i.e. control inputs). These maps, which characterize the changing model likelihoods over the admissible control input space, are constructed using preexisting experimental data and used to produce a model-based open-loop control framework. In effect, the proposed method designs a sequence of control inputs that force the signaling dynamics along a predefined temporal response without measurement feedback while mitigating the effects of model uncertainty. We demonstrate this technique on the well-known Erk/MAPK signaling pathway in T cells. In silico assessment demonstrates that this approach successfully reduces target tracking error by 52% or better when compared with single model-based controllers and non-adaptive multiple model-based controllers. In vitro implementation of the proposed approach in Jurkat cells confirms a 63% reduction in tracking error when compared with the best of the single-model controllers. This study provides an experimentally-corroborated control methodology that utilizes the knowledge encoded within multiple mathematical models of intracellular signaling to design control inputs that effectively direct cell behavior in open-loop.
Author Summary
Most cell behavior arises as a response to external forces. Signals from the extracellular environment are passed to the cell's nucleus through a complex network of interacting proteins. Perturbing these pathways can change the strength or outcome of the signals, which could be used to treat or prevent a pathological response. While manipulating these networks can be achieved using a variety of methods, the ability to do so predictably over time would provide an unprecedented level of control over cell behavior and could lead to new therapeutic design and research tools in medicine and systems biology. Hence, we propose a practical computational framework to aid in the design of experimental perturbations to force cell signaling dynamics to follow a predefined response. Our approach represents a novel merger of model-based control and information theory to blend the predictions from multiple mathematical models into a meaningful compromise solution. We verify through simulation and experimentation that this solution produces excellent agreement between the cell readouts and several predefined trajectories, even in the presence of significant modeling uncertainty and without measurement feedback. By combining elements of information and control theory, our approach will help advance the best practices in model-based control applications for medicine.
PMCID: PMC3983080  PMID: 24722333
17.  Network modeling of the transcriptional effects of copy number aberrations in glioblastoma 
DNA copy number aberrations (CNAs) are a characteristic feature of cancer genomes. In this work, Rebecka Jörnsten, Sven Nelander and colleagues combine network modeling and experimental methods to analyze the systems-level effects of CNAs in glioblastoma.
We introduce a modeling approach termed EPoC (Endogenous Perturbation analysis of Cancer), enabling the construction of global, gene-level models that causally connect gene copy number with expression in glioblastoma.On the basis of the resulting model, we predict genes that are likely to be disease-driving and validate selected predictions experimentally. We also demonstrate that further analysis of the network model by sparse singular value decomposition allows stratification of patients with glioblastoma into short-term and long-term survivors, introducing decomposed network models as a useful principle for biomarker discovery.Finally, in systematic comparisons, we demonstrate that EPoC is computationally efficient and yields more consistent results than mRNA-only methods, standard eQTL methods, and two recent multivariate methods for genotype–mRNA coupling.
Gains and losses of chromosomal material (DNA copy number aberrations; CNAs) are a characteristic feature of cancer genomes. At the level of a single locus, it is well known that increased copy number (gene amplification) typically leads to increased gene expression, whereas decreased copy number (gene deletion) leads to decreased gene expression (Pollack et al, 2002; Lee et al, 2008; Nilsson et al, 2008). However, CNAs also affect the expression of genes located outside the amplified/deleted region itself via indirect mechanisms. To fully understand the action of CNAs, it is therefore necessary to analyze their action in a network context. Toward this goal, improved computational approaches will be important, if not essential.
To determine the global effects on transcription of CNAs in the brain tumor glioblastoma, we develop EPoC (Endogenous Perturbation analysis of Cancer), a computational technique capable of inferring sparse, causal network models by combining genome-wide, paired CNA- and mRNA-level data. EPoC aims to detect disease-driving copy number aberrations and their effect on target mRNA expression, and stratify patients into long-term and short-term survivors. Technically, EPoC relates CNA perturbations to mRNA responses by matrix equations, derived from a steady-state approximation of the transcriptional network. Patient prognostic scores are obtained from singular value decompositions of the network matrix. The models are constructed by solving a large-scale, regularized regression problem.
We apply EPoC to glioblastoma data from The Cancer Genome Atlas (TCGA) consortium (186 patients). The identified CNA-driven network comprises 10 672 genes, and contains a number of copy number-altered genes that control multiple downstream genes. Highly connected hub genes include well-known oncogenes and tumor supressor genes that are frequently deleted or amplified in glioblastoma, including EGFR, PDGFRA, CDKN2A and CDKN2B, confirming a clear association between these aberrations and transcriptional variability of these brain tumors. In addition, we identify a number of hub genes that have previously not been associated with glioblastoma, including interferon alpha 1 (IFNA1), myeloid/lymphoid or mixed-lineage leukemia translocated to 10 (MLLT10, a well-known leukemia gene), glutamate decarboxylase 2 GAD2, a postulated glutamate receptor GPR158 and Necdin (NDN). Furthermore, we demonstrate that the network model contains useful information on downstream target genes (including stem cell regulators), and possible drug targets.
We proceed to explore the validity of a small network region experimentally. Introducing experimental perturbations of NDN and other targets in four glioblastoma cell lines (T98G, U-87MG, U-343MG and U-373MG), we confirm several predicted mechanisms. We also demonstrate that the TCGA glioblastoma patients can be stratified into long-term and short-term survivors, using our proposed prognostic scores derived from a singular vector decomposition of the network model. Finally, we compare EPoC to existing methods for mRNA networks analysis and expression quantitative locus methods, and demonstrate that EPoC produces more consistent models between technically independent glioblastoma data sets, and that the EPoC models exhibit better overlap with known protein–protein interaction networks and pathway maps.
In summary, we conclude that large-scale integrative modeling reveals mechanistically and prognostically informative networks in human glioblastoma. Our approach operates at the gene level and our data support that individual hub genes can be identified in practice. Very large aberrations, however, cannot be fully resolved by the current modeling strategy.
DNA copy number aberrations (CNAs) are a hallmark of cancer genomes. However, little is known about how such changes affect global gene expression. We develop a modeling framework, EPoC (Endogenous Perturbation analysis of Cancer), to (1) detect disease-driving CNAs and their effect on target mRNA expression, and to (2) stratify cancer patients into long- and short-term survivors. Our method constructs causal network models of gene expression by combining genome-wide DNA- and RNA-level data. Prognostic scores are obtained from a singular value decomposition of the networks. By applying EPoC to glioblastoma data from The Cancer Genome Atlas consortium, we demonstrate that the resulting network models contain known disease-relevant hub genes, reveal interesting candidate hubs, and uncover predictors of patient survival. Targeted validations in four glioblastoma cell lines support selected predictions, and implicate the p53-interacting protein Necdin in suppressing glioblastoma cell growth. We conclude that large-scale network modeling of the effects of CNAs on gene expression may provide insights into the biology of human cancer. Free software in MATLAB and R is provided.
PMCID: PMC3101951  PMID: 21525872
cancer biology; cancer genomics; glioblastoma
18.  Dynamic simulation of regulatory networks using SQUAD 
BMC Bioinformatics  2007;8:462.
The ambition of most molecular biologists is the understanding of the intricate network of molecular interactions that control biological systems. As scientists uncover the components and the connectivity of these networks, it becomes possible to study their dynamical behavior as a whole and discover what is the specific role of each of their components. Since the behavior of a network is by no means intuitive, it becomes necessary to use computational models to understand its behavior and to be able to make predictions about it. Unfortunately, most current computational models describe small networks due to the scarcity of kinetic data available. To overcome this problem, we previously published a methodology to convert a signaling network into a dynamical system, even in the total absence of kinetic information. In this paper we present a software implementation of such methodology.
We developed SQUAD, a software for the dynamic simulation of signaling networks using the standardized qualitative dynamical systems approach. SQUAD converts the network into a discrete dynamical system, and it uses a binary decision diagram algorithm to identify all the steady states of the system. Then, the software creates a continuous dynamical system and localizes its steady states which are located near the steady states of the discrete system. The software permits to make simulations on the continuous system, allowing for the modification of several parameters. Importantly, SQUAD includes a framework for perturbing networks in a manner similar to what is performed in experimental laboratory protocols, for example by activating receptors or knocking out molecular components. Using this software we have been able to successfully reproduce the behavior of the regulatory network implicated in T-helper cell differentiation.
The simulation of regulatory networks aims at predicting the behavior of a whole system when subject to stimuli, such as drugs, or determine the role of specific components within the network. The predictions can then be used to interpret and/or drive laboratory experiments. SQUAD provides a user-friendly graphical interface, accessible to both computational and experimental biologists for the fast qualitative simulation of large regulatory networks for which kinetic data is not necessarily available.
PMCID: PMC2238325  PMID: 18039375
19.  Metabolic network reconstruction of Chlamydomonas offers insight into light-driven algal metabolism 
A comprehensive genome-scale metabolic network of Chlamydomonas reinhardtii, including a detailed account of light-driven metabolism, is reconstructed and validated. The model provides a new resource for research of C. reinhardtii metabolism and in algal biotechnology.
The genome-scale metabolic network of Chlamydomonas reinhardtii (iRC1080) was reconstructed, accounting for >32% of the estimated metabolic genes encoded in the genome, and including extensive details of lipid metabolic pathways.This is the first metabolic network to explicitly account for stoichiometry and wavelengths of metabolic photon usage, providing a new resource for research of C. reinhardtii metabolism and developments in algal biotechnology.Metabolic functional annotation and the largest transcript verification of a metabolic network to date was performed, at least partially verifying >90% of the transcripts accounted for in iRC1080. Analysis of the network supports hypotheses concerning the evolution of latent lipid pathways in C. reinhardtii, including very long-chain polyunsaturated fatty acid and ceramide synthesis pathways.A novel approach for modeling light-driven metabolism was developed that accounts for both light source intensity and spectral quality of emitted light. The constructs resulting from this approach, termed prism reactions, were shown to significantly improve the accuracy of model predictions, and their use was demonstrated for evaluation of light source efficiency and design.
Algae have garnered significant interest in recent years, especially for their potential application in biofuel production. The hallmark, model eukaryotic microalgae Chlamydomonas reinhardtii has been widely used to study photosynthesis, cell motility and phototaxis, cell wall biogenesis, and other fundamental cellular processes (Harris, 2001). Characterizing algal metabolism is key to engineering production strains and understanding photobiological phenomena. Based on extensive literature on C. reinhardtii metabolism, its genome sequence (Merchant et al, 2007), and gene functional annotation, we have reconstructed and experimentally validated the genome-scale metabolic network for this alga, iRC1080, the first network to account for detailed photon absorption permitting growth simulations under different light sources. iRC1080 accounts for 1080 genes, associated with 2190 reactions and 1068 unique metabolites and encompasses 83 subsystems distributed across 10 cellular compartments (Figure 1A). Its >32% coverage of estimated metabolic genes is a tremendous expansion over previous algal reconstructions (Boyle and Morgan, 2009; Manichaikul et al, 2009). The lipid metabolic pathways of iRC1080 are considerably expanded relative to existing networks, and chemical properties of all metabolites in these pathways are accounted for explicitly, providing sufficient detail to completely specify all individual molecular species: backbone molecule and stereochemical numbering of acyl-chain positions; acyl-chain length; and number, position, and cis–trans stereoisomerism of carbon–carbon double bonds. Such detail in lipid metabolism will be critical for model-driven metabolic engineering efforts.
We experimentally verified transcripts accounted for in the network under permissive growth conditions, detecting >90% of tested transcript models (Figure 1B) and providing validating evidence for the contents of iRC1080. We also analyzed the extent of transcript verification by specific metabolic subsystems. Some subsystems stood out as more poorly verified, including chloroplast and mitochondrial transport systems and sphingolipid metabolism, all of which exhibited <80% of transcripts detected, reflecting incomplete characterization of compartmental transporters and supporting a hypothesis of latent pathway evolution for ceramide synthesis in C. reinhardtii. Additional lines of evidence from the reconstruction effort similarly support this hypothesis including lack of ceramide synthetase and other annotation gaps downstream in sphingolipid metabolism. A similar hypothesis of latent pathway evolution was established for very long-chain fatty acids (VLCFAs) and their polyunsaturated analogs (VLCPUFAs) (Figure 1C), owing to the absence of this class of lipids in previous experimental measurements, lack of a candidate VLCFA elongase in the functional annotation, and additional downstream annotation gaps in arachidonic acid metabolism.
The network provides a detailed account of metabolic photon absorption by light-driven reactions, including photosystems I and II, light-dependent protochlorophyllide oxidoreductase, provitamin D3 photoconversion to vitamin D3, and rhodopsin photoisomerase; this network accounting permits the precise modeling of light-dependent metabolism. iRC1080 accounts for effective light spectral ranges through analysis of biochemical activity spectra (Figure 3A), either reaction activity or absorbance at varying light wavelengths. Defining effective spectral ranges associated with each photon-utilizing reaction enabled our network to model growth under different light sources via stoichiometric representation of the spectral composition of emitted light, termed prism reactions. Coefficients for different photon wavelengths in a prism reaction correspond to the ratios of photon flux in the defined effective spectral ranges to the total emitted photon flux from a given light source (Figure 3B). This approach distinguishes the amount of emitted photons that drive different metabolic reactions. We created prism reactions for most light sources that have been used in published studies for algal and plant growth including solar light, various light bulbs, and LEDs. We also included regulatory effects, resulting from lighting conditions insofar as published studies enabled. Light and dark conditions have been shown to affect metabolic enzyme activity in C. reinhardtii on multiple levels: transcriptional regulation, chloroplast RNA degradation, translational regulation, and thioredoxin-mediated enzyme regulation. Through application of our light model and prism reactions, we were able to closely recapitulate experimental growth measurements under solar, incandescent, and red LED lights. Through unbiased sampling, we were able to establish the tremendous statistical significance of the accuracy of growth predictions achievable through implementation of prism reactions. Finally, application of the photosynthetic model was demonstrated prospectively to evaluate light utilization efficiency under different light sources. The results suggest that, of the existing light sources, red LEDs provide the greatest efficiency, about three times as efficient as sunlight. Extending this analysis, the model was applied to design a maximally efficient LED spectrum for algal growth. The result was a 677-nm peak LED spectrum with a total incident photon flux of 360 μE/m2/s, suggesting that for the simple objective of maximizing growth efficiency, LED technology has already reached an effective theoretical optimum.
In summary, the C. reinhardtii metabolic network iRC1080 that we have reconstructed offers insight into the basic biology of this species and may be employed prospectively for genetic engineering design and light source design relevant to algal biotechnology. iRC1080 was used to analyze lipid metabolism and generate novel hypotheses about the evolution of latent pathways. The predictive capacity of metabolic models developed from iRC1080 was demonstrated in simulating mutant phenotypes and in evaluation of light source efficiency. Our network provides a broad knowledgebase of the biochemistry and genomics underlying global metabolism of a photoautotroph, and our modeling approach for light-driven metabolism exemplifies how integration of largely unvisited data types, such as physicochemical environmental parameters, can expand the diversity of applications of metabolic networks.
Metabolic network reconstruction encompasses existing knowledge about an organism's metabolism and genome annotation, providing a platform for omics data analysis and phenotype prediction. The model alga Chlamydomonas reinhardtii is employed to study diverse biological processes from photosynthesis to phototaxis. Recent heightened interest in this species results from an international movement to develop algal biofuels. Integrating biological and optical data, we reconstructed a genome-scale metabolic network for this alga and devised a novel light-modeling approach that enables quantitative growth prediction for a given light source, resolving wavelength and photon flux. We experimentally verified transcripts accounted for in the network and physiologically validated model function through simulation and generation of new experimental growth data, providing high confidence in network contents and predictive applications. The network offers insight into algal metabolism and potential for genetic engineering and efficient light source design, a pioneering resource for studying light-driven metabolism and quantitative systems biology.
PMCID: PMC3202792  PMID: 21811229
Chlamydomonas reinhardtii; lipid metabolism; metabolic engineering; photobioreactor
20.  Model-Free Reconstruction of Excitatory Neuronal Connectivity from Calcium Imaging Signals 
PLoS Computational Biology  2012;8(8):e1002653.
A systematic assessment of global neural network connectivity through direct electrophysiological assays has remained technically infeasible, even in simpler systems like dissociated neuronal cultures. We introduce an improved algorithmic approach based on Transfer Entropy to reconstruct structural connectivity from network activity monitored through calcium imaging. We focus in this study on the inference of excitatory synaptic links. Based on information theory, our method requires no prior assumptions on the statistics of neuronal firing and neuronal connections. The performance of our algorithm is benchmarked on surrogate time series of calcium fluorescence generated by the simulated dynamics of a network with known ground-truth topology. We find that the functional network topology revealed by Transfer Entropy depends qualitatively on the time-dependent dynamic state of the network (bursting or non-bursting). Thus by conditioning with respect to the global mean activity, we improve the performance of our method. This allows us to focus the analysis to specific dynamical regimes of the network in which the inferred functional connectivity is shaped by monosynaptic excitatory connections, rather than by collective synchrony. Our method can discriminate between actual causal influences between neurons and spurious non-causal correlations due to light scattering artifacts, which inherently affect the quality of fluorescence imaging. Compared to other reconstruction strategies such as cross-correlation or Granger Causality methods, our method based on improved Transfer Entropy is remarkably more accurate. In particular, it provides a good estimation of the excitatory network clustering coefficient, allowing for discrimination between weakly and strongly clustered topologies. Finally, we demonstrate the applicability of our method to analyses of real recordings of in vitro disinhibited cortical cultures where we suggest that excitatory connections are characterized by an elevated level of clustering compared to a random graph (although not extreme) and can be markedly non-local.
Author Summary
Unraveling the general organizing principles of connectivity in neural circuits is a crucial step towards understanding brain function. However, even the simpler task of assessing the global excitatory connectivity of a culture in vitro, where neurons form self-organized networks in absence of external stimuli, remains challenging. Neuronal cultures undergo spontaneous switching between episodes of synchronous bursting and quieter inter-burst periods. We introduce here a novel algorithm which aims at inferring the connectivity of neuronal cultures from calcium fluorescence recordings of their network dynamics. To achieve this goal, we develop a suitable generalization of Transfer Entropy, an information-theoretic measure of causal influences between time series. Unlike previous algorithmic approaches to reconstruction, Transfer Entropy is data-driven and does not rely on specific assumptions about neuronal firing statistics or network topology. We generate simulated calcium signals from networks with controlled ground-truth topology and purely excitatory interactions and show that, by restricting the analysis to inter-bursts periods, Transfer Entropy robustly achieves a good reconstruction performance for disparate network connectivities. Finally, we apply our method to real data and find evidence of non-random features in cultured networks, such as the existence of highly connected hub excitatory neurons and of an elevated (but not extreme) level of clustering.
PMCID: PMC3426566  PMID: 22927808
21.  Modeling Reveals Bistability and Low-Pass Filtering in the Network Module Determining Blood Stem Cell Fate 
PLoS Computational Biology  2010;6(5):e1000771.
Combinatorial regulation of gene expression is ubiquitous in eukaryotes with multiple inputs converging on regulatory control elements. The dynamic properties of these elements determine the functionality of genetic networks regulating differentiation and development. Here we propose a method to quantitatively characterize the regulatory output of distant enhancers with a biophysical approach that recursively determines free energies of protein-protein and protein-DNA interactions from experimental analysis of transcriptional reporter libraries. We apply this method to model the Scl-Gata2-Fli1 triad—a network module important for cell fate specification of hematopoietic stem cells. We show that this triad module is inherently bistable with irreversible transitions in response to physiologically relevant signals such as Notch, Bmp4 and Gata1 and we use the model to predict the sensitivity of the network to mutations. We also show that the triad acts as a low-pass filter by switching between steady states only in response to signals that persist for longer than a minimum duration threshold. We have found that the auto-regulation loops connecting the slow-degrading Scl to Gata2 and Fli1 are crucial for this low-pass filtering property. Taken together our analysis not only reveals new insights into hematopoietic stem cell regulatory network functionality but also provides a novel and widely applicable strategy to incorporate experimental measurements into dynamical network models.
Author Summary
Hematopoiesis—blood cell development—has long served as a model for study of cellular differentiation and its control by underlying gene regulatory networks. The Scl-Gata2-Fli1 triad is a network module essential for the development of hematopoietic stem cells but its mechanistic role is not well understood. The transcription factors Scl, Gata2 and Fli1 act in combination to upregulate transcription of each other via distal enhancer site binding. Similar network architectures are essential in other multipotent cell lines. We propose a method that uses experimental results to circumvent the difficulties of mathematically modeling the combinatorial regulation of this triad module. Using this dynamical model we show that the triad exhibits robust bistable behavior. Environmental signals can irreversibly switch the triad between stable states in a manner that reflects the unidirectional switching in the formation and subsequent differentiation of hematopoietic stem cells. We also show that the triad makes reliable decisions in noisy environments by only switching in response to transient signals that persist longer than the threshold duration. These results suggest that the Scl-Gata2-Fli1 module possibly functions as a control switch for hematopoietic stem cell development. The proposed method can be extended for quantitative characterization of other combinatorial gene regulatory modules.
PMCID: PMC2865510  PMID: 20463872
22.  Linking Proteomic and Transcriptional Data through the Interactome and Epigenome Reveals a Map of Oncogene-induced Signaling 
PLoS Computational Biology  2013;9(2):e1002887.
Cellular signal transduction generally involves cascades of post-translational protein modifications that rapidly catalyze changes in protein-DNA interactions and gene expression. High-throughput measurements are improving our ability to study each of these stages individually, but do not capture the connections between them. Here we present an approach for building a network of physical links among these data that can be used to prioritize targets for pharmacological intervention. Our method recovers the critical missing links between proteomic and transcriptional data by relating changes in chromatin accessibility to changes in expression and then uses these links to connect proteomic and transcriptome data. We applied our approach to integrate epigenomic, phosphoproteomic and transcriptome changes induced by the variant III mutation of the epidermal growth factor receptor (EGFRvIII) in a cell line model of glioblastoma multiforme (GBM). To test the relevance of the network, we used small molecules to target highly connected nodes implicated by the network model that were not detected by the experimental data in isolation and we found that a large fraction of these agents alter cell viability. Among these are two compounds, ICG-001, targeting CREB binding protein (CREBBP), and PKF118–310, targeting β-catenin (CTNNB1), which have not been tested previously for effectiveness against GBM. At the level of transcriptional regulation, we used chromatin immunoprecipitation sequencing (ChIP-Seq) to experimentally determine the genome-wide binding locations of p300, a transcriptional co-regulator highly connected in the network. Analysis of p300 target genes suggested its role in tumorigenesis. We propose that this general method, in which experimental measurements are used as constraints for building regulatory networks from the interactome while taking into account noise and missing data, should be applicable to a wide range of high-throughput datasets.
Author Summary
The ways in which cells respond to changes in their environment are controlled by networks of physical links among the proteins and genes. The initial signal of a change in conditions rapidly passes through these networks from the cytoplasm to the nucleus, where it can lead to long-term alterations in cellular behavior by controlling the expression of genes. These cascades of signaling events underlie many normal biological processes. As a result, being able to map out how these networks change in disease can provide critical insights for new approaches to treatment. We present a computational method for reconstructing these networks by finding links between the rapid short-term changes in proteins and the longer-term changes in gene regulation. This method brings together systematic measurements of protein signaling, genome organization and transcription in the context of protein-protein and protein-DNA interactions. When used to analyze datasets from an oncogene expressing cell line model of human glioblastoma, our approach identifies key nodes that affect cell survival and functional transcriptional regulators.
PMCID: PMC3567149  PMID: 23408876
23.  Time-dependent structural transformation analysis to high-level Petri net model with active state transition diagram 
BMC Systems Biology  2010;4:39.
With an accumulation of in silico data obtained by simulating large-scale biological networks, a new interest of research is emerging for elucidating how living organism functions over time in cells.
Investigating the dynamic features of current computational models promises a deeper understanding of complex cellular processes. This leads us to develop a method that utilizes structural properties of the model over all simulation time steps. Further, user-friendly overviews of dynamic behaviors can be considered to provide a great help in understanding the variations of system mechanisms.
We propose a novel method for constructing and analyzing a so-called active state transition diagram (ASTD) by using time-course simulation data of a high-level Petri net. Our method includes two new algorithms. The first algorithm extracts a series of subnets (called temporal subnets) reflecting biological components contributing to the dynamics, while retaining positive mathematical qualities. The second one creates an ASTD composed of unique temporal subnets. ASTD provides users with concise information allowing them to grasp and trace how a key regulatory subnet and/or a network changes with time. The applicability of our method is demonstrated by the analysis of the underlying model for circadian rhythms in Drosophila.
Building ASTD is a useful means to convert a hybrid model dealing with discrete, continuous and more complicated events to finite time-dependent states. Based on ASTD, various analytical approaches can be applied to obtain new insights into not only systematic mechanisms but also dynamics.
PMCID: PMC2855528  PMID: 20356411
24.  Metabolic Constraint-Based Refinement of Transcriptional Regulatory Networks 
PLoS Computational Biology  2013;9(12):e1003370.
There is a strong need for computational frameworks that integrate different biological processes and data-types to unravel cellular regulation. Current efforts to reconstruct transcriptional regulatory networks (TRNs) focus primarily on proximal data such as gene co-expression and transcription factor (TF) binding. While such approaches enable rapid reconstruction of TRNs, the overwhelming combinatorics of possible networks limits identification of mechanistic regulatory interactions. Utilizing growth phenotypes and systems-level constraints to inform regulatory network reconstruction is an unmet challenge. We present our approach Gene Expression and Metabolism Integrated for Network Inference (GEMINI) that links a compendium of candidate regulatory interactions with the metabolic network to predict their systems-level effect on growth phenotypes. We then compare predictions with experimental phenotype data to select phenotype-consistent regulatory interactions. GEMINI makes use of the observation that only a small fraction of regulatory network states are compatible with a viable metabolic network, and outputs a regulatory network that is simultaneously consistent with the input genome-scale metabolic network model, gene expression data, and TF knockout phenotypes. GEMINI preferentially recalls gold-standard interactions (p-value = 10−172), significantly better than using gene expression alone. We applied GEMINI to create an integrated metabolic-regulatory network model for Saccharomyces cerevisiae involving 25,000 regulatory interactions controlling 1597 metabolic reactions. The model quantitatively predicts TF knockout phenotypes in new conditions (p-value = 10−14) and revealed potential condition-specific regulatory mechanisms. Our results suggest that a metabolic constraint-based approach can be successfully used to help reconstruct TRNs from high-throughput data, and highlights the potential of using a biochemically-detailed mechanistic framework to integrate and reconcile inconsistencies across different data-types. The algorithm and associated data are available at
Author Summary
Cellular networks, such as metabolic and transcriptional regulatory networks (TRNs), do not operate independently but work together in unison to determine cellular phenotypes. Further, the phenotype and architecture of one network constrains the topology of other networks. Hence, it is critical to study network components and interactions in the context of the entire cell. Typically, efforts to reconstruct TRNs focus only on immediately proximal data such as gene co-expression and transcription factor (TF)-binding. Herein, we take a different strategy by linking candidate TRNs with the metabolic network to predict systems-level responses such as growth phenotypes of TF knockout strains, and compare predictions with experimental phenotype data to select amongst the candidate TRNs. Our approach goes beyond traditional data integration approaches for network inference and refinement by using a predictive network model (metabolism) to refine another network model (regulation) – thus providing an alternative avenue to this area of research. Understanding how the networks function together in a cell will pave the way for synthetic biology and has a wide-range of applications in biotechnology, drug discovery and diagnostics. Further we demonstrate how metabolic models can integrate and reconcile inconsistencies across different data-types.
PMCID: PMC3857774  PMID: 24348226
25.  Identifying cancer biomarkers by network-constrained support vector machines 
BMC Systems Biology  2011;5:161.
One of the major goals in gene and protein expression profiling of cancer is to identify biomarkers and build classification models for prediction of disease prognosis or treatment response. Many traditional statistical methods, based on microarray gene expression data alone and individual genes' discriminatory power, often fail to identify biologically meaningful biomarkers thus resulting in poor prediction performance across data sets. Nonetheless, the variables in multivariable classifiers should synergistically interact to produce more effective classifiers than individual biomarkers.
We developed an integrated approach, namely network-constrained support vector machine (netSVM), for cancer biomarker identification with an improved prediction performance. The netSVM approach is specifically designed for network biomarker identification by integrating gene expression data and protein-protein interaction data. We first evaluated the effectiveness of netSVM using simulation studies, demonstrating its improved performance over state-of-the-art network-based methods and gene-based methods for network biomarker identification. We then applied the netSVM approach to two breast cancer data sets to identify prognostic signatures for prediction of breast cancer metastasis. The experimental results show that: (1) network biomarkers identified by netSVM are highly enriched in biological pathways associated with cancer progression; (2) prediction performance is much improved when tested across different data sets. Specifically, many genes related to apoptosis, cell cycle, and cell proliferation, which are hallmark signatures of breast cancer metastasis, were identified by the netSVM approach. More importantly, several novel hub genes, biologically important with many interactions in PPI network but often showing little change in expression as compared with their downstream genes, were also identified as network biomarkers; the genes were enriched in signaling pathways such as TGF-beta signaling pathway, MAPK signaling pathway, and JAK-STAT signaling pathway. These signaling pathways may provide new insight to the underlying mechanism of breast cancer metastasis.
We have developed a network-based approach for cancer biomarker identification, netSVM, resulting in an improved prediction performance with network biomarkers. We have applied the netSVM approach to breast cancer gene expression data to predict metastasis in patients. Network biomarkers identified by netSVM reveal potential signaling pathways associated with breast cancer metastasis, and help improve the prediction performance across independent data sets.
PMCID: PMC3214162  PMID: 21992556

Results 1-25 (1337696)