|Home | About | Journals | Submit | Contact Us | Français|
Biological systems can be modeled as networks of interacting components across multiple scales. A central problem in computational systems biology is to identify those critical components and the rules that define their interactions and give rise to the emergent behavior of a host response. In this paper we will discuss two fundamental problems related to the construction of transcription factor networks and the identification of networks of functional modules describing disease progression. We focus on inflammation as a key physiological response of clinical and translational importance.
Almost 40 years ago a pioneering symposium was held at Case Western University to assess past developments and future potential of systems approaches in biology. Eloquently Mesarovic presents two important roles systems theory could play in biology: (i) to develop general systems models that can be used as “the first step toward arriving at a more detailed representation of the biological system”, and (ii) to provide “a basis for communication between different fields since the formal concepts of behavior (adaptation, evolution, robustness etc.) are defined in a precise manner and in setting of minimal mathematical structure reflecting, therefore, the minimal degree of special features of the real-life system from which the formal concept has been abstracted” (Mesarovic 1968).
Since then, Systems Biology, loosely defined as the systematic study of complex interactions in biological systems, has emerged as a new and exciting discipline (Kitano 2002; Kitano 2002). The Chemical Engineering community in general, and the Process Systems Engineering group in particular, have made significant contributions by proposing innovative use of ideas, theories, algorithms and tools developed over the years for the analysis of complex process systems to biological systems. The contributions from the research groups of Professors Doyle, Floudas, Hatzimanikatis, Henson, Ierapetritou, Kaznessis, Mantzaris, Maranas, Pallson, Parker, Sahinidis, Stephanopoulos are too numerous to list individually here. This list is by no means complete, and we wish to apologize to those who were unintentionally left out. However, the main point we wish to convey is that systems engineering, be it through either modeling or optimization, has contributed to, both, the development of novel algorithms as well as advancing our fundamental understanding of biological systems.
Central to the analysis of biological systems, and to the work of the researchers mentioned earlier, is the concept of the “network” defined as an interconnected group of systems (Barabasi and Oltvai 2004). Networks are, potentially, characterized by a critical property characteristic of complexity: emergence. In the context of a biological system the implication is that the macroscopic response (phenotype) of a system is the result of propagating information, in the form of disturbances, across an intricate web of interacting modules (Vazquez, Dobrin et al. 2004). However, in biological systems a form of “nested complexity” emerges where networks of interactions form a complexity pyramid (Oltvai and Barabasi 2002). At the lowest level, molecular components of a cell, such as genes, RNA, proteins, and metabolites, are interacting. The interactions define elementary building blocks organized into pathways and regulatory motifs, which in turn are integrated, through appropriate interactions, intro interacting modules that eventually give rise to an organism’s response. The emergent behavior of a biological system, whether it relates to the control of the expression of a single gene (Babu, Luscombe et al. 2004) or the manifestation of a disease (Calvano, Xiao et al. 2005) is the result of the coordinated action of network elements. As such deciphering the connectivity and the dynamics of emerging network architectures becomes a critical question in the analysis of biological systems.
In this paper we will discuss systems-based approaches that aim at exploring the emergence of interaction networks at the (low) level of interaction of transcription factors and the (high) level of interacting signaling and regulation components that give rise to an overall systemic response. We focus our analysis on a critical physiological response, namely, inflammation.
Biological systems dynamically adapt and evolve driven by an intricate machinery that integrates external signals and activates internal mechanisms through a complex web of interacting transcription factors. Cellular systems can therefore be considered as non-linear dynamical systems that exhibit emerging behaviors and are guided by, yet to be determined, regualtory mechanisms (Huang, Eichler et al. 2005). Quantification of these mechanisms will provide a major impetus to scientific research as it would provide a rational basis for designing and optimizing desired responses. Reverse engineering regulatory networks based on high-throughput gene expression measurements is an active area of research (Liang, Fuhrman et al. 1998; D’Haeseleer, Liang et al. 2000; Chua, Robinson et al. 2004; Timothy S. Gardner 2005). However, despite the advent of large-scale transcriptional studies, single perturbation studies at low temporal resolution measurements do not reveal the inherent complexity of cellular dynamics. This is due to our limited ability to understand and control the internal transcriptional machinery directing cells toward target phenotypes (Levine and Davidson 2005). A novel and unique microdevice, the Living Cell Array (LCA) has been proposed to overcome some of these difficulties (Wieder, King et al. 2005). The Living Cell Array is a micro-fluidics device which utilizes cells transfected with artificially constructed reporter plasmids (King, Wang et al. 2007). These reporter plasmids consist of a minimal promoter and 4 repeats of a transcription factor’s consensus sequence as identified via the TRANSFAC database(Matys, Fricke et al. 2003), and an unstable green fluorescent protein (GFP) constructed. Therefore, the activation of a given transcription factor is correlated with the fluorescence of the given cell. In this experimental context, the activation of a given transcription factor (TF) is performed by utilizing a soluble factor which is known to activate that transcription factor. A detailed analysis of a number of critical inflammation specific transcription factors was therefore performed and the activity of the corresponding transcription factor recorded at high temporal resolution (Thompson, King et al. 2004).
Given the artificial construction of the reporter genes, the direct effects of a given activator/transcription factor is clear. It is less clear however what the effects of indirect activation are. Under all of the different activation conditions, all of the reporter genes appear to be activated to a certain extent. The primary question is therefore, what the indirect links are. It may be possible to isolate transcription factors which are tightly coupled, where the activation of one transcription factor causes the activation of a second transcription factor, or which are complementary i.e. the activation of one system can be accomplished via the activation of any one in a set of transcription factors. This essentially allows for the identification of the mechanism behind the crosstalk and addresses issues such as why blocking a specific regulator does not always lead to the blocking of a given cellular response. By construction, the LCA monitors changes in activities of transcription factors in response to constant infusion of soluble signals. By recording the expression of known gene targets of these TFs one can effectively evaluate the actual activity of the corresponding TF. The goal of the analysis would be to identify potential links in activity of TFs and eventually establish a network of interaction based on these known responses.
The promoter regions for these genes, i.e., the location where TFs bind, were constructed in such a manner where the direct activation of a given transcription factor will occur by its corresponding soluble factor simulus, Table 1. However, in spite of this design, it was found that there was significant cross talk, for instance the activation of the reporter gene for STAT3 is activated by by its specific soluble factor IL6, but also by TNF-alpha. The hypothesis being explored is that the nonspecific activation of the reporter gene can occur via a secondary mechanism, i.e. the activation of a given transcription factor may be due to the upstream regulation of another transcription factor. It is hypothesized that if there is this clear link between the activities of two transcription factors, then there should be evidence of co-expression indicating a significant link. Therefore, if the reporter gene is highly co-expressed over a range of different conditions, then it would suggest that there is a definite link between the two transcription factors in terms of their activation.
However, while the activities of the transcription factors may be co-expressed under many conditions, it is also hypothesized that they will not be co-expressed under all conditions; otherwise the two transcription factors would essentially be redundant. While the raw data of the LCA dataset represents a three dimensional dataset with genes, conditions, and time representing the three separate dimensions, the fact that we are looking for co-expressed transcription factor activity means that the problem can be simplified into two dimension. This is done by clustering the temporal profiles under each condition such that transcription factors with similar activities are assigned to the same integer or cluster. After this step has been performed, in identifying transcription factors that are correlated under some, but not all conditions, the next step is to identify the factors that are co-expressed under a maximal number of conditions.
The need to identify transcription factors that are co-expressed under a maximal number of conditions leads to an interesting problem defined as bi-clustering. In bi-clustering, we are as interested in identifying related conditions as we are in identifying related genes, or transcription factors. However, one of the issues associated with bi-clustering is the fact that the problem itself has been determined to be NP-Hard(Cheng and Church 2000). Thus, due to the computational complexity of the problem, most of the algorithms that have been developed to solve bi-clustering are either limited due to constraints which they impose upon the solution(Madeira and Oliveira 2005; Prelic, Bleuler et al. 2006), or use heuristics which do not arrive at globally optimal solutions(Cheng and Church 2000; Yoon, Nardini et al. 2005).
Aside from the use of heuristics that do not yields globally optimal solutions, the most common constraint that is normally imposed by the different methods is either the lack of overlapping bi-clusters(Cheng and Church 2000), or in the cases where they are allowed, to limit the structure of the overlaps(Kluger, Basri et al. 2003). Without overlapping bi-clusters, the resultant solution essentially returns a set of independent cliques which runs contrary to the notion that biological networks are highly interconnected(Zhu, Gerstein et al. 2007). What is needed is a method to isolate not only a single bi-cluster, but also to isolate them in such a manner, such that arbitrarily overlapping biclusters can be identified. The problem that has bedeviled the isolation of arbitrarily overlapping bi-clusters where one must remove redundant overlapping biclusters. Furthermore, finding all over-lapping biclusters requires a method that can efficiently solve the NP-hard problem rather than the reliance upon heuristics.
To address this question, one would need to evaluate all partially overlapping biclusters in a rigorous and consistent way and subsequently combine the results to form a network. We have recently reported a biclustering algorithm (Yang, Foteinou et al. 2007) which addresses this specific problem by making use of a mixed-integer linear optimization as described in (1)
In short, at every iteration one seeks to identify the maximum number of conditions (k) to which N TFs can be assigned, i.e., the problem is solved parametrically in N. Because we deal with long time-series, representing TF activities over time for each perturbation experiment, we proposed in (Yang, Foteinou et al. 2007) the symbolic transformation of each time course and the assignment to it of a unique identified. The D(i,k) denotes the “experimental data” in that it denotes the symbol that has been assigned to each TF profile activity. Thus (1a) defines the objective maximizing the number of conditions, (1b) sets the number of TF assigned to each bicluster, (1c–d) search for TF that share similar profile under different conditions.
Once a bi-cluster is identified, the problem is resolved for the same number of conditions with the inclusion of appropriate cuts that exclude bi-clusters which are subsets of previous ones. This condition is modeled as in (1e), see Figure 1. As earlier mentioned, while the formulation in (1) is able to obtain a set of bi-clusters for a given number of conditions, it can be expanded to find all bi-clusters by solving the problem parametrically. We have demonstrated (Yang, Foteinou et al. 2007) that this approach was able to identify a complete set of direct and indirect interactions which formed the basis for creating a direct graph of interacting TFs, Figure 2. From the bi-clustering result and the associated bipartite network, various interesting interactions had been found. For instance, it was found that while HSE did not have a specific activator under the experimental conditions, it showed significant co-expression and activation from a variety of different activators such as TNF-α and Dexamethasone (Dex). The activation of the Heat Shock Element normally occurs in temperature above 35 degrees, and yet it was activated under the administrations of Dexamethasone, IL-6, and Interferon Gamma. While, HSE was not directly stimulated in the experiment, phenomenon such asthe possible transduction of the HSE by Interferon Gamma have been previously identified (Saile, Eisenbach et al. 2004).. On the other hand, the activation of the system by Dexamethasone may be more of an artifact of the poor data quality. This may be due primarily to the down-regulatory effects associated with Dexamethasone upon most mediators of inflammation in which the manifest of repression upon a baseline of no activation shows primarily the effects of noise. Thus, the correlation of Dexamethasone may be an artifact of the data.
The primary salient characteristic of Figure 2 is the presence of loops such as those that involve IL6 IFN-γ, and IFN-γ and Dexamethasone. The presence of these loops gives a possible mechanism by which both IFN-γ and Dexamethasone are responsible for changing the way an organism responds to inflammatory cytokines, as well as suggesting that there may be a mechanism for involving a tolerance phenomenon. This effect may be mediated through the transcription of the glucocorticosteroid receptor or the Interferon Gamma receptor which is present in the cell(Sanceau, Merlin et al. 1992; Rakasz, Gal et al. 1993). Other identified feedback loops such as those that involve IL6→TNF-α (Moeniralam, Bemelman et al. 1997), glucocorticosteorids→IL6 (Barber, Coyle et al. 1993; Takeda, Kurachi et al. 1998), and IL6→IFN-γ (McLoughlin, Witowski et al. 2003) are evident in Figure 2. We make the additional hypothesis that the feedback loop IL6→TNF-α is mediated through the activity of IFN-γ which has not been directly established. However, it has been established that IFN-γ illustrates non-trivial effects on STAT3 and TNF-α (Raponi, Ghezzi et al. 1997; Kaur, Kim et al. 2003) making it a possible candidate as the hub which mediates feedback activity. This hypothesis shows that the value of the LCA/Biclustering lies not only in the validation of previously identified links, but also as a method for generating new testable hypotheses. Additionally, we assert that a bi-clustering algorithm which was both globally optimal as well allowing for the arbitrary overlapping of bi-clusters is necessary.
A widely used assumption is that the dynamics of TF networks can be approximated as a system of ordinary differential equations with constant coefficients(Gardner, di Bernardo et al. 2003; Dasika, Gupta et al. 2004; Guthke, Moller et al. 2005). While there are undoubtedly significant nonlinear effects present within biological system, given the lack of sufficient conditions in the experimental data, this approximation is used to keep the problem well defined. However, many times, this simple assumption still yields an ill-posed problem in that there are more variables than equations. In response to this, NIR constrains the number of connections possible for each gene or transcription factor to the number of conditions measured(Gardner, di Bernardo et al. 2003), whereas the method proposed by Guthke et al. utilizes Singular Value Decomposition (SVD) to reduce the number of genes whose profiles need to be reconstructed. Examining these techniques it is clear that an ideal technique would require the ability to assess the system under different experimental stimulations as well with high temporal resolution. Fortunately, the LCA is able to satisfy both of these constraints.
The key advantage of utilizing the LCA is that for each transcription factor measured, it is reasonably straightforward to add one or more conditions such that the system is fully defined. This is because each condition represents the stimulation of the system with a stimulatory soluble factor. It is important to note that in both our formulation as well as the experimental system, multiple combinations of soluble factors can be utilized as separate conditions. Therefore, we do not need to make any simplifying assumptions as to the complexity of the network. Furthermore, given the time resolution, the derivative of each transcription factor’s activity level can be accurately estimated via smoothing splines (Rice and Rosenblatt 1983). However, in addition to eliminating the need to place constraints upon the overall complexity of the system, the high temporal resolution allows us to incorporate additional complexities into the system. Specifically, it allows us to consider the possibility that the interaction strengths between two different transcription factors may change over time to reflect other factors that can alter transcription factor activity aside from concentration. These factors may involve mechanisms such as Michalis-Menten interactions associated with binding events(Hemberg and Barahona 2007). Because the input stimulus into the Living Cell Array corresponds to a constant infusion of an activating soluble factor, we hypothesize that the response of various simple mechanisms within the cell will have clearly recognizable features shown below.
The overall model associated with this hypothesis is given in (3):
where D represents the data, D′ represents the calculated derivative of the data after smoothing, A represents the time varying interaction strength between the different factors (to be determined), β represents a matrix which indicates which transcription factors a given soluble factor influences, and s represents the strength of this influence. The index variables i, i′ represents the response of a given factor, j represents the individual condition being modeled, and t represents time.
However, because we not only have to establish what the time varying interaction strengths of the connectivity matrix A are, we also have to establish the underlying connectivity structure, we have formulated the deconvolution of the dynamics as a mixed-integer linear optimization problem. This provides the flexibility for us to either incorporate the network architecture obtained from the previous step, or to identify a structure of a given complexity with the minimal reconstruction error. The formulation is depicted in Eq (4).
Effectively, we minimize appropriate slack variables denoting the deviation of the theoretical model of (3) from the experimental data. Given that the interaction coefficient A is a function of time, we capture the details of the temporal dynamics. The remaining of the constraints established the validity of the networks. As such (4d–e) establish that if a connection is not present the interaction strength between factors i and i′ should be 0, whereas (4f–g) establish that each TF interacts with at least one factor. The latter constraints can easily be relaxed. The final constraint controls the expected complexity of the model by setting a limit on the destined number of interactions. More details are discussed in (Yang, Yarmush et al. 2009)
Utilizing the following formulation, it is possible for us to reconstruct the dynamics associated with the connectivity structure solved in the prior bi-clustering step. The identified dynamics yield some interesting insights as to the overall mechanisms that work in conjunction with the architecture. In this result, the interactions of AP1 were discounted because the factor did not appear to be connected under the solution which were obtained via the bi-clustering formulation Figure 3. Combining this numerical result it is possible to draw interesting hypotheses how the different transcription factors interact. We predict that the response of NFkB to external stimulation appears to have a significant lag event perhaps due to a rate limiting dimerization event, the loss of GRE activity over time points to a tolerance mechanism coupled with the clear down-regulation of NFkB by GRE, and possible oscillatory effects associated with ISRE may be due to its central role in the feedback loop. One of the most interesting observations from these dynamics is that most of the transcription factors appear to exhibit a significant level of tolerance under constant stimulation suggesting in all of our cases we are observing evidence of receptor saturation in response to continued stimuli. Additionally, the fact that such dynamics were visible even in such a small case suggest that the same framework would yield useful insights in a more comprehensive system in which all of the interacting transcription factors were measured.
While there are differences in the connectivity structure as well the solved dynamics, there are some interesting similarities. For instance, the responses of NFkB to direct stimulation as well as the response of GRE to direct stimulation contain very similar profiles. Furthermore, while some of the dynamics appear different such as the response of NFkB to the stimulation of GRE, they differ in an interesting and coherent manner. In the bi-clustering formulation which does not consider AP1 as part of the network, the response of NFkB to GRE stimulation appears to be a mirror of GRE under direct stimulation. However, when we consider the response of AP1, this response changes from a mirror of GRE stimulation to one that suggests the presence of an additional feedback component. This suggests to us that while the glucocorticosteroid receptor can be shown to stimulate NFkB, there is an element in the dynamics which AP1 may play a significant role. Therefore, it is hypothesized that AP1 may play a role in regulating NFkB’s response to GRE stimulation.
In this work, we have demonstrated how to identify and quantify network interactions establishing critical relations between transcription factors. Such models will enable the characterization of the gene expression process and therefore establish important network interactions at the level of the cell. We will next consider network interactions at the level of the host.
Bacterial infection, trauma, surgery and biological stresses in general, induce an acute inflammatory response, characterized by a cascade of events during which multiple cell types are deployed in order to locate pathogens, recruit other cells and eventually eliminate the offenders and restore homeostasis. Under normal circumstances, the inflammatory response is activated and once the pathogens are cleared, reparative processes begin and the response then abates (Laroux 2004). However, in some cases anti-inflammatory processes fail, and an amplified runaway inflammation turns what is normally a beneficial reparative process into a detrimental physiological state characterized by systemic inflammation. The hypermetabolic state is characterized by significant alterations in the utilization of amino acids, glucose and fatty acids, leading to increased resting energy expenditure, a negative nitrogen balance, hyperglycemia and hyperlactatemia. This results in net muscle protein catabolism with extensive amino acid deamination and oxidation, as well as “futile cycling” of substrates such as glucose and fatty acids (Demling and Seigne 2000).
Depending on the severity of the injury and success of the treatment, hypermetabolism and other changes associated with the systemic inflammatory response can progress to multiple organ dysfunction syndrome and sepsis, characterized by significant morbidity and mortality rates. Despite the growing understanding regarding the cellular and molecular mechanisms of systemic inflammatory response syndrome (Tetta, Fonsato et al. 2005) and the numerous animal and human studies that have been undertaken, not many effective therapies exist and only a few drugs are known to reduce mortality compared with controls in clinical trials, albeit at low rates (Bernard, Vincent et al. 2001) and the complexity of the response has made therapeutic strategies elusive(Klaitman and Almog 2003; Riedemann, Guo et al. 2003; Kerschen, Fernandez et al. 2007). Among these, anti-inflammatory corticosteroid-based therapies have seen a recent resurgence in intensive care units (Arzt, Sauer et al. 1994; Meduri, Headley et al. 1998; Annane, Sebille et al. 2002; Annane, Bellissant et al. 2004; Annane, Bellissant et al. 2004). The difficulty in altering the clinical course of critical patients by targeting molecular mediators and therapeutic targets is well known (Marshall 2003). However, the intricacies in translating basic research to clinical practice is recognized as a challenge that needs to be overcome in order to successfully transfer the information from the pre-clinical to the clinical stage (Marshall 2005; Marshall, Deitch et al. 2005). As a result there is a growing interest in deciphering the complexities of the disease in an effort to control and eventually eliminate its detrimental implications. In order to address these pressing issues we have undertaken an integrated approach that aims at approaching the problem from different angles and at different scales. The unifying hypothesis is that the observed response is the outcome of the orchestrated interactions of critical modules in the form a network.
Inflammation is known to be controlled at the gene expression level (Saklatvala, Dean et al. 2003). Establishing and analyzing the complex networks of transcription factors that regulate the expression of inflammatory genes is of critical importance. Thus controlling, i.e., suppressing the expression of inflammatory genes has been identified as a promising therapy and can lead to the development of novel anti-inflammatory drugs (Barnes 2006). A critical enabler in that respect would be to identify the role of the putative networks of regulators of the expression of inflammatory genes.
The nature of the response has lead researchers to the realization that mathematical models of inflammation might provide rational leads for the development of strategies that promote the resolution of the response and the eventual establishment of homeostasis (Seely and Christou 2000). The modeling approaches fall broadly in two categories: those based on explicit dynamic (Day, Rubin et al. 2006; Reynolds, Rubin et al. 2006) and agent based models (An 2004). The potential for studying such complex phenomena in a model-based manner opens the possibility for generating and exploring simultaneously multiple hypotheses for deciphering complex modes of action and the possibility for proposing combination therapies. However, it is rather questionable whether isolated elements of the response best characterize complex responses. Therefore, these approaches require the careful answer to two critical questions: (a) what constitutes an underlying dynamic response, and (b) what is an appropriate inflammation model. Therefore, the host inflammatory response can be considered as the emergence response of a network of interacting elementary response elements (signaling and regulatory).
The resurgence of methods that enable the analysis of such questions is largely enabled by the tremendous advances in monitoring changes at the cellular level driven primarily by developments in measuring gene expression at gene-wide scale. With the technology maturing, what started as a an attempt to classify patterns (Golub, Slonim et al. 1999) of gene expression has evolved into sophisticated analyses providing semi-mechanistic pharmacogenomic models (Jin, Almon et al. 2003).
Recently, there is a growing interest in reducing the complexity of the inflammatory response into a set of key components that are considered to play a critical biological role in the dynamics of the host response when exposed to various stressors such as infection, trauma, hemorrhage shock e.t.c. (Chow, Clermont et al. 2005; Lagoa, Bartels et al. 2006). Thus there is emphasis on reducing the complexity of the models of inflammatory response by identifying a limited number of time-dependent interactions of key elements that are highly sensitive to specific modes of initiation and modulation of the response. Such an approach is necessary in system-level disease processes, like sepsis (Vodovotz, Clermont et al. 2007). A number of excellent prior studies (Kumar, Clermont et al. 2004; Chow, Clermont et al. 2005; Day, Rubin et al. 2006; Reynolds, Rubin et al. 2006; Vodovotz, Chow et al. 2006) have placed significant emphasis on simulating inflammation based on the kinetics of well accepted constituents of the acute inflammatory response. One of the key features of these models is the a priori postulation of certain components that are consistent with biological knowledge to play a major role in triggering the inflammatory response; thus, their computational integration can provide us with significant insight of how such components behave over time empowering their translational application as predictive controls in clinical settings.
However, one of the big challenges is the systematic identification of such representative biological features that can sufficiently represent the complex dynamics of a system. As such, a critical question which emerges is whether we can we identify a representative set of intrinsic responses that emerge from the dynamic evolution of the inflammatory response. This requires the decomposition of the non-linear dynamics of inflammation into an elementary set that can serve as the surrogate for predicting the collective behavior of the system. A possible answer to this issue can be identified through the analysis of gene expression data which aim at monitoring the dynamics of the host response to an inflammatory agent. Therefore, given a high-throughput assay e.g. DNA microarrays, we are interested in extracting the essential transcriptional dynamics of an endotoxin induced inflammatory response and furthemore building an in silico model of inflammation which integrates this reduced set of the essential elements in order to predict the behavior of the entire system through the interplay of its constituent elements. Decomposing the intrinsic dynamics of the entire system into a reduced set of responses enables us to both project and understand the complex dynamics of the system by studying the properties of its essential dynamic parts. Given that the activation of the innate immune system in response to an inflammatory stimulus involves the interaction between the extracellular signals with crucial signaling receptor that drives downstream a signal transduction cascade that leads to a transcriptional effect, we explore the development of an in silico model that aims at coupling extracellular signals with the essential transcriptional responses through a receptor mediated response model.
The data analyzed in this section was generated by the Inflammation and Host Response to Injury Large Scale Collaborative Project funded by the USPHS, U54 GM621119 (Calvano, Xiao et al. 2005; Cobb, Mindrinos et al. 2005). Human subjects were injected intravenously with either endotoxin (CC-RE, lot 2) at a dose of 2-ng/kg body weight or 0.9% sodium chloride (placebo treated subjects). Blood samples were collected before endotoxin infusion (0hr) and 2, 4, 6, 9 and 24 hours after injection as well as for the placebo treated subjects. Cellular RNA was isolated from the leukocyte pellets and a total of 44,924 probe sets on the Hu133A and Hu133B arrays were hybridized and analyzed thus generating the expression measurements of thousands of genes that are activated/or repressed in response to endotoxin.
The administration of a low-dose of endotoxin (LPS) to human subjects elicits the complex dynamics of a transcriptional response altering the expression level of numerous genes. We are interested in unraveling a critical set of “informative” temporal responses that are characterized as the “blueprints” of the orchestrated dynamics of the perturbed biological system. In doing so we hypothesize that there is a definite underlying mechanism that describes the emerging dynamic inflammatory response and capturing the essential inflammatory responses might serve as surrogates for the dynamic evolution of the host response due to endotoxin stimulus. Based on our prior work, we first apply a micro-clustering approach, which is based on a symbolic transformation of time series data which assigns a unique integer identifier (hash value) to each expression motif (Yang, Maguire et al. 2007). Having assigned the temporal expression profiles to distinct motifs, the next task is to select expression motifs that would appear to be highly non-random. Having identified the statistically significant expression motifs from the initial large set of micro-clusters we need to identify a discriminating set of critical temporal shapes that best characterizes the intrinsic dynamic response of the system. Due to global nature of the transcriptional measurements and the fact that we do not a priori select a limited set of responsive genes, the entirety of the transcriptional response is expected to exhibited a rather Gaussian type of response with not clear defining responses (Vemula, Berthiaume et al. 2004). We define the TS of the system as the overall distribution of expression values at a specific time point aiming by quantifying the deviation of the system at each time point versus a baseline distribution (t=0hr) applying a Kolmogorov – Smirnov test (Lampariello 2000). Given the aforementioned metric we are interested in identifying the minimum number of expression motifs which characterize the maximum deviation of the Transcriptional State of the system. This selection problem defines is a combinatorial optimization problem for which we apply a stochastic optimization algorithm, based on simulated annealing (Kirkpatrick, Gelatt et al. 1983).
We identify three critical expression motifs enriched in critical and relevant biological pathways: (i) Early up-regulation response (Pro-inflammatory component, P). Genes in this major temporal class are important in Cytokine – Cytokine receptor interaction as well as in Toll like receptor signaling pathway crucial in activating transcription factors that act synergistically with proinflammatory transcription factors such as members of NFkB/RelA family; (ii) Late up-regulation response (Anti – inflammatory component, A): Genes in this functional class participate in the JAK-STAT cascade which is essential to regulate the expression of target genes that counter - react the inflammatory response. In addition to this, it is emphasized (Murray 2007) that a STAT pathway from a receptor signaling system is a major determinant of key regulatory systems including feedback loops such as SOCS induction which subsequently suppresses the early induced cytokine signaling and essential activators for IL10 signaling (Brightbill, Plevy et al. 2000). Moreover, we identified the late increased expression of IL10RB which is assumed to be indicative of the IL10 signaling cascade; and (iii) Down-regulation response (Energetic component, E): The down-regulated essential response is characterized by a set of genes, which are mainly involved in the cellular bio-energetic processes. In addition to this, a large set of genes, which are essential to Ribosome biogenesis and assembly (RPL/RPS family) are repressed coupled with those genes, which participate in protein synthesis machinery, Oxidative phosphorylation and Pyruvate metabolism. Endotoxin induced inflammation causes the dysregulation of leukocyte bioenergetics and persistent decrease in mitochondrial activity leads to reduced cellular metabolism with subsequent decline in organ function (Singer, De Santis et al. 2004). A restoration of organ function should be associated with an increase in bioenergetics and metabolic activity (Brealey, Brand et al. 2002) and we are assuming that a persistent shut down of these genes might lead to multiple organ dysfunction. These transcriptional responses effectively decompose the overall dynamic and present the constitutive elements of the overall response. They correspond to the cellular signatures in response to LPS administration and manifest the integrated systemic response. Notionally, Figure 4 depicts the essential hypothetical elements of the transcriptional response induced upon recognition of LPS. Appropriate receptors (Toll-like 4, TLR) recognize the pathogen and as a result an intricate cascade of events is initiated which activates appropriate signaling cascades converging in the activation of transcription factors, such a Nuclear Factor kB (NFkB), which modulate the expression of a large family of inflammation-specific transcription factors.
As an external stimulus LPS interacts with its signaling receptor (TLR4) to induce a signal transduction cascade that will ultimately trigger essential signaling modules for the activation of pro–inflammatory transcription factors. Such a transcriptional effect can be modeled applying the basic principle of an Indirect Response Model (IDR) (Krzyzanski and Jusko 1997). The inflammatory stimulus (LPS) is described by a non-linear logistic based function with growth rate klps, 1 and an elimination rate klps, 2 (Zwietering, Jongenburger et al. 1990). In human subjects the endotoxin is cleared within the first 2 hrs of post – LPS administration (Greisman, Hornick et al. 1969). The dynamic profile of the mRNA, R is modeled using an indirect response differential equation characterized by a production rate (Kin,mRNA,R) and a degradation rate (Kout,mRNA,R). The measured mRNA, R is characterized by an up-regulation for the first 4 hrs post-LPS administration and it returns to baseline. As a result the two parameters are estimated so that we can best fit the available mRNA, R. The surface free receptor (TLR4) is characterized by the kinetic parameters k1 and k2 that are associated with the binding interaction between the receptor and the ligand (LPS). These parameters are fixed based on literature values (Shin, Lee et al. 2007) so that to correspond to a low value of disassociation constant (KD); however, we do not have available data associated with the rate of translation ksyn of the mRNA, R to the corresponding surface protein that describes the dynamic evolution of synthesis of new receptors; hence this parameter is estimated so that the dynamic profile of the surface free receptor to be qualitatively a down- regulated one; based on the premise that under the inflammatory stimulus the surface free receptors are occupied. The equilibrium (LPSR) complex is characterized by the binding parameters k1 and k2 as well as by the k3 parameter that shows the rate of formation of the activated signaling DR*. The activated signaling complex (DR*) is proportional to the formed equilibrium complex with a rate k3 and it decays with rate k4,. Moreover, we are assuming that the essential anti – inflammatory component will indirectly regulate the activated intracellular signaling incorporating such a negative effect with a feedback to the production rate of DR*. In addition to this, the non-linear Hill-type of function serves the purpose of a bistable behavior of the system (Xiong and Ferrell 2003). Such a bistability is essential characteristic of the non – linear dynamics of inflammation as it is suggested from various animal studies that an increase in the dose of the inflammatory stimulus can be responsible for an overwhelming inflammatory response. Mathematically such a switch in the stable state of the system can be achieved using positive feedback loops (Tyson, Chen et al. 2003). What is more, the functional form of the activated signaling (DR*) allows us to model an improper (uncontrolled) TLR4 signaling even though the inflammatory stimulus (LPS) has been completely eliminated from the system (Feterowski, Emmanuilidis et al. 2003). At the transcriptional response level the convoluted activated signal (DR*) indirectly stimulates the production rate of the essential pro-inflammatory response (P) which quantitatively is expressed by the linear function (HP, DR*). We are also assuming that the energetic response variable will be responsible for more inflammation (HP, E) (Protti and Singer 2007). The essential anti-inflammatory signaling component is assumed to inhibit the production rate of the pro-inflammatory transcriptional signature. The essential anti-inflammatory signal (A) is stimulated by the activated pro-inflammatory response (HA, P) as well as by the other inflammatory component which is the energetic response (HA, E) and it decays with rate Kout,A. The energetic variable (E) is stimulated by the pro-inflammatory response (P) and we are also assuming that the crucial anti-inflammatory component (A) counter-regulates both the inflammatory components which are the pro-inflammation and the energetic response of the system. We recently demonstrated (Foteinou, Calvano et al. 2007) that this indirect model properly captures the onset and resolution of inflammation but it also predicts a number of responses outside the range of parameter estimation. This justifies the fundamental assumptions that established the functional relationships between the individual components that define the architecture and interactions of the constitutive disease model.
One of the key assumptions underpinning our modeling effort is that intracellular signaling cascades activating inflammation-specific transcriptional responses can be mathematically approximated by the lumped variable DR*. In order to introduce a finer level of detail in our computational model of inflammation we wish to deconvolute and interpret mechanistically the combined signal DR*. In the original model, DR* represent the event activating the transcription of the proinflammatory response (P) which in turn initiates the inflammatory response. As such, DR* is the signal activating, i.e., transcriptionally regulating, the expression of the pro-inflammatory genes. Thus, the mechanistic equivalent of DR* would be the signaling cascade that activates pro-inflammatory transcription factors controlling the expression of the pro-inflammatory genes. Although a large family of transcription factors is known to be involved in inflammation, we focus on a particular family, NFkB, for two reasons. First, the nuclear factor kB family is known to be a major player in the inflammatory response (Saklatvala, Dean et al. 2003) and as such it has been widely studied as a major contributor. Second, the fact the NFkB plays an important role has led to the development of numerous, independent, modeling approaches in order to quantify the expected response of its signaling cascade (Hoffmann, Levchenko et al. 2002). Therefore, we introduce the NFkB signal transduction cascade as the prototypical module for initiating and controlling the expression of pro-inflammatory genes. Numerous signaling molecules and reactions participate in the NFkB signaling pathway (Hoffmann, Levchenko et al. 2002). However, sensitivity analysis (Ihekwaba, Broomhead et al. 2004) demonstrated that the activity of NFkB is maximally modulated by a reduced set of basis signaling molecules (IKK, IKBa and NFkB). As such (Krishna, Jensen et al. 2006) proposed a minimal model of NFkB that accounts for the propensity of oscillations in the dynamic behavior of NF-kB activity. However, instead of simulating the kinase activity as a constant parameter and incorporating saturation degradation rates as discussed in (Krishna, Jensen et al. 2006), we propose to model IKK as a transient signal. Thus, the cellular surface complex (LPSR) induces the activation of kinase activity (IKK) with a rate k3, while being eliminated with a rate k4. As it previously state, the non-linear function of Hill-type, is an essential functional form in order to achieve a bistability response in the dynamics of the probed system (Rifkind 1967; Lehmann, Freudenberg et al. 1987; Tschaikowsky, Schmidt et al. 1998; Kerschen, Fernandez et al. 2007). In chronic inflammatory diseases several cytokines might be responsible for perpetuating and amplifying the inflammatory reaction through the critical node (IKK) (Barnes and Karin 1997). Therefore, we simulate such an interaction by the presence of a positive feedback loop in the kinetics of kinase (IKK) activity. Assuming that NFkBn serves as a percentage of its total cytoplasmic concentration the term (1-NFkBn) denotes the available free cytoplasmic concentration of NF-kB and herein the nuclear concentration (NFkBn) and nuclear activity are used interchangeably. The import rate of cytoplasmic NF-kB into the nucleus depends on the availability of its free cytoplasmic concentration (1- NFkBn) stimulated by the kinase activity (IKK). However, its degradation rate depends on the presence of its primary inhibitor (IkBa) as the latter retrieves nuclear concentrations of NFkB by forming an inactive complex in the cytoplasmic region (Carmody and Chen 2007). The dynamics of the gene transcript of IKBa (mRNA, IKBa) are characterized by a zero order production rate (Kin,IkBa) and a first order degradation rate (Kout,IkBa) which is stimulated by NFkB (Barnes and Karin 1997). The protein inhibitor IkBa, is the product of translation of its gene transcript (mRNA,IkBa) and it degrades at a rate kI,2 which is stimulated by the kinase activity (IKK). Based on the premise that IkBa forms a complex with the available cytoplasmic NF-kB mathematically we expressed is as the product (1-NFkBn) IkBa. From the modeling point of view, in order to achieve a zero steady state for the protein inhibitor IkBa we need the additional negative term –kI,1. Moreover, at the transcriptional response level, instead of assuming the active signaling complex, DR* manifests the effect of LPS on the cellular response level, we assume that the nuclear activity of NF-kB (NFkBn) serves as the “active signal” that indirectly stimulates the production rate of the essential pro-inflammatory response (P).
In our model, the activation of NF-kB signaling module serves as the representative signaling controller of the pro-inflammatory genetic switch underpinning the manifestation of transcriptional responses. An inadequate control of its transcriptional activity is associated with the culmination of a hyperinflammatory response making it a desired therapeutic target. Anti-inflammatory drugs such as corticosteroids play a critical role in modulating the progression of inflammation and significant prior research efforts have attempted to elucidate the mechanisms driving corticosteroid activity (Jusko 1994; DuBois, Xu et al. 1995; Xu, Sun et al. 1995; Sun, DuBois et al. 1998; Almon, DuBois et al. 2002; Almon, Dubois et al. 2005; Almon, Lai et al. 2005; Almon, DuBois et al. 2007) Such studies simulate the pharmacogenomic effect of glucocorticoids at the transcriptional level taking their mechanistic (signaling) action into account (Ramakrishnan, DuBois et al. 2002; Jusko, DuBois et al. 2005). In an attempt to demonstrate the capability of our model to generate a behavior via interacting modules, we opt to integrate the regulatory signaling information with the anti-inflammatory mechanism of corticosteroids, as putative controllers of inflammation. As such, we will explore means of modulating the activity of NFkB through the use of corticosteroids which would allow us to perform computational tests that perturb the trajectory of the non-linear inflammatory signal.
The corticosteroid intervention envelope consists of a set of elementary interactions that involve: (i) the binding of the corticosteroid drug (D) to its cytosolic receptor (GR), (ii) the subsequent formation of the drug-receptor complex (DR), (iii) the translocation of the cytosolic complex to the nucleus (DR(N)) while a portion of nuclear receptor (DR(N)) is recycled and finally (iv) the auto-regulation of the gene transcript of the glucocorticoid receptor (Rm). All the interacting components and modules that constitute the NFkB dependent physicochemical model of inflammation are shown in Figure 5 together with their quantitative representation. Corticosteroids manifest their anti-inflammatory properties by various mechanisms and due to our inability to model all the possible mediators that may be affected by steroids, we will explore their effect towards up-regulation of critical anti-inflammatory proteins including IkBa (Auphan, DiDonato et al. 1995) and IL-10 (Barber, Coyle et al. 1993).
The proposed integrated model of systemic inflammation prior to any intervention is characterized by the dynamic state of eleven (11) variables that seek to describe the propagation of LPS to the transcriptional response level incorporating biological information in the form of regulatory signaling. In our computational model the host restores homeostasis without any external perturbation and a self-limited inflammatory response involves the successful elimination of the inflammatory stimulus within the first 2hr post-endotoxin administration while followed by a subsequent resolution within 24hr, Figure 6.
Standard parameter estimation techniques are applied in order to evaluate appropriately model parameters reproducing the available experimental data. These data include transcriptional profiles of endotoxin receptor (mRNA,R), the primary NFkB inhibitor (mRNA,IkBa) as well as signatures that reflect essential biological processes such as pro-inflammation (P), anti-inflammation (A) and cellular energetics (E). The dynamic profiles of all the elements that constitute the NFkB dependent host response model are presented in Figure 6. Regarding the parameters associated with the active corticosteroid envelope, prior studies (Jin, Almon et al. 2003) provide values for these kinetics in an attempt to simulate in rat liver the effect of a single intravenous administration of corticosteroids. Driven by the premise that in our model the pharmacokinetics of the drug has not been calibrated, we opt to maintain the same values as outlined in Jin et al. However, of critical importance in mathematical modeling are validation strategies that establish a communication link between the model and the real-world process. Therefore, the appropriateness of the proposed model is assessed by performing computational tests that not only reproduce available data, but rather qualitatively predict and modulate uncontrolled responses, Figure 7.
The pre-exposure of the host to controlled levels of inflammatory agents affects the eventual fate of the response. It has been observed that repeated doses of endotoxin insult might lead to a less vigorous innate immune response, a phenomenon known as endotoxin tolerance. A “rapid” tolerance scenario can be induced when the system is pre-exposed to a low endotoxin challenge for between 3–6hr, Figure 7(A). Such preconditioning results in an attenuation of the inflammatory response characterized by a less vigorous immune response coupled with the decreased peak level of the pro-inflammatory response. Prior experimental studies (van der Poll 1996) have documented that concentrations of the particular pro-inflammatory mediator (TNF-a) were decreased profoundly ex vivo at 3hr – 6 hr after in vivo endotoxin administration; while by 24 hrs the endotoxin tolerance had completely resolved. However, the magnitude and timing of pre-exposure of repeated doses of endotoxin are key determinants for discriminating between tolerance and potentiation effects. As such, the successive administration of low doses of LPS may perturb system’s homeostasis towards the progression of an unresolved inflammatory response, Figure 7(B). In Figure 7(C) we simulate the situation in which the initial levels of endotoxin are increased as this would probably constitute the most obvious irreversible disturbance. As such we observe that when the concentration of LPS is strong enough the response does not abate. Such a computational result validates the general concept that it is the host response to endotoxin rather than the stimulus itself that yields the progression of an uncontrolled inflammatory response. Since we have demonstrated the ability of our model to simulate the trajectory of an unconstrained inflammatory response, the potential of our model is also demonstrated through systematic perturbations that intend to modulate the inflammatory response through corticosteroid based intervention strategies. For example, regardless of the implications of high LPS concentration, the pre-exposure of the system into hypercortisolemia “reprograms” the dynamics of the system in favor of a balanced immune response, Figure 7(D). In order to simulate such perturbation we assumed that the active drug signal, DR(N) favors the transcriptional synthesis of IkBa (mRNA,IkBa). However, qualitatively similar behavior is observed if the DR(N) signal potentiates humoral anti-inflammatory mechanisms such as IL-10 signaling (A model component). In particular the system is pre-exposed to corticosteroids for 6hr in a continuous infusion and the intrinsic dynamics of the system are effectively modulated towards reversibility in the progression rate. Clinically a preoperative administration of corticosteroids is further discussed for alleviating surgical stress (Sato, Koeda et al. 2002) placing emphasis on intervention strategies that target the inflammatory response at an early dynamic stage (e.g. transcriptional level). In addition to this, due to the physiological role of steroids in the immune system researchers put significant effort in understanding more about the cytokine dynamics under hypercortisolemia (Richardson, Rhyne et al. 1989; Hawes, Rock et al. 1992; Barber, Coyle et al. 1993; Barnes and Karin 1997; Bornstein and Briegel 2003; Keh, Boehnke et al. 2003). These studies have focused on elucidating the in vivo responses to endotoxin (LPS) when there is an exposure of human subjects to hypercortisolemia for various durations of time. Thus, in (Barber, Coyle et al. 1993) normal human subjects are exposed to glucocorticoid infusion concurrent with and before the endotoxin challenge. Experimental measurements of cytokines and hemodynamic parameters suggest the integral role of hypercortisolemia in modulating the cytokine network when administered few hours, e.g. 6hr, before the main endotoxin challenge. Qualitatively, our in silico results lie in general agreement with prior experimental studies thus paving the way for improving the working feedback loop between “dry” and “wet” experiments.
In this paper we discussed the potential of systems-based approaches to develop appropriate network models. We presented two such models in the context of the inflammatory response. The first was related to the development of networks of interacting transcription factors and the second model was related to the development of a multi-scale model of interacting modules of the host inflammatory response. We demonstrated how data analysis, coupled with optimization can yield significant insights and enable the generation of testable hypotheses. In concluding this short review we would like to argue that possibly a very significant, and often overlooked, success of systems-based research is that through the universal language of mathematics and the opportunity of formalizing and quantifying, albeit with significant simplifications often times, abstract concepts of complex physiological phenomena, it has managed to establish communication bridges and made it acceptable to bring together scientists from a variety of fields with a common goal: to develop a better understanding of a physiological condition. However, these efforts are just a mere beginning.
This review aimed at just revealing some of the interesting problems and identify some of the fascinating opportunities and challenges for systems research in biology and physiology. Despite numerous supportive preclinical studies, most generated hypotheses related to the management and treatment of human inflammation have failed clinical testing(Lowry and Calvano 2008). Even the improved capacity to acquire quantitative data in a clinical setting has generally failed to improve outcomes in acutely ill patients. These failures were often attributed to invoking the single variable assumption in a clinical scenario. It has been argued that prediction of the behavior of complex diseases derived from local insights may be impossible(Clermont, Bartels et al. 2004). A systems-oriented mathematical modeling approach, as a means of dynamic knowledge representation, offers a promising possibility of improving the interpretation of quantitative, patient-specific information and help to better target therapy(An 2008; An, Faeder et al. 2008; Vodovotz, Csete et al. 2008; Foteinou, Calvano et al. 2008 (accepted for publication)). However, such models are typically complex and nonlinear, which precludes the identification of unique parameters and states of the model that best represent available data(Zenker, Rubin et al. 2007). It has been argued, however, that the ill-posedness of the inverse problem in quantitative physiology is not merely a technical obstacle, but rather reflects clinical reality and, when addressed adequately in the solution process, provides a novel link between mathematically described physiological knowledge and the clinical concept of differential diagnoses. A thorough account of the state of the art in computational models of inflammation were presented at the 2007 International Conference on Complexity in the Acute Illness (ICCAI) and advances summarized in(Vodovotz, Constantine et al. 2009).
We wish to acknowledge the invaluable input of our collaborators M.L. Yarmush (Biomedical Engineering, Rutgers University & Center for Engineering in Medicine), R.R. Almon and W.J. Jusko (Biological Sciences, SUNY Buffalo), S. F. Lowry and S.E. Calvano (Department of Surgery, UMDNJ) and F. Berthiaume (Harvard Medical School). Finally, we would like to acknowledge financial support from NIH under GM082974, NSF under grant 0519563, EPA under grant GAD R 832721-010, and a Busch Biomedical Research Award.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.