Home | About | Journals | Submit | Contact Us | Français |

**|**Stat Appl Genet Mol Biol**|**PMC3215431

Formats

Article sections

Authors

Related links

Stat Appl Genet Mol Biol. 2011 January 1; 10(1): 47.

Published online 2011 October 5. doi: 10.2202/1544-6115.1727

PMCID: PMC3215431

Anthony Almudevar, University of Rochester;

Copyright ©2011 De Gruyter. All rights reserved.

This article has been cited by other articles in PMC.

Gene perturbation experiments are commonly used for the reconstruction of gene regulatory networks. Typical experimental methodology imposes persistent changes on the network. The resulting data must therefore be interpreted as a steady state from an altered gene regulatory network, rather than a direct observation of the original network. In this article an implicit modeling methodology is proposed in which the unperturbed network of interest is scored by first modeling the persistent perturbation, then predicting the steady state, which may then be compared to the observed data. This results in a many-to-one inverse problem, so a computational Bayesian approach is used to assess model uncertainty.

The methodology is first demonstrated on a number of synthetic networks. It is shown that the Bayesian approach correctly assigns high posterior probability to the network structure and steady state behavior. Further, it is demonstrated that where uncertainty of model features is indicated, the uncertainty may be accurately resolved with further perturbation experiments. The methodology is then applied to the modeling of a gene regulatory network using perturbation data from nine genes which have been shown to respond synergistically to known oncogenic mutations. A hypothetical model emerges which conforms to reported regulatory properties of these genes. Furthermore, the Bayesian methodology is shown to be consistent in the sense that multiple randomized applications of the fitting algorithm converge to an approximately common posterior density on the space of models. Such consistency is generally not feasible for algorithms which report only single models. We conclude that fully Bayesian methods, coupled with models which accurately account for experimental constraints, are a suitable tool for the inference of gene regulatory networks, in terms of accuracy, estimation of model uncertainty, and experimental design.

The reconstruction of gene regulatory networks (GRN) using gene expression data has in recent years been established as a viable and important methodology in systems biology. Typically, GRNs are represented as graphical models, in which a gene graph defines gene interactions, which are often annotated by a functional or stochastic description of the interactions (see Shmulevich and Dougherty (2007) or Koller and Friedman (2009) for recent comprehensive surveys of such models).

An important distinction can be made between static and dynamic models. The former usually take the form of connectivity graphs in which edges indicate proximity on a functional pathway. These include relevance networks and Bayesian networks. Alternatively, a dynamic model is able to predict gene state transitions in time. Although nominally a static model, Bayesian networks can be adapted to dynamic modeling by introducing a time index (termed *dynamic Bayesian networks*). These are usually fit using time course data (Sachs, Perez, Pe’er, Lauffenburger, and Nolan, 2005), but the application of this model to steady-state data has also been described in Shmulevich and Dougherty (2007). Another commonly used dynamic model is the Boolean network (BoN). Gene expression estimates are first discretized, usually (but not always) into expressed (one) and unexpressed (zero) values. For BoNs the gene graph is a directed graph annotated by a direct functional relationship between the parents and offspring. This defines an operator *A* on the product space of gene expression estimates. Despite the apparent simplicity of this model, it has proven to be a powerful tool in systems biology (Kauffman, 1969), particularly when the objective is to predict network response to alternative stimuli. A BoN is usually fit using gene perturbation experiments, in which the level of one or several selected genes is experimentally altered and the subsequent change in expression is measured for genes of interest. It should be noted that evidence has been compiled that discretizing gene expression estimates need not result in significant loss of information and may conform well to actual cell behavior (Shmulevich and Zhang, 2002, Zhou, Wang, and Dougherty, 2003). The ability to reduce the dynamic properties of a GRN to a single operator proves to be crucial to the methodologies proposed here, and dictates our choice of the BoN model.

BoNs are particularly suitable for research in cancer GRNs aiming at the identification of targets for intervention strategies downstream of oncogenic mutations. Identification of such targets is notoriously difficult and unpredictable, in part because cell regulation is inherently complex. It has also become clear that cancer progression is associated with profound changes in the genetic networks that control the functioning of the cell. The dynamic properties of the BoN permit the modeling of network structure able to predict such catastrophic alterations of gene regulatory control.

Malignant cell transformation is a highly cooperative process that requires the interaction of a few oncogenic mutations that cause substantial reorganization of many cell features (Hanahan and Weinberg, 2000). This involves synergistic modulation of downstream signaling circuitry at multiple levels of regulation, including gene expression (Lloyd, Obermuller, Staddon, Barth, McMahon, and Land, 1997, Sewing, Wiseman, Lloyd, and Land, 1997, Xia and Land, 2007, McMurray, Sampson, Compitello, Kinsey, Newman, Smith, Chen, Klebanov, Salzman, Yakovlev, and H., 2008).

Genes that show a synergistic response to the combined action of two cooperating oncogenic mutations (e.g. activated Ras and loss-of-function p53) are referred to as *cooperation response genes* (CRG) (McMurray et al., 2008). Notably, resetting the expression of such CRGs to levels found in normal parental cells by individual gene perturbation experiments revealed that a large proportion of these CRGs (14 out of 24 tested) are critical to the malignant state in murine and human cancer cells. In contrast, perturbations of genes responding in a non-synergistic manner had a similar effect in only one out of 14 cases (McMurray et al., 2008). CRGs represent a set of 95 annotated cellular genes, many of which are involved in the regulation of cell signaling, transcription, apoptosis and metabolism, and represent key control points in many facets of cancer cell behavior (McMurray et al., 2008). CRGs are thus potentially critical nodes in gene networks underlying the malignant phenotype, providing an attractive rationale to explain why several features of cancer cells emerge simultaneously out of the interaction of a few genetic lesions (Xia and Land, 2007, McMurray et al., 2008).

We introduce three statistical and modeling issues which the proposed methodology will address:

- Many experiments measure only the
*steady state*of a GRN – interpretable as the state reached after the indefinite operation of a dynamic system. A goodness-of-fit measure should therefore be based on the ability of a model to predict an observed steady state. - Many perturbations are
*persistent*– they are artificially maintained at a given level regardless of the tendency of the cells’ GRN. This implies that the observed steady state is a property of a GRN related to, but different from, the one of interest. A GRN reconstruction methodology must model this effect correctly. - For various reasons, there are likely to be a large number of putative GRNs which can be explained by the experimental data. This need not preclude useful inference as long as model uncertainty is explicitly considered. Under this condition, Bayesian methods may be used to assign confidence levels to model features of interest.

We will argue that these issues may be addressed by incorporating BoN models into a computational Bayesian approach (Besag, Green, Higdon, and Mengersen, 1995) that constructs a posterior density on the full model space, from which a simulated sample is used to form inferential statements.

Before introducing our proposed methodology, we present a brief overview of BoNs. We begin with a formal definition of BoNs, introducing relevant concepts then extend the definition to ternary networks. We conclude with an overview of BoN model inference, using current approaches to motivate our method of inference.

Formally, a BoN of order *n* is completely defined by an operator *A* : on state space = * ^{n}*. Let denote the set of all such operators. The state

Although the formal definition of a BoN does not require a specific graph, this type of model is usefully regarded as a type of graphical model because many of its important properties are naturally expressed in terms of graphs. Let *G* = (*V*, *E*) be an order *n* directed graph with vertices *V* = {1,…,*n*} and edges *E*, and let denote the set of all such graphs. An edge from *j* directed to *i* is indicated *j* → *i* *E*, and this edge implies that *j* is a *parent* of *i* and *i* is a *child* of *j*. Next, suppose for a given operator *A*, for each *i* there is a subset *S _{i}*

Therefore, there are two complementary representations of a BoN. As described above, the BoN is characterized solely in terms of operator *A*, from which the minimal graph *G* may be constructed. Alternatively, as a graphical model we are given a graph *G* which then constrains the choice of *A*. If *S _{i}* denotes the set of all parent nodes of

The ability to model dynamic characteristics of a GRN is an important feature of the BoN. We define an iterative process *x*(*t*) = *A ^{t}*(

**Definition 1** *We say x* *is a* fixed point *if x* = *A*(*x*).

**Definition 2** *A finite sequence x*(1),…,*x*(*m*) *is a* cycle *if* (*i*) *each state of the sequence is unique;* (*ii*) *x*(*t* + 1) = *A*(*x*(*t*)) *for t* = 1,…,*m;* (*iii*) *x*(1) = *A*(*x*(*m*)).

A cycle of length 1 is a fixed point. By definition, an attractor is equivalent to a cycle, and we denote the set of all attractors by . A state which is not a member of any attractor is called a *transient* state. Any process either remains entirely in a single attractor, or passes through a finite number of transient states, then remains indefinitely in a single attractor. For an attractor *F*, the *basin of attraction* is defined as the set of states *B*(*F*) from which a process can reach *F*. Conversely, a process beginning in state *x* can reach only one attractor, denoted *F*(*x*). If *x* is not transient, then *F*(*x*) identifies the attractor containing *x*.

Thus, the identification of a BoN model of a GRN permits a characterization of those network states which are transient, and the attractor states which represent the cells’ long-term behavior. An attractor may be interpreted as a meta-stable state, so that the BoN is able to model transitions between such states. This in turn leads to the identification of those perturbations capable of altering cell behavior. Evidence for the existence of attractors in actual GRNs that can be modeled by BoNs is reported in Huang and Ingber (2000).

The definition of a BoN can be extended to accommodate multiple gene expression outcomes. One can replace outcomes = {0,1} with more than two ordered outcomes ′ = {0, 1,…,*k* − 1}, with transition functions now mapping (′)* ^{n}* to ′. There are a number of ways to extend a BoN. One approach is to quantize the gene expression levels using a

A *ternary network* is created by setting = {−1, 0, 1}. This is suitable for classifying the outcomes of perturbation experiments when gene expression profiles are available for an unperturbed control sample. Then gene expression estimates from perturbation experiments can be classified as under-expressed, invariant, or over-expressed (−1,0,1 respectively) by comparison with the control (Kim, Dougherty, Bittner, Chen, Sivakumar, Meltzer, and Trent, 2000).

Inference of BoN models of GRNs usually proceeds using gene perturbation experiments, in which expression levels of specific genes are experimentally controlled and the resulting changes in the remaining genes measured. After statistical adjustment, *q* perturbation experiments yield observed state vectors *Y* = (*y*_{1},…,*y _{q}*),

There are two potential types of perturbations – *persistent* and *transient*. A persistent perturbation, denoted *e* = (*i*, *s*), forces gene *i* to remain indefinitely in state *s*. In the case of gene perturbations, this can be accomplished by gene knockdown by stable short hairpin RNAs (shRNA) or enforced mRNA production from gene delivery vectors, depending upon the desired final set point. In contrast, a transient perturbation forces gene *i* to state *s* temporarily, after which experimental control ceases and the gene has the potential to change to a different state. In genetics, primarily persistent perturbations are used to investigate GRNs because they allow for greater stability and control of gene expression necessary to uncover phenotypes in long term biological assays.

However, it is important to note that under persistent perturbations the cells under observation are different from the cells which are ultimately of interest, since persistent perturbations alter the cells’ GRN. Define a single persistent perturbation experiment by the list of perturbations to which the cell is simultaneously subject, *E* = (*e*_{1},..,*e _{m}*). Let

A direct approach to model inference is to identify transition functions *f*_{1},…, *f _{n}* which are logically consistent with the data

In steady state experiments, it is the attractors which are observed, rather than specific Boolean functions. Therefore, it is natural to see the problem as a search for BoNs constrained by a given set of attractors. An enumerative approach to this problem was considered in Pal, Ivanov, Datta, and Dougherty (2005). Here we propose a model based approach, noting that the ability of an operator *A* to predict an observed steady state may be directly evaluated by calculating iterations *A ^{t}*, provided an initial state may be defined (assumptions leading to this condition will be proposed). An important advantage of this approach is that persistent perturbations may be modeled using iterations of the appropriate operator

In Pal et al. (2005) it was shown that the search space to be considered for the enumeration of attractor-constrained BoNs is extremely large (≥ 10^{17} in a four node example), so that the experimentally observed attractors will not likely be sufficient for resolution to a manageably small number of plausible models. The problem is compounded by the exponentially increasing size of the model space. As argued in Danks, Glymour, and Spirtes (2003) and Wimberly, Danks, Glymour, and Chu (2009), even when structural parsimony constraints are imposed on *A* (as in Ideker, Thorsson, Ranish, Christmas, Buhler, Eng, Bumgarner, Goodlett, Aebersold, and Hood (2001)), the number of perturbation experiments needed to resolve model uncertainty is too large to be practical even for small networks.

We take the view that the inability of the data *Y* to resolve the identity of the true model should not preclude useful inference. It may be that specific features of the network, pertaining to attractor structure or pathway relationships, can be reported with a measurable degree of confidence. We therefore propose a computational Bayesian approach involving the construction of a posterior density on the model space . Bayesian approaches to the inference of network structure have been applied to BoNs in Zhou, Wang, Pal, Ivanov, Bittner, and Dougherty (2004) and to large scale pedigrees in Almudevar (2007). In addition, this Bayesian approach can be used for the design of future experiments which most efficiently reduce model uncertainty (see Ideker et al. (2000), Almudevar and Salzman (2005)).

To illustrate our approach, the proposed methodology will be applied to three synthetic networks designed to highlight the strengths and limitations of using perturbation experiments to reconstruct GRNs. Furthermore, we apply our algorithm to a subset of 9 CRGs reported in McMurray et al. (2008) to demonstrate its ability to uncover potentially interesting biological features of cancer GRNs.

We now present three assumptions on which the inference procedure will be based.

- (A1) The wild-type cell is governed by operator
*A*in the sense that from any initial state*x*, the cell will reach a steady state defined by*F*(*x*). Upon reaching steady state, gene expression can be predicted from*F*(*x*). - (A2) Under perturbation experiment
*E*, (A1) holds for operator*A*._{E} - (A3) There exists a single state vector
*X*^{0}which is a fixed point for any model*A*not subject to persistent perturbations.

Assumption (A1) justifies the use of BoNs for modeling GRNs. It essentially states that the wild type cell possesses at least one attractor, which appears in the laboratory as a steady state with respect to gene expression, representing a form of cell stability. Furthermore, there may be more than one attractor, and an accurately fit model will be able to predict to which attractor any given gene expression state will transition and also which perturbations will force the cell to transition from one attractor to another. It is this feature that makes BoNs suitable for modeling cancer progression as a transition between meta-stable states. See Shmulevich and Dougherty (2007, Sec 2.2) for a good discussion of this issue. Assumption (A2) justifies the modeling approach proposed here and is a conjecture which is subject to experimental validation.

Assumption (A3) carries important implications for ternary networks. We assume that an outcome 0 represents an unperturbed (or wildtype) condition, and that under given experimental conditions the cell will not leave this condition. In the CRG network application described below, experimental gene perturbations are calibrated relative to a control level, represented by outcome 0. Over- or under-expression is defined as a gene expression estimate significantly above or below the control level. Note that neither outcome 0 or −1 implies that a gene is not expressed. We define state vector *X*^{0} = (0,…,0), which represents the control cell (in which genes are expressed at control levels). Then (A3) states that *X*^{0} is itself a fixed point.

We now present an inference procedure based on the assumptions described above. We are given persistent perturbation experiments, *E*_{1},…,*E _{q}*, and corresponding steady state observations,

To score the model, we need a distance *d* : × [0, ∞) such that *d*(*F* | *y*) measures the compatibility of attractor *F* with steady state *y*. The distance must account for the fact that *y* may be a strict reduction of *F*. As will be discussed below, several distances are feasible.

Let
${X}_{k}^{0}$ be equal to the state vector *X*^{0} defined in assumption (A3), modified so that for each perturbation *e* = (*i*,*s*) in experiment *E _{k}* component

$$D(A|Y)=\sum _{k=1}^{q}d({F}_{{E}_{k}}({X}_{k}^{0})|{y}_{k}).$$

(1)

It is important to note that to calculate *D*(*A* | *Y*) we do not need to calculate the entire attractor structure of the operator *A*_{Ek} implied in (1). It suffices to calculate the single attractor
${F}_{{E}_{k}}({X}_{k}^{0})$, offering a considerable computational advantage. This is easily done by iterating *A*_{Ek} from state
${X}_{k}^{0}$ until a cycle is observed.

We next consider the problem of defining a score *d*(*F* | *x*) for attractor *F* and state vector *x* . When an attractor *F* is a fixed point, then *D* is a distance between two elements *x′*, *x*″ . Two possible forms are the Hamming distance and *L*_{1} distance, given respectively as:

$$\begin{array}{l}{d}_{H}({x}^{\prime},{x}^{\u2033})=\sum _{i=1}^{n}I\{{x}_{i}^{\prime}\ne {x}_{i}^{\u2033}\},\\ {d}_{1}({x}^{\prime},{x}^{\u2033})=\sum _{i=1}^{n}|{x}_{i}^{\prime}-{x}_{i}^{\u2033}|.\end{array}$$

When *F* is a cycle, one possibility is to reduce *F* to a vector and apply *d _{H}* or

$${d}_{M}(F|x)={d}_{1}(\overline{F},x),$$

noting that *d _{H}* is not applicable here, since we may have / . This score must be carefully interpreted. Notice in the given example that the second cycle component alternates between −1 and 1, so the corresponding component of is 0. This may be appropriate when gene expression measurements are aggregated over large numbers of cells. While this is a possibility, it would be important to confirm the existence of this effect experimentally.

Another approach is to assume that cycles containing both −1 and 1 in a single component are unlikely, if not impossible. If the attractor component contains such a cycle, it is assumed that a steady state prediction cannot be made and the component contributes a nonzero amount to the score. If a cycle component for gene *i* contains 0 and 1 (or 0 and −1), we assume that in the steady state observation the gene will be over-expressed (or under-expressed). The cycle can be summarized for each component by the set of nonzero outcomes observed. The score *d _{T}* (

We now describe two related tasks: (1) minimizing *D*(*A* | *Y*) over *A* , and (2) sampling from a posterior density on representing models which are plausible given the data. As will be demonstrated below, even when one can find *A* such that *D*(*A* | *Y*) = 0, there may be many *A* which satisfy this. If it were feasible, a reasonable approach would be to constructively enumerate the solution set {*A* *: D*(*A* | *Y*) = 0}. If all, or almost all, models in this set shared a common model feature (for example, genes 1 and 2 jointly regulate gene 3), then that feature could then be assigned a high confidence. Of course, such an enumeration will not generally be feasible, but we may use instead techniques from computational Bayesian methodology to approximate this type of enumeration.

It is important to note that we have no natural probabilistic models with which to construct a posterior distribution (as would exist for Bayesian network models), but we may define a density weighted by *D*(*A* | *Y*) of the form:

$${\varphi}_{\tau}(A)\propto \text{exp}(-D(A|Y)/\tau ),$$

for any positive constant *τ*. Suppose *D** = min_{A}*D*(*A* | *Y*). We wish to find *τ** for which *P*{*D*(*A* | *Y*) = *D**} ≈ 1 under *ϕ*_{τ*} (allowing the probability to be slightly less than 1 would be equalivalent to accepting some error tolerance). Then a sample from *ϕ*_{τ*} is equivalent to an approximation of solution set {*D*(*A* | *Y*) = *D**}, since all models in this set have the same density under *ϕ*_{τ*}. Simulated annealing (SA) techniques can be used to determine *τ**, following which a simulated sample from *ϕ*_{τ*} can be generated using MCMC methodologies. Details of our implementation of this procedure are given in Appendix I.

Inferential statements about the true model may be constructed using *Bayesian averaging*. Let *S* be a statement about the model, for example, ‘gene *a* regulates gene *b*’. Then *S* may be formally defined as the subset of all models in for which *S* is true. The probability of *S* under density *ϕ*_{τ*} is analagous to a posterior probability obtainable from a Bayesian model and is interpretable as a level of confidence for the truth of *S* (we therefore use the terminology of Bayesian inference in our discussion). This probability can be directly estimated using a simulated sample from *ϕ*_{τ*}. A posterior probability of 1 or 0 is interpretable as certainty that statement *S* is true or false, while probabilities in (0,1) mean that the truth of *S* is compatible with, but not necessarily implied by, the data. This interpretation will be useful in the design of further experiments, of which the goal would become the conversion of posterior probabilities from values near 0.5 to 0 or 1.

The SA algorithm used here is based on one proposed in Aarts and Korst (1989). Details are given in Appendix I, but we briefly describe here the general structure. The algorithm proceeds in stages, with stage *m* consisting of a sample simulation of fixed length *N _{s}* from density

In this section the methodology will be demonstrated using a set of synthetic network models and gene perturbation data for 9 CRGs reported in McMurray et al. (2008). The primary motivation for this demonstration will be to validate two ideas:

- We expect many network models to be compatible with a given data set; therefore, a random search algorithm cannot be expected to be consistent, that is, varying the random seed or the initial point should yield differing models. Alternatively, if the goal is to estimate a Bayesian posterior density, we should expect the algorithm to be consistent.
- A Bayesian posterior density is correct if a model feature assigned a marginal posterior probability close to 0 or 1 is absent or present in the true model, respectively. A posterior probability closer to 0.5 indicates that the data is not sufficiently informative to resolve the absence or presence of that feature. This type of uncertainty should decrease with additional experimental data.

We present a detailed analysis of a synthetic network using the following steps:

- define a ternary network,
*A*(transition functions and topology), - define a list of persistent perturbations,
*E*_{1},…,*E*, with initial states, ${X}_{1}^{0},\dots ,{X}_{q}^{0}$,_{q} - determine the attractor, ${y}_{k}={F}_{{E}_{k}}({X}_{k}^{0})$ for each
*k*, then construct input steady state data,*Y*= (*y*_{1},…,*y*), and_{k} - derive an estimate,
*Â*of*A*, and construct a posterior density on the model space.

The simulated annealing algorithm was applied to the data *Y* following which a posterior density was sampled, as described in the Methods Section. Parameter settings are described in Appendix I. In this example, we assume we can observe *Y* without error, so only one replication of the algorithm is presented. Issues raised by the possibility of error in *Y* are considered in the CRG example presented in Section 3.2.

Figure 1 shows the topology of the first synthetic network we considered. We define two types of transition functions mapping → : *equality*, *f _{i}*(

$$\begin{array}{c}{f}_{1}(x)=\sim {x}_{4},{f}_{2}(x)={x}_{1},{f}_{3}(x)={x}_{2},{f}_{4}(x)={x}_{3}\\ {f}_{5}(x)={x}_{6},{f}_{6}(x)={x}_{5},{f}_{7}(x)=0,{f}_{8}(x)=0.\end{array}$$

The transition function for gene 9 can be best represented as a table of values based on the status of genes 7 and 8 (Table 2). Attractors resulting from transient single perturbations of each gene are summarized in Table 3. Because genes 7–9 are not part of a cycle, any transient perturbation would yield a temporary change in gene expression and then reversion back to the null attractor, **0**. In Table 4 columns represent the steady states resulting from the persistent single perturbations (indicated in bold type). In a real experiment, these values would be generated by comparing the gene expression estimates from the perturbed samples to control samples.

Gene regulatory network for Example 1. Increased (decreased) gene expression is represented by arrows (squares).

Cyclic attractors resulting from transient perturbation of single genes. Null attractors, and genes within attractors remaining at outcome 0 are not listed.

Observed steady state gene expression after persistent perturbations of each gene to outcome indicated in bold.

Our proposed model inference algorithm was first applied to the data in columns 1–9 of Table 4. Table 5 displays the marginal posterior probabilities for each potential edge in the graph. If we report each edge with a marginal posterior probability > 0.5, the true network topology is reproduced exactly. It must be noted, however, that a summary of the marginal edge probabilities is not necessarily an exact representation of a network topology. This would be the case if all probabilities were essentially 0 or 1, but when probabilities closer to 0.5 are present, there may be important statistical dependencies among edge events implied by the posterior density *ϕ*_{τ*}. This would happen, for example, if two edges have marginal probabilities close to 0.5, but occur together in any single network under *ϕ*_{τ*} only with very small probability.

Marginal posterior probabilities for graph edges (Example 1). Rows are offspring and columns are parents. By considering all edges with probability > 0.5 (in boldface), we are able to reconstruct the network topology exactly.

Next, the attractors associated with the persistent perturbations are all correctly identified with probabilities >0.87. However, although the true attractors associated with transient perturbations of genes 5–9 are similarly assigned high posterior probability (>0.86), those associated with genes 1–4 have very low posterior probability (<0.05). We note that the true attractor for transient perturbations of genes 1–4 is quite complex and would require each connection on the circuit to be present (see Figure 1). While the marginal posterior probability for each edge is quite high (minimum 0.86, see Table 5), there exist a variety of attractors represented in the posterior density. The highest probability attractor is the null with probability ≈ 0.4. This is the attractor that would occur if each edge in the circuit were not present (which must happen with a probability of at least 1–0.86 = 0.14), but may also occur if the transition functions are not correctly estimated. An important feature of the proposed methodology is that attractor densities for the correctly resolved perturbations have much lower variability. Overall, we conclude that the topology of the network has been correctly identified, along with the attractors for transient perturbations of genes 5–9, but there is uncertainty regarding the attractors for transient perturbations of genes 1–4.

Naturally, this leads to the question of whether our method can be used to predict future perturbations that are best able to resolve model uncertainty, as in Ideker et al. (2000) or Almudevar and Salzman (2005). Our observations suggest that new perturbations of genes 1–4 would be the best choice. Accordingly, the model was refit after negative perturbation of genes 1–4 (columns 10–13 of Table 4). This resulted in a high posterior probabilitity (>0.84) being assigned to the correct attractor for all of the transient perturbations. This suggests that our methodolgy can be used to design future experiments that reduce uncertainty in the attractors.

Two additional synthetic networks were developed to further assess our methodology in the same manner as Example 1. The network topologies for Examples 2 and 3 are illustrated in Figure 2. Note that Example 3 has the same topology as Example 2; however, the transition functions differ for genes 2 and 5, introducing negative regulation. We will see that this small change greatly affects model inference.

Gene regulatory networks for (a) Example 2 and (b) Example 3. Increased (decreased) gene expression is represented by arrows (squares).

The only ambiguity in the graphical representation of these networks is in the regulation of gene 2. The transition functions for gene 2 in Examples 2 and 3 are given in tabular form in Table 6. Attractors resulting from transient single perturbations are shown in Table 7 for Example 2 and in Table 8 for Example 3. Note that transient perturbations of genes 4, 8, and 9 result in the null attractor in both examples. In the same manner as Example 1, we treat the matrices of steady state estimates from 9 single persistent over-expression perturbations as the input to our inference algorithm.

Transition function table for Gene 2 based on the values of Genes 1 and 3 in (a) Example 2 and (b) Example 3.

Cyclic attractors resulting from transient perturbation of single genes in Example 2. Null attractors, and genes within attractors remaining at outcome 0 are not listed.

Cyclic attractors resulting from transient perturbation of single genes in Example 3. Null attractors, and genes within attractors remaining at outcome 0 are not listed.

By assuming the existence of an edge when the posterior probability is at least 0.5 (the convention adopted in Section 3.1.1), we find that the topology of Example 3 is estimated correctly; however, the same is not true in Example 2. The existence of negative regulation in Example 3 gives additional information regarding topology, relative to purely positive regulation.

We now consider estimation of attractors from transient perturbations. For both examples, whenever the null attractor was correct, it was identified as such with a minimum posterior probability of 0.896. Among the other true attractors only those of Example 2 for genes 1–3 were identified with probabilities greater than 0.5 (0.803, 0.829, and 0.828). For genes 5–7 in Example 2, the true attractors were given the highest posterior probabilities, but these were less than 0.5 (0.415, 0.414, and 0.414). The true attractors in Example 3 for genes 1–3 and 5–7 cycle between 1 and −1 for all active genes (see Table 8) and are likely to be challenging to estimate. However, it is important to note that in all cases, whenever the true attractor does not have the highest probability the posterior density is noticeably more variable than when it does.

In this section we report on an application of the proposed methodology to the modeling of presumed interactions between 9 CRGs. Here, interest is in deducing how CRG perturbations will alter a control GRN. Therefore, we use a ternary network with = {−1,0,1} where 0 represents gene expression in the unperturbed GRN and −1 and 1 represent significant deviations from control. We consider a GRN based on the following 9 CRGs: HoxC13, Id2, Id4, Jag2, Notch3, Sfrp2, Wnt9a, Zfp385, Plac8.

Table 9 gives the observed steady states for 9 single perturbations involving the 9 CRGs (perturbations of all but Plac8 were over-expression). See Appendix II for details regarding experimental design. The direct perturbation effects are summarized in the connectivity diagram in Figure 3, in which an edge leading from gene *a* to *b* is included if there is statistical evidence that a perturbation of *a* changes the expression level of *b*. This is equivalent to the *accessibility graph* described in methodology developed in Wagner (2001, 2004). This type of model can be expected to contain redundant edges, since an edge will be included from *a* to all genes downstream of *a*, whereas the causal structure could be represented using fewer edges which model sequential, or transitive, causality. Wagner proposes replacing the *accessibility graph* with an *adjacency graph*, defined as the most parsimonious graph which preserves the pathway structure of the accessibility graph (effectively removing *shortcuts*). However, such a model is not feasible in an application in which special interest is placed on the discovery of self sustaining cyclical attractors representing alternative meta-stable states.

Connectivity graph derived from the 9 CRG model using a threshold of 20 edges. Merging or splitting of edges is indicated by the solid circles. Increased gene expression or gene repression are represented by arrows or bars, respectively.

Steady state summaries for the 20 edge model selected by the thresholding method described in Appendix II. The implied edges are shown in Figure 4(a). Persistent perturbations are indicated in bold type.

Table 10 shows marginal edge probabilities *p*(*j* → *i*) obtained by averaging 10 full posterior densities estimated by our proposed algorithm (see below for more detail). All probabilities *p*(*j* → *i*) ≥ 0.6 are highlighted, which are represented by a connectivity graph shown in Figure 4(a). We note that the edges are not functionally annotated (as in Figure 3), and should be regarded as a summary of many networks, rather than a single network model, with edges representing connections of some form present in most models sampled from a posterior density. Many of the general features of the connectivity graph of Figure 3 are preserved in Figure 4a, in particular a pathway structure originating in HoxC13, then passing through Id2, then through a set of three genes Zfp386, Jag2, Notch3, and terminating in a circuit comprising Wnt9a, Sfrp2 and Id4. There is also a potential for feedback from this final circuit to the previous triplet.

A connectivity graph for the 9 CRG model (a) constructed using edges with estimated marginal posterior probability *p*(*j* → *i*) ≥ 0.6 and (b) derived from a ternary network model fit. In both plots merging or splitting of edges is indicated **...**

Marginal edge posterior probabilities *p*(*j* → *i*) obtained by consolidating 10 separate posterior densities. The *Null* parent signifies no parent.

Lastly, we examine the complete ternary network. We note that the network topology (Figure 4b) shows a similar partition into highly connected circuits (Zfp385/Jag2/Notch3 and Wnt9a/Sfrp2/Id4) as can been seen in Figures 3 and and4a.4a. The remaining genes play no important role. In general, the networks shown in Figures 4a and and4b4b are very similar (edges which are not in common are indicated in gray).

The circuit Wnt9a/Sfrp2/Id4 forms an attractor which alternates over-expression of Sfrp2 with under-expression of Wnt9a and Id4. One advantage of the ternary network model is that it predicts those perturbations which are able to force a transition from one meta-stable state to another. Thus we note that Id4 does not induce feedback, so that the self sustaining behavior is due to Wnt9a and Sfrp2 alone. This circuit can be induced by setting Wnt9a or Sfrp2 individually to their attractor levels. Alternatively it can be induced by an over-expression of Jag2, which forces under-expression of Zfp385 and subsequently Wnt9a, which suffices to induce the cycle. This is predicted by applying operator *A* iteratively to an initial state characterized by over-expression of Jag2 (see Table 11).

Table gives a sample pathway induced by iterating unperturbed operator *A* from initial state (0,0,0,1,0,0,0,0,0) until the Wnt9a/Sfrp2/Id4 attractor is reached (iterations 4–6).

The network contains an interesting robustness property. If Wnt9a is over-expressed, or Sfrp2 is under-expressed, contrary to the predicted attractor, a regulatory action on the part of the Zfp385/Jag2/Notch3 circuit is induced which forces a return to the attractor (see Table 12).

Table gives a sample pathway induced by iterating unperturbed operator *A* from initial state (0,0,0,0,0,0,1,0,0) until the Wnt9a/Sfrp2/Id4 attractor is reached (iterations 8–10).

We emphasize that this network is only one of many plausible networks. The advantage of Bayesian averaging is that any feature can be assigned a posterior probability. We may then reject or accept features with probabilities close to zero or one. Further experimentation can then resolve features possessing greater uncertainty.

In spite of any modeling uncertainty, a coherent GRN emerges, with a circuit involving genes Wnt9a, Sfrp2, Id4 predicted as a possible meta-stable state. There is some support in the literature for this model. These genes are known to regulate cell proliferation and differentiation (Lasorella, Uo, and Iavarone, 2001, Suzuki, Watkins, Jair, Schuebel, Markowitz, Chen, Pretlow, Yang, Akiyama, Van Engeland, Toyota, Tokino, Hinoda, Imai, Herman, and Baylin, 2004, Xiang, Lin, Zhang, Tan, and Lu, 2008) and have been associated with promoting or inhibiting tumor activity (McMurray et al., 2008). In addition, consistent with their proposed role as regulators in the observed network, each of the down-regulated CRGs – Jag2, Notch, and Zfp385 – behaves as a strong inhibitor of the cancer phenotype upon resetting its expression to levels found in normal parental cells (McMurray et al., 2008). In the present analysis, these genes appear to play a regulatory role with respect to the Wnt9a/Sfrp2/Id4 circuit. Consistent with this prediction, interactions between Notch and Wnt signaling have been observed in the regulation of stem cell proliferation and cell fate choice (Crosnier, Stamataki, and Lewis, 2006).

To test the model, the algorithm was run with 30 replicates for each score type, using the parameter values described in Appendix I. We refer to scores using *d _{M}*(

Analysis of fit replications using score type 1. **(a)** Boxplots of score values *D*(*A* | *Y*) grouped by *N*_{s} parameter. **(b)** Plot of extreme cycle count against score *D*(*A* | *Y*). A running mean smoother with bandwidth 0.5 is superimposed. **(c)** A running mean smoother **...**

A different form of overfitting involves the appearance of extreme cycles in which expression of a single gene alternates between −1 and 1. Conceivably, such cycles introduce enough complexity to permit closer fitting of the steady states. This effect may be especially relevant for score type 1, since if the outcomes within a cycle are equally frequent then the predicted steady state value will be 0, which is the value of most observed outcomes in the data. For each of the 90 score type 1 fits, the number of extreme cycles was calculated (each of 72 predicted outcomes may have a distinct extreme cycle). A scatter plot of extreme cycle count against score *D*(*A* | *Y*) is shown in Figure 5b, with a running mean smoother superimposed. In Figure 5c the proportion of fits with > 0 extreme cycles is estimated as a function of *D*(*A* | *Y*) using a running mean smoother. The tendency for the model to fit extreme cycles with greater frequency as *D*(*A* | *Y*) decreases is evident. From Table 1, when using score type 2 extreme cycles are permitted, but are always penalized. Figure 6 shows histograms of extreme cycle count grouped by score type (70 fits using score type 1, *N _{s}* > 2,000, 30 fits using score type 2,

Histograms of extreme cycle count grouped by score type. For score type 1, 70 replications using *N*_{s} = 10,000, 25,000, 100,000 are used, for score type 2, 30 replications using *N*_{s} = 100,000.

For each score type 2 fit the minimum attained score was 4. Conceivably, this minimum represents the lowest attainable score with indegree bound *K* = 2, although with no available analytical theory, this can be verified only by an exhaustive model search. We also expect many solutions to achieve exactly this score.

Table 1 defines a method of summarizing attractors as vectors of single components. Each component of the summary vector is determined by observing the sequence of that component’s outcomes in the attractor cycle. The attractor summary component is set to 0 if all outcomes in the sequence are 0, set to 1 if a 1 occurs at least once but −1 does not occur, set to −1 if a −1 occurs at least once but 1

Table 1 defines a method of summarizing attractors as vectors of single components. Each component of the summary vector is determined by observing the sequence of that component’s outcomes in the attractor cycle. The attractor summary component is set to 0 if all outcomes in the sequence are 0, set to 1 if a 1 occurs at least once but −1 does not occur, set to −1 if a −1 occurs at least once but 1 does not occur, and set to 9 if both 1 and −1 occur.

It is important to emphasize that analysis involving operator *A* differs substantially from that involving perturbation operators *A*_{Ek}. The attractors associated with *A* are not observed, but represent the behavior of the unperturbed cell, and would therefore be of greater interest in a general study of the behavior of the cell, forming the basis of an *in silico* model of the cells’ GRN. In addition, the steady states resulting from any new perturbation experiment *E*′ can be predicted computationally using *A*_{E′}.

We examined the predicted attractor summaries for operator *A* for a variety of initial states. We use initial states corresponding to experiments *E*_{1},…,*E*_{9}, except that the initial state is not forced to be persistent. For example, corresponding to *E*_{1}, a single forced persistent over-expression of *HoxC*13, we set initial state *x*_{0} = (1,0,0,0,0,0,0,0,0), then observe iterates *A ^{m}*(

Under the methodology defined in Section 2.3 a SA algorithm is first used to determine *τ**, which defines the posterior density *ϕ*_{τ*} used to estimate posterior probabilities. Some flexibility can be introduced by using posterior density *ϕ*_{rτ*} for some positive factor *r*. For our analysis the posterior density was sampled for the first 10 model fits using the reported model fit as the initial point of the sampler, using *r* 1,3. Our primary concern will be with consistent convergence. While we do not expect the algorithm to converge consistently to a single model, we should expect consistent convergence to a single posterior density, to within some acceptable approximation.

We first summarize the graph structure. Using Bayesian averaging the posterior probability that *j* is a parent of *i*, *p*(*j* → *i*), was estimated for each posterior density. A *no parent* state was included. These probabilities appear to be consistent across model fits (Figures 7 and and8).8). Under consistent convergence, these values should be found close to the identity. If we compile all absolute differences | *p*(*j* → *i*) − *p′* (*j* → *i*) | where *p* and *p′* denote distinct posterior densities, for each pair, we find 97%, 98% and 2% no greater than 0.1, no greater than 0.25 and greater than 0.95, respectively, for *r* = 1. The corresponding percentages for *r* = 3 are 87%, 95% and 0%. The numbers, along with the plots themselves, suggest a straightforward consequence of varying *r* – lower values yield greater overall consistency but a greater number of large differences. This tendency was also observed by the authors for other model features. It is suggestive of the bias/variance trade-off common in many forms of complex statistical modeling. In the our analyses, we use the value *r* = 3, as the more cautious choice.

Scatterplots of *p*(*j* → *i*) for posterior density pairs (1,2),(2,3),…,(9,10), using temperature *τ**. Identity and +/− 0.1 boundaries are indicated.

Scatterplots of *p*(*j* → *i*) for posterior density pairs (1,2),(2,3),…,(9,10), using temperature 3*τ**. Identity and +/− 0.1 boundaries are indicated.

We noted above that when examining multiple fits, there was considerable variation in the attractor structure, but that only 3 attractors appeared consistently in most fits. Interestingly, these rarer attractors have little influence on their associated posterior densities. For example, the attractor (1,0,−1,1,9,9,−1,1,1) appears in one model fit, but has negligible posterior probability in the associated posterior density. This is generally true of all the rarer attractors. This is an important feature of the methodology, since it identifies which features of a given model are likely to be spurious, without the need for extensive model fitting replications.

Finally, we briefly raise a technical consideration. Recall that when calculating the score *D*(*A* | *Y*) the operators *A*_{Ek} are iterated from a given initial state until an attractor is reached. This means that there may be table entries which are never used. For example, in the table for Zfp385 it is predicted that the simultaneous under-expression of Jag2 and Notch3 will induce the over-expression of Zfp385. However, that particular combination of outcome levels for Jag2 and Notch3 was not encountered during the scoring of *A*. Although assigning values to each table entry is needed to conduct a random search, and possibly to perform further Bayesian averaging, it is still appropriate to flag these entries, since they have no effect on the score or on any of the resulting attractors or pathway structure of the perturbed operators.

We have proposed a method for the modeling of GRNs as Boolean networks using gene perturbation data. The method accounts for the dynamic structure of gene perturbation experiments and, through the use of computational Bayesian methodology, permits the direct evaluation of model uncertainty. This in turn permits more useful inference statements concerning GRN features without requiring resolution to a single model, and will permit the rational and efficient design of further experiments.

The model was applied to gene perturbation data associated with 9 CRGs (McMurray et al., 2008). The method was consistent in the sense that given randomized initial points, the algorithm always converged to an approximately common posterior density. An examination of the results yielded an intuitively plausible ensemble of networks, consistent with previously reported observations. The model was also tested using a number of synthetic networks, and proved to be accurate in the sense that model features reporting posterior probabilities close to 1 were consistently accurate. In addition, it was demonstrated that experiments which resolve model uncertainty could be identified by a careful examination of the posterior density.

While significant in their own right, these results demonstrate the potential of the proposed methodology to uncover novel gene regulatory networks and to better understand the behavior of known ones. As genomic research turns away from independent tests of differential gene expression and tackles the complexity of gene networks, methods such as the one proposed here provide a crucial tool in advancing our understanding. Work is currently underway to extend the proposed methodology to larger and more complex networks and to investigate other potential gene networks.

In this appendix we describe in detail the Markov Chain Monte Carlo (MCMC) methodology used to search for low scoring models. Some care is needed here because the model is defined both by the network topology and the transition functions.

Altering an operator in some way may or may not alter its minimal graph. We now introduce some notation and definitions which will help consider this question. Assume contains a level 0. For convenience, given *x* * ^{I}*, for integer

Let * _{I}* be the set of mappings

**Definition 3** *Suppose f* _{I}*for I* ≥ 1*. Suppose S* {1,…,*I*}, |*S*| = *d* ≥ 1*. Then f′* _{I−d} *is a* reduction *(in S) of f if f* (*x*) = *f′* (*H*^{−S}(*x*)) *for all x* ^{I}*. We say fis* non-reducible *if it has no reduction. Also, a reduction f′ of f is the* maximal reduction *if it is non-reducible. Conversely, an element f′* _{I+|S|} *is an* extension *(in the components S) of f* _{I} *if f′* (*x*) = *f* (*H*^{−S}(*x*)) *for x* ^{I+|S|}.

Let
${\mathcal{F}}_{1}^{\mathit{nr}}$ be the class of all non-reducible mappings in * _{I}*. If

Depending on outcome set , and any other assumptions, restrictions may be placed on model space . In the example described below we impose the following restrictions. Let *G*(*A*) be the minimal graph of *A*, with parent set *S _{i}* for the

We take the steady state vector defined in Assumption (A3) to be *X*^{0} = , and this in turn requires that the transition function *f _{i}* = 0 whenever all input arguments are 0, or when

We denote by * the subset of which satisfies these constraints.

Since an exhaustive search over * will not be feasible, we will employ a Markov chain Monte Carlo (MCMC) search, involving the simulation of a Markov chain on state space *. We use score *D*(*A* | *Y*), and additionally require a neighborhood (*A*) * for each *A* *. Given current state *A*′, a *proposal* state *A*″ is chosen randomly from (*A*′) which becomes the subsequent state with a probability depending on the resulting change in score, otherwise the Markov chain remains at *A′*. The efficiency of a MCMC usually depends greatly on the choice of (*A*) and the proposal distribution in general (see, for example, Chib and Greenberg (1995)). Ideally, selection of the proposal state does not result in too large a change in score.

We will use the simulated annealing algorithm (SAA) described in Aarts and Korst (1989) (see also Almudevar (2003) for a brief summary of the algorithm). Consider distributions of the form:

$${\varphi}_{t}(A)\propto \text{exp}(-D(A|Y)/\tau )$$

(2)

where *τ* is a temperature parameter. The algorithm is structured as a sequence of stages. In the *m*th stage a sample of models from steady state distribution *ϕ*_{τm} (*A*) is simulated using the Metropolis-Hastings algorithm (Hastings, 1970).

Here, *τ _{m}* is a specific stage

$${\tau}_{m+1}=\frac{{\tau}_{m}}{1+\frac{{\tau}_{m}\hspace{0.17em}\text{ln}(1+\delta )}{3{\sigma}_{m}}},$$

(3)

which ensures a constant change in the variational distance of the steady state distribution towards the minimum.

The algorithm as described in Almudevar (2003) relies on 5 parameters. The initial acceptance rate *χ*_{0} of the MCMC sampler determines the initial stage 1 temperature *τ*_{1}, and *σ*_{0} is the steady state standard deviation corresponding to *χ*_{0}. At *χ*_{0} = 1 the MCMC is equivalent to a random walk. The recommended practice is to set the value close to, but less than, 1. The change in variational distance is *δ*. The convergence tolerance is , and the algorithm terminates when *μ _{m}*/

A number of modifications were made to the SA algorithm as described in Almudevar (2003), permitting wider searches before convergence occurs. First, *σ _{m}* in (3) is replaced by an exponential smoother
${\sigma}_{m}^{\rho}=\rho {\sigma}_{m}+(1-\rho ){\sigma}_{m-1}^{\rho}$, for parameter

We also extend the neighborhood (*A*) to the *product neighborhood*:

$${\mathcal{N}}_{j}(A)=\underset{{A}^{\prime}\in {\mathcal{N}}_{j-1}(A)}{\cup}\mathcal{N}({A}^{\prime});j>1,$$

where _{1}(*A*) = (*A*), which permits a wider search within each stage. A mixture can then be defined, selecting *A′* from _{j}(*A*) with probability
${p}_{j}^{n}$.

Finally, we permit an adaptive truncation of stage sample length *N _{s}*. For integer
${N}_{s}^{tr}$ we end a stage when the number of occurrences of

It should be noted that there are a number of alternative methods for large-scale combinatorial optimization problems, for example, TABU search or genetic algorithms (Aarts and J.K., 1997). However, our interest will also be in the estimation of posterior densities on the model space, for which the use of a MCMC sampler is a natural option. The use of the SA algorithm will allow the integration of the optimization and the posterior density sampling phases of the modeling process in a way not possible with other optimization methods.

The neighborhood (*A*) is generated by defining a class of small modifications of *A*. Let *S _{i}* be the parent sets of the minimal graph

- (O1) Add one parent to a parent set
*S*._{i} - (O2) Randomly change function
*f*._{i} - (O3) Determine if a maximal reduction exists for
*f*. If so, adjust_{i}*f*and_{i}*S*accordingly._{i}

Note that the parent set of node *i* can be reduced by operation (O3), but not elsewhere. This ensures that any such modification is minimal. Operation (O2) is represented here by the following algorithm, which accepts as input *f* * ^{I}*.

**Perturb(f):**

- Randomly select
*y*– .^{I} - Set
*f′*=*f* - If
*f*(*y*) ≠ 0 set*f′*(*y*) = 0. - If
*f*(*y*) = 0 set*f′*(*y*) to a randomly chosen value from − 0. - Return
*f′*.

We note that in *Perturb*(*f*), transitions between outcomes 1 and −1 are avoided, in order to minimize the impact of the transition. Given model *A* the algorithm *Proposal*(*A*), defined below, generates proposal state *A*′. Informally, a node is chosen randomly, then one of two sets of operations is chosen at random. In the first, a parent is added to the parent set, then the node function is first extended, then perturbed. In the second, the node function is perturbed, then reduced, if possible. The algorithm requires parameter *K* (indegree bound), and probabilities *α*_{0},…,*α _{K}*. We require

**Proposal(A):**

- 1. Select node
*i**V*at random, retrieve*f*,_{i}*S*from model_{i}*A*. Go to step 2a with probability*α*, and go to step 3a with probability 1 −_{k}*α*, where_{k}*k*=*I*._{i} - 2a. Randomly select
*v*from*V*− {*i*} −*S*._{i} - 2b. Set
*S*′=_{i}*S*{_{i}*v*}. - 2c. Determine
*f*, the extension of^{e}*f*in the_{i}*j*th component, where*v*is the*j*th component of*S*′._{i} - 2d. Set
*f′*=_{i}*Perturb*(*f*).^{e} - 2e. Go to step 4.
- 3a. Set
*f′*=_{i}*Perturb*(*f*)._{i} - 3b. Determine if a maximal reduction in
*E*≠ exists for*f*′. If not, set_{i}*S′*=_{i}*S*, then go to step 4._{i} - 3c. Set
*S*′=_{i}*S*−_{i}*E*, and replace*f′*with it’s maximal reduction._{i} - 4. Return model
*A*′, equivalent to model*A*, with*i*th node components*f*,_{i}*S*replaced by_{i}*f′*,_{i}*S′*._{i}

In each case the parameters for the SA algorithm were set to
${N}_{s}={N}_{s}^{\mathit{tr}}=2,000$, with *χ*_{0} = 0.9, *δ* = 0.1, = 0.001, *N _{}* = 3. The initial sample, used to determine the initial temperature, is of length

We set a cap of *K* = 2 parents. To select *N _{s}* it is necessary to estimate the size of (

$$|\mathcal{N}(A)|\le {n}_{0}{N}_{1}+{n}_{1}{N}_{1}+{n}_{2}{N}_{2}\le \overline{N}=16{n}^{2}-29n$$

where *n _{i}* is the number of genes with

For the remaining parameters we take *χ*_{0} = 0.9, *δ* = 0.01, = 0.001, *N _{}* = 25. The initial sample, used to determine the initial temperature, is of length 2000. We use a mixture of neighborhoods (

The data for the 9 CRG model is of the following form. A control cell is taken to be a normal parental cell after an oncogenic transformation. Persistent perturbations were performed by individually readjusting expression of each of the 9 CRGs to levels found prior to oncogenic transformation (amounting to controlled under-expression for Plac8, and over-expression for the remainder). Gene expression profiles were obtained using four biological replications, paired with four control replicates. For each response/experiment pair involving the 9 CRGs a two-sample Mann-Whitney test was performed, resulting in a p-value indicating the statistical significance of differential expression. This results in a 9 by 9 matrix *B* of p-values, indicating the degree to which expression levels of the row gene are altered due to the perturbation of the column gene. Each element is annotated by a −1 or 1 indicating under- or over-expression relative to control levels. Diagonal elements of *B* are set to 1 (or to −1 for Plac8).

This form of data introduces the problem of selecting a p-value threshold *ρ*. A gene graph is constructed by selecting all matrix elements no greater than *ρ*, then adding an edge directed from the column gene to the row gene for each selected element. The value *ρ* may be selected using a multiple testing procedure (MTP) (Dudoit and van der Laan, 2008), although an *ad hoc* value such as *ρ* = 0.01 is often used in the literature. We will use a method proposed in Almudevar (2009) in which the hierarchal sequence of graphs obtained by varying *ρ* is first constructed, then each graph is tested against a null model with equal numbers of edges using an information theoretic score which is sensitive to general graph features, such as hub or chain structure, resulting in a sequence of graph structure p-values. The resulting sequence is shown in Figure 9. As is reported in the literature, the edge distribution of actual cellular networks tends to follow a power law distribution (Wagner, 2002), hence adopting a Erdös-Renyi random graph model (uniformly distributed edges) as the null model provides a statistical basis for detecting network structure.

The methodology may be used to select a threshold *ρ*, and also to provide a global significance level for the existence of general network structure, independent of the threshold selection. In Almudevar (2009) the value *ρ* was estimated to be the minimum graph structure p-value recorded (in this example, the 16th edge in Figure 9. Although examples in Almudevar (2009) suggest that this minimum selection rule is reasonably accurate, no formal description of its properties is currently available. We note that in Figure 9 the graph structure p-values rapidly decrease early in the sequence, then after the 20th edge an apparent steady state is reached. We therefore set the threshold at 20 edges rather than 16, noting that subsequent modeling will provide further inference regarding network structure.

The global test for network structure proposed in Almudevar (2009) yields a significance level of *P* = 0.029. It is also interesting to note that the 20 edge threshold is *ρ* = 0.177 (that is, the p-value of rank 20 in *B*). It is unlikely that any standard MTP would select a threshold this high. The threshold defines a 20 edge connectivity graph, shown in Figure 3, as well as the 9 experimental steady states, shown in Table 10, which represents the input data for the fitting algorithm.

Anthony Almudevar, University of Rochester.

Matthew N McCall, University of Rochester.

Helene McMurray, University of Rochester.

Hartmut Land, University of Rochester.

- Aarts E, LJK . Local Search in Combinatorial Optimization. Princeton, NJ: Princeton University Press; 1997.
- Aarts E, Korst J. Simulated Annealing and Boltzmann Machines. John Wiley & sons; 1989.
- Aarts E, Korst J, van Laarhoven P. In: Local Search in Combinatorial Optimization. Aarts E, Lenstra JK, editors. Princeton, NJ: Princeton University Press; 1997. pp. 91–120. chapter Simulated Annealing.
- Almudevar A. “A simulated annealing algorithm for maximum likelihood pedigree reconstruction,” Theoretical Population Biology. 2003;63:63–75. doi: 10.1016/S0040-5809(02)00048-5. [PubMed] [Cross Ref]
- Almudevar A. “A graphical approach to relatedness inference” Theoretical Population Biology. 2007;71:213–229. doi: 10.1016/j.tpb.2006.10.005. [PMC free article] [PubMed] [Cross Ref]
- Almudevar A. “Selection of statistical thresholds in graphical models,” EURASIP Journal on Bioinformatics and Systems Biology. 2009;2009:13. doi: 10.1155/2009/878013. [PMC free article] [PubMed] [Cross Ref]
- Almudevar A, Salzman P. “Using a Bayesian posterior density in the design of perturbation experiments for network reconstruction,” IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2005;2:223–229.
- Besag J, Green P, Higdon D, Mengersen K. “Bayesian computation and stochastic systems,” Statist. Sci. 1995;10:3–41. doi: 10.1214/ss/1177010123. [Cross Ref]
- Boros E, Ibaraki T, Makino K. “Error-free and best-fit extensions of partially defined boolean functions” Inform Comput. 1998;140:254–283. doi: 10.1006/inco.1997.2687. [Cross Ref]
- Chib S, Greenberg E. “Understanding the metropolis-hastings algorithm,” The American Statistician. 1995;49:327–335. doi: 10.2307/2684568. [Cross Ref]
- Crosnier C, Stamataki D, Lewis J. “Organizing cell renewal in the intestine: stem cells, signals and combinatorial control” Nature Reviews Genetics. 2006;7:349–359. doi: 10.1038/nrg1840. [PubMed] [Cross Ref]
- Danks D, Glymour C, Spirtes P. “The computational and experimental complexity of gene perturbations for regulatory network search.” In: Hsu W, Joehanes R, Page C, editors. Proceedings of IJCAI-2003 Workshop on Learning Graphical Models for Computational Genomics. 2003. pp. 22–31.
- Dudoit S, van der Laan MJ. Multiple Testing Procedures with Applications to Genomics. New York: Springer; 2008. [Cross Ref]
- Hanahan D, Weinberg RA. “The hallmarks of cancer” Cell. 2000;100:57–70. doi: 10.1016/S0092-8674(00)81683-9. [PubMed] [Cross Ref]
- Hastings WK. “Monte Carlo sampling methods using Markov chains and their applications,” Biometrika. 1970;57:97–109. doi: 10.1093/biomet/57.1.97. URL http://biomet.oxfordjournals.org/cgi/content/abstract/57/1/97. [Cross Ref]
- Huang S, Ingber DE. “Shape-dependent control of cell growth, differentiation, and apoptosis: switching between attractors in cell regulatory networks,” Exp Cell Res. 2000;261:91–103. doi: 10.1006/excr.2000.5044. [PubMed] [Cross Ref]
- Ideker TE, Thorsson V, Karp RM. “Discovery of regulatory interactions through perturbation: Inference and experimental design,” Pacific Symposium on Biocomputing. 2000;5:302–313. [PubMed]
- Ideker TE, Thorsson V, Ranish JA, Christmas R, Buhler J, Eng JK, Bumgarner R, Goodlett DR, Aebersold R, Hood L. “Integrated genomic and proteomic analyses of a systematically perturbed metabolic network,” Science. 2001;292:929–934. doi: 10.1126/science.292.5518.929. [PubMed] [Cross Ref]
- Kauffman S. “Metabolic stability and epigenesis in randomly constructed genetic nets.” J Theor Biol. 1969;22:437–467. doi: 10.1016/0022-5193(69)90015-0. [PubMed] [Cross Ref]
- Kim S, Dougherty ER, Bittner ML, Chen Y, Sivakumar K, Meltzer P, Trent JM. “General nonlinear framework for the analysis of gene interaction via multivariate expression arrays,” Journal of Biomedical Optics. 2000;5:411–424. doi: 10.1117/1.1289142. [PubMed] [Cross Ref]
- Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques. Cambridge, MA: MIT Press; 2009.
- Lahdesmaki H, Shmulevich I, Yli-Harja O. “On learning gene regulatory networks under the boolean network model” Machine Learning. 2003;52:147–167. doi: 10.1023/A:1023905711304. [Cross Ref]
- Lasorella A, Uo T, Iavarone A. “Id proteins at the cross-road of development and cancer” Oncogene. 2001;20:8326–8333. doi: 10.1038/sj.onc.1205093. [PubMed] [Cross Ref]
- Lloyd AC, Obermuller F, Staddon S, Barth CF, McMahon M, Land H. “Cooperating oncogenes converge to regulate cyclin/cdk complexes” Genes Dev. 1997;11:663–677. doi: 10.1101/gad.11.5.663. [PubMed] [Cross Ref]
- McMurray H, Sampson E, Compitello G, Kinsey C, Newman L, Smith B, Chen S, Klebanov L, Salzman P, Yakovlev A, LH “Synergistic response to oncogenic mutations defines gene class critical to cancer phenotype,” Nature. 2008;453:1112–1116. doi: 10.1038/nature06973. [PMC free article] [PubMed] [Cross Ref]
- Mircean C, Tabus I, Astola J, Kobayashi T, Shiku H, Yamaguchi M, Shmulevich I, Zhang W. Quantization and similarity measure selection for discrimination of lymphoma subtypes under k-nearest neighbor classification. Citeseer; 2003.
- Pal R, Ivanov I, Datta A, Dougherty ER. “Generating boolean networks with a prescribed attractor structure,” Bioinformatics. 2005;21:4021–4025. doi: 10.1093/bioinformatics/bti664. [PubMed] [Cross Ref]
- Sachs K, Perez O, Pe’er D, Lauffenburger DA, Nolan GP. “Causal protein-signaling networks derived from multiparameter single-cell data,” Science. 2005;308:523–529. doi: 10.1126/science.1105809. [PubMed] [Cross Ref]
- Sewing A, Wiseman B, Lloyd AC, Land H. “High-intensity raf signal causes cell cycle arrest mediated by p21cip1” Mol Cell Biol. 1997;17:5588–5597. [PMC free article] [PubMed]
- Shmulevich I, Dougherty ER. Genomic Signal Processing. Princeton, NJ: Princeton University Press; 2007.
- Shmulevich I, Zhang W. “Binary analysis and optimization-based normalization of gene expression data” Bioinformatics. 2002;18:555–565. doi: 10.1093/bioinformatics/18.4.555. [PubMed] [Cross Ref]
- Suzuki H, Watkins D, Jair K, Schuebel K, Markowitz S, Chen W, Pretlow T, Yang B, Akiyama Y, Van Engeland M, Toyota M, Tokino T, Hinoda Y, Imai K, Herman J, Baylin S. “Epigenetic inactivation of sfrp genes allows constitutive wnt signaling in colorectal cancer.” Nat Genet. 2004;36:417–422. doi: 10.1038/ng1330. [PubMed] [Cross Ref]
- Wagner A. “How to construct a large genetic network from
*n*perturbations in fewer than*n*^{2}easy steps” Bioinformatics. 2001;17:1183–1197. doi: 10.1093/bioinformatics/17.12.1183. [PubMed] [Cross Ref] - Wagner A. “Estimating coarse gene network structure from large-scale gene perturbation data,” Genome Research. 2002;12:309–315. doi: 10.1101/gr.193902. [PubMed] [Cross Ref]
- Wagner A. “Reconstructing pathways in large genetic networks from genetic perturbations” Journal of Computational Biology. 2004;11:53–60. doi: 10.1089/106652704773416885. [PubMed] [Cross Ref]
- Wimberly F, Danks D, Glymour C, Chu T. Computational methodologies in gene regulatory networks. Hershey, PA: IGI Global Publishing; 2009.
- Xia M, Land H. “Tumor suppressor p53 restricts ras stimulation of rhoa and cancer cell motility” Nat Struct Mol Biol. 2007;14:215–223. doi: 10.1038/nsmb1208. [PubMed] [Cross Ref]
- Xiang Y, Lin G, Zhang Q, Tan Y, Lu G. “Knocking down wnt9a mrna levels increases cellular proliferation.” Mol Biol Rep. 2008;35:73–79. doi: 10.1007/s11033-007-9055-9. [PubMed] [Cross Ref]
- Zhou X, Wang X, Dougherty ER. “Binarization of microarray data on the basis of a mixture model,” Molecular Cancer Therapeutics. 2003;2:679–684. [PubMed]
- Zhou X, Wang X, Pal R, Ivanov I, Bittner M, Dougherty ER. “A Bayesian connectivity-based approach to constructing probabilistic gene regulatory networks,” Bioinformatics. 2004;20:2918–2927. doi: 10.1093/bioinformatics/bth318. URL http://bioinformatics.oxfordjournals.org/content/20/17/2918.abstract. [PubMed] [Cross Ref]

Articles from Statistical Applications in Genetics and Molecular Biology are provided here courtesy of **Berkeley Electronic Press**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |