Home | About | Journals | Submit | Contact Us | Français |

**|**PLoS ONE**|**v.3(6); 2008**|**PMC2423472

Formats

Article sections

Authors

Related links

PLoS ONE. 2008; 3(6): e2456.

Published online 2008 June 18. doi: 10.1371/journal.pone.0002456

PMCID: PMC2423472

Enrique Balleza,^{1} Elena R. Alvarez-Buylla,^{2} Alvaro Chaos,^{2} Stuart Kauffman,^{3} Ilya Shmulevich,^{4} and Maximino Aldana^{1,}^{*}

Sui Huang, Editor^{}

Children's Hospital Boston, United States of America

* E-mail: xm.manu.sif@xam

Conceived and designed the experiments: SK MA EB IS. Performed the experiments: EA. Analyzed the data: MA EB EA AC. Contributed reagents/materials/analysis tools: EB EA AC IS. Wrote the paper: MA EB.

Received 2008 February 8; Accepted 2008 April 14.

Copyright Balleza et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

This article has been cited by other articles in PMC.

The coordinated expression of the different genes in an organism is essential to sustain functionality under the random external perturbations to which the organism might be subjected. To cope with such external variability, the global dynamics of the genetic network must possess two central properties. (a) It must be robust enough as to guarantee stability under a broad range of external conditions, and (b) it must be flexible enough to recognize and integrate specific external signals that may help the organism to change and adapt to different environments. This compromise between robustness and adaptability has been observed in dynamical systems operating at the brink of a phase transition between order and chaos. Such systems are termed *critical*. Thus, criticality, a precise, measurable, and well characterized property of dynamical systems, makes it possible for robustness and adaptability to coexist in living organisms. In this work we investigate the dynamical properties of the gene transcription networks reported for *S. cerevisiae*, *E. coli*, and *B. subtilis*, as well as the network of segment polarity genes of *D. melanogaster*, and the network of flower development of *A. thaliana*. We use hundreds of microarray experiments to infer the nature of the regulatory interactions among genes, and implement these data into the Boolean models of the genetic networks. Our results show that, to the best of the current experimental data available, the five networks under study indeed operate close to criticality. The generality of this result suggests that criticality at the genetic level might constitute a fundamental evolutionary mechanism that generates the great diversity of dynamically robust living forms that we observe around us.

There is evidence that many complex dynamical systems found in nature are critical; namely, they operate close to a phase transition between two different dynamical regimes [1]. Avalanches [2], atmospheric phenomena [3], [4], financial markets [5], [6], earthquakes [7], [8], granular matter [9], and the brain [10]–[12], are typical examples. Critical systems exhibit remarkable properties which would be difficult to explain without the assumption of criticality. For instance, they can integrate, process and transfer information faster and more reliably than non critical systems [13]. Or they can detect and respond to external stimuli whose intensities span several orders of magnitude, like the brain [11]. These remarkable properties are mainly a consequence of the long-range correlations that emerge close to the critical point, producing collective behaviors and coordinated responses of the entire system. Thus, criticality confers on the system the ability to collectively respond and adapt to an often rapidly changing environment.

In the context of genetic regulatory networks (GRN), which are recognized as the main component in charge of cellular control [14], recent theoretical studies have shown that robustness and adaptability, two central properties of living organisms [15]–[20], exist simultaneously with the highest probability only in GRN operating at or close to criticality [21]. Thus, criticality is a property that can help us understand how the coordinate expression of the different genes in an organism is achieved under external perturbations, either to sustain cell functionality or to generate new phenotypes in order for the organisms to change and adapt to new environments [15]–[21]. Therefore, it is important to determine whether the GRN of real organisms are dynamically critical. Although some attempts have recently been made in order to answer this question [22]–[25], the definite answer has remained elusive for several decades. Here we present direct evidence that the GRN of five different organisms indeed exhibit critical dynamics. We do so by simulating in the computer the avalanche of perturbations in the gene expression profile of the genetic networks of these organisms. This allows us to compute the Derrida mapping *M*(*x*) for the five networks under consideration [26]. We will formally introduce the Derrida map in a further section. For the time being, it suffices to say that *M*(*x*) relates the size *x*(*t*) of the perturbation avalanche at time *t*, with the size *x*(*t*+1) of the avalanche at the next time step *t*+1. In other words: *x*(*t*+1)=*M*(*x*(*t*)). Therefore, *M*(*x*) contains all the information of the perturbation dynamics and can be used to directly measure the dynamical regime in which the network operates. Using this technique, we show that the dynamics of these avalanches are critical within numerical accuracy for the five different organisms studied.

However, in computing *M*(*x*) for the large networks of *E. coli*, *S. cerevisiae* and *B. subtilis*, we face the problem that the overwhelming majority of the regulatory functions (also termed *regulatory phrases*) that determine the *combined* effect of the regulators on their target genes are unknown. To circumvent this difficulty, we used random functions to model the dynamics of these networks. Although random, these functions were constructed in accordance with the fraction of positive regulatory phrases inferred from real gene expression profiles. Thus, the internal structure of the regulatory functions that we used for *E. coli*, *S. cerevisiae* and *B. subtilis* is statistically consistent with the one observed in microarray experiments.

Another important aspect that determines the dynamical regime in which the network operates (ordered, critical or chaotic) is the fraction of canalizing functions [27]–[31], which will be defined in a further section. Intuitively, these functions take into account the existence of dominant regulators such that, when present, override the effect of the other regulators. From the microarray experiments that we analyze it was not feasible to infer the fraction of canalizing functions present in the regulatory networks of *E. coli*, *S. cerevisiae* and *B. subtilis*. However, for these networks we varied in our simulations the fraction of canalizing functions around the statistically expected values. Interestingly, we observe a significant robustness of the critical dynamics under the addition or elimination of canalizing functions. This suggests that the critical behavior observed in the dynamics of the genetic networks of the organisms under study, is mainly produced by the network architecture rather than by the specific nature of the regulatory functions.

In the sections that follow we first present the Boolean network model that we use to implement the dynamics of the genetic networks, and the well known mean-field results that predict the existence of a phase transition from ordered to chaotic dynamics in this model. Then, we go beyond the mean-field theory by implementing the Boolean approach in the networks of real organisms, and show that in all the cases the Derrida map *M*(*x*) is consistent with critical dynamics. We then analyze how this map changes under the addition and removal of canalizing functions. In the last section we summarize and discuss our results.

Several models have been proposed to analyze the dynamics of GRN [32]–[35]. Although the details of the dynamics might change from one description (e.g. continuous) to another (e.g. discrete), we expect the general properties of the dynamics, such as criticality, to be model independent. In fact, recent work shows that continuous and discrete descriptions of GRN exhibit similar dynamical properties under very general conditions [36]. Here we use the Boolean approach [37]–[39], in which every gene is represented by a discrete variable *g* that can take two values: *g*=1 if the gene is expressed and *g*=0 otherwise. The genome is thus represented by a set of *N* binary variables, *g _{1}*,

(1)

where are the *k _{n}* regulators of

In this section we present the mean-field theory results that show the existence of a dynamical phase transition from ordered to chaotic dynamics in the Boolean network model [26], [38], [44]. This allows us to introduce the tools that we use to characterize the dynamical regime in which the network operates. Although the phase transition was first obtained within the context of the mean-field approximation, we will show in the next section the remarkable result that the phase transition also occurs, almost identically, in networks with realistic topologies, for which the mean-field assumption does not necessarily apply. The phase transition is characterized by the temporal evolution of the Hamming distance *x*(*t*) between two different dynamical trajectories produced by two slightly different initial conditions. From a biological point of view, the Hamming distance *x*(*t*) is the average normalized size at time *t* of the avalanche of perturbations in the gene expression profile, produced by the perturbation (e.g. gene knockout or gene over expression) of a small fraction *x*(0) of genes at time *t*=0. The temporal evolution of *x*(*t*) is given by a dynamical mapping *x*(*t*+1)=*M*(*x*(*t*)) which relates the size of the perturbation avalanche at two consecutive time steps [26], [38], [44]. Given an initial perturbation *x*(0) at time *t*=0, successive iterations of this mapping will eventually converge to a stable fixed point , which is the final size of the perturbation avalanche. The value *x*_{∞} is the order parameter that characterizes the dynamical regime in which the network operates. Thus, if *x*_{∞}=0 (ordered regime), all the initial perturbations die out over time. On the contrary, if *x*_{∞}>0 (chaotic regime), the initial perturbation of even a small fraction of genes propagates across the entire system, finally altering the expression of a finite fraction *x*_{∞} of genes in the genome. It turns out that *M*(*x*) is a continuous convex monotonically increasing function of *x* (in the mean field theory *M*(*x*) is a polynomial), with the property that *M*(0)=0 and *M*(1)<1. Therefore, there is only one parameter that controls the phase transition, the so-called average network sensitivity *S*, which is given by *S*=[*dM*(*x*)/*dx*]_{x}_{=0}. If *S*<1 then the only fixed point is *x*_{∞}=0, whereas if *S*>1 then *x*_{∞}>0. The phase transition occurs at *S*=1, for which the fixed point *x*_{∞}=0 is only marginally stable. In general, *S* depends on *p* and on the topology of the network. The dynamical mapping *M*(*x*) contains all the information about the dynamical regime in which the network operates. This is true even if *M*(*x*) cannot be obtained through a mean-field computation, which is the case for real networks. In the Supporting Information (Text S1) we provide a Java applet with the animation of the perturbation dynamics in networks with homogeneous random topology.

The mean-field theory that predicts the existence of the phase transition controlled by the average network sensitivity *S*, is based on the assumption that all the genes in the network are statistically independent and statistically equivalent [26], [44]. However, this assumption is certainly not true for real GRN, due to the existence of global regulators which correlate the expression of a large fraction of genes. Indeed, recent large-scale analysis [21] of the transcription regulatory networks of *E. coli*, *S. cerevisiae*, and *B. subtilis* indicate that these networks exhibit a Poisson-like input topology and a scale-free output topology (see Figure 1a). The output scale-free topology correlates the expression of a large fraction of the genes, and the assumption of statistical independence is not longer satisfied. Therefore, we do not expect the mean-field theory to be applicable for real networks because of their topological characteristics. Nonetheless, the phase transition predicted by the mean-field theory is identical to the one observed in randomly constructed Boolean networks with topologies statistically equivalent to the ones observed in real GRN. This is shown for the first time in Figure 1b. This result is quite remarkable, for we know that in many other systems the phase transition strongly depends on the network topology, and can even disappear for topologies that induce strong correlations between the elements (such as the scale-free topology, [45]).

To determine the dynamical regime in which the genetic network of a real organism operates, we have to compute the dynamical mapping *M*(*x*) directly from experimental data, without any mean-field assumptions. The actual form of *M*(*x*) depends on both the network topology and the particular set of Boolean functions. We computed numerically *M*(*x*) for five genetic networks: The network of flower morphogenesis in *A. thaliana* (15 genes; [41]); the network of segment polarity genes in *D. melanogaster* (60 genes; [42]); and the gene transcription networks of *E. coli* (1481 genes; [46]), *S. cerevisiae* (3459 genes; [47]), and *B. subtillis* (830 genes; [48]). In the first two cases the Boolean functions are already known. However, the overwhelming majority of regulatory phrases for the gene transcription networks of *E. coli*, *S. cerevisiae* and *B. subtilis* are still unknown. Due to this lack of information, to implement the Boolean dynamics on the GRN of these three organisms we used biased random Boolean functions generated with a gene expression probability *p* inferred from microarray experiments. (In the next section we show that the map *M*(x) does not change significantly for networks with a large fraction of canalizing functions.) *Given* the network topology, *p* can be estimated from microarray experiments by standard Bayessian parametric inference with two states (see Methods). Using 223 microarrays to sample the gene expression space in *S. cerevisiae* [49], 152 microarrays for *E. coli* [49]–[51], and 69 microarrays for *B. subtillis* [52], we inferred several regulatory phrases for each of these organisms. All the experiments used for this analysis are listed in the Supporting Information accompanying this article (Text S1). We only used regulatory phrases for which the average of the *a posteriori* probability distribution was greater than 70% (activatory phrase) or smaller than 30% (inhibitory phrase). The inference technique is explained below in the Materials and Methods section. With this technique, we obtained the following gene expression probabilities: *p*=0.576±0.038 for *E. coli* (from 264 inferred phrases), *p*=0.495±0.055 for *S. cerevisiae* (from 196 phrases), and *p*=0.0531±0.035 for *B. subtilis* (from 307 phrases). We then constructed random Boolean functions with internal bias given by these probabilities.

We report in Figure 2 the Derrida curves (i.e. the graphs of *M*(*x*)) for the five genetic networks under consideration. It is apparent from this figure that in all five cases the Derrida curves arrive almost tangent to the identity close to the origin. The above is consistent with critical behavior at the genetic level. A polynomial regression analysis allows us to estimate the average network sensitivity *S*=[*dM*(*x*)/*dx*]_{x}_{=0} by computing the slope at the origin of the best-fit polynomial with a degree equal to the maximum number of regulators per gene in each case (this is the polynomial predicted by the mean-field theory). In all five cases the Regression Sum of Squares is below 10^{−4}, and the average network sensitivities obtained in this way are: *S*=1.028 for *E. coli*; *S*=1.036 for *S. cerevisiae*; *S*=0.826 for *B. subtilis*; *S*=0.914 for *D. melanogaster*; and *S*=1.127 for *A. thaliana*. Within numerical accuracy, these sensitivities show that the dynamics of these networks are very close to criticality. Note that this is particularly true for the two largest and most complete networks of *E. coli* and *S. cerevisiae*.

We have used random Boolean functions in our simulations because the overwhelming majority of the logical rules for *E. coli*, *S. cerevisiae* and *B. subtilis*, are still unknown. The caveat is that real biological functions are not random. It has been pointed out that canalizing functions are more realistic from a biological point of view [27], [28]. A formal definition of canalizing functions can be found in Ref. [31]. Here, it suffices to define a canalizing function as follows. Let *F*(*g*_{1}, *g*_{2}, …, *g _{k}*) be a Boolean function of

It is known that the amount of canalizing functions in the system can change the dynamical regime in which the network operates [29], [30]. Therefore, it is important to determine if the dynamics of the large networks of *E. coli*, *S. cerevisiae* and *B. subtilis* are still critical when more realistic Boolean functions are used. However, from the microarray experiments that we analyzed it is impossible to know the fraction of canalizing functions present in these organisms, or if such functions have one, two or more canalizing inputs. For instance, according to the *regulonDB*, which gives the topology of the transcription regulatory network of *E. coli* [46], there are at least 817 genes with two or more regulators. These 817 genes are all regulated by a subset of 160 genes. Therefore, to determine whether or not the Boolean functions associated with these genes are canalizing, one would have to analyze microarray experiments probing at least 2^{160} different configurations for these genes. This is impossible for many reasons, but mainly because not all the 2^{160} configurations of the regulators are biologically attainable. (Consider for instance the configuration 00000….00 in which all the regulators are “off”, or the configuration 11111…11 in which all the regulators are “on”. Clearly, these two configurations are not attainable under any biological condition—without killing the organism.) Therefore, the 152 microarrays used to sample the gene expression space in *E. coli* represent a very tiny fraction of all the possible configurations necessary to determine the whole set of Boolean functions. However, these 152 experiments are absolutely relevant because they represent the *observable and biologically attainable* gene expression configurations of the organism. Thus, it might be irrelevant if the whole Boolean function of a given gene is canalizing because neither us nor the organism are sampling its whole set of (mathematically possible) configurations. For this reason, we do believe that the important quantity is the *observed* gene expression probability for the biologically attainable configurations.

Nonetheless, Random Boolean functions already contain canalizing functions. Table 2 gives the probability *P _{c}*(

From Tables 2 and and33 we obtain that, if the Boolean functions for *E. coli* are generated at random, just by chance about functions out of 1482 (70%) would be canalizing on at least one input. If we do not consider the 615 genes with only one regulator (because for such genes the Boolean function is trivially canalizing), and the 50 genes with no inputs, then there are 817 genes with two or more regulators. From the data listed in Tables 2 and and33 one obtains that, for the genes with *K*≥2, about out of 817 of the randomly generated functions would be canalizing just by chance. It is important to note that these 421 canalizing functions mostly come from the genes with *K*=2 and *K*=3. Thus, combining the results presented in Tables 2 and and33 we obtain that the fraction *f _{c}* of canalizing functions present just by chance for the genes with

Therefore, in the simulations with random Boolean functions there is already a large fraction of canalizing functions. To further investigate the effect that canalizing functions might have on the dynamics of the genetic networks, we have added and removed canalizing functions to and from the ones already present just by chance. Our results indicate that the critical behavior observed in the genetic dynamics still exists even when the fraction of canalizing functions substantially deviates from the value expected for fully random functions.

Figure 3 shows the map *M*(*x*) for the *E. coli* network for several values of the fraction *f _{c}* of genes with

Analogously, to decrease the value of *f _{c}* we removed a fraction

From Figure 3 it is clear that the Derrida map is less robust to the removal than to the addition of canalizing functions. However, in the region 0.4≤*f _{c}*≤0.66 the Derrida map does not seem to change substantially.

To indicate the significance of the above results we present in Figure 4 the map *M*(*x*) for homogeneous random networks with different fractions *f _{c}* of canalizing functions. In these networks each gene has exactly

These results strongly suggest that the critical behavior observed in the dynamics of the genetic networks of real organisms, and its robustness to changes in the fraction of canalizing functions, are most probably due to the network architecture rather than to the logical rules that regulate the expression of the genes. If, for instance, there were a large fraction of genes with 5 inputs in the *E. coli* network, then the presence of canalizing functions would certainly affect the dynamical regime in which the network operates. But given that most of the genes have 1, 2 or 3 inputs, there is already a large fraction of canalizing functions just by chance. Adding more does not substantially affect the dynamical behavior.

Our results show that, to the best of the current experimental data available, and within numerical accuracy, the Boolean dynamics of the GRN of five organisms from four different kingdoms are critical. We have used two small well documented networks for specific patterning processes in plants (*A. thaliana*) and animals (*D. melanogaster*), for which both the topology and the Boolean functions are known and correspond to thoroughly studied processes at the molecular level. We also tested larger GRN for unicellular organisms (*E. coli*, *B. subtilis* and *S. cerevisiae*) in which the logical rules are not known. Thus, in the absence of other knowledge, we used random regulatory phrases constructed with gene expression probabilities inferred from microarray experiments. How ever, the results are essentially the same when the fraction of canalizing functions with the same gene expression probabilities varies considerably. The fact that all these GRN, constructed with completely different approaches, for distinct organisms and of different sizes exhibit critical Boolean dynamics is of outmost interest and strongly suggest that this might be a generic characteristic with far reaching consequences. For there is evidence that criticality confers clear evolutionary advantages to living organisms, because it is only close to criticality that robustness and adaptability can coexist. It remains an open problem to determine how criticality has emerged throughout evolution, i.e. to devise biologically relevant models of network growth that generate critical dynamics. Such models must take into account not only the evolution of the network topology, but also the emergence of the regulatory phrases through which the genes interact. If this critical behavior is corroborated as more and better experimental data become available, and with more detailed dynamic models, criticality at the genetic level may become a fundamental evolutionary mechanism that renders the stability and diversity that we observe in living organisms.

We computed the phase transition displayed in Fig. 2(b) by implementing the Boolean dynamics in networks with scale-free output topology and Poisson input topology. Such networks are easily generated in the computer by firs assigning to each gene *g _{n}* its number of outputs

Once we know the regulators of every gene in the network, the Boolean functions are assigned as follows. A Boolean function of *k* inputs has 2* ^{k}* values, one for each of the 2

For *E. coli* all available microarray experiments in the Stanford Microarray Database (SMD, see http://genome-www5.stanford.edu) were incorporated in the inference algorithm, except by experiments that increased the number of false positives (see Assessing Inference Success below); 107 experiments were selected in total from SMD. Also, 45 microarray experiments were incorporated from Gosset et al, 2004 and Zhang et al, 2005. For *B. subtilis* all the available microarray experiments in the KEGG Expression Database (www.genome.jp/kegg/expression) were incorporated except experiments that increased the number of false positives; 69 experiments in total. For *S. cerevisiae* basically all the data form the three experimenters were retrieved from SMD. The name of the experimenter and the number of microarrays are as follows: Gasch, 138; DeRisi, 29; Spellman, 56. Only the cell cycle experiments from Spellman were rejected because they do not conform to the hypothesis of the Bayesian inference algorithm, which requires the absence of oscillating variables. See the Supporting Information (Text S1) for a complete relation of ID's of the incorporated experiments for the three organisms.

The data were retrieved from SMD as Log Ratios (base 2). These data were already background corrected and mean normalized by SMD itself. Only features with no flags were selected. Data from (Gosset et al, 2004; Zhang et al, 2005) were background corrected and normalized by Affymetrix Microarray Suite 5.0. Log Ratios (base 2) were then calculated between the wild type experiments and the mutants with and without glucose. The data from the KEGG Expression Database were already normalized when retrieved. They were only background corrected by us; Log ratios (base 2) were obtained.

Only Log Ratios greater than a certain threshold *T*_{0} were considered (see Assessing Inference Success). With the filtered data we performed standard Bayesian parametric inference with variables of two states, inhibited (repressed) and induced (activated). The Equivalent Sample Size was set equal to 4. The induction or repression of a gene were established if the average of the *a posteriori* distribution was equal to or greater than 70% for induction, and equal to or lower than 30% for repression. A detailed example of the inference algorithm is presented below.

The nature of the regulation (activation or repression) of many of the established regulations in the networks of *E. coli* and *B. subtilis* is reported in the corresponding databases (Salgado et al, 2006; Makita et al, 2004). From this information it is possible to generate regulatory phrases only for those genes with one regulator. Thus we compared the one-regulator phrases already reported with our inferred one-regulator phrases. The inference success was established as the percentage of coincidences between our inferred phrases and the reported ones. The inference success increases as the threshold *T*_{0} increases. However, the total number of phrases that can be inferred diminishes as *T*_{0} increases. A good compromise between low percentage of false positives (high inference success) and a good statistics was achieved at *T*_{0}=1.50 for *E. coli*, and *T*_{0}=1.30 for *B. subtilis*. From these results we found that a good threshold for *S. cerevisiae*, for which the nature of the regulations is not reported, is *T*_{0}=1.50. Note that a threshold of 1.5 in Log_{2} Ratios is equivalent to almost a three fold change in expression intensities.

We illustrate the inference technique to obtain the value of the parameter *p* from microarray experiments with a specific example. Suppose that gene A is regulated by genes B and C, and that we want to determine the regulatory phrases, i.e. how A changes its expression due to the joint combinatorial changes of B and C. In order to do so, we need a set of microarray experiments in different conditions that have been already normalized and their background noise subtracted. We use the data shown in Table 4 for the gene expression level of three genes (log ratios base 2), obtained from two color spotted microarrays. In this table, “Cond_{i}” refers to one microarray experiment in the i^{th} condition (not necessarily all the conditions have to be different). For instance, Cond_{1} may be the wild type condition, whereas Cond_{2} may be a condition in which a global regulator has been knocked out. In general, the level of expression of a given gene is different from one condition to another. Namely, there may be over expressions or under expressions across the different conditions. To filter out only the biggest changes in the level of expression of the genes we use a threshold *T*_{0}, and indicate the positive changes (those greater than the threshold) with an arrow pointing upwards. Analogously, we indicate the negative changes (those lower than the threshold) with an arrow pointing downwards. When no change is detected we use the symbol (—). Using the data shown in Table 4 and setting the threshold value to *T*_{0}=1.5, we get the discrete representation shown in Table 5.

By counting how many times A changes for the different combinations of B and C given in Table 5, we obtain the data shown in Table 6. Some table entries are equal to zero, indicating that there are no data for that particular combination of B and C. Note that we have considered all the cases with three symbols (↓,—,↑), and two regulators (B and C). Now we reduce Table 6 by considering only the entries where a change can be detected (i.e., we eliminate all entries with “—”). This gives the results displayed in Table 7. For reasons that will be clear in a moment, it is necessary to add one unit of *a priori* evidence to every entry of Table 7. After this unit has been added, we obtain Table 8. Observe from this last table that, for the first combinatorial change of B and C (first row), there are three evidences where the level of expression of A decreases and one in which it increases. If these were all the evidences, we could say that from 100% of the cases (3+1), in 75% of them the expression level of A increases and in 25% of them it decreases. Thus, the *a priori* evidence that we have added has the effect of changing the importance the *a posteriori* evidence. There are several ways of distributing the *a priori* evidence; in this work we have used an Equivalent Sample Size *S*=4, which means that the 4 units of *a priori* evidence are distributed equally among all the possibilities.

Finally, given the *a posteriori* evidence, in order to decide whether the expression level of A increases or decreases we use a second threshold *T*_{1}. If the percentage of *a posteriori* evidence is greater or equal than *T*_{1}, then a regulatory phrase has been established. Using a threshold *T*_{1}=75% we obtain the results shown in Table 9. We have used the symbol “—” to state that the evidence does not support a decision. To infer the regulatory phrases from the microarray experiments analyzed in this work we used the value *T*_{1}=75%.

Once a set of phrases has been successfully established, we determine the probability of gene expression as the quotient of activatory phrases between the total of inferred phrases. In the case shown in Table 9, two phrases have been successfully established, one activatory and the other inhibitory. Thus, the estimated probability of gene expression in this example would be *p*=½.

To validate our methodology for inferring regulatory phrases we use the nature of the regulation already reported in the data bases for the case of simple regulations, namely, when only one regulator determines the expression of the regulated gene (gene A is only regulated by gene B). For example, in *E. coli* the gene *alaW* is regulated (positively) only by *fis*. In the Regulon Data Base this regulation is represented as:

The plus sign at the end indicates that *alaW* is positively retulated by *fis*. The table with the two phrases follows immediately:

*fis* *alaW*

↓ ↓

↑ ↑

Note that this information in the RegulonDB cannot be used to validate our methodology for genes with more than one regulator. This is because the regulation is combinatorial. To validate our methodology we compared only the inferred phrases for simple regulations with the ones reported in the RegulonDB or the DBTBS. Figure 5 shows the inference success (the fraction of regulatory phrases form simple regulations that matched the phrases obtained from the curated databases) as a function of the threshold *T*_{0}. In the same graphs we have plotted the total number of inferred phrases (including the phrases from simple regulations). As we can see, the inference becomes better as the threshold *T*_{0} increases. However, the total number of inferred phrases decreases with the threshold. In order to have a good statistics (more than 250 inferred phrases) and a high inference success (around 90%), in this work we have chosen *T*_{0}=1.5 in *E. coli* and *T*_{0}=1.3 in *B. subtilis*.

Java Applet of the dynamics and list of ID numbers for the microarray experiments used in this work.

(0.05 MB DOC)

Click here for additional data file.^{(50K, doc)}

We thank the reviewers of this work for their useful comments and suggestions.

**Competing Interests: **The authors have declared that no competing interests exist.

**Funding: **This work was partially supported by the National Institute of Health grants GM070600 (I.S. and S.A.K.), GM072855 (I.S.), and P50 GM076547 (I.S.), CONACyT grants P47836-F (M.A.), CO1.41848/A-1, CO1.0538/A-1 and CO1.0435.B-1 (E.R.A.B.), and PAPIIT-UNAM grants IN112407-3 (M.A.) and IN230002 (E.R.A.B.). E.B. and A.C. acknowledge CONACyT for Ph.D. scholarships.

1. Hohenberg PC, Halperin BI. Theory of dynamic critical phenomena. Rev Mod Phys. 1977;49:435–479.

2. Turcotte DL. Self-organized criticality. Rep Prog Phys. 1999;62:1377–1429.

3. Peters O, Neelin JD. Critical phenomena in atmospheric precipitation. Nat Phys. 2006;2:393–396.

4. Arakawa A. Scaling tropical rain. Nat Phys. 2006;2:373–374.

5. Lux T, Marchesi M. Scaling and criticality in a stochastic multi-agent model of a financial market. Nature. 1999;397:498–500.

6. Mantegna RN, Stanley HE. Scaling behavior in the dynamics of an economic index. Nature. 1995;376:46–49.

7. Heimpel M. Critical behavior and the evolution of fault strength during earthquake cycles. Nature. 1997;388:865–868.

8. Scholz CH. Earthquakes and friction laws. Nature. 1998;391:37–42.

9. Ostojic S, Somfai E, Nienhuis B. Scale invariance and universality of force networks in static granular matter. Nature. 2006;439:828–830. [PubMed]

10. Chialvo DR. Are our senses critical? Nat Phys. 2006;2:301–302.

11. Kinouchi O, Copelli M. Optimal dynamical range of excitable networks at criticality. Nat Phys. 2006;2:348–352.

12. Werner G. Metastability, criticality and phase transitions in brain and its models. BioSystems. 2007 doi:10.1016/j.biosystems.2006.12.001. [PubMed]

13. Sethna JP, Dahmen K A, Myers CR. Crackling noise. Nature. 2001;410:242–250. [PubMed]

14. Kioussis D. Kissing chromosomes. Nature. 2005;435:579–480. [PubMed]

15. Levine M, Tjian R. Transcription regulation and animal diversity. Nature. 2003;424:147–151. [PubMed]

16. Kitano H. Biological robustness. Nat Rev Genet. 2004;5:826–837. [PubMed]

17. de Visser J, Hermisson J, Wagner GP, Meyers LA, Bagheri-Chaichian H, et al. Perspective: evolution and detection of genetic robustness. Evolution. 2003;57:1959–1972. [PubMed]

18. Kirshner M, Gerhart J. Evolvability. Proc Natl Acad Sci U S A. 1998;95:8420–8427. [PubMed]

19. Nehaniv C. Evolvability. BioSystems. 2003;69:77–81. [PubMed]

20. Wagner A. Robustness and evolvability in living systems. Oxford University Press; 2007.

21. Aldana M, Balleza E, Kauffman S, Resendis O. Robustness and evolvability in gene regulatory networks. J Theor Biol. 2007;245:433–448. [PubMed]

22. Serra R, Villani M, Semeria A. Genetic network models and statistical properties of gene expression data in knock-out experiments. J Theor Biol. 2004;227:149–157. [PubMed]

23. Serra R, Villani M, Graudenzi A, Kauffman S. Why a simple model of genetic regulatory networks describes the distribution of avalanches in gene expression data. J Theor Biol. 2007;246:449–460. [PubMed]

24. Shmulevich I, Kauffman S, Aldana M. Eukariotic cells are dynamically oredred or critical but not chaotic. Proc Natl Acad Sci U S A. 2005;102:13439–13444. [PubMed]

25. Nykter M, Price ND, Aldana M, Ramsey SA, Kauffman SA, Hood LE, Yli-Harja O, Shmulevich I. Gene expression dynamics in the macrophage exhibit criticality. Proc. Natl. Acad. Sci. U.S.A. 2008;105(6):1897–1900. [PubMed]

26. Derrida B, Stauffer D. Phase transitions in two dimensional Kauffman cellular automata. Europhys Lett. 1986;2:739–745.

27. Harris S, Sawhill B, Wuensche A, Kauffman S. A model of transcriptional regulatory networks based on biases observed in the regulatory rules. Complexity. 2002;7:23–40.

28. Shmulevich I, Lähdesmäki H, Dougherty ER, Astola J, Zhang W. The role of certain Post classes in Boolean network models of genetic networks. Proc. Natl. Acad. Sci. U.S.A. 2003;100 (19):10734–10739. [PubMed]

29. Nikolajewa S, Friedel M, Wilhelm T. Boolean networks with biologically relevant rules show order behavior. BioSystems. 2007;90:40–47. [PubMed]

30. Karlsson F, Hörnquist M. Order and chaos in Boolean gene networks depends on the mean fraction of canalizing functions. Physica A. 2007;384:747–575.

31. Just W, Shmulevich I, Konvalina J. The number and probability of canalizing functions. Physica D. 2004;197:211–221.

32. Bower JM, Bolouri H, editors. Computational Modeling of Genetic and Biochemical Networks. The MIT Press; 2001.

33. Smolen P, Baxter DA, Byrne JH. Modeling transcriptional control in gene networks - methods, recent results, and future directions. Bull Math Biol. 2000;62:247–292. [PubMed]

34. Hasty J, McMillen D, Isaacs F, Collins JJ. Computational studies of gene regulatory networks: In Numero molecular biology. Nat Rev Genet. 2001;2:268–279. [PubMed]

35. Bornholdt S. Less is more in modeling large genetic networks. Science. 2005;310:449–451. [PubMed]

36. Chaves M, Sontag ED, Albert R. Methods of robustness analysis for Boolean models of gene control networks. IIE Proc-Syst Biol. 2006;153:154–167. [PubMed]

37. Kauffman S. Metabolic stability and epigenesis in randomly constructed genetic nets. J Theor Biol. 1969;22:437–467. [PubMed]

38. Aldana M, Coppersmith S, Kadanoff LP. Boolean dynamics with random couplings. In: Kaplan E, Marsden JE, Sreenivasan KR, editors. Perspectives and Problems in Nonlinear Science. Springer NY; 2003. pp. 23–89.

39. Kauffman S. At Home in the Universe. Oxford University Press; 1995.

40. Mendoza L, Alvarez-Buylla ER. Genetic regulation of root hair development in Arabidopsis thaliana: a network model. J Theor Biol. 2000;204:311–326. [PubMed]

41. Espinosa-Soto C, Padilla-Longoria P, Alvarez-Buylla ER. A gene regulatory network model for cell-fate determination during Arabidopsis thaliana flower development that is robust and recovers experimental gene expression profiles. The Plant Cell. 2004;16:2923–2939. [PubMed]

42. Albert R, Othmer HG. The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J Theor Biol. 2002;223:1–18. [PubMed]

43. Huang S, Ingber DE. Shape-dependent control of cell growth, differentiation, and apoptosis: Switching between attractors in cell regulatory networks. Exp Cell Res. 2000;262:91–103. [PubMed]

44. Aldana M. Boolean dynamics of networks with scale-free topology. Physica D. 2003;185:45–66.

45. Pastor-Satorras R, Vespignani V. Epidemis spreading in scale-free networks. Phys Rev Lett. 2001;86:3200–3203. [PubMed]

46. Salgado H, Gamma-Castro S, Martínez-Antonio A, Diaz-Peredo E, Sanchez-Solano F, et al. RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions. Nucleic Acids Res. 2006;34 (Database issue):D394–397. [PMC free article] [PubMed]

47. Luscombe NM, Babu MM, Yu H, Snyder M, Teichmann SA, et al. Genomic analysis of regulatory network dynamics reveals large topological changes. Nature. 2004;431:308–312. [PubMed]

48. Makita Y, Nakao M, Ogasawara N, Nakai K. DBTBS: database of transcriptional regulation in Bacillus subtilis and its contribution to comparative genomics. Nucleic Acids Res. 2004;32:D75–77. [PMC free article] [PubMed]

49. Stanford Microarray Database (http://genome-www5.stanford.edu)

50. Gosset G, Zhang Z, Nayyar S, Cuevas WA, Saier MH., Jr Transcriptome analysis of crp-dependent catabolite control of gene expression in Escherichia coli. J Bacteriology. 2004;186:3516–3524. [PMC free article] [PubMed]

51. Zhang Z, Gosset G, Barabote R, González CG, Cuevas WA, et al. Functional interactions between the carbon and iron utilization regulators, crp and fur, in Escherichia coli. J Bacteriology. 2005;187:980–990. [PMC free article] [PubMed]

52. KEGG Expression Database (www.genome.jp/kegg/expression)

Articles from PLoS ONE are provided here courtesy of **Public Library of Science**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |