Home | About | Journals | Submit | Contact Us | Français |

**|**Gene Regul Syst Bio**|**v.5; 2011**|**PMC3080741

Formats

Article sections

- Abstract
- Introduction
- Gene Expression Paradigm
- Some Technicalities
- Regime of Self-Sustained Oscillations
- Criticality
- Stiff Excitation and Oscillations
- Gene Expression Initiation
- Discussion
- Summary
- References

Authors

Related links

Gene Regul Syst Bio. 2011; 5: 27–40.

Published online 2011 March 22. doi: 10.4137/GRSB.S6804

PMCID: PMC3080741

National Cancer Institute, EPN 3108, 6130 Executive Blvd., Rockville, Maryland 20892, USA

Corresponding author email: sr212a/at/nih.gov

Copyright © the author(s), publisher and licensee Libertas Academica Ltd.

This is an open access article. Unrestricted non-commercial use is permitted provided the original work is properly cited.

This article has been cited by other articles in PMC.

In this paper, a new model of self-organized criticality is introduced. This model, called the *gene expression paradigm*, is motivated by the problem of gene expression initiation in the newly-born daughter cells after mitosis. The model is fundamentally different in dynamics and properties from the well known *sand-pile paradigm*. Simulation experiments demonstrate that a critical total number of proteins exists below which transcription is impossible. Above this critical threshold, the system enters the regime of self-sustained oscillations with standard deviations and periods proportional to the genes’ complexities with *probability one*. The borderline between these two regimes is very sharp. Importantly, such a self-organization emerges without any deterministic feedback loops or external supervision, and is a result of completely random redistribution of proteins between inactive genes. Given the size of the genome, the domain of self-organized oscillatory motion is also limited by the genes’ maximal complexities. Below the critical complexity, all the regimes of self-organized oscillations are self-similar and largely independent of the genes’ complexities. Above the level of critical complexity, the whole-genome transcription is impossible. Again, the borderline between the domains of oscillations and quiescence is very sharp. The gene expression paradigm is an example of cellular automata with the domain of application potentially far beyond its biological context. The model seems to be simple enough for staging an experiment for verification of its remarkable properties.

Immediately after cell division, the DNA molecules in daughter cells are largely “naked”, that is, they lack the regulatory elements which are necessary for launching the gene expression machinery. As in the classical problem of the chicken and the egg, genes are waiting for transcription factors to initiate transcription, but the gene expression machinery should be launched first in order to produce them. Figuratively speaking, after the major disruption caused by mitosis, the cell is in the state of turmoil and uncertainty regarding initiation of gene expression. No deterministic process to trigger gene expression in the newly-born daughter cells is yet known. In^{1} we have briefly discussed a conceivable scenario of gene expression initiation. Due to random partitioning of mRNAs during mitosis,^{2}^{,}^{3} it is possible that at least some of them are inherited in quantities sufficient for starting the *translation* of some proteins. Selective growth of protein concentrations, in turn, increases the probability that some genes receive an adequate collection of transcription factors (TFs) for producing the first burst of mRNA. This first burst triggers a positive feedback loop and initiates an *avalanche* of transcription events randomly propagating throughout the entire genome. Qualitatively, this process resembles the *avalanches* considered in the theory of self-organized criticality. ^{4} However, without deeper conceptual modeling, it is difficult to envision whether or not some kind of asymptotic regime may be reached in this process, how this asymptotic regime will be structured, and what are the major systemic parameters governing the process.

In this work, we propose a simple model which is intended to elucidate these issues. Despite its simplicity, the properties of the model are rather remarkable. Conceptually, this is a *paradigm model*. This means that the model is not intended to describe the process of gene expression initiation in its entirety and with much realistic biological detail; instead, it directs attention to some of its fundamental facets. Within this paper, we will refer to this model as the *gene expression paradigm*. By invoking such terminology, we draw a parallel with the celebrated *sand-pile paradigm* in the theory of self-organized criticality. As is well known, the sand-pile paradigm reflects the most fundamental properties of sand-piles without incorporating much realistic detail. This is because the physical properties of sand grains and the specifics of their interaction are too numerous to be included into a grossly simplified model. One true value of this paradigm is that it introduces the notion of *criticality* as a new type of dynamic stability and provides deep insight into the origins of scaling laws known in many domains of science.^{5}^{,}^{6} In this capacity of a *paradigm* (that is, a framework of thinking about certain class of phenomena), this model has important applications far beyond the sand-pile dynamics.^{7} In its logical and mathematical essence, the sand-pile model is a cellular automaton with very simple rules of updating. Despite this simplicity, the model is very rich in properties, its behavior is very complex, and it would be hardly possible to describe such behavior by other mathematical tools in equivalently parsimonious manner. Likewise, the gene expression paradigm is in essence a cellular automaton with simple rules of updating which elucidates some core issues in gene expression dynamics. Similar to the sand-pile paradigm, there is also a hope that, due to its logical simplicity and mathematical generality, the gene expression paradigm may be applied to many problems far beyond the initial biological context.

Let us imagine a set of sites, which we will call *genes*, *g*_{1:}* _{N}*, organized as a one-dimensional string called a

(1)

Due to random variations of {*L _{i}*}, each with variance

The majority of computations presented below have been performed with the size of genome *N* = 1000, unless stated otherwise. For brevity, the gene-specific numbers of TFs which are necessary for sparking transcription, t_{1:}* _{N}*, are called

We first consider a typical case when the *average complexity*, = 30, and the coefficient of variation of the thresholds,
. Figure 1 shows the sample distribution of thresholds corresponding to this choice. It is worth noting that the majority of the thresholds lie between 10 and 50 with a negligible number of thresholds too close to zero. In this computation, the degree of completeness is set to *λ* = 0.6. This means that from the total *N* = 30,000 TFs which are required for completely filling all the regulatory sites, only 18,000 are actually available. This lack of proteins makes it impossible for all the genes to transcribe simultaneously. Figure 2 depicts the temporal evolution of gene-specific levels of TFs (hereafter, we will call these numbers *loads*). For the purpose of this demonstration, four genes have been selected at random from the totality of 1000 genes. As stated above, initial loads are distributed multinomially; hence, due to random fluctuations of initial loads and presence of the genes with comparatively low thresholds, a certain number of genes may be fully equipped for transcription from the very beginning. Subsequent behavior of the loads has a number of notable features. After a certain initial period when the proteins randomly travel from one site to another, some *additional loads* sooner or later reach the threshold level, thus producing transcriptions of the corresponding genes. After that, according to the rules of the gene expression paradigm, these regulatory sites are emptied, all the TFs are released for random repossession between all other genes, and the process begins anew. The most surprising result is that, despite purely random redistribution of newly available proteins between the non-empty regulatory sites, *strict periodicity emerges for all the genes*, each with its own autocorrelation structure. Generally, the oscillatory motions of different genes *are not* mutually correlated. Figure 3 shows examples of auto-and cross-correlation structure for a pair of genes selected at random. As seen from this graph, the autocorrelation structures of the genes are markedly different. In order to assign a *global measure* of cross-correlations to the entire set of 1000 individual time-series, we have computed the average of absolute values of non-diagonal correlations; in this case it was found to be 0.23.

Sample distribution of the gene-specific thresholds (ie, numbers of TFs required for transcription initiation) for = 30 and *r* = 0.3.

Examples of temporal dynamics of gene-specific transcription factors for four genes selected at random from the genome of size 1000.

Auto- and cross-correlations for two genes selected at random. Individual autocorrelations are markedly different. Cross-correlations are small.

It is easy to envision from Figure 2 a connection between the periods of oscillations and thresholds of corresponding genes: for those genes with larger thresholds, the number of cycles necessary to reach the transcription status are greater; hence, their periods are longer. In order to quantify this dependence, we have parameterized the autocorrelation structures of individual time series using autoregressive stochastic sequences. Namely, the time series of loads, *x _{i}* (

(2)

where
are autoregressive coefficients, *ε _{i}*(

(3)

where *t _{p}* is the characteristic time of

Left panel: dependence of quasi-periods of oscillations on the genes’ complexities (thresholds). Right panel: the same for standard deviations of fluctuations. This picture shows that the genes with greater complexities oscillate slower and with **...**

Figure 5 depicts the overall dynamics of the system in terms of the number of simultaneously active genes. As seen from this picture, after a relatively rapid “take-off” consisting of about 15 cycles, the system enters a self-organized mode with an almost constant level of activity with ~300 genes (out of 1000) transcribing simultaneously. Note that about 9000 TFs are engaged in this activity which is a half of all the TFs available in the system. While these 300 genes are in the transcription mode, the remaining 700 genes are busy redistributing the remaining 9000 proteins among one another.

If the total number of proteins available in the system is not sufficiently large then the probability of simultaneous transcription initiation by a large number of genes may be too small to launch the whole-genome gene expression. Remarkably, the borderline between the regime of whole-genome oscillatory motion and the regime of quiescence is very sharp, in the essence, critical. Figure 6 illustrates this premise. The solid line corresponds to the super-critical parameter, *λ*_{+} = 0.549, whereas the dashed line corresponds to the sub-critical parameter, *λ*_{–} = 0.547. If one runs the simulation repeatedly for *λ* = *λ*_{+} then one would *always* end up with the whole-genome oscillatory transcription with identical asymptotic levels of activity. For the reference, we provide the time course of activity above the critical region, that is, for *λ* = 0.56 (dotted line). In contrast, if the simulation is run repeatedly with *λ* = *λ*_{–}, then the asymptotic limit will *always* be complete quiescence. The critical value of *λ* lies somewhere in the interval [*λ*_{–}, *λ*_{+}], and our best estimate for it is *λ _{c}* = 0.5748. The behavior of the system with

Example of criticality for the genome of size 1000 and average complexity 30. Critical value of degree of completeness, *λ*, is 0.5748.

Time series of loads for five genes selected at random from 1000. Despite wide variations in behavior during the transition period, the regime of self-sustained oscillations emerges almost simultaneously for all genes.

Heat map of time courses of 50 genes selected at random from 1000. Despite widely varying behaviors during the transition period, the regime of self-sustained oscillations emerges simultaneously for all the genes.

The heat map in Figure 9 provides a deeper insight into the dynamics of transition to the regime of self-sustained oscillation. In this picture, the bottom half depicts the timelines of loads for 25 genes with the smallest thresholds (ie, the simplest genes) while the top half depicts the timelines of loads for 25 genes with the largest thresholds (the most complex genes). As seen from this picture, the paths of these two groups of genes towards their corresponding asymptotic regimes are markedly different. However, these diametrically opposite groups of genes enter the regime of self-sustained oscillations almost simultaneously. This tells us that we are dealing here with the *whole-genome self-synchronization phenomenon emerging without any external organizing forces or deterministically introduced feedback loops.*

Comparison of transitions to asymptotic oscillatory mode for the 25 simplest genes (minimal thresholds) with those for the most complex genes (maximal thresholds). The thresholds are given in ascending order on the left side of the graph with the smallest **...**

The position of the critical region, [*λ*_{–}, *λ*_{+}] on the scale of *λ* depends on the width of the distribution of the thresholds. When the variability of the thresholds increases, the borderline between the regimes of quiescence and oscillations moves down towards smaller values of *λ _{c}*. In a biological context this means that the greater the diversity of genes’ complexities, the easier it is to find some low-threshold genes which initiate the whole-genome self-organization. To illustrate this premise, we have simulated the case with

Time course of transcriptional activity for wider distribution of the thresholds (coefficient of variation *ζ* = 0.75). Transition period is longer than for the case *ζ* = 0.3, but asymptotic oscillatory mode is achievable anyways.

Heat maps of time courses for the 25 simplest (bottom half) and 25 of the most complex genes (top half) for the coefficient of variation *ζ* = 0.75.

There exists another type of criticality in the gene expression paradigm. For a given genome size, there exists a certain *maximal complexity* of genes above which the transcription initiation becomes impossible. Similarly, the borderline between the full-fledged transcription and complete quiescence is amazingly thin. As an example, the solid line in Figure 12 depicts the time course of transcriptional activity when the genome size *N* = 500 and average threshold is = 100. It shows that under these conditions the whole-genome oscillation does exist and that the asymptotic level of activity is approximately the same as in *the entire region* of smaller thresholds (ie, for much simpler genes). As a reference, we provide the time course of activity for = 30 (dotted line). However, when the genes’ complexity is just one step above the critical value, ie, = 101, the gene expression excitation is no longer possible, periodic or otherwise (dashed line).

When *λ* is *well above* the critical value, and the variations in genes’ thresholds are small (say, *ζ* = 0.1), an interesting phenomenon of almost simultaneous excitation and highly synchronized oscillations occurs. As seen from Figure 13, for *λ* = 0.9 the whole-genome oscillatory regime is established almost immediately upon starting the process. This oscillatory mode features not only oscillations of loads of individual genes, but also the periodic oscillations of the *total number* of active genes. Another notable feature of this regime transpires in Figure 14. All the individual oscillations are tightly synchronized and have identical periods of oscillations, regardless of their individual complexities. During the onset of the oscillatory regime, the genes with the highest thresholds are just one step behind those with the smallest thresholds. Oscillations are synchronized not only between each other but also with the total number of active genes. In this oscillatory mode, a majority of all the periods reach their minimal possible value 2. One may refer to this regime as a *regime of stiff oscillations*. In this regime, the total number of proteins available in the system is so large that each gene is forced to reside in only one of two possible stages, that is, to be either overloaded or transcribed.

Stiff oscillations for high degree of completeness 0.9 and narrow distribution of genes’ complexities (coefficient of variation 0.1).

As mentioned in Introduction, in^{1} the author has made a conjecture that the whole-genome gene expression starts with those genes which require the smallest number of TFs, and propagates upward on the scale of complexity as an avalanche-like process. The simulations within the gene expression paradigm corroborate this premise. Figure 15 shows how the complexities of *active* genes change with time. First, the transcription follows a narrow path involving a small fraction of all the genes, and these genes are predominantly the ones of relatively low complexity. Complexity of the transcribing genes, as well as their total number, gradually increase. Upon reaching some critical state the number and complexity grows explosively, encompassing the entire genome.

The *gene expression paradigm* that we have introduced in this paper emerged from our long-standing interest in the fundamentals of nonlinear dynamics of genetic regulatory networks.^{1}^{,}^{10}^{–}^{16} However, given the popularity and importance of the *sand-pile paradigm* in the science of self-organized criticality, it makes some sense to formulate the gene expression paradigm in terms of sand dynamics as well, thus emphasizing its simplicity and generality. Let us imagine a large number of sand pits and an equal number of children standing with shovels next to each of them. Suppose that each pit has a certain limiting capacity. The game goes as follows. When the content of a pit reaches its limiting capacity, the child responsible for it randomly shovels the sand from his pit to *all other pits* and does it until his pit is emptied. In this process, some other pits may become overloaded, and the corresponding child takes care about his pit in the same manner. A remarkable property of this game is that shortly after starting the process, the content of each pit enters the quasi-periodic mode, with the amplitudes and periods of oscillations being proportional to the pits’ limiting capacities. It is easy to notice the fundamental difference of this process with the sand-pile paradigm. In the latter, the excess of sand is randomly redistributed between the *neighboring sites*, thus causing an *avalanche*. In our model, the sand is randomly redistributed between *all the sites*, thus causing a somewhat opposite phenomenon of the *self-organized oscillations*. As often happens in the dynamics of cellular automata,^{17} this simple modification of the rules of updating leads to a completely different behavior.

In a sense, the model we have considered is *one-dimensional*. To wit, we have assumed that all the proteins are equivalent in their roles as TFs; hence, only their total quantity was of importance. Perhaps a more realistic approach would require reformulation of the entire problem in terms of very high-dimensional configuration space in which each TF plays a unique role and a certain *assortment* of them would be necessary to start a transcription. Such a move towards realism, albeit intriguing, would probably mean a death sentence to the model due to tremendous computational difficulties.

In natural sciences, simplified theoretical models of complex phenomena are often called *toy* models. This terminology explicitly highlights the fact that a toy model is not intended to serve as a comprehensive theory of the phenomenon; rather, it attempts to provide a description of its certain core elements. Obviously, it would be a fallacy to attack a toy model on the basis that it is unable to provide a realistic description of the phenomenon in its entirety. In biology, a contrast between the theoretical toy models and the functionality of complex biological entities is tremendous. In this sense, the theoretical considerations offered in this paper do not intend to provide a comprehensive framework for description of cellular dynamics in its entirety. It only focuses on one fundamental property of gene expression machinery, that is, on the possibility of self-organized self-sustained oscillations without any deterministic mechanism behind it. In the literature, there is no lack of experimental evidence and theoretical considerations regarding oscillations in gene expression. Thus, in the *Atlas of Cellular Oscillations*^{18} published as early as in 1979, more than 450 experimental sources were cited featuring various types of intracellular oscillations. Rhythms in gene expression have been discussed in much detail in more recent reviews.^{19}^{,}^{20} Numerous theoretical models have been developed to explain this class of phenomena.^{21}^{,}^{22} An attempt to demonstrate the possibility of *whole-genome oscillations* has been undertaken through introducing the model of *artificial genome*.^{23} Similar to the gene expression paradigm, the *artificial genome* may be also seen as a version of cellular automata but with much more complex rules of updating. The gene expression paradigm vividly demonstrates that behind all the intricacies of gene expression machinery there exist simple, general and inescapable rules that lead to the whole-genome self-sustained oscillations.

We have introduced a new model of self-organized criticality called the *gene expression paradigm*. We have shown that in this model a critical total number of proteins exist below which transcription is impossible. Above the critical threshold, the system enters the regime of self-sustained stochastic oscillations with standard deviations and periods proportional to the genes’ thresholds with *probability one*. Importantly, such a self-organization appears without any deterministic feedback loops or external supervision.

Given the size of the genome, the domain of self-organized oscillatory motion is also limited by the maximal genes’ complexities. Below the critical complexity, the regimes of self-organized oscillations are self-similar and largely independent of the genes’ complexities. Above the level of critical complexity, the whole-genome transcription is impossible. The borderline between the domains of oscillations and quiescence is very sharp.

The gene expression paradigm is an example of cellular automata with the domain of application potentially far beyond its biological context. The model seems to be simple enough for staging an experiment for verification of its remarkable properties.

For a general autoregressive process of order *f*

the autocorrelation structure may be presented as

where *s*_{1:}* _{f}* are the (complex) roots of the characteristic equation of autoregression and

where

The roots *s*_{1:}* _{f}* may be found per solution of the difference equation with respect to

with *Â* being a shift operator: *Âε*(*t*) = *ε*(*t* − 1). For the third order process, when one root, *s*_{1}, is real and two roots, *s*_{2:3} = *ρ* exp(±*iθ*), are complex-conjugate

**Disclosure**

This manuscript has been read and approved by the author. This paper is unique and is not under consideration by any other publication and has not been published elsewhere. The author and peer reviewers of this paper report no conflicts of interest. The author confirms that he has permission to reproduce any copyrighted material.

1. Rosenfeld S. Stochastic cooperativity in non-linear dynamics of genetic regulatory networks. Math Biosci. 2007;210:121. [PubMed]

2. McAdams HH, Arkin A. Stochastic mechanisms in gene expression. Proc Natl Acad Sci U S A. 1997;94:814. [PubMed]

3. McAdams HH, Arkin A. It’s a noisy business! Genetic regulation at the nanomolar scale. Trends Genet. 1999;15:65. [PubMed]

4. Turcotte DL. Self-organized criticality. Rep Prog Phys. 1999;62:1377.

5. Bak P. How Nature Works The Science of Self-Organized Criticaliy. Springer Verlag; 1996.

6. Scott A. Encyclopedia of Nonlinear Science, Taylor & Francis Books; Abington, UK: 2005.

7. Turcotte DL, Rundle JB. Self-organized complexity in the physical, biological, and social sciences. Proc Natl Acad Sci U S A. 2002;99(Suppl 1):2463. [PubMed]

8. Johnson NL, Kotz S, Balakrishnan N. Continuous Univariate Distributions. Wiley-Interscience; 1994.

9. Jenkins GM, Watts DG. Spectral Analysis and Its Applications. Holden Day; 1968.

10. Rosenfeld S. A nonlinear continuous stochastic model for genetic regulatory networks. Caveats for microarray data analysis. 2006:28–34. ALIFE-X.

11. Rosenfeld S. Stochastic Oscillations in Genetic Regulatory Networks. Applications to Microarray Experiment. In: Rizzi A, Vichi M, editors. Physica-Verlag. 2006. pp. 1609–18. COMPSTAT-2006.

12. Rosenfeld S. Stochastic Oscillations in Genetic Regulatory Networks, EURASIP. Journal of Bioinformatics and Systems Biology. 2006:1. [PMC free article] [PubMed]

13. Rosenfeld S. Why do high-dimensional networks seem to be stable?-A reflection on stochasticity of dynamically unstable nonlinear systems. In: Gauges R, Kummer U, Pahle J, Willy P, editors. Fifth Workshop on Computation of Biochemical Pathways and Genetic Networks; University of Heidelberg, Heidelberg. 2008. pp. 101–12.

14. Rosenfeld S. Origins of Stochasticity and Burstiness in High-Dimensional Biochemical Networks, EURASIP. Journal of Bioinformatics and Systems Biology. 2009 [PMC free article] [PubMed]

15. Rosenfeld S. Characteristics of transcriptional activity in nonlinear dynamics of genetic regulatory networks. Gene Regul Syst Bio. 2009;3:159. [PMC free article] [PubMed]

16. Rosenfeld S. Patterns of stochastic behavior in dynamically unstable high-dimensional biochemical networks. Gene Regul Syst Bio. 2009;3:1. [PMC free article] [PubMed]

17. Wolfram S. 2002. A New Kind of Science, Wolfram Media.

18. Rapp PE. An Atlas of Cellular Oscillations. Jounal of Experimental Biology. 1979;81:281. [PubMed]

19. Nitabach MN, Blau J. Cellular Clockwork. Nature Genetics. 2002;32:559. [PubMed]

20. Schriber U, Naef F. Cellular oscillations: rhytmic gene expression and metabolism. Current Opinion in Cell Biology. 2005;17:223. [PubMed]

21. Goldbeter A. Complex oscillatory phenomena, including multiple oscillations, in regulated biochemical systems. Biomed Biochim Acta. 1985;44:881. [PubMed]

22. Goldbeter A. Computational approaches to cellular rhythms. Nature. 2002;420:238. [PubMed]

23. Reil T. Dynamics of Gene Expression in an Artificial Genome—Implications for Biological and Artificial Ontogeny, Advances in Artificial Life. 5th European Conference; 1999. pp. 457–66. ECAL’99.

Articles from Gene Regulation and Systems Biology are provided here courtesy of **Libertas Academica**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's Canada Institute for Scientific and Technical Information in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |