Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2996232

Formats

Article sections

- Abstract
- Control is limited by information loss
- Noise limited by 4th root of signal rate
- Information losses in cascades
- Systems selected for noise suppression
- Outlook
- Supplementary Material
- References

Authors

Related links

Nature. Author manuscript; available in PMC 2011 March 9.

Published in final edited form as:

PMCID: PMC2996232

NIHMSID: NIHMS220662

Correspondence and requests for materials should be addressed to ; Email: gv/at/eng.cam.ac.uk or ; Email: johan_paulsson/at/hms.harvard.edu

Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

Negative feedback is common in biological processes and can increase a system’s stability to internal and external perturbations. But at the molecular level, control loops always involve signaling steps with finite rates for random births and deaths of individual molecules. By developing mathematical tools that merge control and information theory with physical chemistry we show that seemingly mild constraints on these rates place severe limits on the ability to suppress molecular fluctuations. Specifically, the minimum standard deviation in abundances decreases with the quartic root of the number of signaling events, making it extraordinarily expensive to increase accuracy. Our results are formulated in terms of experimental observables, and existing data show that cells use brute force when noise suppression is essential, e.g. transcribing regulatory genes 10,000s of times per cell cycle. The theory challenges conventional beliefs about biochemical accuracy and presents an approach to rigorously analyze poorly characterized biological systems.

Life in the cell is a complex battle between randomizing and correcting statistical forces: births and deaths of individual molecules create spontaneous fluctuations in abundances^{1}^{,}^{2}^{,}^{3}^{,}^{4} – noise – while many control circuits have evolved to eliminate, tolerate or exploit the noise^{5}^{,}^{6}^{,}^{7}^{,}^{8}. The net outcome is difficult to predict because each control circuit in turn consists of probabilistic chemical reactions. For example, negative feedback loops can compensate for changes in abundances by adjusting the rates of synthesis or degradation^{7}, but such adjustments are only certain to suppress noise if the individual deviations immediately and surely affect the rates^{5}. Even the simplest transcriptional autorepression by contrast involves gene activation, transcription and translation, introducing intermediate probabilistic events that can randomize or destabilize control. Negative feedback may thus either suppress or amplify fluctuations depending on the exact mechanisms, reaction steps and parameters^{9} – details that are difficult to characterize at the single cell level and that differ greatly from system to system. This raises a fundamental question: to what extent is biological noise inevitable and to what extent can it be controlled? Could evolution simply favor networks – however elaborate or ingeniously designed – that enable cells to homeostatically suppress any disadvantageous noise, or does the nature of the mechanisms impose inherent constraints that cannot be overcome?

To address this question without oversimplifying or guessing at the complexity of cells, we consider a chemical species X_{1} that affects the production of a second species X_{2}, which in turn indirectly controls the production of X_{1} via an arbitrarily complicated reaction network with any number of components, nonlinear reaction rates, or spatial effects (Fig. 1). For generality, we only specify three of the chemical events of the larger network:

(1)

where *x*_{1} and *x*_{2} are numbers of molecules per cell, the birth and death rates are probabilistic reaction intensities, *τ*_{1} is the average lifetime of X_{1} molecules, *f* is a specified rate function, and the unspecified control network allows *u* to be dynamically and arbitrarily set by the full time history of X_{2} values. Death events for X_{2} are omitted because the results we derive rigorously hold for all types and rates of X_{2} degradation mechanisms, as long as they do not depend on X_{1}. The generality of *u* and *f* allows X_{1} to represent many different biological species: an mRNA with X_{2} as the corresponding protein, a protein with X_{2} as either its own mRNA or an mRNA downstream in the control pathway, an enzyme with X_{2} as a product, or a self-replicating DNA with X_{2} as a replication control molecule.

The arbitrary birth rate *u* represents a hypothetical ‘control demon’ that knows everything about past and present values of *x*_{2} and uses this information to minimize the variance in *x*_{1}. This corresponds to an optimal reaction network capable of any type of time-integration, frequency-based control, spatially extended dynamics, or other exotic actions. The sole restriction is that the control system depends on *x*_{1} only via reaction (*iii*), an example of a common chemical signaling relay where a concentration determines a rate. Because individual X_{2} birth events are probabilistic, some information about X_{1} is then inevitably and irrecoverably lost and the current value of X_{1} cannot be perfectly inferred from the X_{2} time-series. Specifically, the number of X_{2} birth events in a short time period is on average proportional to *f*(*x*_{1}), with a statistical uncertainty that depends on the average number of events. If *x*_{1} remained constant, the uncertainty could be arbitrarily reduced by integrating over a longer time, but because it keeps changing randomly on a time scale set by *τ*_{1}, integration can only help so much. The problem is thus equivalent to determining the strength of a weak light source by counting photons: each photon emission is probabilistic, and if the light waxes and wanes, counts from the past carry little information about the current strength. The otherwise omniscient control demon thus cannot know the exact state of the component it is trying to control.

We then quantify how finite signaling rates restrict noise suppression, without linearizing or otherwise approximating the control systems, by analytically deriving a feedback-invariant upper limit on the mutual information^{10} between X_{1} and X_{2} – an information-theoretic entropic measure for how much knowing one variable reduces uncertainty about another – and derive lower bounds on variances in terms of this limit. We use a continuous stochastic differential equation for the dynamics of species X_{1}, an approximation that makes it easier to extend the results to more contexts and processes, but keep the signaling and control processes discrete. After considerable dust has settled, this theory (summarized in Box 1 and detailed in the Supplementary Information, SI) allows us to calculate fundamental lower bounds on variances.

Statistical uncertainties and dependencies are often measured by variances and correlation coefficients, but both uncertainty and dependence can also be defined purely in terms of probabilities (*p _{i}*), without considering the actual states of the system. The Shannon entropy

First, because imperfectly estimating the state of a system fundamentally restricts the ability to control it (SI), there is a hard bound on variances whenever there is incomplete mutual information between the signal *X*_{2} and the controlled variable *X*_{1}. We quantify the bound by means of Pinsker’s nonanticipatory epsilon entropy^{28}, a rarely utilized information-theoretic concept that exploits the fact that the transmission of information in a feedback system must occur in real time. This shows (SI) how an upper bound on the mutual information *I* (*X*_{1}; *X*_{2}) – i.e. a limited Shannon capacity in the channel from *X*_{1} to *X*_{2} – imposes a lower bound on the mean squared estimation error *E* (*X*_{1}_{1})^{2}, where the ‘estimator’ _{1} is an arbitrary function of the discrete signal *X*_{2} time series and the *X*_{1} dynamics at equilibrium is described by a stochastic differential equation. Since the capacity of the molecular channels we consider is not increased by feedback, this results in a lower limit in the variance of *X*_{1}, in terms of the channel capacity *C*, that holds for arbitrary feedback control laws:
.

Second, the Shannon capacity is potentially unlimited when information is sent over point process ‘Poisson channels’^{29},
, as in stochastic reaction networks where a controlled variable affects the rate of a probabilistic signaling event. However, infinite capacity requires that the rate *f* (*x*_{1}) is unrestricted and thus that *X*_{1} is unrestricted – contrary to the purpose of control. Here we consider two types of restrictions. First, if the rate has an upper limit *f*_{max} it follows^{30} that *C*=*K*<*f*> where *K*= log(*f*_{max}/<*f*>). The channel capacity then equals the average intensity multiplied by the natural logarithm of the effective dynamic range *f*_{max}/<*f*>, and the noise bound follows
. This allows for any nonlinear function *f* (*x*_{1}) but, for specific functions, restricting the variance in *x*_{1} can further reduce the capacity. For example, we analytically show that the capacity of the generic Poisson channel subject to mean and variance constraints follows
. Having less noise in *x*_{1}will reduce the variance in *f* and thereby make it harder to transmit the information that is fundamentally required to reduce noise. Combining this expression for the channel capacity with the feedback limit above reveals hard limits beyond which no improvements can be made: any further reduction in the variance would require a higher mutual information, which is impossible to achieve without instead increasing the variance. When *f* is linear in *x*_{1} this produces the result in Eq. (2). Analogous calculations allow us to derive capacity and noise results when *f* is a Hill function, or for processes with bursts, extrinsic noise, parallel channels, and cascades (SI). Finite channel capacities are the only fundamental constraints considered here, so at infinite capacity perfect noise suppression is possible by construction.

When the rate of making X_{2} is proportional to X_{1}, *f* =*αx*_{1}, for example when X_{1} is a template or enzyme producing X_{2}, the hard lower bound on the (squared) relative standard deviation created by the loss of information follows:

(2)

where <…> denotes population averages and *N*_{1} = <*u*>*τ*_{1} = <*x*_{1}> and *N*_{2} = *α*<*x*_{1}>*τ*_{1} are the numbers of birth events of X_{1} and X_{2} made on average during time *τ*_{1}. Thus no control network can significantly reduce noise when the signal X_{2} is made less frequently than the controlled component. When the signal is made more frequently than the controlled component, the minimal relative standard deviation (square root of Eq. (2)) at most decreases with the *quartic* root of the number of signal birth events. Reducing the standard deviation of X_{1} 10-fold thus requires that the signal X_{2} is made at least 10,000 times more frequently. This makes it hard to achieve high precision, and practically impossible to achieve extreme precision, even for the slowest changing X_{1} in the cell where the signals X_{2} may be faster in comparison.

Systems with nonlinear amplification before the infrequent signaling step are also subject to bounds. For arbitrary nonlinear encoding where *f* is an arbitrary functional of the whole *x*_{1} time history – corresponding to a second control demon between X_{1} and X_{2} – the quartic root limit turns into a type of square root limit (Box 1 and SI). However, gene regulatory functions typically saturate at full activation or leak at full repression, as the generalized Hill function
with *K*_{1}*<K*_{2}. Here X_{1} may be an activator or repressor, and X_{2} an mRNA encoding either X_{1} or a downstream protein. Without linearizing *f* or restricting the control demon, an extension of the methods above (SI) reveals similar quartic root bounds as in Eq. (2), with the difference that *N*_{2} is replaced by *γN*_{2}_{,}_{max} where *γ* is on the order of one in a wide range of biologically relevant parameters (SI), and *N*_{2}_{,}_{max}= *vτ*_{1} = *N*_{2}
*v*/<*f*>. Cells can then produce much fewer signal molecules without reducing the information transfer, depending on the maximal rate increase *v*/<*f*>, but the quartic root effect still strongly dampens the impact on the noise limit. If X_{2} is an mRNA, *N*_{2}_{,}_{max} is also limited because transcription events tend to be relatively rare even for fully expressed genes.

Many biological systems show much greater fluctuations due to upstream sources of noise, or sudden ‘bursts’ of synthesis^{4}^{,}^{11}^{,}^{12}. If X_{1} molecules are made or degraded in bursts (size *b*_{1}*,* averaged over births and deaths) there is much more noise to suppress, and if signal molecules X_{2} are produced in bursts (size *b*_{2}) each independent burst only counts as a single signaling event in terms of the Shannon information transfer, and:

(3)

The effective average number of molecules or events is thus reduced by the size of the burst, which can increase the noise limits greatly in many biological systems. The effect of slower upstream fluctuations in turn depends on their time-scales, how they affect the system, and whether or not the control system can monitor the source of such noise directly. If noise in the X_{1} birth rate is extrinsic to X_{1} but not directly accessible by the controller, the predicted noise suppression limits can follow similar quartic root principles for both fast and slow extrinsic noise, while for intermediate time-scales the power-law is between 3/8 and ¼ (SI, and Fig 2).

Signaling in the cell typically involves numerous components that change in probabilistic events with finite rates. Information about upstream states is then progressively lost at each step much like a game of ‘broken telephone’ where messages are imperfectly whispered from person to person. If each signaling component X_{i}_{+1} decays exponentially and is produced at rate *α _{i}x_{i}*, an extension of the theory (SI) shows that if a control demon monitors X

(4)

where *N _{j}* is the average number of birth events (or bursts, as in Eq. (3)) of species

The rapid loss of information in cascades also suggests another trade-off: effective control requires a combination of appropriately nonlinear responses and small information losses, but nonlinear amplification in turn requires multiple chemical reactions with a loss of information at each step. The actual bounds may thus be much more restrictive than predicted above, where assuming Hill functions or arbitrary control networks conceals this trade-off. One of the greatest challenges in the cell may be to generate appropriately nonlinear reaction rates without losing too much information along the way.

Parallel signal and control systems can instead improve noise suppression, since each signaling pathway contributes independent information about the upstream state. However, for a given total number of signaling events, parallel control cannot possibly reduce noise below the limits above: the loss of information is determined only by the total frequency of the signaling events, not their physical nature. The analyses above in fact implicitly allow for arbitrarily parallel control with *f* interpreted as the total rate of making control molecules affected directly by X_{1} (SI).

The results above paint a grim picture for suppression of molecular noise. At first glance this seems contradicted by a wealth of biological counterexamples: molecules are often present in low numbers, signaling cascades where one component affects the rates of another are ubiquitous, and yet many processes are extremely precise. How is this possible if the limits apply universally? First, the transmission of chemical information is not fundamentally limited by the number of molecules present at any given time, but by the number of chemical events integrated over the time-scale of control (i.e., by *N*_{2} rather than <*x*_{2}> above). Second, most processes that have been studied quantitatively in single cells do in fact show large variation, and the anecdotal view of cells as microscopic-yet-precise largely comes from a few central processes where cells can afford a very high number of chemical events at each step, often using post-translational signaling cascades. Just like gravity places energetic and mechanistic constraints on flight but does not confine all organisms to the surface of the earth, the rapid loss of information in chemical networks places hard constraints on molecular control circuits but does not make any level of precision inherently impossible.

It can also be tempting to dismiss physical constraints simply because life seems fine despite them. For example, many cellular processes operate with a great deal of stochastic variation, and central pathways seem able to achieve sufficiently high precision. But such arguments are almost circular. The existence of flight does not make gravity irrelevant, nor do winged creatures simply fly sufficiently well. The challenges are instead to understand the trade-offs involved: what performances are selectively advantageous given the associated costs, and how small fitness differences are selectively relevant?

To illustrate the biological consequences of imperfect signaling we consider systems that must suppress noise for survival and must relay signals through gene expression, where chemical information is lost due to infrequent activation, transcription, and translation. The best characterized examples are the homeostatic copy number control mechanisms of bacterial plasmids that reduce the risk of plasmid loss at cell division. These have been described much like the example above with X_{1} as plasmids and X_{2} as plasmid-expressed inhibitors^{5}, except that plasmids self-replicate with rate *u*(*t*)*x*_{1} and therefore are bound by the quartic root limit for all values of *N*_{1} and *N*_{2} (SI, Fig. 2). To identify the mechanistic constraints when X_{1} production is directly inhibited by X_{2}, rather than by a control demon that is infinitely fast and that delivers the optimal response to every perturbation, we consider a closed toy model:

(5)

where X_{1} degradation is a proxy for partitioning at cell division, and the rate of making X_{2} is proportional to X_{1} because each plasmid copy encodes a gene for X_{2}. We then use the logarithmic gains^{6}^{,}^{14}
*H*_{12} = −ln*u*/ln*x*_{2} and
to quantify the percentage responses in rates to percentage changes in levels without specifying the exact rate functions. Parameter *H*_{12} is similar to a Hill coefficient of inhibition, and *H*_{22} determines how X_{2} affects its own rates, increasing when it is negatively auto-regulated and decreasing when it is degraded by saturated enzymes. The ratio *H*_{12}/*H*_{22} is thus a total gain, corresponding to the eventual percentage response in *u* to a percentage change in *x*_{1}. With *τ*_{2} as the average lifetime of X_{2} molecules, stationary fluctuation-dissipation approximations^{6}^{,}^{15} (linearizing responses, SI) then give:

(6)

where the limit holds for all *H _{ij}* and

Several recent studies have generalized control-theoretic notions^{19}^{,}^{20} or applied them to biology^{21}^{,}^{22}. Others have demonstrated physical limits on the accuracy of cellular signaling^{13}^{,}^{23}^{,}^{24}^{,}^{25}, for example using fluctuation-dissipation approximations to predict estimation errors associated with a constant number of diffusing molecules hitting a biological sensor^{26}. Interestingly, the latter show that the minimal relative error decreases with the square root of the number of events, regardless of detection mechanism. Some studies have also analyzed the information transfer capacity of open-loop molecular systems^{25}, or extracted valuable insights from Gaussian small-noise approximations. Here we extend these works by developing exact mathematical methods for arbitrarily complex and nonlinear real-time feedback control of a dynamic process of noisy synthesis and degradation. In such systems, the minimal error decreases with the *quartic* root of the integer number of signaling events, making a decent job 16 times harder than a half-decent job. This perhaps explains why there is so much biochemical noise – correcting it would just be too costly – but also constrains other aspects of life in the cell. For example, the noise levels may increase or decrease along signaling cascades, depending on the kinetic details at each step, but information about upstream states is always progressively and irreversibly lost. Though it is tempting to believe that large reaction networks are capable of almost anything if the rates are suitably nonlinear, the opposite perspective may thus be more appropriate: having more steps where one component affects the rates of another creates more opportunities for losing information and fundamentally prevents more types of behaviors. While awaiting the detailed models that predict what single cells actually do – which require every probabilistic chemical step to be well characterized – fusing control and information theory with stochastic kinetics thus provides a useful starting point: predicting what cells cannot do.

This research was supported by the BBSRC under grant BB/C008073/1, by the National Science Foundation Grants DMS-074876-0 and CAREER 0720056, and by grants GM081563-02 and GM068763-06 from the National Institutes of Health.

Supplementary Information is linked to the online version of the paper at www.nature.com/nature

**Author contributions** The three authors (I.L., G.V., and J.P.) contributed equally, and all conceived the study, derived the equations, and wrote the paper.

**Author information** Reprints and permissions information is available at npg.nature.com/reprints. The authors declare no competing financial interests.

1. Ozbudak EM, Thattai M, Kurtser I, Grossman AD, van Oudenaarden A. Regulation of noise in the expression of a single gene. Nature Genetics. 2002;31:69–73. [PubMed]

2. Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science (Washington, DC, United States) 2002;297:1183–1186. [PubMed]

3. Newman JR, et al. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature. 2006;441:840–846. [PubMed]

4. Golding I, Paulsson J, Zawilski SM, Cox EC. Real-time kinetics of gene activity in individual bacteria. Cell (Cambridge, MA, United States) 2005;123:1025–1036. [PubMed]

5. Paulsson J, Ehrenberg M. Noise in a minimal regulatory network: Plasmid copy number control. Quarterly Reviews of Biophysics. 2001;34:1–59. [PubMed]

6. Paulsson J. Summing up the noise in gene networks. Nature (London, United Kingdom) 2004;427:415–418. [PubMed]

7. Dublanche Y, Michalodimitrakis K, Kuemmerer N, Foglierini M, Serrano L. Noise in transcription negative feedback loops: simulation and experimental analysis. Molecular Systems Biology. 2006:E1–E12. [PMC free article] [PubMed]

8. Barkai N, Shilo BZ. Variability and robustness in biomolecular systems. Mol Cell. 2007;28:755–760. [PubMed]

9. Maxwell J. On governors. Proc Royal Society of London. 1868;16:270–283.

10. Cover TM, Thomas JA. Elements of Information Theory. 2. John Wiley & Sons, INC; 1991.

11. Pedraza JMPJ. Effects of molecular memory and bursting on fluctuations in gene expression. Science. 2008:339–343. [PubMed]

12. Cai L, Friedman N, Xie XS. Stochastic protein expression in individual cells at the single molecule level. Nature (London, United Kingdom) 2006;440:358–362. [PubMed]

13. Tkacik G, Callan CG, Jr, Bialek W. Information flow and optimization in transcriptional regulation. Proc Natl Acad Sci U S A. 2008;105:12265–12270. [PubMed]

14. Savageau MA. Parameter sensitivity as a criterion for evaluating and comparing the performance of biochemical systems. Nature (London, United Kingdom) 1971;229:542–544. [PubMed]

15. Keizer J. Statistical Thermodynamics of Nonequilibrium Processes. Springer; 1987.

16. Singh A, Hespanha JP. Optimal feedback strength for noise suppression in autoregulatory gene networks. Biophys J. 2009;96:4013–4023. [PubMed]

17. Korobkova EA, Emonet T, Park H, Cluzel P. Hidden stochastic nature of a single bacterial motor. Phys Rev Lett. 2006;96:058105. [PubMed]

18. Doan T, Mendez A, Detwiler PB, Chen J, Rieke F. Multiple phosphorylation sites confer reproducibility of the rod’s single-photon responses. Science. 2006;313:530–533. [PubMed]

19. Martins NC, Dahleh MA, Doyle JC. Fundamental Limitations of Disturbance Attenuation in the Presence of Side Information. IEEE Transactions on Automatic Control. 2007;52:56–66.

20. Martins NC, Dahleh MA. Feedback Control in the Presence of Noisy Channels: “Bode-Like” Fundamental Limitations of Performance. IEEE Transactions on Automatic Control. 2008;52:1604–1615.

21. El-Samad H, Kurata H, Doyle JC, Gross CA, Khammash M. Surviving heat shock: control strategies for robustness and performance. Proc Natl Acad Sci U S A. 2005;102:2736–2741. [PubMed]

22. Yi TM, Huang Y, Simon MI, Doyle J. Robust perfect adaptation in bacterial chemotaxis through integral feedback control. Proc Natl Acad Sci U S A. 2000;97:4649–4653. [PubMed]

23. Bialek W, Setayeshgar S. Cooperativity, sensitivity, and noise in biochemical signaling. Phys Rev Lett. 2008;100:258101. [PubMed]

24. Gregor T, Tank DW, Wieschaus EF, Bialek W. Probing the limits to positional information. Cell. 2007;130:153–164. [PMC free article] [PubMed]

25. Walczak AM, Mugler A, Wiggins CH. A stochastic spectral analysis of transcriptional regulatory cascades. Proc Natl Acad Sci U S A. 2009;106:6529–6534. [PubMed]

26. Bialek W, Setayeshgar S. Physical limits to biochemical signaling. Proc Natl Acad Sci U S A. 2005;102:10040–10045. [PubMed]

27. Shannon CE. A Mathematical Theory of Communication. Bell System Technical Journal. 1948;27:379–423. 623–656.

28. Gorbunov AK, Pinsker MS. Nonanticipatory and prognostic epsilon entropies and message generation rates. Problems of Information Transmission. 1973;9:184–191.

29. Kabanov Y. The capacity of a channel of the Poisson type. Theory of Probability and its Applications. 1978;23:143–147.

30. Davis MHA. Capacity and cut-off rate for Poisson type channels. IEEE Transactions on Information Theory. 1978;26:710–715.

31. Tomizawa J. Control of ColE1 plasmid replication: binding of RNA I to RNA II and inhibition of primer formation. Cell (Cambridge, MA, United States) 1986;47:89–97. [PubMed]

32. Das N, et al. Multiple homeostatic mechanisms in the control of P1 plasmid replication. Proceedings of the National Academy of Sciences of the United States of America. 2005;102:2856–2861. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's Canada Institute for Scientific and Technical Information in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |