|Home | About | Journals | Submit | Contact Us | Français|
Negative feedback is common in biological processes and can increase a system’s stability to internal and external perturbations. But at the molecular level, control loops always involve signaling steps with finite rates for random births and deaths of individual molecules. By developing mathematical tools that merge control and information theory with physical chemistry we show that seemingly mild constraints on these rates place severe limits on the ability to suppress molecular fluctuations. Specifically, the minimum standard deviation in abundances decreases with the quartic root of the number of signaling events, making it extraordinarily expensive to increase accuracy. Our results are formulated in terms of experimental observables, and existing data show that cells use brute force when noise suppression is essential, e.g. transcribing regulatory genes 10,000s of times per cell cycle. The theory challenges conventional beliefs about biochemical accuracy and presents an approach to rigorously analyze poorly characterized biological systems.
Life in the cell is a complex battle between randomizing and correcting statistical forces: births and deaths of individual molecules create spontaneous fluctuations in abundances1,2,3,4 – noise – while many control circuits have evolved to eliminate, tolerate or exploit the noise5,6,7,8. The net outcome is difficult to predict because each control circuit in turn consists of probabilistic chemical reactions. For example, negative feedback loops can compensate for changes in abundances by adjusting the rates of synthesis or degradation7, but such adjustments are only certain to suppress noise if the individual deviations immediately and surely affect the rates5. Even the simplest transcriptional autorepression by contrast involves gene activation, transcription and translation, introducing intermediate probabilistic events that can randomize or destabilize control. Negative feedback may thus either suppress or amplify fluctuations depending on the exact mechanisms, reaction steps and parameters9 – details that are difficult to characterize at the single cell level and that differ greatly from system to system. This raises a fundamental question: to what extent is biological noise inevitable and to what extent can it be controlled? Could evolution simply favor networks – however elaborate or ingeniously designed – that enable cells to homeostatically suppress any disadvantageous noise, or does the nature of the mechanisms impose inherent constraints that cannot be overcome?
To address this question without oversimplifying or guessing at the complexity of cells, we consider a chemical species X1 that affects the production of a second species X2, which in turn indirectly controls the production of X1 via an arbitrarily complicated reaction network with any number of components, nonlinear reaction rates, or spatial effects (Fig. 1). For generality, we only specify three of the chemical events of the larger network:
where x1 and x2 are numbers of molecules per cell, the birth and death rates are probabilistic reaction intensities, τ1 is the average lifetime of X1 molecules, f is a specified rate function, and the unspecified control network allows u to be dynamically and arbitrarily set by the full time history of X2 values. Death events for X2 are omitted because the results we derive rigorously hold for all types and rates of X2 degradation mechanisms, as long as they do not depend on X1. The generality of u and f allows X1 to represent many different biological species: an mRNA with X2 as the corresponding protein, a protein with X2 as either its own mRNA or an mRNA downstream in the control pathway, an enzyme with X2 as a product, or a self-replicating DNA with X2 as a replication control molecule.
The arbitrary birth rate u represents a hypothetical ‘control demon’ that knows everything about past and present values of x2 and uses this information to minimize the variance in x1. This corresponds to an optimal reaction network capable of any type of time-integration, frequency-based control, spatially extended dynamics, or other exotic actions. The sole restriction is that the control system depends on x1 only via reaction (iii), an example of a common chemical signaling relay where a concentration determines a rate. Because individual X2 birth events are probabilistic, some information about X1 is then inevitably and irrecoverably lost and the current value of X1 cannot be perfectly inferred from the X2 time-series. Specifically, the number of X2 birth events in a short time period is on average proportional to f(x1), with a statistical uncertainty that depends on the average number of events. If x1 remained constant, the uncertainty could be arbitrarily reduced by integrating over a longer time, but because it keeps changing randomly on a time scale set by τ1, integration can only help so much. The problem is thus equivalent to determining the strength of a weak light source by counting photons: each photon emission is probabilistic, and if the light waxes and wanes, counts from the past carry little information about the current strength. The otherwise omniscient control demon thus cannot know the exact state of the component it is trying to control.
We then quantify how finite signaling rates restrict noise suppression, without linearizing or otherwise approximating the control systems, by analytically deriving a feedback-invariant upper limit on the mutual information10 between X1 and X2 – an information-theoretic entropic measure for how much knowing one variable reduces uncertainty about another – and derive lower bounds on variances in terms of this limit. We use a continuous stochastic differential equation for the dynamics of species X1, an approximation that makes it easier to extend the results to more contexts and processes, but keep the signaling and control processes discrete. After considerable dust has settled, this theory (summarized in Box 1 and detailed in the Supplementary Information, SI) allows us to calculate fundamental lower bounds on variances.
Statistical uncertainties and dependencies are often measured by variances and correlation coefficients, but both uncertainty and dependence can also be defined purely in terms of probabilities (pi), without considering the actual states of the system. The Shannon entropy H (X) = Σpilogpi measures inherent uncertainty rather than how different the outcomes are, and the mutual information between random variables I (X1; X2) = H (X1)–H (X1|X2) measures how much knowing one variable reduces entropic uncertainty in another, regardless of how their outcomes may correlate10,27. Despite the fundamental differences between these measures, however, there are several points of contact that can be used to predict limits on stochastic behavior.
First, because imperfectly estimating the state of a system fundamentally restricts the ability to control it (SI), there is a hard bound on variances whenever there is incomplete mutual information between the signal X2 and the controlled variable X1. We quantify the bound by means of Pinsker’s nonanticipatory epsilon entropy28, a rarely utilized information-theoretic concept that exploits the fact that the transmission of information in a feedback system must occur in real time. This shows (SI) how an upper bound on the mutual information I (X1; X2) – i.e. a limited Shannon capacity in the channel from X1 to X2 – imposes a lower bound on the mean squared estimation error E (X11)2, where the ‘estimator’ 1 is an arbitrary function of the discrete signal X2 time series and the X1 dynamics at equilibrium is described by a stochastic differential equation. Since the capacity of the molecular channels we consider is not increased by feedback, this results in a lower limit in the variance of X1, in terms of the channel capacity C, that holds for arbitrary feedback control laws: .
Second, the Shannon capacity is potentially unlimited when information is sent over point process ‘Poisson channels’29, , as in stochastic reaction networks where a controlled variable affects the rate of a probabilistic signaling event. However, infinite capacity requires that the rate f (x1) is unrestricted and thus that X1 is unrestricted – contrary to the purpose of control. Here we consider two types of restrictions. First, if the rate has an upper limit fmax it follows30 that C=K<f> where K= log(fmax/<f>). The channel capacity then equals the average intensity multiplied by the natural logarithm of the effective dynamic range fmax/<f>, and the noise bound follows . This allows for any nonlinear function f (x1) but, for specific functions, restricting the variance in x1 can further reduce the capacity. For example, we analytically show that the capacity of the generic Poisson channel subject to mean and variance constraints follows . Having less noise in x1will reduce the variance in f and thereby make it harder to transmit the information that is fundamentally required to reduce noise. Combining this expression for the channel capacity with the feedback limit above reveals hard limits beyond which no improvements can be made: any further reduction in the variance would require a higher mutual information, which is impossible to achieve without instead increasing the variance. When f is linear in x1 this produces the result in Eq. (2). Analogous calculations allow us to derive capacity and noise results when f is a Hill function, or for processes with bursts, extrinsic noise, parallel channels, and cascades (SI). Finite channel capacities are the only fundamental constraints considered here, so at infinite capacity perfect noise suppression is possible by construction.
When the rate of making X2 is proportional to X1, f =αx1, for example when X1 is a template or enzyme producing X2, the hard lower bound on the (squared) relative standard deviation created by the loss of information follows:
where <…> denotes population averages and N1 = <u>τ1 = <x1> and N2 = α<x1>τ1 are the numbers of birth events of X1 and X2 made on average during time τ1. Thus no control network can significantly reduce noise when the signal X2 is made less frequently than the controlled component. When the signal is made more frequently than the controlled component, the minimal relative standard deviation (square root of Eq. (2)) at most decreases with the quartic root of the number of signal birth events. Reducing the standard deviation of X1 10-fold thus requires that the signal X2 is made at least 10,000 times more frequently. This makes it hard to achieve high precision, and practically impossible to achieve extreme precision, even for the slowest changing X1 in the cell where the signals X2 may be faster in comparison.
Systems with nonlinear amplification before the infrequent signaling step are also subject to bounds. For arbitrary nonlinear encoding where f is an arbitrary functional of the whole x1 time history – corresponding to a second control demon between X1 and X2 – the quartic root limit turns into a type of square root limit (Box 1 and SI). However, gene regulatory functions typically saturate at full activation or leak at full repression, as the generalized Hill function with K1<K2. Here X1 may be an activator or repressor, and X2 an mRNA encoding either X1 or a downstream protein. Without linearizing f or restricting the control demon, an extension of the methods above (SI) reveals similar quartic root bounds as in Eq. (2), with the difference that N2 is replaced by γN2,max where γ is on the order of one in a wide range of biologically relevant parameters (SI), and N2,max= vτ1 = N2 v/<f>. Cells can then produce much fewer signal molecules without reducing the information transfer, depending on the maximal rate increase v/<f>, but the quartic root effect still strongly dampens the impact on the noise limit. If X2 is an mRNA, N2,max is also limited because transcription events tend to be relatively rare even for fully expressed genes.
Many biological systems show much greater fluctuations due to upstream sources of noise, or sudden ‘bursts’ of synthesis4,11,12. If X1 molecules are made or degraded in bursts (size b1, averaged over births and deaths) there is much more noise to suppress, and if signal molecules X2 are produced in bursts (size b2) each independent burst only counts as a single signaling event in terms of the Shannon information transfer, and:
The effective average number of molecules or events is thus reduced by the size of the burst, which can increase the noise limits greatly in many biological systems. The effect of slower upstream fluctuations in turn depends on their time-scales, how they affect the system, and whether or not the control system can monitor the source of such noise directly. If noise in the X1 birth rate is extrinsic to X1 but not directly accessible by the controller, the predicted noise suppression limits can follow similar quartic root principles for both fast and slow extrinsic noise, while for intermediate time-scales the power-law is between 3/8 and ¼ (SI, and Fig 2).
Signaling in the cell typically involves numerous components that change in probabilistic events with finite rates. Information about upstream states is then progressively lost at each step much like a game of ‘broken telephone’ where messages are imperfectly whispered from person to person. If each signaling component Xi+1 decays exponentially and is produced at rate αixi, an extension of the theory (SI) shows that if a control demon monitors Xn+1 and controls X1, N2 above is replaced by
where Nj is the average number of birth events (or bursts, as in Eq. (3)) of species j during time period τ1. Information transfer in cascades is thus limited by the components made in the lowest numbers, and because the total average number of birth events over the n steps obeys Ntot≥n2Neff, a five-step linear cascade requires at least 25 times more birth events to maintain the same capacity to suppress noise as a single-step mechanism. This effect of information loss is superficially similar to noise propagation where variation in inputs cause variation in outputs, but though both effects reflect the probabilistic nature of infrequent reactions, the governing principles are very different. In fact, the mechanisms for preventing noise propagation – such as time-averaging or kinetic robustness to upstream changes6 – cause a greater loss of information, while mechanisms that minimize information losses – such as all-or-nothing nonlinear effects13 – instead amplify noise. Large variation in signaling intermediates is thus not necessarily a sign of reduced precision but could reflect strategies to minimize information loss, which in turn allows tighter control of downstream components.
The rapid loss of information in cascades also suggests another trade-off: effective control requires a combination of appropriately nonlinear responses and small information losses, but nonlinear amplification in turn requires multiple chemical reactions with a loss of information at each step. The actual bounds may thus be much more restrictive than predicted above, where assuming Hill functions or arbitrary control networks conceals this trade-off. One of the greatest challenges in the cell may be to generate appropriately nonlinear reaction rates without losing too much information along the way.
Parallel signal and control systems can instead improve noise suppression, since each signaling pathway contributes independent information about the upstream state. However, for a given total number of signaling events, parallel control cannot possibly reduce noise below the limits above: the loss of information is determined only by the total frequency of the signaling events, not their physical nature. The analyses above in fact implicitly allow for arbitrarily parallel control with f interpreted as the total rate of making control molecules affected directly by X1 (SI).
The results above paint a grim picture for suppression of molecular noise. At first glance this seems contradicted by a wealth of biological counterexamples: molecules are often present in low numbers, signaling cascades where one component affects the rates of another are ubiquitous, and yet many processes are extremely precise. How is this possible if the limits apply universally? First, the transmission of chemical information is not fundamentally limited by the number of molecules present at any given time, but by the number of chemical events integrated over the time-scale of control (i.e., by N2 rather than <x2> above). Second, most processes that have been studied quantitatively in single cells do in fact show large variation, and the anecdotal view of cells as microscopic-yet-precise largely comes from a few central processes where cells can afford a very high number of chemical events at each step, often using post-translational signaling cascades. Just like gravity places energetic and mechanistic constraints on flight but does not confine all organisms to the surface of the earth, the rapid loss of information in chemical networks places hard constraints on molecular control circuits but does not make any level of precision inherently impossible.
It can also be tempting to dismiss physical constraints simply because life seems fine despite them. For example, many cellular processes operate with a great deal of stochastic variation, and central pathways seem able to achieve sufficiently high precision. But such arguments are almost circular. The existence of flight does not make gravity irrelevant, nor do winged creatures simply fly sufficiently well. The challenges are instead to understand the trade-offs involved: what performances are selectively advantageous given the associated costs, and how small fitness differences are selectively relevant?
To illustrate the biological consequences of imperfect signaling we consider systems that must suppress noise for survival and must relay signals through gene expression, where chemical information is lost due to infrequent activation, transcription, and translation. The best characterized examples are the homeostatic copy number control mechanisms of bacterial plasmids that reduce the risk of plasmid loss at cell division. These have been described much like the example above with X1 as plasmids and X2 as plasmid-expressed inhibitors5, except that plasmids self-replicate with rate u(t)x1 and therefore are bound by the quartic root limit for all values of N1 and N2 (SI, Fig. 2). To identify the mechanistic constraints when X1 production is directly inhibited by X2, rather than by a control demon that is infinitely fast and that delivers the optimal response to every perturbation, we consider a closed toy model:
where X1 degradation is a proxy for partitioning at cell division, and the rate of making X2 is proportional to X1 because each plasmid copy encodes a gene for X2. We then use the logarithmic gains6,14 H12 = −lnu/lnx2 and to quantify the percentage responses in rates to percentage changes in levels without specifying the exact rate functions. Parameter H12 is similar to a Hill coefficient of inhibition, and H22 determines how X2 affects its own rates, increasing when it is negatively auto-regulated and decreasing when it is degraded by saturated enzymes. The ratio H12/H22 is thus a total gain, corresponding to the eventual percentage response in u to a percentage change in x1. With τ2 as the average lifetime of X2 molecules, stationary fluctuation-dissipation approximations6,15 (linearizing responses, SI) then give:
where the limit holds for all Hij and τi (SI). This reflects a classic trade-off in control theory: higher total gain suppresses spontaneous fluctuations in X1 but amplifies the transmitted fluctuations from X2 to X1. Numerical analysis confirms that even a Hill-type inhibition function u can get close to the limit (not shown), and thus that direct inhibition can do almost as well as a control demon. However, the parameter requirements can be extreme: the signal molecules must be very short-lived, and the optimal gain may be so high that introducing any delays or ‘extrinsic’ fluctuations6,16 would destabilize the dynamics. Regardless of the inhibition control network, plasmids thus need to express inhibitors at extraordinarily high rates, and generate strongly nonlinear feedback responses without introducing signaling cascades. Most plasmids indeed take these strategies to the extreme, for example transcribing control genes tens of thousands of times per cell cycle using several gene copies and some of the strongest promoters known. Some plasmids also eliminate many of the cascade steps inherent in gene expression, using small regulatory RNAs, and still create highly nonlinear responses using proofreading-type mechanisms (Fig. 3, left). Others partially avoid indirect control by ensuring that the plasmid copies themselves prevent each others’ replication (Fig. 3, right), or suppress noise without closing control loops17,18 by changing the Poisson nature of the X1 and X2 chemical events (Eq. (1)). Though such schemes may have limited effects on variances11, some plasmids seem to take advantage of them5.
Several recent studies have generalized control-theoretic notions19,20 or applied them to biology21,22. Others have demonstrated physical limits on the accuracy of cellular signaling13,23,24,25, for example using fluctuation-dissipation approximations to predict estimation errors associated with a constant number of diffusing molecules hitting a biological sensor26. Interestingly, the latter show that the minimal relative error decreases with the square root of the number of events, regardless of detection mechanism. Some studies have also analyzed the information transfer capacity of open-loop molecular systems25, or extracted valuable insights from Gaussian small-noise approximations. Here we extend these works by developing exact mathematical methods for arbitrarily complex and nonlinear real-time feedback control of a dynamic process of noisy synthesis and degradation. In such systems, the minimal error decreases with the quartic root of the integer number of signaling events, making a decent job 16 times harder than a half-decent job. This perhaps explains why there is so much biochemical noise – correcting it would just be too costly – but also constrains other aspects of life in the cell. For example, the noise levels may increase or decrease along signaling cascades, depending on the kinetic details at each step, but information about upstream states is always progressively and irreversibly lost. Though it is tempting to believe that large reaction networks are capable of almost anything if the rates are suitably nonlinear, the opposite perspective may thus be more appropriate: having more steps where one component affects the rates of another creates more opportunities for losing information and fundamentally prevents more types of behaviors. While awaiting the detailed models that predict what single cells actually do – which require every probabilistic chemical step to be well characterized – fusing control and information theory with stochastic kinetics thus provides a useful starting point: predicting what cells cannot do.
This research was supported by the BBSRC under grant BB/C008073/1, by the National Science Foundation Grants DMS-074876-0 and CAREER 0720056, and by grants GM081563-02 and GM068763-06 from the National Institutes of Health.
Author contributions The three authors (I.L., G.V., and J.P.) contributed equally, and all conceived the study, derived the equations, and wrote the paper.
Author information Reprints and permissions information is available at npg.nature.com/reprints. The authors declare no competing financial interests.