Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2859978

Formats

Article sections

Authors

Related links

Nature. Author manuscript; available in PMC 2010 April 27.

Published in final edited form as:

PMCID: PMC2859978

NIHMSID: NIHMS184849

Correspondence and requests for materials should be addressed to J.G. (Email: ude.dravrah.smh@ymerej).

Reversible phosphorylation on serine, threonine and tyrosine is the most widely studied posttranslational modification of proteins^{1}^{,}^{2}. The number of phosphorylated sites on a protein (*n*) shows a significant increase from prokaryotes, with *n*≤7 sites, to eukaryotes, with examples having *n*≥150 sites^{3}. Multisite phosphorylation has many roles^{4}^{,}^{5} and site conservation indicates that increasing numbers of sites cannot be due merely to promiscuous phosphorylation. A substrate with *n* sites has an exponential number (2^{n}) of phospho-forms and individual phospho-forms may have distinct biological effects^{6}^{,}^{7}. The distribution of these phospho-forms and how this distribution is regulated have remained unknown. Here we show that, when kinase and phosphatase act in opposition on a multisite substrate, the system can exhibit distinct stable phospho-form distributions at steady state and that the maximum number of such distributions increases with *n*. Whereas some stable distributions are focused on a single phospho-form, others are more diffuse, giving the phosphoproteome the potential to behave as a fluid regulatory network able to encode information and flexibly respond to varying demands. Such plasticity may underlie complex information processing in eukaryotic cells^{8} and suggests a functional advantage in having many sites. Our results follow from the unusual geometry of the steady-state phospho-form concentrations, which we show to constitute a rational algebraic curve, irrespective of *n*. We thereby reduce the complexity of calculating steady states from simulating 3×2^{n} differential equations to solving two algebraic equations, while treating parameters symbolically. We anticipate that these methods can be extended to systems with multiple substrates and multiple enzymes catalysing different modifications, as found in posttranslational modification ‘codes’^{9} such as the histone code^{10}^{,}^{11}. Whereas simulations struggle with exponentially increasing molecular complexity, mathematical methods of the kind developed here can provide a new language in which to articulate the principles of cellular information processing^{12}.

A major difficulty in studying multisite phosphorylation from a systems perspective has been the lack of information regarding when different sites on the same protein are simultaneously phosphorylated^{13}. Sites may be intricately dependent on each other^{14} and both the number and the position of phosphates can affect biological outcome^{6}^{,}^{7}. The phospho-form distribution—the relative stoichiometries of each of the 2^{n} phospho-forms—is thus the appropriate measure of phosphorylation state for a multisite substrate. Such distributions are starting to be measured^{15}^{,}^{16}, prompted by interest in posttranslational modification (PTM) codes. A theoretical understanding will improve our ability to interpret such data, uncover biological principles and design appropriate experiments. Although we focus on phosphorylation, we hope to lay a foundation for analysing multiple PTMs.

Figure 1 summarizes a general model of multisite phosphorylation. A substrate *S*, with *n* sites, is acted on by a kinase *E* and a phosphatase *F*. Substrates may have multiple kinases and phosphatases *in vivo* but a single enzyme often addresses multiple sites and we focus here on the minimal enzymatic machinery needed for any *n*. Each enzyme may act distributively (Fig. 1a) as well as processively (Fig. 1b) using a standard biochemical scheme (Fig. 1c) and arbitrary preferences for site order. These assumptions are more general than in previous models^{17}^{-}^{22}: by choosing the details appropriately, any kinase, phosphatase and substrate system can be represented. ATP is assumed to be recharged by some process that is not modelled. It is therefore not treated as a variable but its effect is absorbed into the site-specific parameters (${a}_{u}^{X},{b}_{u}^{X},{c}_{u,v}^{X}$ defined in Fig. 1c). Phosphorylation and dephosphorylation are assumed to take place on a fast time scale in comparison to synthesis and degradation of the component proteins. The model is, therefore, effectively closed: there is no flux of material through it and the total amounts of substrate, *S*_{tot}, and enzymes, *E*_{tot} and *F*_{tot}, remain constant at all times. With mass-action kinetics, these assumptions give rise to 3×2^{n} nonlinear differential equations for the state variables (Fig. 1d).

With limited information on site-specific parameters, numerical simulations can be undertaken for randomly selected parameter values in an attempt to discern typical behaviours. However, doing so in a state space of dimension 3×2^{n} rapidly becomes intractable as *n* increases. Here we introduce a new method of analysis, which allows strong conclusions to be drawn about steady states without having to specify parameter values in advance. Experimental evidence indicates that biological systems attain quasi-steady states *in vivo*^{23}^{,}^{24}, including systems in which multisite phosphorylation has a significant role^{25}. While the steady-state assumption must be confirmed in each experimental context, it has been widely used in modelling multisite phosphorylation^{17}^{-}^{22}.

At steady state, the second equation in Fig. 1d yields an expression for *XS _{u}* in terms of

The linearized equations for the *S _{u}* can be solved by Gaussian elimination, which works as well over the coefficients (

$${S}_{u}={S}_{0}{r}_{u}\left(t\right)$$

(1)

where *S*_{0} is the unphosphorylated phospho-form and *r _{u}*(

If numerical values are to be given to the site-specific parameters, it is important to know that the rational functions *r _{u}*(

Equation (1) implies that the steady-state phospho-forms can be described through a single auxiliary variable, *t*, so that they form a one-dimensional geometric object, or ‘curve’. Despite increasing numbers of sites, and the exponentially increasing size of the model, the steady-state phospho-forms always remain a curve, providing the basis for an exponential reduction in complexity. What changes with *n* is the extent to which the curve undulates, which determines how many steady states can co-exist for given amounts of substrate and enzymes (see below). Not all curves can be described by rational functions; those that can are of considerable geometric interest, as explored in an earlier paper^{26} and discussed further in the Supplementary Information.

The upshot of equation (1) is that, at steady state, the 3×2^{n} state variables are determined by *S*_{0}, *E* and *F*. Because the substrate and enzyme totals remain constant, we can formulate three equations for these three unknowns. The equation for *S*_{0} can be solved directly in terms of *S*_{tot}, leaving a pair of equations (defined in the Supplementary Information):

$${\Phi}_{1}(E,F)={E}_{\mathrm{tot}}\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}{\Phi}_{2}(E,F)={F}_{\mathrm{tot}}$$

(2)

which determine the steady-state *E* and *F* values corresponding to any given substrate and enzyme totals. Equation (2) exactly characterizes the steady states of the model (Supplementary Information). To find steady states, it is no longer necessary to numerically simulate 3×2^{n} differential equations; this can be done by only solving two algebraic equations. The complexity arising from the dynamics has been distilled away. This exponential reduction of complexity is the key to what follows.

Figure 2a shows an example with four sites and five steady states. (We assume sequentiality, as in Fig. 3a, but merely for convenience.) Only stable states are detected experimentally, or found by numerical simulation, and this example is tristable; the corresponding stable phospho-form distributions are markedly distinct. Whereas distributions 1 and 3 are each focused on a single phospho-form, distribution 2 is broader. With multiple stable states a system can encode many outcomes, or several bits of information, enabling complex information encoding and processing^{8}. Such multistability is believed to underlie cellular differentiation and other decisions^{8}^{,}^{27} but experimental examples have, so far, only demonstrated bistability^{23}^{-}^{25}. Bistability in two-site phosphorylation was previously shown by modelling^{20}.

It remained unknown how the number of stable states depends on *n*. We found that multistability tends to occur when substrate is in excess over enzymes (as in Fig. 2a). In this approximation, the two equations for *E* and *F* in equation (2) can be reduced to one polynomial equation for *t=E/F* (Supplementary Information):

$$P\left(t\right)={\alpha}_{N}{t}^{N}+{\alpha}_{N-1}{t}^{N-1}+\cdots +{\alpha}_{1}t+{\alpha}_{0}=0$$

(3)

where *a _{i}* (

The advantage of equation (3) over equation (2) is that we can readily construct solutions of the former. If *n* is even and we choose any *n*+1 distinct positive numbers, then we can always find a model for which the corresponding *P*(*t*) has these numbers as solutions (Supplementary Information). Moreover, *S*_{tot} can be chosen arbitrarily, so that the approximation of equation (2) by equation (3) is as accurate as desired. It follows that the corresponding model has *n*+1 steady states. The example in Fig. 2a meets this bound for *n*=4. If *n* is odd the same can be achieved for any set of *n* distinct positive numbers (Supplementary Information). Note that odd numbers of steady states are constructed in both cases.

This is also what we find in simulations. Furthermore, if the steady states are ordered by their corresponding *E/F* values then unstable states always occur between stable ones (Supplementary Information), as in Fig. 2a. Hence, if there are 2*k*+1 steady states, *k*+1 of them are stable. It follows that a model with *n* sites can have as many as (*n*+2)/2 stable states (*x* being the greatest integer not greater than *x*). We see that the tristability in Fig. 2a is only the tip of the iceberg: the maximum number of stable states increases with increasing numbers of sites.

Experimental detection of stable states requires an understanding of how they arise dynamically. Figure 3b considers the inter-conversion of *S*_{0} and *S*_{1}, for the sequential system in Fig. 3a. An informal argument based on the Michaelis–Menten approximation suggests that, for suitable parameter values, if the system is started entirely in *S*_{0} then substrate can remain trapped predominantly in that state, as in distribution 1 in Fig. 2a. A similar argument applies to trapping of *S*_{4}, as in distribution 3. Simulations show that all three stable distributions can be reached by starting from suitable mixtures of *S*_{0} and *S*_{4} (Fig. 2b).

*In vivo*, the most likely way that a phosphorylation system is regulated is by modulating its enzymes. Because the enzymes have been assumed to be in their active states, this corresponds to altering *E*_{tot} or *F*_{tot}. Figure 4a shows, for the tristable system in Fig. 2, that changes in *E*_{tot} can switch the system between stable states 1 and 3. It is possible that more complex modulations, involving both *E*_{tot} and *F*_{tot}, could access all three stable states or that different parameter values could facilitate additional switching capability. Figure 4b shows another option, in which changes in *E*_{tot} can switch between three stable states even though there is only a narrow window of *E*_{tot} values for which three stable states coexist. In other words, the system may not need to exhibit robust tristability to have access to three stable states.

DNA provides a static, structural mechanism for encoding information at a capacity of 2 bits per base pair. The idea that PTMs provide a dynamic mechanism for information encoding has been broadly influential^{10}^{,}^{11} but the mechanistic details remain a matter of debate^{9}^{,}^{28} and no estimate of information capacities has emerged for any such code. Our result provides the first demonstration of a PTM mechanism that can, in principle, encode an arbitrary amount of information, along with an estimate of its information capacity. If natural selection has found such a capability useful, that may help account for the emergence of large numbers of phosphorylation sites.

This work was supported in part by the NIH under grant R01-GM081578. We thank A. Manrai for scientific discussions, R. Ward for editorial help and HMS RITG for support with cluster computing. We acknowledge the encouragement of the late Stephen Thomson (1946–2006) and Charles Gunawardena (1929–2007).

**Supplementary Information** is linked to the online version of the paper at www.nature.com/nature.

Reprints and permissions information is available at www.nature.com/reprints.

1. Walsh CT. Posttranslational Modification of Proteins. Roberts and Company; 2006.

2. Cohen P. The role of reversible protein phosphorylation in health and disease. Eur. J. Biochem. 2001;268:5001–5010. [PubMed]

3. Gnad F, et al. PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites. Genome Biol. 2007;8:R250. [PMC free article] [PubMed]

4. Cohen P. The regulation of protein function by multisite phosphorylation — a 25 year update. Trends Biochem. Sci. 2000;25:596–601. [PubMed]

5. Holmberg CI, Tran SEF, Eriksson JE, Sistonen L. Multisite phosphorylation provides sophisticated regulation of transcription factors. Trends Biochem. Sci. 2002;27:619–627. [PubMed]

6. Wu RC, et al. Selective phosphorylations of the SRC–3/AIB1 coactivator integrate genomic responses to multiple cellular signaling pathways. Mol. Cell. 2004;15:937–949. [PubMed]

7. Park K-S, Mohapatra DP, Misonou H, Trimmer JS. Graded regulation of the Kv2.1 potassium channel by variable phosphorylation. Science. 2006;313:976–979. [PubMed]

8. Nurse P. Life, logic and information. Nature. 2008;454:424–426. [PubMed]

9. Sims RJ, Reinberg D. Is there a code embedded in proteins that is based on posttranslational modification? Nature Rev. Mol. Cell Biol. 2008;9:815–820. [PubMed]

10. Jenuwein T, Allis CD. Translating the histone code. Science. 2001;293:1074–1080. [PubMed]

11. Turner B. Cellular memory and the histone code. Cell. 2002;111:285–291. [PubMed]

12. Cohen JE. Mathematics is biology's next microscope, only better; biology is mathematics' next physics, only better. PLoS Biol. 2004;2:e439. [PMC free article] [PubMed]

13. Hunter T. The age of crosstalk: phosphorylation, ubiquitination and beyond. Mol. Cell. 2007;28:730–738. [PubMed]

14. Ferrarese A, et al. Chemical dissection of the APC repeat 3 multistep phosphorylation by the concerted action of protein kinases CK1 and GSK3. Biochemistry. 2007;46:11902–11910. [PubMed]

15. Phanstiel D, et al. Mass spectrometry identifies and quantifies 74 unique histone H4 isoforms in differentiating human embryonic stem cells. Proc. Natl Acad. Sci. USA. 2008;105:4093–4098. [PubMed]

16. Pesavento JJ, Bullock CR, LeDuc RD, Mizzen CA, Kelleher NL. Combinatorial modification of human histone H4 quantitated by two-dimensional liquid chromatography coupled with top down mass spectrometry. J. Biol. Chem. 2008;283:14927–14937. [PubMed]

17. Goldbeter A, Koshland DE. An amplified sensitivity arising from covalent modification in biological systems. Proc. Natl Acad. Sci. USA. 1981;78:6840–6844. [PubMed]

18. Lisman JE. A mechanism for memory storage insensitive to molecular turnover: a bistable autophosphorylating kinase. Proc. Natl Acad. Sci. USA. 1985;82:3055–3057. [PubMed]

19. Salazar C, Höfer T. Allosteric regulation of the transcription factor NFAT1 by multiple phosphorylation sites: a mathematical analysis. J. Mol. Biol. 2003;327:31–45. [PubMed]

20. Markevich NI, Hoek JB, Kholodenko BN. Signalling switches and bistability arising from multisite phosphorylation in protein kinase cascades. J. Cell Biol. 2004;164:353–359. [PMC free article] [PubMed]

21. Gunawardena J. Multisite protein phosphorylation makes a good threshold but can be a poor switch. Proc. Natl Acad. Sci. USA. 2005;102:14617–14622. [PubMed]

22. Kim SY, Ferrell JE. Substrate competition as a source of ultrasensitivity in the inactivation of Wee1. Cell. 2007;128:1133–1145. [PubMed]

23. Ozbudak EM, Thattai M, Lim HN, Shraiman BI, van Oudenaarden A. Multistability in the lactose utilization network of *Escherichia coli*. Nature. 2004;427:737–740. [PubMed]

24. Sha W, et al. Hysteresis drives cell-cycle transitions in *Xenopus laevis* egg extracts. Proc. Natl Acad. Sci. USA. 2003;100:975–980. [PubMed]

25. Ferrell JE, Jr, Machleder EM. The biochemical basis of an all-or-none cell fate switch in *Xenopus* oocytes. Science. 1998;280:895–898. [PubMed]

26. Manrai A, Gunawardena J. The geometry of multisite phosphorylation. Biophys. J. 2008;95:5533–5543. [PubMed]

27. Monod J, Jacob F. General conclusions: teleonomic mechanisms in cellular metabolism, growth and differentiation. Cold Spring Harb. Symp. Quant. Biol. 1961;26:389–401. [PubMed]

28. Berger SL. The complex language of chromatin regulation during transcription. Nature. 2007;447:407–411. [PubMed]

29. Huang C-YF, Ferrell JE. Ultrasensitivity in the mitogen-activated protein kinase cascade. Proc. Natl Acad. Sci. USA. 1996;93:10078–10083. [PubMed]

30. Cornish-Bowden A. Fundamentals of Enzyme Kinetics. 2nd edn Portland Press; 1995. pp. 23–28.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |