|Home | About | Journals | Submit | Contact Us | Français|
The phosphorylation and dephosphorylation of proteins by kinases and phosphatases constitute an essential regulatory network in eukaryotic cells. This network supports the flow of information from sensors through signaling systems to effector molecules, and ultimately drives the phenotype and function of cells, tissues, and organisms. Dysregulation of this process has severe consequences and is one of the main factors in the emergence and progression of diseases, including cancer. Thus, major efforts have been invested in developing specific inhibitors that modulate the activity of individual kinases or phosphatases; however, it has been difficult to assess how such pharmacological interventions would affect the cellular signaling network as a whole. Here, we used label-free, quantitative phosphoproteomics in a systematically perturbed model organism (Saccharomyces cerevisiae) to determine the relationships between 97 kinases, 27 phosphatases, and more than 1000 phosphoproteins. We identified 8814 regulated phosphorylation events, describing the first system-wide protein phosphorylation network in vivo. Our results show that, at steady state, inactivation of most kinases and phosphatases affected large parts of the phosphorylation-modulated signal transduction machinery, and not only the immediate downstream targets. The observed cellular growth phenotype was often well maintained despite the perturbations, arguing for considerable robustness in the system. Our results serve to constrain future models of cellular signaling and reinforce the idea that simple linear representations of signaling pathways might be insufficient for drug development and for describing organismal homeostasis.
Protein kinases, and, to a lesser extent, protein phosphatases, are attractive drug targets (1–5); however, although their respective catalytic activities are well characterized, their functions in vivo remain relatively poorly understood. Despite extensive in vitro (6), in silico (7), or indirect in vivo assays (8), our knowledge of the global relationships between kinases, phosphatases, and their substrates remains fragmented (2). Even less is known about the more downstream, indirect consequences of kinase activity, making rational selection of suitable candidates for therapeutic interventions difficult; consequently, many promising kinase inhibitors are ultimately retired from development (9).
One promising approach for closing this knowledge gap is the organism-wide, quantitative assessment of all phosphorylated proteins, comparing phosphorylation status in wild-type cells to that in cells that have undergone systematic perturbations of their kinases or phosphatases. Progress in phosphoproteomics technology has brought this goal within reach by enabling the reproducible quantification of thousands of phosphorylation sites in a single study (10–12). Although the throughput is not yet sufficient to systematically address all 518 protein kinases and 147 protein phosphatases in human cells (13, 14), simpler organisms, such as yeast, can be addressed. Yeast in particular is frequently used as a model to study human diseases (15), including cancer, mitochondrial diseases, and even neurological disorders caused by protein misfolding (16, 17). Although some signaling systems, such as the apoptotic machinery, are absent in yeast, other parts of its signaling network display substantial similarities to those in human cells (18, 19). Of the 161 kinases and phosphatases in yeast, 136 are conserved in humans at more than 30% amino acid sequence identity (table S1), and some human signaling proteins can even replace their yeast counterparts (20). Here, we used a combination of phosphoproteomics measurements and computational methods (11) to detect and quantify the system-wide responses in the yeast phosphoproteome upon deletion or inhibition of most of its kinases and phosphatases.
We developed an integrated experimental and computational strategy for high-throughput comparative phosphoproteomic analysis in Saccharomyces cerevisiae (Fig. 1), which consisted of the following steps. First, we systematically perturbed the kinase-substrate and phosphatase-substrate networks by selecting gene deletion mutants of the nonessential kinases or phosphatases or, for some essential kinases, by generating mutants inhibitable by cell-permeable drugs, which are referred to as “analog-sensitive” kinase strains (21). To minimize compensatory mutations that might accumulate over time in the gene deletion strains, we freshly prepared all mutant strains. To enable a statistical characterization of our observations, we always grew, processed, and measured each perturbed strain in three independent replicates, together with three replicates of wild-type, control cells. Phosphopeptides were isolated from each sample (22, 23) and submitted to high-performance mass spectrometry to generate liquid chromatography coupled to mass spectrometry LC-MS/MS phosphoproteome maps. The triplicate phosphoproteome maps generated from each perturbed or wild-type cell sample were annotated with the amino acid sequences of the detected phosphopeptide features and were aligned with the algorithm SuperHirn (24), which was followed by additional postprocessing (see Supplementary Materials for details). The statistical significance of observed changes in the perturbed states was then computed for each phosphopeptide with the Corra software suite (25).
We assessed the reliability of our measurements and computational data processing at two levels. First, we assessed the confidence of the phosphopeptide identifications generated by database searching, and second, we assessed the reproducibility of detecting quantitative phosphopeptide differences between wild-type and mutant strains. For the first check, and to determine the reliability of our phosphopeptide identifications from the peptide fragment ion spectra, we performed statistical analyses with the PeptideProphet tool (26) and a decoy database strategy (27). From these analyses, we found that a PeptideProphet probability cutoff of 0.9 corresponded to a false discovery rate (FDR) of ~0.038 (3.8%) (table S2), which confirms that our chosen cutoff of 0.9 yielded an acceptably low degree of incorrect peptide identifications, in particular because most phosphopeptides were identified repeatedly in the context of this extensive study.
We then used the statistical tool Corra (25), which supports an empirical Bayesian alternative to the t test (28). The test improves the reliability of conclusions in cases of large-scale testing. For each phosphopeptide feature, the test provided a P value of the observed differences between wild-type and mutant replicates. The P values were further corrected for multiple testing according to the Benjamini and Hochberg procedure (29) (see the Supplementary Materials). After this quantitative analysis step, we chose an FDR threshold of 0.015 in conjunction with a minimum fold-change requirement of log2 >1.5, both of which had to be met before we would consider any phosphopeptide as reproducibly regulated. At this threshold, nine comparisons between wild-type and lowest-impact kinase mutants resulted in only a single or no phosphopeptide being designated as regulated, which verified the validity of our selected criteria. On the basis of these results, we concluded that our applied cutoffs ensured that, despite a high sensitivity (fig. S1), only a minimal amount of noise entered our analyses and that we achieved high reproducibility in the observed regulatory events.
Overall, we attempted the analysis of 161 mutant strains of yeast. Of these, 37 strains could not be analyzed because they were not viable, not inhibitable, or otherwise not amenable to our procedure (table S1). In total, we generated quantitative data for 116 gene deletion mutants and for an additional 8 strains in which analog-sensitive kinases were pharmacologically inhibited (table S1). Together, this corresponds to coverage of 78% of the theoretical kinase and phosphatase space in yeast and covered 77% of those enzymes that show sequence conservation with human enzymes (table S1). A matrix and a network generated from these data related the observed changes in the abundance of a phosphopeptide (measured in triplicate) to the corresponding kinase or phosphatase deletion (Fig. 2 and fig. S2). The matrix contains 8814 reproducible changes in peptide abundance that mapped to 1026 phosphoproteins that were clustered according to the coregulation of the phosphopeptides (tables S3 and S4). Of note, an additional 7550 phosphopeptides were consistently identified but did not exhibit a substantial change in abundance under any of the perturbations tested.
Finally, the cellular abundance distribution of detected phosphoproteins (regulated and unregulated) was roughly similar to that of the total yeast proteome; however, the complete phosphoproteome was still not covered (fig. S3), because under our chosen growth conditions, many phosphorylation sites would not be phosphorylated, and because our experimental pipeline had several biases, among them that only tryptic peptides with a mass/charge ratio (m/z) suitable for LC-MS/MS analysis (30) could be identified. Nevertheless, the observed phosphorylation sites covered a reasonably large fraction of the phosphoproteome, and therefore an existing bias should not impair our conclusions (31).
Because kinases and phosphatases are components of complex, interconnected signaling networks, we fully expected to observe a number of indirect, downstream responses, that is, phosphopeptides whose abundance would change despite their not being a direct molecular target of the kinase or phosphatase in question. Indeed, we found that such events seemed to strongly outnumber direct kinase-substrate interactions, as argued by the following observations. First, we determined for each kinase or phosphatase the number of phosphopeptides whose responses showed the expected directionality (that is, reductions in abundance in the case of kinase deletions and increases in abundance in the case of phosphatase deletions). In general, the number of phosphopeptides that responded in the expected directionality was roughly similar to that of phosphopeptides that responded with “inverted” directionality (Fig. 2 and fig. S4). Exceptions to this finding were analog-sensitive kinases that were inhibited over the short term; for example, in the case of Cdc28, about 76% of the phosphopeptides were regulated in the expected directionality. No difference in the direction of regulation was observed between nonessential kinases or phosphatases (fig. S4). Second, we conservatively assumed that phosphopeptides that changed in abundance in only a single deletion strain might be direct molecular targets of the kinase or phosphatase in question. By this measure, we found that, at most, 32% of the observed regulatory events might have been direct for kinases (that is, that the events mapped to just a single kinase), whereas in the case of phosphatases this number was 53%. The data sets generated by the short-term inhibition of the analog-sensitive kinases showed a higher fraction of potential direct targets (44%) than did the permanent deletion strains.
Third, we tested the overlap of our data with various previously established reference protein-protein interactions in yeast (32–35), such as the STRING database (tables S5 and S6). We observed that the overlap of our data with these direct interactions was small (table S5). This is consistent with the long-held notion that kinase-substrate interactions are too weak and transient to be detectable by typical affinity purification–based protein interaction screens. Reassuringly, however, first, the overlap of the heavily studied kinase Cdc28 with our data set on the level of regulated phosphoproteins was high, showing a 43% overlap with the study of Ubersax et al. (36) and a 76% overlap with the study of Holt et al. (10) (on the phosphorylation site level, the overlap was 46%). Second, all other phosphorylation events that did overlap showed substantial enrichments for the expected directionality. Likewise, we observed substantial enrichment of confirmed interactions, in particular for those phosphopeptides that responded only in a single perturbation (table S7). This indicates that our data included a sizeable fraction of direct enzyme-target interactions; however, from all three tests, we can conclude that indeed a large majority of our observed events were indirect consequences of the deletion. Not a single kinase showed exclusively direct effects, indicating that a focused modulation of a pathway (branch) without system-wide adaptations might not be possible with a single drug.
As is the case in prolonged pharmacological intervention, our genetic kinase-deletion approach gave the cells ample time to accommodate (and potentially compensate for) the loss of kinase activity. This should not only have led to downstream, indirect consequences on the phosphoproteome, but could have also entailed subsequent changes in gene expression and the amounts of proteins produced. To assess the extent of this effect, we measured not only abundance changes in the phosphoproteome but also abundance changes of the proteins themselves, by observing unphosphorylated peptides in a subset of 16 kinase deletion strains. The kinases selected for this test ranged from those that had a small effect on the phosphoproteome to those that had a large effect. The data indicated that for a total of 467 regulated phosphopeptides that matched to 118 proteins covered in this analysis, 79% of the proteins remained unchanged in abundance, and, in a single case, the directionality of the phosphopeptide regulation was opposite to the protein abundance change (figs. S5 and S6). In 21% of the cases in which a phosphopeptide was regulated, we also observed a change in protein abundance in the same direction.
We also performed additional orthogonal, but more indirect, analyses based on the coregulation or antiregulation of phosphorylation sites on the same protein, which we found in more than half of the phosphoproteins. We reasoned that a synchronous change with a similar amplitude and directionality of such phosphopeptides would indicate an abundance change of the corresponding protein. In contrast, a discordant abundance change of the phosphopeptides from such proteins would indicate a change in phosphorylation site occupancy. These data (fig. S7) can be summarized as follows: For about 25% of the observed events, only a single regulated phosphopeptide was detected on the entire length of the phosphoprotein, impeding this type of analysis. The remainder of events fell into three classes: In 49% of the remaining cases, at least two phosphopeptides originating from the same protein were observed to be regulated, and these exhibited identical directionality. In contrast, in 5% of events, the changes were of opposing directionality; the latter pattern was not consistent with a simple protein abundance change. Of note, in a large part of the data, that is, in 46% of cases, a phosphopeptide that had substantially changed in abundance was detected with at least one other phosphopeptide on the same protein, but the other phosphopeptides were not observed to be regulated. The latter two categories indicate that for most events detected in this study, changes in the abundance of a phosphopeptide could not be explained by changes in protein abundance alone.
The number of phosphopeptides that were affected by the deletion of a given kinase or phosphatase varied considerably (Fig. 2). Therefore, we (i) quantified the impact of each kinase or phosphatase on the phosphoproteome under the growth conditions tested, (ii) assessed whether the kinases and phosphatases were associated with different biological processes according to their effect on the phosphoproteome, and (iii) determined which biological processes were affected by each kinase and phosphatase.
We first computed the fraction of phosphopeptides that were affected by a given kinase or phosphatase relative to the total number of phosphopeptides that were affected by the kinases and phosphatases (Fig. 3A and table S8). We observed that the deletion of 22% of the kinases and phosphatases that we tested resulted in fewer than 10 perturbed phosphopeptides each; therefore, we considered these deletions to have had minimal effects on the fraction of the phosphoproteome detected in this study. These included kinases important in cellular stress response mechanisms, such as Mrk1 (37) and Gcn2 (38). In contrast, for 78% of the kinase and phosphatase deletion strains, distinct changes in the phosphoproteome could be detected. The kinases with the largest effects on the phosphoproteome were Ctk1 (39), a kinase with key roles in the regulation of transcription and translation, and Psk2, which is involved in sugar flux and translational regulation (40). These data show that the loss of most kinases or phosphatases indeed perturbed large parts of the signaling network.
We next determined the distribution of biological processes represented by the phosphoproteins affected by the lower-impact (bottom half) and higher-impact (top half) kinases and phosphatases, respectively. We found that the enzymes with the smallest effect showed a strong enrichment in processes associated with mitogen-activated protein kinase (MAPK) cascade signaling [“MAPKKK (MAPK kinase kinase) cascade,” P = 3.9−10; “response to pheromone,” P = 4.2−6], whereas the enzymes with the largest effects showed a strong enrichment in processes related to the mitotic cell cycle (“interphase of mitotic cell cycle,” P = 3.1−9; “mitotic cell cycle,” P = 1.4−6) (tables S9 and S10). These data showed that under the tested conditions, even stress- or mating-related kinases showed a measurable impact on the phosphoproteome, albeit lower than that of growth- and cell cycle–related kinases or phosphatases. Lastly, we also computed those biological processes that were enriched among the responders of each individual kinase or phosphatase. We found that 575 biological processes were enriched (Fig. 3B and table S11), an average of five processes for each active kinase or phosphatase. The most frequently enriched functions were “endocytosis” (39 times) and “cell morphogenesis” (38 times). Together, these data illustrate that the effects of most kinases and phosphatases on the signal transduction network, and thereby on controlled biological processes, were broad, perhaps broader than expected (2).
We next tested the phenotypic consequences of deletion of kinases and phosphatases, which are relevant in particular with regard to effects (side effects) of potential drugs that inhibit kinases or phosphatases. For each deletion strain, we assessed changes in growth speed (41) and morphological features (table S8) (42). Despite 97 of the deletion strains showing reproducible responses in the phosphorylation network, only 9 mutants showed a strong effect on growth speed, and the total was 23 if strong changes in morphological features were also included (Fig. 3A). Conversely, 11 of the 27 kinases and phosphatases that had an undetectable, or only minimal, effect on the section of the phosphoproteome measured in this study showed a phenotype, among them, the kinase Elm1 (43), which showed a strong morphological phenotype. However, many strong morphological phenotypes were indeed observed in mutants that showed a strong change in the phosphoproteome, but the results were nevertheless surprising because they indicated that strong phenotypes were not necessarily reflected in the status of the phosphoproteome, as exemplified by Elm1 and other enzymes. Perhaps, in some cases, compensatory effects (visible at the level of the phosphoproteome) were precisely what prevented the occurrence of strong phenotypic consequences, as exemplified by the lack of correlation between the growth phenotypes and the changes in the phosphoproteome. This observation is particularly relevant because, first, cancer cells might display in some regards increased compensatory power, and second, kinase inhibitors that are specific for a target in vivo might not necessarily result in a cellular phenotype.
Our study delineates the responses of the system-wide cellular phosphorylation network upon systematic inactivation of individual kinases or phosphatases. Because the phosphorylation network is one of the main cellular backbones for the processing of information and the implementation of cellular responses, it is highly dynamic. Our measured behavior is only a single snapshot of a large number of possible outcomes, which were constrained by the growth and experimental conditions that we chose.
The first surprising observation that we made was that 7550 phosphopeptides were consistently identified but did not show a substantial amount of regulation. This may be due to, first, our cutoffs being conservative; thus, many putative regulatory events may not have been reproducible or strong enough to be deemed substantial. Second, 22% of the kinase and phosphatase mutants could not be analyzed, mainly because the corresponding genes are essential for cellular viability. Perhaps their essentiality is at least partly due to a generally higher impact on the phosphoproteome, as indicated recently (10), or because their substrates need to be phosphorylated constitutively. Third, in yeast, a large number of paralogous kinase isoforms exist (for example, Tpk1, Tpk2, and Tpk3). Given this, it is reasonable to expect some overlap or redundancy in substrates, which could lead to a considerable number of phosphorylation sites that would appear unregulated as long as only one of the paralogous duplicates was deleted. Fourth, the yeast populations that we analyzed consisted in a strict sense of many mixed subpopulations (for example, cells in different cell cycle states), and it can be assumed that an identical phosphorylation site can become phosphorylated by different kinases during the cell cycle. Therefore, analyzing deletions of single kinases or phosphatases would only manifest in slight, if any, regulation for such sites; for example, a cell cycle phase–specific regulation is masked by all cells that are not in that particular phase at any given time point. Fifth, we also analyzed whether the regulated and nonregulated phosphopeptides fell into different protein abundance classes (for example, the nonregulated are of low abundance and therefore regulation is more difficult to observe), but this was not the case. Overall, it is likely that all five possible explanations contribute to the observed result.
Another finding of this study was the unexpectedly strong dominance of indirect effects (as opposed to direct molecular target effects), which were often without a resulting strong cellular phenotype. To some extent, this observation fits with a view of signaling networks having to be highly flexible and redundant to respond to an ever-changing environment while maintaining stable cellular states (44). This constrains the architecture of the system, as described by the “law of requisite variety” (45, 46), a fundamental law in systems control theory. It states that stable systems have to encode a number of control states that is higher than or equal to the number of states to be controlled. Considering that for each cell the space of “environmental states” is enormous, consequently, also the cellular “control variable space” must have an equal or greater size. The combinatorial possibilities of the phosphoproteome seem to ideally fulfill this demand (44).
An alternative explanation for this observation might also be found in the theory of Neutral Evolution (47). It is possible that only a small number of the observed phosphorylation events are actually relevant for the function and survival of the cell, whereas most phosphorylation events would simply have no effect, or at least have no negative effect, on the cell. As a result, such phosphorylation sites would not be counterselected during evolution. The data generated in this study do not, by themselves, support or refute this hypothesis. Finally, the low correlation between phenotype and the degree of change in the phosphoproteome may have been affected by the growth conditions chosen here, the lack of sensitivity of the phenotypic assays, or the possibility that the phosphoproteomics data were not sampled deeply enough to find such correlations.
In addition to revealing insights into the architecture of cellular signaling, our data set also describes the proteome-wide functional states of yeast cells; this might be useful for determining diagnostic markers for stress conditions, functional states of key pathways, or the activity of a given kinase or phosphatase. These markers could be used in conjunction with targeted proteomics approaches to not only study basic biological processes but also determine how a given pharmacological intervention would affect the cellular signaling network.
With targeted proteomics methods, not only can the cellular information flux under many conditions be observed, at high throughput, but this approach also enables us to understand for all phosphorylation sites whether the observed change is a “true” regulation event or simply as a result of a change in protein abundance (48–50) because both the phosphopeptide and several proteotypic peptides corresponding to the protein could be relatively or absolutely quantified, thus determining the phosphorylation site occupancy and regulation. Overall, our data provide global starting points, and constraints, toward understanding the complexity of phosphorylation regulation in yeast and other organisms. In the future, the results should be complemented by similar data for specific cellular conditions, time courses, or small-molecule interventions, thereby sharpening—step by step—our view of the events in the phosphorylation network. The ensuing insights in general design rules and motifs in cellular information processing will be essential for our ability to develop kinase-based drugs in an informed way.
The generated LC-MS/MS phosphoproteome maps (table S2), an overview of the generated data (table S12), and the statistical methods used for their analysis are explained in detail in the Supplementary Materials. We have made available all kinase/phosphatase-responder relations in a user-friendly way in the recently described PhosphoPep database (30, 51) (http://www.phosphopep.org). All yeast strains used here can be supplied upon request in a 96-well plate format (table S13).
Materials and Methods
Fig. S1. Power of the analysis approach.
Fig. S2. Topological properties of the protein phosphorylation network.
Fig. S3. Abundance distribution of responder phosphoproteins (proteins that contain “regulated” phosphopeptides).
Fig. S4. Ratio of phosphopeptides that are reduced or increased in abundance.
Fig. S5. Regulation of phosphopeptides versus regulation of protein abundance.
Fig. S6. Regulation of phosphopeptides versus regulation of protein abundance.
Fig. S7. Regulation of phosphopeptides that map to the same protein.
Table S1. List of enzymes.
Table S2. False discovery rate of peptide identification and specificity of phosphopeptide enrichment for each analyzed phosphorylation pattern.
Table S3. Information on phosphopeptides and phosphoproteins.
Table S4. Significant coregulation of kinases and phosphatases.
Table S5. Overlap of data from this study with other data sets.
Table S6. Confirmed STRING interactions.
Table S7. Overlap of possible direct targets with other data sets.
Table S8. Effects of each kinase and phosphatase on the phosphoproteome.
Table S9. Enrichment of biological processes among the low-impact kinases (bottom half).
Table S10. Enrichment of biological processes among the high-impact kinases (top half).
Table S11. GO terms.
Table S12. Overview of the entire data set.
Table S13. Information on yeast strains.