|Home | About | Journals | Submit | Contact Us | Français|
Developments in the field of phosphoproteomics have been fueled by the need simultaneously to monitor many different phosphoproteins within the signaling networks that coordinate responses to changes in the cellular environment. This article presents a brief review of phosphoproteomics with an emphasis on the biological insights that have been derived so far.
Although many biochemical mechanisms are involved in cellular signaling, reversible phosphorylation of serine, threonine, and tyrosine residues is the one most commonly used in mammalian cells. Protein kinases are one of the largest gene families in humans and mice, accounting for 1.7% of the human genome [1,2], and up to 30% of all proteins may be phosphorylated . Traditional biochemical and genetic analyses of phosphoproteins, and of the kinases and phosphatases that modify them, have provided a wealth of information about signaling pathways. These approaches, which typically focus on one protein at a time, are, however, not readily amenable to understanding the complexity of protein phosphorylation or how individual phosphoproteins function in the context of signaling networks. The availability of genome databases and advancements in analytical technology, especially mass spectrometry, has made it possible to study many phosphoproteins and phosphorylation sites at once. The term 'phosphoproteomics' describes a sub-discipline of proteomics that is focused on deriving a comprehensive view of the extent and dynamics of protein phosphorylation. While phosphoproteomics will greatly expand knowledge about the numbers and types of phosphoproteins, its greatest promise is the rapid analysis of entire phosphorylation-based signaling networks.
Current methods for analysis of the phosphoproteome rely heavily on mass spectrometry and 'phosphospecific' enrichment techniques. Emerging technologies that are likely to have important impacts on phosphoproteomics include protein  and antibody  microarrays, and fluorescence-based single-cell analysis . While these methods have the potential for high sensitivity and high throughput, they require prior knowledge of particular phosphoprotein targets. In contrast, mass-spectrometry-based approaches both allow large-scale analysis and provide the ability to discover new phosphoproteins. The speed, selectivity, and sensitivity of mass spectrometry also provide important advantages over biochemical methods for the analysis of protein phosphorylation [7-9]. Because many phosphoproteins, especially signaling intermediates, are low-abundance proteins phosphorylated at sub-stoichiometric levels, a considerable amount of effort has been devoted to the development of phosphospecific enrichment methods that are compatible with, or directly coupled to, mass spectrometry. These methodological approaches have been described in a number of recent reviews [7,8,10-13], and current methods are summarized in Table Table11.
Phosphoproteomics is a rapidly moving field. For example, advances in mass spectrometry, including the use of Fourier transform ion cyclotron resonance instruments, have recently been applied so as to improve the sensitivity and accuracy of phosphoproteomic experiments . It is likely that additional technological improvements will occur over the next few years. A recent, and very important, advance has been the incorporation of quantitative mass spectrometry methods into phosphoproteomics. For example, information about the dynamics of protein phosphorylation is often more informative than efforts directed solely at expanding the 'parts list' of signaling proteins. Identification of proteins or phosphorylation sites that change in response to receptor activation validates them as important components in signaling through that receptor.
Quantitative methods for mass spectrometry-based phosphoproteomics rely on the use of heavy isotopes and fall into three general categories: in vitro labeling of phosphoamino acids, in vitro labeling of proteins and peptides, and in vivo metabolic labeling. The basic principle of all three involves labeling peptides from one sample (control cells, for example) with a heavy isotope. This sample is then mixed with an unlabeled sample (from stimulated cells, for example) and the two are analyzed simultaneously. The ability of mass spectrometers to resolve the normal and isotopically labeled versions of the same peptide allows direct comparison of the amount of peptide in each sample. If the labeled peptide is a phosphopeptide, this method can be used to determine changes in the level of phosphorylation.
Several methods for in vitro labeling of phosphoamino acids with isotopically tagged moieties have been reported (for a list of methods discussed here, see Table Table2).2). Phosphoprotein isotope-coded affinity tag (PhIAT) involves the introduction of two isotopic forms of biotin-tagged dithiols into phosphoserine and phosphothreonine residues . Phosphoprotein isotope-coded solid-phase tag (PhIST) involves the simultaneous capture and labeling of phosphopeptides using solid-phase reagents . A third method, β-elimination/Michael addition with DTT (BEMAD), utilizes incorporation of normal or deuterated dithiothreitol followed by enrichment of labeled peptides by thiol chromatography [17,18]. All these methods utilize β-elimination/Michael addition chemistry to derivatize the phosphorylated amino acids; but this derivatization method is limited in that it cannot modify phosphotyrosine residues and also modifies sites of O-linked addition of N-acetylglucosamine .
In vitro methods for labeling peptides at sites other than phosphoamino acids include isotope-coded affinity tagging (ICAT), which labels cysteines with biotin derivatives that allow affinity enrichment of the labeled peptides . Although limited to the analysis of phosphopeptides that also contain a cysteine residue, ICAT has been used to study phosphorylation of the epidermal growth factor (EGF) receptor . The iTRAQ method (commercially available from Applied Biosystems, Foster City, USA) involves isotopic labeling of amine groups, allowing uniform labeling of all peptides in a sample. The iTRAQ method has been coupled with phosphotyrosine peptide enrichment and immobilized metal affinity chromatography (IMAC) to study the dynamics of phosphotyrosine-mediated EGF receptor signaling .
The third, and most widely used, method for isotopically labeling peptides involves metabolic labeling of cultured cells with amino acids (usually lysine and arginine) containing heavy isotopes. Stable isotope labeling by amino acids in cell culture (SILAC) is used in conjunction with mass spectrometry for the measurement of changes in protein expression and post-translational modifications (Figure (Figure1)1) . SILAC has been coupled with the use of anti-phosphotyrosine antibodies to study tyrosine phosphorylation following activation of the receptors for EGF [23,24], fibroblast growth factor , and insulin . SILAC has been coupled with IMAC to analyze the signaling involved in the yeast pheromone response  and in a targeted way to study the temporal phosphorylation of the β2-adrenergic receptor  and the ERK/p90rsk protein kinase signaling pathway .
Much of the current literature on phosphoproteomics is devoted to methods development. Technological advancements have expanded the capabilities of phosphoproteomics, but its ultimate impact depends on the biological meaning that can be generated from phosphoproteomic data. Even though its potential is just now being realized, a number of phosphoproteomic studies have already provided important new insights into cellular signaling.
A major application of phosphoproteomics has been the discovery of phosphoproteins not previously known to be involved in cellular signaling and the discovery of new phosphorylation sites in known signaling proteins. As enzymatic catalysis by protein kinases is the only known physiological mechanism for phosphorylation of serine, threonine, and tyrosine residues, and as most kinases are induced in cell signaling, any phosphorylated protein is potentially involved in signal transduction. Thus, there have been a number of large-scale screens for phosphoproteins and phosphorylation sites. Studies involving the enrichment of phosphopeptides followed by the identification of phosphorylation sites by tandem mass spectrometry have been carried out with yeast , human sperm , human T cells , murine B-lymphoma cells , human Jurkat cells, murine 3T3 cells, and human cancer cell lines . This type of study provides four kinds of new information: a list of known phosphoproteins that are expressed in a particular cell type; the identification of novel phosphorylation sites in previously known phosphoproteins; details of the phosphorylation of known proteins that have not previously been shown to be phosphorylated; and the identification of the phosphorylation of completely novel proteins.
Large-scale phosphoprotein and phosphopeptide screens like those described above [29-33] have been very effective in identifying large numbers of phosphoproteins and phosphorylation sites. The utility of this information is, however, often limited by a lack of biological context, especially for phosphoproteins with no known function. A targeted approach to examining phosphoproteins in specific subcellular organelles helps to overcome this limitation. Examples include phosphoproteomic analysis of HeLa cell nuclear proteins , murine synaptosomes , murine postsynaptic densities (that area of a postsynaptic membrane where neurotransmitter receptors and ion channels are clustered) , and Arabidopsis plasma membranes . The underlying assumption in these targeted experiments is that the phosphorylation identified play a role in signaling pathways that regulate the function of that organelle. So far, this assumption seems to hold quite well, as many of the phosphorylation sites that were identified were in proteins known to be associated with those organelles. Other approaches have targeted cells in defined states, including the identification of phosphorylation sites in proteins from capacitated human sperm  and phosphotyrosine proteins from activated T cells or chronic myelogenous leukemia cells treated with Gleevec, an inhibitor of the BCR-Abl protein kinase .
Additional functional insights derived from large-scale phosphoproteomic experiments are estimates of the contributions of different protein kinases to overall protein phosphorylation. Phosphopeptide sequences identified in phosphoproteomic screens have been analyzed with bioinformatic software tools that predict consensus phosphorylation sites for a variety of protein kinases (Table (Table3).3). Analysis of the large dataset (2,002 phosphorylation sites) from growing HeLa cell nuclear proteins with the Scansite program  showed that kinases that target serine and threonine residues followed by proline residues (for example, cyclin-dependent kinases, Cdks, and mitogen-activated protein (MAP) kinases) and acidophilic kinases, such as casein kinases, accounted for 77% of the total sites . There were no tyrosine phosphorylation sites detected in this study, consistent with the low levels of this modification (<1%) thought to be present in most cells. A relative paucity of tyrosine phosphorylation sites was also observed in phosphoproteomic analyses of the WEHI-231 B-lymphoma cell line  and Arabidopsis plasma membranes . Bioinformatic analysis of 289 synaptosomal phosphorylation sites with Scansite and NetPhosK  led to the conclusion that a small number of protein kinases phosphorylate many synaptic proteins and that each synaptic phosphoprotein is phosphorylated by many kinases . This analysis allowed the construction of a kinase-substrate network map that could be superimposed onto the protein-protein interaction network of the N-methyl D-aspartate C (NMDA) receptor complex (one of the receptors for the neurotransmitter glutamate). This network model predicts an important role for proximity and scaffolding proteins in mediating signaling through the complex.
The advent of phosphoproteomic methods for the large-scale identification of phosphorylation sites has generated a critical need for easy and effective ways to disseminate information derived from phosphoproteomic studies to the signal transduction research community. Datasets obtained in large-scale screens are likely to include information about phosphorylation sites in proteins that are the main focus of individual laboratories. Without easy ways to search the large datasets, it is difficult for such laboratories to take advantage of the information. Fortunately, there are several ongoing efforts to incorporate phosphorylation-site information derived from large-scale phosphoproteomics experiments into searchable databases. These efforts include the Phosphosite database , the Swiss-Prot database [42,43], and the Phospho.ELM database  (see Table Table33).
The combination of quantitative mass spectrometry and phosphoproteomics has generated powerful technologies for studying cellular signaling. The phosphoproteomic screening approaches described here have provided new insights into the complexity of protein phosphorylation. The relevance of individual phosphorylation sites detected in these studies to particular signaling pathways is, however, often unknown. The ability to monitor changes in phosphorylation that occur in response to the perturbation of a signaling pathway allows identification of phosphoproteins relevant to that pathway. Phosphospecific antibodies are very sensitive and useful probes for analyzing changes in phosphorylation of specific sites. They do, however, require prior knowledge of the phosphorylation site and it is challenging to monitor large numbers of phosphorylation sites simultaneously. Quantitative phosphoproteomics has a sensitivity that approaches that of phosphospecific antibodies but can be used to identify novel phosphoproteins and to monitor hundreds or thousands of individual phosphorylation sites in a single experiment.
Quantitative proteomics methods have been used in a targeted way to monitor phosphorylation of individual proteins. A combination of SILAC and IMAC was used to analyze agonist-induced phosphorylation of the β2-adrenergic receptor . The simultaneous monitoring of multiple sites allowed identification of the relevant in vivo phosphorylation sites and the discovery that different agonists (for example, isoproterenol and dopamine) induce differential phosphorylation of individual sites. SILAC was also used to monitor in vivo the kinetics of EGF-induced phosphorylation of six phosphotyrosine residues in the EGF receptor . The results showed that the kinetics of phosphorylation of the tyrosine residues correlated with the preferential association of the receptor with individual binding partners, such as growth factor receptor bound protein 2 (Grb2) and Src homology 2 domain-containing transforming protein (Shc).
A second major application of quantitative phosphoproteomics has been in studying the dynamics of phosphorylation and the assembly of signaling complexes. A combination of SILAC and anti-phosphotyrosine immunoprecipitation was used to examine phosphotyrosine-dependent signaling networks induced by EGF stimulation of HeLa cells . Of the 202 proteins detected, which were either phosphotyrosine proteins or proteins that co-precipitated with phosphotyrosine proteins, the levels of 81 were elevated by 1.5-fold or more following EGF stimulation. In addition to monitoring the activation of tyrosine phosphorylation, these experiments detected and quantitated proteins that associate with phosphotyrosine proteins through Src homology 2 (SH2) domains and other binding motifs. For example, temporal changes in the phosphorylation of the EGF receptor correlated with the co-precipitation of proteins known to interact with it, such as Grb2 and Shc. While nearly all of the proteins known to be associated with EGF receptor signaling were identified in these experiments, many additional proteins that were not previously known to be associated with this pathway were also identified. For example, the time-dependent recruitment of a set of RNA-binding proteins suggested a novel role for EGF receptor signaling in mRNA processing and transport. Six novel EGF-dependent proteins with no known function were also identified in these experiments. Quantitative mass spectrometry was used to compare the time courses of their association with the anti-phosphotyrosine complex with the time course of EGF receptor activation; this comparison allowed the assignment of functions for these proteins in early, membrane-proximal events or in later events such as cytoskeletal reorganization or endosomal trafficking. This report  provides the first example of the potential of quantitative phosphoproteomics to provide unprecedented amounts of information about cellular signaling from a single set of experiments.
A second example of the application of quantitative phosphoproteomics to the analysis of the dynamics of signaling complexes involves the ERK/p90rsk protein kinase signaling cassette . This study utilized immunoprecipitation of epitope-tagged versions of extracellular signal-regulated protein kinase (ERK), p90 ribosomal S6 kinase (p90rsk), and their targets, the tumor suppressors TSC1 and TSC2, to profile phosphorylation of multiple sites on these proteins simultaneously following either EGF stimulation or treatment with the protein kinase C (PKC) activator phorbol myristate acetate (PMA). New results from this study include the discovery of a novel phosphorylation site in p90rsk and eight novel phosphorylation sites in TSC1 and TSC2. Selective kinase inhibitors were used to show that phosphorylation of one of the novel sites in TSC2 was dependent on the activation of PKC but independent of ERK. The results demonstrated the existence of a previously uncharacterized PKC-dependent pathway for the regulation of TSC2.
Quantitative phosphoproteomic methods have also been used for a large-scale analysis of yeast phosphoproteins altered in response to activation of the mating pheromone-response pathway. SILAC labeling was coupled with phosphopeptide enrichment to identify 729 phosphorylation sites in 503 proteins . Of these, 139 sites were altered following pheromone treatment. Large numbers of phosphopeptides were derived from proteins known to be involved in the pheromone signaling pathway, including the pheromone receptor, components of the MAP kinase pathway, transcription factors, proteins involved in cell polarization, and proteins that participate in the assembly of mating projections. The identification of a set of pheromone-regulated phosphorylations on RNA-processing and -transport proteins suggested that the pheromone pathway has a previously unknown role in regulating mRNA metabolism. Overall, this study provided an unprecedented comprehensive dataset that quantifies pheromone-dependent changes in the phosphorylation of large numbers of individual phosphorylation sites. This included the identification of many proteins not previously known to be phosphorylated, as well as the identification of novel phosphorylation sites present in previously characterized phosphoproteins. The methods used are applicable to the study of cellular signaling in a wide variety of cell types and signaling paradigms. The large number of individual phosphorylation events that can be quantitatively interrogated using this approach will provide a powerful method for analyzing signaling networks at the systems biology level.
In conclusion, mass-spectrometry-based phosphoproteomics has already made significant contributions to our understanding of cellular signaling. The incorporation of technological advances in mass spectrometry and the application of novel protein and peptide enrichment techniques will increase the sensitivity and accuracy of detecting phosphoproteins and identifying phosphorylation sites. Continued application of quantitative methods will enhance the usefulness of data derived from phosphoproteomic experiments. Owing to the importance of reversible phosphorylation as a signal transduction mechanism, phosphoproteomics is likely to play an increasingly valuable role in the study of cellular signaling. This will be especially true as research in cellular signaling begins to grapple with understanding context-dependent signaling in living cells. Large-scale phosphoproteomic methods have the potential to monitor information flow through large portions of signaling networks that ultimately control the overall response of a cell to changes in its environment.