|Home | About | Journals | Submit | Contact Us | Français|
More than 300 different types of protein post-translational modifications (PTMs) have been described, many of which are known to have pivotal roles in cellular physiology and disease. Nevertheless, only a handful of PTMs have been extensively investigated at the proteome level. Knowledge of protein substrates and their PTM sites is key to dissection of PTM-mediated cellular processes. The past several years have seen a tremendous progress in developing MS-based proteomics technologies for global PTM analysis, including numerous studies of yeast and other microbes. Modification-specific enrichment techniques combined with advanced MS/MS methods and computational data analysis have revealed a surprisingly large extent of PTMs in proteins, including multi-site, cooperative modifications in individual proteins. We review some of the current strategies employed for enrichment and detection of PTMs in modification-specific proteomics.
Post-translational modification (PTM) represents an important mechanism for diversifying and regulating the cellular proteome. In this review, PTM refers to a chemical event that converts a ribosomally coded amino acid residue into a non-standard amino acid residue by an enzymatic reaction. The identification of protein substrates and their PTM sites are fundamental to the biochemical dissection of PTM pathways (for example, the identification of enzymes that catalyse PTM), to studies of the role of PTM in the function of substrate proteins, to the establishment of substrate–enzyme(s) relationships, and to providing insights into the possible regulation of cellular physiology by PTM.
Protein lysine acetylation provides a good example of a critical role of substrate identification for functional characterization of a PTM-mediated pathway. Lysine acetylation was initially identified in histones in the 1960s . Demonstration of its first non-histone substrate protein, p53, in 1997  stimulated extensive studies of the roles of lysine acetylation in transcriptional regulation. Identification of diverse substrates in both cytosolic and mitochondrial fractions  opened new avenues for its functional studies in energy metabolism, signal transduction, and mitochondrial regulation.
Conventionally, PTM substrates have been identified by laborious biochemical approaches, including in vitro PTM reaction assays using radioactive isotope-labeled substrates, Western blot analysis, and more recently peptide and protein arrays [4, 5] (Table 1). While useful, these methods suffer from various shortcomings. For example, radio-isotopes of carbon and hydrogen are rather weak radio emitters (for example, 14C or 3H in the case of protein methylation and acetylation), which makes it difficult to efficiently detect their corresponding modified proteins. Antibody-based Western blot analysis has been successful for identifying candidate substrate proteins for certain types of PTM, such as tyrosine phosphorylation. However, the small size of the structural motifs of other common PTMs (for example, protein methylation and acetylation) makes it difficult to generate pan-specific antibodies, which recognize PTM peptides/proteins independent of its surrounding sequences, with good affinity for routine Western blotting.
Another valid approach for identifying protein substrates is based on the specificity of PTM-specific enzymes. For example, in vitro screens have been carried out using peptide or protein arrays to identify sequence motifs for a protein lysine methyltransferase  and for protein kinases . Nevertheless, PTM substrate candidates identified by these approaches require further validation by MS analysis of the purified endogenous proteins. In summary, despite technical advances in the past few decades, more efficient and sensitive bioanalytical technologies are needed to address key bottlenecks in the identification of PTM substrate proteins, in mapping PTM sites, and to investigate in vivo PTM dynamics.
During the past decade, MS-based proteomics has been shown to be a powerful technique for proteome-wide identification of PTM substrates and mapping of PTM sites. Such studies typically involve four steps (Fig. 1). First, the protein lysate of interest is proteolytically digested, usually by a specific protease, such as trypsin. Second, the resulting proteolytic peptides are subjected to enrichment, using a suitable method, to separate the PTM peptides of interest from the rest of the proteolytic peptides. Third, The isolated PTM-peptides are then analyzed by nano-HPLC/MS/MS for peptide identification and precise localization of PTM sites. Finally, the peptide candidates are further evaluated by a manual or an automated verification method to ensure the accuracy and statistical significance of the identification . In addition, a separation step can be included in the procedure to separate either proteins (before the proteolytic digestion) or peptides (after the proteolytic digestion) into multiple fractions to reduce sample complexity.
High sensitivity is desirable in PTM proteomics to detect substrate proteins that exist in low abundance in cells. The detection sensitivity of a PTM proteomics screening depends on four factors: (i) yield of affinity enrichment, (ii) level of contamination from irrelevant peptides, (iii) sensitivity of the HPLC/MS/MS system, and (iv) complexity of the peptide mixture. The PTM peptides are present in an ocean of non-PTM peptides and may be present in low stoichometry. Accordingly, without enrichment, mass spectrometric analysis has low efficiency to detect PTM peptides. Despite advances in the sensitivity of HPLC/MS systems and the development of more powerful algorithms for protein sequence database searching, the lack of efficient procedures for enrichment of PTM peptides has become a major bottleneck for PTM proteomic research.
Here, we review existing MS-based proteomics strategies for global PTM analysis, with a focus on enrichment methods for PTM peptides. We also discuss future challenges for comprehensive PTM analysis. Readers interested in general information about PTMs, mapping PTM sites in proteins and PTM quantification by MS are referred to several recent review articles [8–13].
Before PTM peptides are enriched, the protein lysate of interest is typically prepared from cultured cells and/or tissues, and subsequently proteolytically digested. In some cases, cellular organelles and/or a protein complex are isolated and followed by proteolytic digestion and PTM analysis. A few key issues in this step include prevention of in vitro artificial PTM reaction, increase detection sensitivity by reducing complexity of a protein sample, and preparation of a protein sample that is biologically relevant.
To reduce the sample complexity, proteins can be resolved into fractions using various methods, including organelle separation, and fractionation of proteins and proteolytic peptides by electrophoretic techniques or multi-dimensional chromatography. A combination of these fractionation methods can further simplify peptide mixtures. Protein or peptide fractionation cannot only reduce the complexity, but can also enhance the yield of affinity enrichment by raising the peptide concentration in the peptide mixture and reducing competitive inhibition (for example, in the case of antibody affinity purification).
A protein can be located in multiple cellular organelles and protein complexes; it is therefore highly likely that the protein will have a different spatio-temporal PTM profile and stoichiometry depending on the molecular and cellular context. A well-established example are the core histone proteins whose PTMs differ between heterochromatin and euchromatin. To gain biologically relevant PTM information, the protein of interest should be isolated from a biologically relevant cellular environment. For examples, EGF receptor can be located in either plasma membrane or nuclei . The protein should therefore be independently isolated from both cellular organelles for PTM analysis.
Antibodies are widely used for the detection of PTMs in a protein by Western blot analysis. Immunoisolation of antigenic peptides using an antibody was combined with MS for identification of epitope peptides more than a decade ago . An obvious extension is to use pan-PTM antibodies to isolate peptides bearing the PTM of interest. Isolation of PTM peptides from tryptic digests by immunoaffinity purification using a pan-PTM antibody should be simpler than isolating the corresponding proteins, because the PTM of interest is always exposed, and less non-specific binding is likely to occur for peptides than for proteins. This approach has been successfully used for global analysis of protein lysine acetylation [3, 16], arginine methylation , tyrosine nitration , and tyrosine phosphorylation [19, 20]. In addition to pan-specific antibodies, antibodies specific to PTM motifs in peptides have been used to identify kinase substrates and quantify their PTM changes . The same approach should be applicable for identifying novel substrates where a consensus PTM motif exists for a PTM regulatory enzyme of interest.
However, high-quality antibodies are not always available for PTMs of interest. For example, due to their small sizes, it is not easy to generate high affinity antibodies against acetyllysine (AcLys) or methyllysine, and, not surprisingly, the available antibodies have low binding affinity. Nevertheless, the success of proteomic analysis of lysine acetylation  and arginine methylation  suggests that this approach is likely to be applicable to most, if not all, PTMs, provided that the PTM moiety of interest is antigenic and has reasonable size that is larger than a methyl or acetyl group.
A key issue for antibody-based affinity enrichment is quality control. In Western blot analysis, the signals detected by the antibody should be competitively abolished using a peptide library bearing a fixed PTM residue, or by a protein, such as BSA containing the PTM residues (generated through an in vitro chemical reaction). Those antibodies that can detect PTM-specific signals in a Western blot are likely to be suitable for affinity enrichment for proteomics screening.
It is not usually difficult to develop an antibody or to use a chemical/metabolic labeling method to isolate peptides bearing a PTM that has a large chemical moiety (see below). However, it can be a challenge to isolate PTM peptides in which the PTM induces a small change (for example, dehydration or monomethylation). Additionally, it can be difficult to generate antibodies against certain poorly antigenic PTMs. Nevertheless, new technologies are emerging for the generation of non-antibody affinity reagents that can be selected from randomized oligomer libraries, in which combinatorial binding sites are associated with a non-antibody scaffold (for example, methyllysine-binding domain) [22, 23]. Success of such technologies is likely to have a significant impact on PTM detection and proteomics studies.
A range of chemical methods has been developed to tag PTMs, including in vitro chemical reactions and in vivo metabolic labeling. Azide, due to its small size and bioorthogonal nature, has been used for metabolic labeling of PTMs. The resulting chemically labeled PTM proteins can be subsequently conjugated to an affinity linker, such as biotin [24, 25]. This method was used successfully for the identification of protein farnesylation , O-GlcNAc modifications [24, 27], palmitoylation  and myristoylation..
A PTM can also be converted, in vitro, into a tractable site for affinity labeling for the purpose of affinity enrichment. For example, β-elimination of O-phosphorylated residues (pSer and pThr)  or O-GlcNAcylated Ser/Thr residues  permits the introduction of a chemical with an affinity tag (e.g. biotin and fluorous affinity tag ). The introduced affinity tag enables the enrichment of PTM peptides of interest. Likewise, nitrotyrosine and S-nitrosylation modified residues have been derivatized with a chemical tag containing biotin for detection and affinity isolation [33, 34]. Similarly, chemical approaches to convert the S-palmitoyl group into a tractable tag for affinity-enrichment and subsequent HPLC/MS/MS analysis were demonstrated, and revealed numerous palmitoylated proteins in yeast  and in rat neuronal synapses .
Complex glycosylated proteins have been isolated by affinity enrichment methods using either lectin or chemical derivatization. Many proteomics studies of glycoproteins have been reported, which are based on lectin-based affinity enrichment . In particular, combinations of different types of lectins seem promising for comprehensive analysis of the N-glycosylated proteome [38–41]. A major caveat of lectin-based enrichment is the low- and variable-binding affinity between lectins and glycoproteins. Recently, hydrophilic interaction liquid chromatography (HILIC) based methods for isolation of glycopeptides have emerged and proved their utility when combined with MS/MS .
Chemical derivatization involving oxidation of the carbohydrate side chain and conjugation of glycopeptides to hydrazide resin provides an alternative approach to enrich glycosylated peptides (Fig. 2). The isolated glycopeptides can be subsequently released by treatment with the glycanspecific enzyme PNGase F and then identified by MS [43, 44]. This approach has been used to identify and quantify glycoproteins that are associated with plasma membrane, tissues and bodily fluids [44, 45].
While useful, chemical methods for introducing an affinity tag should be used with care (Table 2). Sample loss can be a problem as chemical derivatization reactions can be inefficient, and they often produce unwanted side products. Accordingly, a procedure involving multiple reactions is not desirable, unless all the chemical reaction steps are highly controlled and sample losses are minimized.
In addition to the attachment of small chemical moieties, PTMs by conjugation of a polypeptide, such as ubiquitin or ubiquitin-like polypeptides, onto proteins have also been described. Global studies of these kinds of PTMs have been performed in various species, including yeast. Typically, an epitope tag, such as a histidine tag or HA tag, is introduced into the PTM moiety (for example, at the N-terminus of ubiquitin) to facilitate affinity purification using nickel beads (for histidine tag) or an anti-epitope antibody [46–50]. A single step purification generally yields a high proportion of non-specifically bound proteins (for example, proteins with multiple histidine residues in a short peptide sequence) along with conjugated proteins. This problem can be significantly alleviated by tandem purification using two affinity tags (for example, histidine tag and FLAG tag) for sequential enrichment .
The most successful analytical strategies for enriching phosphopeptides take advantage of the unique chemical characteristics of the phosphate group; that is, its negative charge and ability to interact with ion exchange beads and to participate in coordinate covalent bonding with immobilized metal ions. IMAC using Fe(III) was initially used to isolate phosphopeptides from a protein of interest for mapping phosphorylation sites. It was then extended to isolate phosphopeptides from a protein mixture. The method was used in the first few successful phosphoproteomics studies [52, 53]. The IMAC procedure has been continuously improved [54–56] and can be combined with other separation technologies. For example, peptide separation prior to IMAC has the advantage of improving enrichment efficiency, specificity and dynamic range. Accordingly, both strong anion exchange chromatography and strong cation exchange chromatography (SCX) have been used to prefractionate protein lysates prior to IMAC for phosphoproteomics studies [57–59]. Phosphoproteomic analysis of yeast (S. cerevisiae, S. pombe) under various conditions has been highly successful by using such methods [60–62] (Table 2).
In addition to IMAC beads, a titanium dioxide (TiO2)-based solid matrix has proven highly efficient and specific for enriching phosphopeptides [63–65]. As demonstrated by several recent phosphoproteomics studies [65, 66] TiO2 has been easier to implement and has proved to be more robust than IMAC for the analysis of complex protein samples .
Other methods have been used for enriching phosphopeptides. SCX, which takes advantages of charge-state differences between tryptic phosphopeptides and tryptic unphosphorylated peptides at low pH, was used to enrich phosphopeptides . Other metal oxides, such as zirconium and aluminum, have also been explored for enriching phosphopeptides [69–71], but have not yet been widely used. Interestingly, the combined use of IMAC and TiO2 (called SIMAC) seems able to separate mono-phosphorylated peptides from multi-phosphorylated counterparts, thereby reducing sample complexity and leading to higher sensitivity . These two enrichment techniques also demonstrated to be complementary in a study of S. pombe phosphoproteome . Combinations of either IMAC or TiO2 with other complementary fractionation techniques, such as HILIC  and peptide IEF [74–76], may further improve our abilities to pursue comprehensive phosphoproteome analysis in the future.
Modification-specific enzymes are attractive reagents for analysis of PTM substrates. This is nicely demonstrated by a modification-specific workflow for proteome-wide analysis of glycosylphosphatidylinositol-anchored proteins (GPIAPs) (Fig. 2). GPI-APs represent a special class of glycoproteins that are located at the extracellular surface of the plasma membrane. Modification-specific analysis of GPIAPs is facilitated by the availability of phospholipases that are highly specific for cleavage of the phosphatidylinositol linkage that tethers the protein to the cell surface. Using this method, GPI-APs from human and plant plasma membrane fractions were specifically released for subsequent proteomic analysis by LC-MS/MS [77, 78].
A chemo-enzymatic method was used to tag O-GlcNAc-modified proteins. This method uses an engineered galactosyltransferase enzyme to selectively label O-GlcNAc-modified proteins with a ketone-containing galactose analog followed by a conjugation reaction with an aminooxy biotin compound which attaches the biotin tag .
A MS/MS dataset generated from HPLC/MS/MS analysis of the enriched PTM peptides is subjected to protein sequence alignment for identifying PTM peptide sequences, PTM sites, and quantification. High sensitivity of a HPLC/MS system and a long HPLC elution gradient will facilitate identification of more PTM peptides. In addition, accurate identification and quantification of PTMs are of paramount importance.
Despite previous efforts to improve the accuracy of protein sequence database searches, the false identification of PTM peptides and their sites continues to be a problem, especially in the context of large-scale PTM analysis, or when multiple PTMs are included in the protein sequence database search. Given that datasets of PTM substrates and their PTM sites are likely to be widely used by the biomedical research community, it is highly important to ensure high MS data quality for accurate annotation of PTM sites.
Toward this goal, PTM sites should be precisely mapped and all the major peaks in MS/MS spectra of PTM-peptides should be assigned. For some types of PTM, exact location of a PTM site is critical. For example, protein methylation (identified with a mass shift of + 14 Da) can be present on several amino acid residues (for example, Arg, Lys, Asp, Glu, Asn, Cys, His, and Ser). Likewise, similar challenges exist for protein acetylation (at Lys, Ser, Thr, and Tyr) and for protein phosphorylation (at Ser, Thr, Tyr, and His). The existing protein database search algorithms often have difficulty in distinguishing those PTM isomers that have adjacent PTM sites. Although laborious and not often practical, manual verification is always useful to precisely locate the PTM sites with high confidence. Accordingly, for biology researchers who use the dataset, it is critical to carefully examine the MS/MS data of interest, by using objective scoring and evaluation criteria and by working closely with a mass spectrometrist to ensure the accuracy of PTM identification before investing significant efforts in functional studies. The fact that some PTMs, such as pSer, pThr, and O-GlcNAc, are labile and may (partially) decompose during LC-MS/MS analysis further complicates the situation and challenges data interpretation.
Dynamic studies of a PTM can provide insights into the role of the PTM in a biological process, and can identify substrates of a PTM regulatory enzyme. Integration of enrichment methods for PTM peptides with quantitative MS provides the means for determining changes of a PTM on a global scale. Quantitative PTM proteomics can be achieved using methods with or without stable isotope labels [13, 80]. (reviewed in ). Label-free quantification is primarily based on the intensities of the MS signals generated from HPLC/MS analysis. Labeling of peptides with stable isotopes can be achieved either in vitro or in vivo. Isotopic labeling methods offer the advantage of higher accuracy, while label-free quantification techniques enjoy more flexibility in sample source (for example, multiple samples from cultured cells or animal tissues).
Studies in histones and p53 demonstrate that multiple PTMs can be present in proteins and that PTMs operate in a cooperative fashion. However, the vast majority of cellular proteins have not been carefully examined for the possible existence of multiple PTMs. Previous studies in PTM proteomics have focused on a particular type of PTM. Accordingly, those proteins bearing more than a few different PTMs have not been examined at a global scale. Recent advances in non-restrictive sequence alignment algorithms provide a possible means for identifying all types of PTMs in a protein [81–84]. Identification of PTM peptides with more than one type of PTM in a short proteolytic peptide is likely to reveal important insights into the principles that govern PTM cross-talk, such as synergy or inhibition [85, 86]. Recent advances in MS fragmentation techniques – electron capture dissociation  and electron transfer dissociation  – facilitate efficient fragmentation of large peptides and small protein domains. Thus, it is becoming feasible to study multi-site and cooperative PTMs within small protein domains.
In summary, tremendous progress has been made in the past several years in developing PTM-specific enrichment methods and MS-based proteomics technologies for PTM analysis in prokaryotes and eukaryotes, including yeasts. Yeast, i.e. Saccharomyces cerevisiae, remains one of the favorite model organisms for developing and implementing quantitative proteomics technologies for comprehensive mapping of PTMs. Given the high number of PTMs and high abundance of some of the common ones (for example, phosphorylation, ubiquitination, and lysine acetylation), PTMs likely constitute the most complex and delicate regulatory networks in eukaryotic cells.
The future challenges for PTM proteomics include achieving higher sensitivity and a wider dynamic range of protein abundance for detection of low-abundance PTM events, improving the accuracy of PTM identification and localization, developing robust quantitative methods for studying PTM dynamics in tissues, and developing methods for analysis of all types of PTMs in proteins. In addition, the emerging importance of multisite, cooperative PTM events in proteins necessitates the development of novel strategies for full-spectrum PTM identification not only in proteolytic peptides, but also in regulatory protein domains and intact proteins.
We want to thank Yoshida Oda, Hui Zhang, and members of Zhao and Jensen laboratories for critically reading this paper and for helpful suggestions. This work was supported by NIH grants to Y.Z. and by grants from the Lundbeck Foundation and the Danish Research Agency to O.N.J. We apologize to the colleagues whose works cannot be cited in this review paper due to space limitations.
The authors have declared no conflict of interest.