Post-translational modification (PTM) represents an important mechanism for diversifying and regulating the cellular proteome. In this review, PTM refers to a chemical event that converts a ribosomally coded amino acid residue into a non-standard amino acid residue by an enzymatic reaction. The identification of protein substrates and their PTM sites are fundamental to the biochemical dissection of PTM pathways (for example, the identification of enzymes that catalyse PTM), to studies of the role of PTM in the function of substrate proteins, to the establishment of substrate–enzyme(s) relationships, and to providing insights into the possible regulation of cellular physiology by PTM.
Protein lysine acetylation provides a good example of a critical role of substrate identification for functional characterization of a PTM-mediated pathway. Lysine acetylation was initially identified in histones in the 1960s [
1]. Demonstration of its first non-histone substrate protein, p53, in 1997 [
2] stimulated extensive studies of the roles of lysine acetylation in transcriptional regulation. Identification of diverse substrates in both cytosolic and mitochondrial fractions [
3] opened new avenues for its functional studies in energy metabolism, signal transduction, and mitochondrial regulation.
Conventionally, PTM substrates have been identified by laborious biochemical approaches, including
in vitro PTM reaction assays using radioactive isotope-labeled substrates, Western blot analysis, and more recently peptide and protein arrays [
4,
5] (). While useful, these methods suffer from various shortcomings. For example, radio-isotopes of carbon and hydrogen are rather weak radio emitters (for example,
14C or
3H in the case of protein methylation and acetylation), which makes it difficult to efficiently detect their corresponding modified proteins. Antibody-based Western blot analysis has been successful for identifying candidate substrate proteins for certain types of PTM, such as tyrosine phosphorylation. However, the small size of the structural motifs of other common PTMs (for example, protein methylation and acetylation) makes it difficult to generate pan-specific antibodies, which recognize PTM peptides/proteins independent of its surrounding sequences, with good affinity for routine Western blotting.
| Table 1Techniques for detection and identification of PTM substrates |
Another valid approach for identifying protein substrates is based on the specificity of PTM-specific enzymes. For example,
in vitro screens have been carried out using peptide or protein arrays to identify sequence motifs for a protein lysine methyltransferase [
6] and for protein kinases [
4]. Nevertheless, PTM substrate candidates identified by these approaches require further validation by MS analysis of the purified endogenous proteins. In summary, despite technical advances in the past few decades, more efficient and sensitive bioanalytical technologies are needed to address key bottlenecks in the identification of PTM substrate proteins, in mapping PTM sites, and to investigate
in vivo PTM dynamics.
During the past decade, MS-based proteomics has been shown to be a powerful technique for proteome-wide identification of PTM substrates and mapping of PTM sites. Such studies typically involve four steps (). First, the protein lysate of interest is proteolytically digested, usually by a specific protease, such as trypsin. Second, the resulting proteolytic peptides are subjected to enrichment, using a suitable method, to separate the PTM peptides of interest from the rest of the proteolytic peptides. Third, The isolated PTM-peptides are then analyzed by nano-HPLC/MS/MS for peptide identification and precise localization of PTM sites. Finally, the peptide candidates are further evaluated by a manual or an automated verification method to ensure the accuracy and statistical significance of the identification [
7]. In addition, a separation step can be included in the procedure to separate either proteins (before the proteolytic digestion) or peptides (after the proteolytic digestion) into multiple fractions to reduce sample complexity.
High sensitivity is desirable in PTM proteomics to detect substrate proteins that exist in low abundance in cells. The detection sensitivity of a PTM proteomics screening depends on four factors: (i) yield of affinity enrichment, (ii) level of contamination from irrelevant peptides, (iii) sensitivity of the HPLC/MS/MS system, and (iv) complexity of the peptide mixture. The PTM peptides are present in an ocean of non-PTM peptides and may be present in low stoichometry. Accordingly, without enrichment, mass spectrometric analysis has low efficiency to detect PTM peptides. Despite advances in the sensitivity of HPLC/MS systems and the development of more powerful algorithms for protein sequence database searching, the lack of efficient procedures for enrichment of PTM peptides has become a major bottleneck for PTM proteomic research.
Here, we review existing MS-based proteomics strategies for global PTM analysis, with a focus on enrichment methods for PTM peptides. We also discuss future challenges for comprehensive PTM analysis. Readers interested in general information about PTMs, mapping PTM sites in proteins and PTM quantification by MS are referred to several recent review articles [
8–
13].