|Home | About | Journals | Submit | Contact Us | Français|
Protein phosphorylation is a fundamental regulatory mechanism in many cellular processes and aberrant perturbation of phosphorylation has been implicated in various human diseases. Kinases and their cognate inhibitors have been considered as hotspots for drug development. Therefore, the emerging tools, which enable a system-wide quantitative profiling of phosphoproteome, would offer a powerful impetus in unveiling novel signaling pathways, drug targets and/or biomarkers for diseases of interest. This review highlights recent advances in phosphoproteomics, the current state of the art of the technologies and the challenges and future perspectives of this research area. Finally, some exemplary applications of phosphoproteomics in diabetes research are underscored.
Protein phosphorylation is a post-translational modification (PTM) on serine, threonine, and tyrosine residues that plays an essential role in signal transduction across many cellular processes, including cell proliferation, differentiation, and apoptosis, as well as metabolism. The dynamics of phosphorylation and the network of targets are delicately controlled and coordinated by the reciprocal actions of the arrays of protein kinases and phosphatases in response to stimuli . In the context of biological significance of phosphorylation, the negatively charged phosphate group incorporated could either attract to or repel from a positively or negatively charged amino acid residue in close proximity, respectively. The protein conformation change resulting from phosphorylation in turn modifies its physical and chemical properties and functions, such as catalytic activity, interaction partners, and stability. Accordingly, aberrant protein phosphorylation is involved in many human diseases, such as Alzheimer’s disease [2,3], cancer [4–6], cardiovascular disease [7–9], as well as diabetes [10–12]. Ascribing to the biological impact of phosphorylation in the pathogenesis of human diseases, protein kinases and the cognate kinase inhibitors have been the quintessence in drug development [13–15], such as tyrosine kinase inhibitor imatinib (Gleevec®) for treating chronic myelogenous leukemia [16,17]; lipid kinase PI3K for solid tumors [18–20]; serine/threonine kinase BRAF for melanoma [21–23]; receptor tyrosine kinase EGFR for lung cancer [24,25]; and serine/threonine kinase mTOR for renal tumors .
The knowledge of the complexity of phosphorylation-mediated signaling networks has been greatly advanced in the last decade largely due to this emerging field of phosphoproteomics. Phosphoproteomics as a technology has become indispensable for biomedical research, which often enables quantitative profiling of site-specific phosphorylation across many different biological conditions with extensive coverage of the phosphoproteome . The ability to identify >10,000 phosphorylation sites and to perform quantitative measurements of their levels of phosphorylation in such a broad scale allow investigators to systematically identify aberrant signaling activations, pathways, networks underlying disease conditions, thus providing a new level of mechanistic understanding of disease pathogenesis and potential therapeutic targets or biomarkers. Herein, we review the recent technological advances of phosphoproteomics with an emphasis on the quantitative approaches, including both global and targeted quantitative phosphoproteomics, highlight the current state-of-the-art of phosphoproteomics field, and discuss some recent applications to diabetes.
For large-scale profiling of the phosphoproteome, mass spectrometry (MS) has become the dominant tool due to its sensitivity and unique ability to identify site-specific PTMs. Traditionally, phosphoproteomics analysis by MS faces great challenges because of the low abundance of signaling proteins and low stoichiometry of phosphorylation events. However, recent advances in affinity enrichment approaches of phosphopeptides, liquid chromatography (LC) separations, and MS instrumentation has largely overcome this challenge by enabling large-scale proteome wide profiling of protein phosphorylation. The general workflow of MS-based quantitative phosphoproteomics is illustrated in Figure 1. While the detailed workflow will be varied depending on specific applications, it typically consists of four major steps: 1) protein extraction and enzymatic digestion, 2) isobaric labeling using reagents such as tandem mass tags (TMT)  or isobaric tags for relative and absolute quantification (iTRAQ)  to enabling multiplexed relative quantification, 3) enrichment of phosphopeptides by affinity chromatography, and 4) LC-MS/MS analyses. To further enhance the coverage of the phosphoproteome, LC fractionation strategies were often applied either prior to or after phosphopeptide enrichment . Besides isobaric labeling, label-free or other stable isotope labeling approaches such as stable isotope labeling by amino acids in cell culture (SILAC) can also be incorporated to achieve quantitative measurements of protein phosphorylation [31,32].
Since phosphorylation is a labile PTM, preserving the integrity of in vivo phosphorylation status is critical. Therefore, it is pivotal to include appropriate phosphatase inhibitors during the steps of cell lysis and protein extractions. In shotgun proteomics, proteins are typically digested into peptides using trypsin. Other enzymes such as endoproteinase Lys-C or Glu-C can also be applied for digestion to enhance the coverage of the phosphoproteome due to the complementary specificity of enzymes. For example, in one study the integration of both Glu-C and trypsin digestion resulted in the identification of 8,507 phosphorylation sites compared to only 4,647 phosphorylation sites by trypsin alone . Wisniewski et al. demonstrated that the consecutive use of Lys-C and trypsin enhanced both protein and phosphorylation site identification by 40% . Despite the increase of sequence coverage adapting multiple enzyme digestion, the shortfall is the need of additional samples and MS instrument time. This caveat can be partially alleviated by using consecutive proteomic digestion with the implementation of filter aided sample preparation (FASP) as an enzyme reactor. In this way different populations of peptides can be obtained from a single sample without the need of an additional input material .
Elucidation of signaling networks requires quantification of the dynamic changes of protein phosphorylation. In principle most quantitative strategies are commonly applicable to both global proteomics and phosphoproteomics. Although recent advances in the robustness and reproducibility of LC-MS platforms have enabled label-free approaches to be more commonly employed in quantitative proteomics , the majority of quantitative phosphoproteomics studies to date are based on stable isotope labeling approaches. Among the isotope labeling approaches, SILAC  and isobaric labeling strategies are commonly employed in phosphoproteomics. In terms of quantification accuracy, SILAC typically performs better than isobaric labeling since the labeling process is conducted at a more upstream level (i.e., during cell culture) compared to peptide-level labeling for iTRAQ or TMT. Nonetheless, isobaric reagents offer several advantages in enabling quantitative analysis of multiple samples simultaneously (i.e., multiplexing), which is particularly useful for monitoring a biological system over multiple time points, and the universal applicability to all types of samples.
The current commercially available isobaric reagents TMT and iTRAQ offer the options of 4-, 6-, 8-, and 10-plex labeling and quantification for both global proteomics and phosphoproteomics, which provides a great flexibility depending on the experimental designs in specific applications. The sample multiplexing ability also greatly increases the overall sample throughput for phosphoproteomics analysis especially when multi-dimensional LC separations are employed to enhance the coverage. The higher-energy collisional dissociation (HCD) performed on the new generation of Orbitrap mass spectrometers such as Orbitrap Velos or Q-Exactive has become the primary approach for analyzing isobaric labeled samples by producing excellent quality MS/MS data along with low m/z reporter ions for both identification and quantification . One potential caveat related to isobaric labeling-based quantification is that the isolation window for selected precursors in the first stage MS, which is typically 3 Thomson, potentially include ions of multiple peptides, and such potential interferences could skew the quantification results of the identified peptides . To address this potential interference issue, extensive multi-dimensional LC separations can be applied to at least partially alleviate the problem. More recently, triple-stage MS (MS3) strategy was reported to nearly completely eliminate interference , but with the expense of sensitivity. More recently, McAlister et al. described a MultiNotch MS3 method on an Orbitrap Fusion instrument that utilizes synchronous precursor selection for co-isolating and co-fragmenting multiple MS2 fragment ions to enhance the overall sensitivity .
In addition to labeling strategies, label-free quantification is also a commonly applied strategy in phosphoproteomics. Several software tools and strategies were reported for robust measurements of the levels of phosphopeptides in different samples using different strategies. For example, Schilling et al. demonstrated the use of MS1 extracted ion chromatograms using Skyline for quantification of phosphorylation . Xue et al. introduced a library-assisted extracted ion chromatogram (LAXIC) approach which utilized a synthetic peptide library as internal standards for normalization . Cox et al. reported the generic MaxLFQ approach using the Maxquant computational platform, which is applicable to phosphoproteomics . However, one primary limitation of label-free quantification is its heavy reliance on the reproducibility of sample processing and instrument performance.
Due to the generally low-abundance of phosphopeptides, efficient enrichment of phosphorylated serine (pSer), threonine (pThr), and tyrosine (pTyr) containing peptides is a key step for phosphoproteomics analysis. Various affinity enrichment strategies based on either specific antibodies or chemical resins have been developed for effective isolation of phosphopeptides from complex mixtures. Anti-pTyr antibody-based immunoaffinity approaches have become the primary method for profiling the tyrosine phosphoproteome . For global enrichment of pSer/pThr/pTyr peptides, immobilized metal affinity chromatography (IMAC)  and metal oxide affinity chromatography (MOAC) with TiO2  have become two most popular methods.
Tyrosine phosphorylation is a prominent component of intracellular signaling involved in receptor tyrosine kinases activating and is essential for proliferation, differentiation, survival, and metabolism [46,47]. Specific anti-pTyr immunoaffinity strategies have proven highly successful in the comprehensive mapping of tyrosine phosphorylation [43,48–51]. However, phosphorylated tyrosine residues only constitute a very small fraction of the total amount of the cellular protein phosphorylation with an estimated relative abundance of 1800:200:1 for pSer/pThr/pTyr in vertebrate cells . Therefore, a relatively large amount of proteins (in the levels of mg) is often required to achieve an adequate coverage of the tyrosine phosphoproteome. Recently, Boersema et al. has demonstrated the feasibility of in-depth quantitative profiling of the pTyr proteome using immunoaffinity enrichment coupled with stable isotope dimenthyl labeling, where more than 1,100 unique phosphopeptides were identified from 4 mg using single dimensional LC-MS/MS .
The IMAC approach was first introduced in 1986 , leveraging the positively charged metal ions such as Fe3+, Ga3+, Ti4+, and Zr4+ [55–58], which are immobilized on resins to capture negatively charged phosphopeptides. The most commonly used resins are coated with iminodiacetic acid (IDA) and nitrilotriacetic acid (NTA). Despite being one the most extensively applied technique for phosphopeptide enrichment, IMAC has several caveats in terms of its low tolerance towards buffers or salts in the biological samples  and potential compromise of specificity during its unspecific binding towards acidic peptides. Initially, O-methylesterification was applied to derivatize carboxylic acid groups in acidic amino acid residues and the peptide C-terminal in order to enhance the specificity of IMAC . A disadvantage of the esterification procedure is the occurrence of side reaction products (partial hydrolysis of peptides, deamidation of asparagine and glutamine residues) that can increase sample complexity.
Over the years, the commercially available IMAC resins have been improved for both its specificity and robustness through either the modification of the solid support or the chelating linker. For example, the availability of magnetic Ni-NTA agarose resins enabled high throughput automated isolation of phosphopeptides in 96-well format with relatively high specificity . By coupling with different fractionation strategies, IMAC has been successfully applied to many large-scale studies to achieve extensive in-depth profiling of the global phosphoproteome where >10,000 phosphorylation sites were often identified [61,62]. Moreover, different metal ions can offer distinct enrichment efficiency and binding selectivity. For example, a sequential Ga3+- and Fe3+-IMAC enrichment was reported to significantly enhance the coverage of the phosphoproteome compared to a single IMAC enrichment .
A number of metal oxides or hydroxides, including TiO2, Ga2O3, ZrO2, Fe3O4, Nb2O3, SnO2, HfO2, Ta2O5, and Al(OH)3, were reported to be adapted as matrix for enriching phosphopeptide based on the principles of complex formation between metal oxides and phosphopeptides . TiO2 is by far the most widely employed owing to its remarkable sensitivity and selectivity in phosphopeptide enrichment. Pinkse et al. described the first implementation of TiO2 as a pre-column in tandem with LC-MS/MS for enrichment of phosphopeptide in 2004 . One main limitation of MOAC is related to its specificity due to the potential binding to acidic peptides. Larsen et al.  in 2005 described an approach to improve the selectivity towards phosphopeptides by loading the sample in 2,5-dihydroxybenzoic acid (DHB) solution containing acetonitrile and TFA so that DHB competes with the binding of acidic peptides on TiO2 beads, thus achieving improved overall specificity for enriching phosphopeptides. Phosphopeptides were eluted using an alkaline solution at pH 10.5 using ammonium hydroxide. More recently, aliphatic hydroxyl acid modifiers such as lactic acid were proposed to improve selectivity and capacity of TiO2 towards phosphorylated peptides . TiO2-based MOAC enrichment has been broadly applied in large-scale phosphoproteomics studies. For example, Paulo et al. were able to quantify 10,562 phosphorylation events on the mouse kidney, liver, and pancreas tissues treated with MEK inhibitors using MOAC coupled with 10-plex TMT labeling .
IMAC and MOAC have been reported as complementary in the aspects that IMAC favors multiply phosphorylated peptides while MOAC favors mono-phosphorylated peptides . This complementary nature has led to the integration of both techniques into a single workflow termed sequential elution from IMAC (SIMAC) . Briefly, this strategy involves multistage elution from IMAC: 1) the mono-phosphorylated peptides were initially eluted from IMAC under acidic condition (in 1% TFA, 20% acetonitrile at pH 1.0) after binding, and 2) the multiply phosphorylated peptides were subsequently eluted from the same IMAC resin using pH 11.30 (ammonia water). Both the IMAC flow-through and the 1% TFA acidic eluent were subjected to TiO2 chromatography for nriching the monophorylated peptides. Both Fe3+- and Ga3+-IMAC have been reported in the SIMAC strategy  and in one report a nearly 2-fold increase in phosphopeptide identification was observed from lysates of human mesenchymal stem cells compared to TiO2 alone . This SIMAC strategy has been successfully applied to several large-scale phosphoproteomics studies [70,71]
In addition, polymer-based metal-ion affinity capture (PolyMAC) has emerged to as a novel chemical strategy based on water soluble, globular dentrimers multifunctionalized with metal ions (e.g., Ti4+, Fe3+) for capturing phosphate groups [72,73]. Comparing to TiO2, PolyMAC displayed excellent reproducibility, selectivity, and high recovery of phosphopeptides from complex mixtures.
While the final enriched phosphopeptide samples are typically analyzed by LC-MS/MS, additional orthogonal LC fractionation strategies are often applied either prior to or after the enrichment of phosphopeptides to enhance the overall coverage. Traditionally, strong cation exchange (SCX) was widely adopted as the first dimensional fractionation ; however, SCX generally suffers from low resolution of separations and potential sample loss due to the need of additional desalting. Recent efforts have been converged onto better alternatives of SCX that can not only offer better separation but also mitigate the sample loss. HpH reversed phase LC  and electrostatic repulsion hydrophilic interaction chromatography (ERLIC) or hydrophilic interaction chromatography (HILIC) have been commonly applied for pre-fractionation prior to phosphopeptide enrichment. However, if the total protein amount is limited, the capillary-based LC fractionation was often applied after phosphopeptide enrichment to enhance the coverage of the phosphoproteome.
HpH reversed phase LC typically operates at either pH 10 using ammonium formate buffer or pH 7.5 using TEAB buffer . In RPLC, it was observed that a change in the pH of the mobile phase from low to high will dramatically alter the overall charge distribution of a peptide, and its respective ion-pairing interactions with the LC stationary phase, thereby offering a distinct and at least partially orthogonal separation . RPLC offers much higher resolution in separations and better sample recovery compared to traditional SCX, which makes it an ideal as the first dimensional fractionation strategy when coupled with a low pH LC-MS/MS . Wang et al. first demonstrated the coupling of high- and low-pH RPLC offline for global proteome profiling using a so-called concatenation strategy to combined the early, middle and late eluting fractions from HpH RPLC for more effectively utilizing the partial orthogonality nature of the 2D separations . Now offline HpH RPLC fractionation has been routinely applied prior to phosphopeptide enrichment by IMAC or MOAC for achieving extensively coverage of the phosphoproteome where the identification of >30,000 phosphopeptides were often reported in a given study [61,77,78].
ERLIC and HILIC are other types of promising orthogonal LC-based fractionation approaches for coupling with RPLC for phosphoproteomics [79,80]. HILIC is a high-resolution separation technique where the primary interaction between a peptide and the neutral, hydrophilic stationary phase is hydrogen bonding. In HILIC, retention increases with increasing polarity (hydrophilicity) of the peptide, opposite to the trends observed in RPLC . In contrast, ERLIC is a hybrid-mode chromatography that makes use of the properties of HILIC and an anion exchange separation operated typically under a high percentage of organic solvent (e.g. 70% acetonitrile) at low pH . In the absence of HILIC mode, phosphopeptides are electrostatically repulsed from the positive charged stationary column at low pH. However, in the presence of high content organic solvent, hydrophilic interaction dominates and outweighs the anion exchange interaction. Thereby, the phosphopeptides are less repulsed by (i.e., more retentive to) the column versus the non-phosphorylated counterparts . In a comparison of ERLIC versus SCX-IMAC in human A431 epidermal cells, it was observed that ERLIC and SCX are complementary in phosphopeptide identifications .
Analysis of phosphopeptides by MS/MS has been traditionally challenging on ion trap instruments using collision-induced dissociation (CID) since the phosphorylated group on serine and threonine residues are very labile and a 98 Da neutral loss of phosphoric acid occurs preferentially over peptide backbone fragmentation . The inefficient backbone fragmentation often resulted in MS/MS spectra that were essentially devoid of sequence information and hamper the phosphopeptide identification and phosphorylation site assignment, even with tedious manual examination of MS/MS spectra. The introduction of electron transfer dissociation (ETD) partially alleviates the challenge by successfully fragment the sequence backbone without little loss of the phosphate moiety on most fragment ions (complementary ions of types c and z) [84,85]. ETD is very useful for fragmentation of peptides containing labile modifications, such as phosphopeptides, as the resulting peptides fragments will retain their phosphate group, enabling direct identification of phosphorylation site. Drawbacks of ETD included its relatively low duty cycle due to the long reaction time compared to CID, and its low fragmentation inefficiency on doubly charged species, which constitutes the majority of peptides in a tryptic digest. While ETD is complementary and especially unique for site determination, these drawbacks prevent ETD being broadly applied in large-scale phosphoproteome profiling . More recently, HCD has become more broadly available in the new generation of Orbitrap instruments (e.g., Orbitrap Velos and Q-Exactive). HCD deposits higher levels of energy into the precursor peptide ion on a shorter timescale, thus promoting the occurrence of much more prominent backbone fragmentation [87,88]. Recently, a new fragmentation scheme combining both ETD and HCD, namely EThcD, originally introduced by Frese et al. , has enabled dual fragment ion series of both b/y- and c/z-type fragment ions in a single spectrum. The EThcD, now available in the Orbitrap Fusion instrument, provides much richer fragmentation data, thus resulting in substantially enhanced phosphopeptide identification and site determination rate .
In the assignment of phosphorylation site, different probability-based algorithms have been developed to measure the probability of correct phosphorylation site localization based on the presence and intensity of site-determining ions in MS/MS spectra. These include PTM score , Ascore , PhosphoScore , PhosphoScan , and PhosCalc  for MS/MS spectra acquired using CID and SLoMo , Phosphinator , and PhosphoRS  for spectra acquired using both CID and ETD fragmentation modes.
In addition to global phosphoproteomics, there is increasing interest in pursuing accurate quantitative analyses of phosphorylation dynamics for a specific set of protein targets from particular signaling pathways or networks. In principle, both antibody-based and targeted MS approaches can be used to measure phosphorylation levels of specific proteins. Herein we briefly review antibody-based multiplex immunoassays, protein microarray, flow cytometry, and targeted MS approaches for targeted measurements of protein phosphorylation.
Multiplex immunoassays can simultaneously detect or quantify multiple analytes via an immunological reaction. This assay is carried out and presented in two major layouts: 1) planar array-based assay, also known as enzyme-linked immunosorbent assay (ELISA), for example, Meso-scale discovery (MSD); and 2) microbead-based assay, for instance, Luminex xMAP, and FlowCytomix . In the setting of MSD, carbon electrode plate surface is implemented versus polystyrene ascribing to its 10-fold stronger anchoring capacity to the phospho-antibodies. In each 96-well, up to ten carbon electrodes could be immobilized; and each electrode is conjugated with a specific phospho-antibody. Therefore, it enables a 10-plex capability (i.e., up to 10 different phosphoproteins could be detected and analyzed per sample concurrently). In the platform of xMAP, polystyrene microspheres core is employed as the stationary matrix. Each batch of microspheres is coated and earmarked with both red and infrared fluorophores mixed at a distinct ratio, which are then coupled with the respective fluorescent R-phycoerythrin-labeled phospho-antibody. As one may anticipate, since there is an enormous possible combination of fluorophores in assigning distinct batches of microsphere conjugated with different phospho-antibodies, dozens to hundreds of phosphoprotein targets could be analyzed per sample at the same time [100,101] (Figure 2A). This innovation tremendously extends the capability of multiplexing, which is a critical merit in the application to large-scale clinical studies [102–105].
Protein microarray or protein chip is another form of multiplex immunoassay albeit carried out on a glass slide or membrane. Specifically, nitrocellulose-coated slides, nylon or silanized silica are employed owing to its high binding affinity to proteins . Proteome Profiler Antibody Arrays (R&D system), PathScan, and RayBioTech Phosphorylation Assay are the exemplary commercially available microarrays. The protein microarray could be subdivided into two groups based on the molecules being immobilized. In the context of forward-phase protein microarrays (FPPAs) or the antibody microarrays, phospho-antibodies are fabricated on the slide, whereas in the reverse-phase protein microarrays (RPPAs), sample of interest is glazed on the surface. In the regard of multiplexing, each array could probe for a specific phosphoprotein target across hundreds of samples in just one single run [107,108] (Figure 2B). Moreover, protein microarray is compatible with a broad spectrum of clinical samples, including biopsies, laser capture microdissection, and multiple biological fluids.
Flow cytometry is another technology that couples cell-sorting capability with the multiplexing capability of the fluorescent cell barcoding system to generate cell-type specific phosphor-proteomic fingerprints of a heterogeneous cell population [109,110]. More recently, multiplexed mass cytometry was developed by coupling flow cytometry, mass-tag cellular barcoding of antibodies, and mass spectrometry to offer single cell analyses of many proteins including phosphoproteins [111,112].
Besides immunoassays, targeted quantification of site-specific phosphorylation dynamics using SRM/PRM or data-independent acquisitions (DIA) for specific signaling proteins or pathways has become an emerging technique [113–117]. While there have been significant advances in antibody-based assays in the format of multiplexed immunoassays or protein microarrays, the major limitations for antibody-based assays are their limited specificity and often reproducibility in analyte detection due to non-specific binding and the lack of site-specific phosphorylation information. In contrast, targeted mass spectrometry based on selected reaction monitoring (SRM, also called multiple reaction monitoring MRM) or parallel reaction monitoring (PRM) provides high specificity, site-specific, and reproducible measurements of phosphorylation levels on targeted proteins . The general concept of targeted MS quantification of site-specific phosphorylation is illustrated in Figure 2C. Heavy-isotope labeled phosphopeptides for the phosphorylation sites of interest will be synthesized and spiked into the peptide samples. The samples containing internal standards can be subjected to phosphopeptide enrichment using IMAC or MOAC prior to LC-SRM analyses. Both the dynamics of site-specific phosphorylation and the phosphorylation stoichiometry can be accurately quantified . Direct quantification of phosphorylation dynamics is feasible without affinity enrichment by applying alternative more sensitive targeted proteomics workflow . Several recent studies have demonstrated targeted quantification of phosphorylation for the entire signaling pathway such as DNA damage pathway  and PI3K-mTOR/MAPK pathway .
Given the technology advances in phosphoproteomics in its ability to comprehensive quantitative profiling of the phosphoproteome, we have seen an explosion of both phosphoproteomics data and literature. Table 1 summarizes some representative studies and their workflows that are reflecting the current state-of-the-art of phosphoproteomics in terms of the depth of coverage, sensitivity, and quantification dynamics. In many of these global studies, an enrichment specificity of >90% was typically obtained. By coupling phosphopeptide enrichment technique such as TiO2-MOAC or Ti4+-based IMAC with single dimensional LC-MS/MS, ~10,000 phosphorylation sites were identified and quantified from mouse liver tissue  or Jurkat T-cells . Moreover, the phosphoproteome coverage could be further enhanced with the incorporation of different proteolytic enzymes such as AspN, Chymotrypsin, GluC, Trypsin, and LysC where 18,430 phosphosites were identified .
To achieve more in-depth coverage, multi-dimensional separations were often coupled with phosphopeptide enrichment when the starting sample amounts are not limited. For example, SCX fractionation has been coupled with IMAC enrichment or TiO2 enrichment as well as anti-pTyr IP, followed by LC-MS/MS, where >30,000 phosphosites were identified [62,122]. More recently, the integration of isobaric labeling (e.g., iTRAQ) with HpH RPLC fractionation followed by IMAC enrichment and LC-MS/MS has become an increasingly popular quantitative phosphoproteomics workflow, which provides quantification of typically 20–40K phosphosites [61,78]. With this workflow, 4 to 10 biological samples can be multiplexed for quantification depending on the type of iTRAQ or TMT reagents to be used. One caveat of all the current in-depth phosphoproteome profiling workflows is the requirement of at least several mg of proteins per sample. Recently, there are several developments designed to start with a relative small amounts of proteins. For example, Ficarro et al. reported an online nanoflow 3D-LC-MS/MS workflow integrating HpH, strong anion exchange, and low pH RPLC to achieve the identification of 7,700 unique phosphopeptides from a total of only 400 μg peptides from mouse CD8+ T cells . Engholm-Keller et al. reported the integration of TiO2-based pre-fractionation followed by SIMAC fractionation and capillary HILIC fractionation, which resulted in an average of ~6,600 unique phosphopeptides from 300 μg peptides/condition in a duplex dimethyl labeling experiment . The most sensitive phosphoproteomics work reported to date is the coupling of hydroxy acid-modified MOAC and miniaturized analytical column (25 μm I.D. capillary LC column) for successful identification of ~1000 phosphosites from 10,000 human cancer cells .
Besides global phosphoproteomics, targeted quantifications of hundreds of phosphorylation sites from specific pathways have also become feasible. For example, De Graaf et al. reported the Ti4+-IMAC-SRM based quantification of 89 phosphosites from PI3K-mTOR/MAPK pathway , and Kennedy et al., reported IMAC-SRM based quantification of 107 phosphosites from DNA repair pathway . The IMAC-SRM workflow typically only requires ~200 μg peptides per sample due to the overall higher sensitivity of LC-SRM based targeted measurements compared to conventional LC-MS/MS global measurements.
Besides the technological advances, we are also witnessing an explosion of the amount data on phosphorylation and other PTMs. The public available data on site-specific phosphorylation and other PTMs represent a hugely valuable resource to the research community at large. For instance, a number of comprehensive database resources of phosphosites as well as sites for other PTMs have been made available to the public during the past decade. These include the Phopspho.ELM , PhosphoSitePlus [126,127], Phosida , SysPTM , and dbPTM . The SysPTM and dbPTM databases are curated databases of different types of site-specific PTMs from public resources. The PhosphoSitePlus is currently the most comprehensive database on key types of PTMs for human and mouse, containing over 330,000 non-redundant PTMs, including phospho, acetyl, ubiquityl, and methyl groups .
Phosphoproteomics has become an indispensable technology for biomedical research. Herein, we highlight some recent applications in diabetes, an important research area where the study of protein phosphorylation is pivotal and increasing its role. Protein phosphorylation is essential in the orchestration of pancreatic β-cell function, including β-cell proliferation, apoptosis, and insulin secretion, as well as insulin resistance of peripheral tissues. The challenge in applying phosphoproteomics to diabetes research is associated with the very limited sample availability of pancreatic islets samples, especially human samples. To date, most studies were carried out with cell lines or samples from mouse models. For example, Engholm-Keller et al. identified ~6,600 unique phosphopeptides from an insulin-secreting rat INS-cell line using the SIMAC-HILIC approach . Li et al. reported the phosphoproteome dynamics during glucose-stimulated insulin secretion (GSIS) using rat pancreatic islets by implementing TiO2 enrichment with SILAC and LC-MS/MS . Their work represents the first study focusing on phosphorylation dynamics during GSIS in islets. Despite the very small starting protein amounts (20–47 μg) from each rat, they managed to identify 8,539 phosphosites from 2,487 proteins, revealing phosphorylation responses on many different pathways such as insulin secretion related pathways, cytoskeleton dynamics, protein processing in ER and Golgi, transcription, and translation. Ca2+-dependent kinases such as Camk2b, L-type Ca2+ channel-activity-related protein Rem2, and Ca2+ transporters Atp2b1 and Slc24a2 were among those regulated proteins by the short-term high glucose stimulation. This confirms the essential role of Ca2+ in exocytosis. In another recent study of insulin signaling by Zhang et al., 3,876 phosphorylation sites were identified and a large number of potential new substrates of protein phosphatase 1 regulatory subunit 12A were revealed by quantitative phosphoproteomics . A total of 698 phosphosites were responsive to PPP1R12A knockdown at both basal and insulin stimulated states and 295 of which were implicated in insulin signaling-relevant pathways such as insulin receptor signaling, mTOR signaling, and ERK/MAPK signaling. This study unveils new mechanistic evidence of the role of PPP1R12A in the skeletal muscle insulin signaling, and its implications in skeletal muscle insulin resistance and type 2 diabetes. More recently, El Ouaamari et al. reported SerpinB1 as a novel as a novel factor regulating pancreatic β-cell proliferation through the canonical growth factor pathways based on the quantitative phosphoproteomics data obtained by applying TMT-IMAC-LC-MS/MS to mouse islets and Western blotting validation .
In the case of type 1 diabetes (T1D), it involves the autoimmune attack specifically on the pancreatic β-cells mediated by the autoreactive T lymophocytes. Iwai et al. tested the hypothesis that the downstream signaling cascade of the antigen-specific T cell receptor (TCR) is a key factor instrumental in orchestrating the T cell response by carrying out quantitative phosphoproteomics of the primary CD4+ T cells derived from the diabetes-prone NOD versus –resistant B6.H2g7 mice . With the anti-pTyr IP and IMAC along with iTRAQ labeling, they were able to quantify 77 tyrosine phosphorylation events from 54 unique proteins downstream of TCR stimulation . Moreover, differentially orchestrated phosphorylation sites, including TXK, CD5, PAG1, and ZAP-70, underscore the key signaling molecules involved in the CD4+ T cell activation in NOD mice. In another study related to the complications of T1D, quantitative phosphoproteomics was applied to examine the mechanism of complications using induced pluripotent stem cells (iPSC) from patients with longstanding T1D, which allowed the observation of many key phosphorylation events in the DNA repair pathway related to T1D complications .
The recent advances in the field of phosphoproteomics confer us with an unprecedented capability and exciting opportunities for systems-wide study of signaling pathways and their regulation for various biological systems to gain fundamental understanding of cellular processes such as cell proliferation and apoptosis and disease pathogenesis. The powerful tools to study phosphorylation dynamics for a large number of phosphorylation sites (e.g., >10,000) are extremely valuable in identifying novel pathways, new drug targets and/or mechanistic biomarkers for the disease of interest. In the last decade we witnessed an explosion of phosphoproteomics applications, we anticipate the impact will only be accelerating now that the current phosphoproteomics workflows have been well disseminated to many different laboratories around the world based on standard enrichment protocols such as IMAC and MOAC and commercially available instrumentation for separations and MS measurements. We also anticipate that the integration of global quantitative phosphoproteomics for discovery and targeted phosphoproteomics for specific pathways will become a more prevailing tool in future applications.
In spite of the advances we recapitulate above and the great potential for applications, the phosphoproteomics field in general still faces a number of challenges for broad applications in biological and clinical research. First, most phosphoproteomics workflows require a relatively large amount of proteins (e.g., mg or more proteins are required for in-depth profiling). However, many clinical applications only have very limited amounts of samples available. For example, it is difficult to obtain even ~100 μg proteins from laser-capture microdissection (LCM) samples or pancreatic islet samples. Future advances in microscale phosphoproteomics will be important to enable this important area of clinical applications . Second, although many recent studies reported very impressive coverage of the phosphoproteome (e.g., >30,000 phosphosites) using extensive fractionation [61,62,122], this workflows offer a low throughput for sample analyses, which is not suitable for large-scale profiling of clinical samples. Further advances in MS instrumentation and separations will likely lead to enhanced throughput in the near future. Third, there is also a bioinformatics challenge in identifying functionally important phosphorylation sites and regulatory networks among the extensive coverage of the whole phosphoproteome. Towards this direction, improvements in bioinformatics tools along with more accurate quantification, especially the dynamics of phosphorylation stoichiometry [115,136], will enable more effectively identify novel functional sites and regulatory networks. Finally, the targeted MS-based quantification of PTMs including phosphorylation is still a relative new area, which offers significant advantages in terms of quantification accuracy, precision and reproducibility. We envision significant efforts will be dedicated to enhance the sensitivity and multiplexing of targeted quantification of PTMs to enable broad applications of this technology for accurately quantification of potentially multiple types of PTMs such as phosphorylation, acetylation, ubiquitination, redox modifications in specific pathways, thus allowing the study of signaling crosstalk through different PTMs.
While phosphoproteomics has been broadly applied in many biological applications, there are still many hurdles on the way to apply the technologies to specific types of applications as discussed above. In the next 5 years, we anticipate the phosphoproteomics technologies will become more mature with further advance in sensitivity, throughput, reproducibility, and accuracy of quantification to enable even broader applications in both normal physiology (e.g., the effect of physical activity ) and disease pathogenesis. Specifically, the sensitivity for handling microscale samples will potentially to be a key technology advance to enable phosphoproteome profiling of extremely small clinical samples such as LCM-tissue samples. Further advances in the multiplexed quantification (20-plex or more)  and the overall sample throughput will likely be achieved through the must faster scanning rate of MS instrumentation and advanced separations. Such advances should enable in-depth quantitative profiling of tissue phosphoproteome in large-scale clinical studies such as those performed for the global proteome . Moreover, we anticipate significant advances in MS-based targeted quantification of phosphorylation and other types of PTMs in terms of sensitivity, multiplexing capacity, and throughput to enable detailed time course studies or large-scale population studies of specific pathways and signaling networks. With these advances, phosphoproteomics studies will become more integrative for both discovery and functional verification, and be more routinely applied in broad areas of biomedical research.
This work is partially supported by National Institutes of Health grants UC4 DK104167 (to WJ Qian) and P41 GM103493 (to RD Smith).
Declaration of interest
The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.
Papers of special note have been highlighted as:
* of interest
** of considerable interest