|Home | About | Journals | Submit | Contact Us | Français|
Pathways linking oncogenic mutations to increased proliferative or migratory capacity are poorly characterized, yet provide potential targets for therapeutic intervention. As tyrosine phosphorylation signaling networks are known to mediate proliferation and migration, and frequently go awry in cancers, a comprehensive understanding of these networks in normal and diseased states is warranted. To this end, recent advances in mass spectrometry, protein microarrays, and computational algorithms provide insight into various aspects of the network including phosphotyrosine identification, analysis of kinase/phosphatase-substrates, and phosphorylation mediated protein-protein interactions. Here we detail technological advances underlying these systems level approaches and give examples of their application. By combining multiple approaches, it is now possible to quantify changes in the phosphotyrosine signaling network with various oncogenic mutations, thereby unveiling novel therapeutic targets.
Numerous cellular responses are regulated by reversible phosphorylation of serine, threonine, and tyrosine residues. While tyrosine phosphorylation accounts for less than 1% of the phosphoproteome, it plays a disproportionately large role in disease as nearly 50% of the 90 human tyrosine kinases are implicated in cancer . Aberrant regulation of tyrosine phosphorylation by gain- or loss-of-function mutations to kinases or phosphatases can generate alterations in cellular signaling that contribute to oncogenic malignancies . In addition to direct effects on the network associated with mutation of tyrosine kinases and phosphatases, the oncogenic potential of other mutations (e.g. NF1, PTEN) is driven, in part, through alteration of tyrosine phosphorylation-mediated signaling networks. Quantitative phosphoproteomic analysis can reveal the individual phosphorylation sites affected by each oncogenic mutation, but linking this information to functional consequence remains challenging, mostly due to our incomplete knowledge of signaling network connectivity and signal transduction through the cell. To gain a more comprehensive understanding of the tyrosine phosphorylation signaling network, the following information is required: 1) identification and quantification of tyrosine phosphorylation sites, including dynamic regulation of these sites under different conditions, 2) identification of kinases and phosphatases responsible for modification of these sites, and 3) analysis of phosphorylation dependent protein-protein interactions. This review will discuss the most promising recent technological advances in multiple assays that have begun to address each of these aspects as well as the limitations and most appropriate application for these various assays. The combination of several assays will be required to generate a comprehensive map of phosphotyrosine networks.
Identification of tyrosine phosphorylation sites has traditionally been performed on a protein-by-protein basis, but advancements in mass spectrometry-based approaches have now enabled rapid, unbiased identification and quantification of hundreds of tyrosine phosphorylation sites. Specifically, enrichment based on peptide immunoprecipitation with anti-phosphotyrosine antibodies followed by immobilized metal affinity chromatography (IMAC) has now overcome the principal challenge in identification and quantification of tyrosine phosphorylation: the low level of these signals in a large background of serine/threonine phosphorylation [3,4]. Furthermore, accurate quantification of tyrosine phosphorylation across multiple biological samples is now available by encoding samples with stable isotope labels such as stable isotope labeling with amino acids in cell culture (SILAC) or isobaric tag for relative and absolute quantitation (iTRAQ). As the name implies, SILAC labeling occurs through culturing cells in media containing stable-isotope encoded amino acids which are incorporated into proteins during turnover and cell division. As demonstrated in Figure 1a, cell lysates from different conditions, cultured in different SILAC-labeled media, are mixed, and peptides resulting from proteolytic digestion of the labeled proteins can then be analyzed by liquid chromatography tandem mass spectrometry (LC-MS/MS). Since peptides containing different stable-isotope encoded amino acids differ in mass, quantification of SILAC samples occurs by comparing the peak height (or, more accurately, the chromatographic elution profile) for a given peptide found in each sample. In comparison to SILAC, iTRAQ labeling occurs later in the process, with labeling occurring through nucleophilic reaction of N-termini and lysine ε-amines with the iTRAQ reagent, an N-hydroxy succinimide ester. For iTRAQ, up to 8 different conditions may be labeled, mixed, and analyzed by LC-MS/MS. As the iTRAQ reagent is isobaric, a given peptide present in all samples and labeled with each of the 8 isoforms of iTRAQ will still appear as a single peak (with associated isotope envelope) in the full scan mass spectrum (Figure 1b). Quantification occurs from the MS/MS spectrum, by comparing peak areas of the iTRAQ marker ions. Since SILAC labeling occurs during cell culture, quantification errors associated with differential sample processing are minimized. At present the SILAC strategy is limited to simultaneous analysis of three samples, while iTRAQ is capable of analyzing 8 samples in a single analysis. iTRAQ labeling is inherently applicable to a broad range of samples, as labeling occurs post-cell lysis, while SILAC is primarily limited to cell culture, although stable-isotope labeling of a variety of model organisms has recently been reported. Both of these quantification strategies have additional advantages and disadvantages which have been extensively described ; in general either technique provides more accurate quantification compared to stable isotope label-free methods.
Even with these improvements in methodology, one of the fundamental limitations of all mass spectrometric analyses is the accuracy of peptide identification and post-translational modification site assignments. Despite advancements in searching algorithms, Chen and colleagues have demonstrated that there is no predetermined cut-off score or p value that can distinguish correct from incorrect peptide identification , leading to the conclusion that all tandem mass spectra should be manually validated to decrease false positive identifications .
To move from quantitative catalogs of phosphorylation sites to functional association, incorporation of phenotypic measurements has led to correlation of phosphorylation events with downstream cellular responses [8,9]. This integrative approach has been used to define differential tyrosine phosphorylation signaling downstream of ErbB receptors upon epidermal growth factor (EGF) or heregulin (HRG) stimulation in ErbB2 overexpressing human mammary epithelial cells (HMEC) . In this study, computational analysis was used to correlate changes in tyrosine phosphorylation with quantitative cellular responses of cell proliferation and migration, providing a rank-ordered list of the most strongly correlated phosphorylation sites with each phenotype and thereby enabling follow-on validation studies for novel associations between phosphorylation and phenotype . In a separate study, quantitative analysis of tyrosine phosphorylation in gliobastoma cells expressing titrated levels of the EGFRvIII mutant demonstrated altered utilization of the EGFR signaling pathway by the EGFRvIII mutant, leading to the identification of cross-talk between EGFRvIII and the c-Met receptor tyrosine kinase. Combinatorial inhibition of these receptors led to decreased cell viability, thus illustrating the utility of combining phosphoproteomic and phenotypic assays in the identification of novel cancer therapeutics [12,13].
Innovations in microarray technology have also enabled quantitative screening of tyrosine phosphorylation . In antibody microarrays, an array is imprinted with antibodies, incubated with cell lysate, and bound proteins are detected and quantified via a secondary antibody, mimicking a sandwich ELISA. Antibody microarrays have been used to analyze protein expression and tyrosine phosphorylation of EGFR and ErbB2 signaling networks in breast cancer cell lines following EGF stimulation . Although the throughput is high and the sample requirements are minimal, assays are performed with antibodies recognizing a given phosphorylation site and are therefore limited to analysis of known tyrosine phosphorylation sites. Reverse-phase or lysate microarrays utilize a different format in which cell lysates are arrayed on a microarray and probed with pan- or phosphorylation-specific antibodies. This approach has been applied to multiple biological samples ranging from cultured cell lysates to tumor tissue lysates . For both of these microarray formats, the applicability to a wide range of signaling networks and the quality of the data is limited by antibody specificity. However, the accuracy of the antibody microarray tends to be greater than the reverse-phase microarray and is less reliant on highly specific phosphorylation-specific antibodies due to the combination of two antibodies recognizing different epitopes on a given protein. Lack of specificity of phosphorylation-specific antibodies is a major concern, as highlighted in a recent study of the ErbB signaling network in which phosphorylation-specific antibodies were tested for their ability to provide consistent data when used in the reverse-phase microarray or in a 1-dimensional immunoblot against the same sample. Of the 61 antibodies tested, only 4 were consistent between the two formats, and only 12 antibodies gave similar trends when microarrays were compared to immunoblots . This poor specificity can severely limit the accurate interpretation of reverse-phase phosphorylation microarrays: although the general trends tend to be correct, in many cases non-specific binding generates incorrect data, both qualitatively and quantitatively. As with gene expression microarrays, alternate assays (e.g. ELISA, mass spectrometry) are recommended to validate the results from reverse-phase phosphorylation microarrays. On a positive note, with improved antibodies targeting more phosphorylation sites in the network, microarray-based analysis of signaling networks may eventually provide a higher-throughput, lower-sample consumption alternative to MS-based phosphoproteomics for evaluating phosphorylation changes caused by oncogenic mutations.
The equilibrium of tyrosine phosphorylation is tightly regulated by the antagonistic actions of kinases and phosphatases, and many oncogenic mutations disrupt this critical balance [18,19]. In many cases, it is not obvious how increased signaling at one node in the network affects the other nodes, or how this translates to oncogenesis. Identification of substrates and enzymatic activities of kinases and phosphatases under different biological conditions would provide insight into this process, but this identification has proven to be challenging due to several reasons, including low-abundance substrates, complex feed-back loops in the network, and overlapping substrate specificities .
Conventional in vitro kinase assays using recombinant proteins to identify kinase substrates are not only arduous, as each kinase-substrate pair must be evaluated individually, but also may be physiologically irrelevant. To bypass this process and identify endogenous substrates in an unbiased manner, Shokat and colleagues engineered analog-specific (AS) kinases containing ‘gate-keeper’ mutations to expand the ATP-binding pocket, thereby enabling the utilization of bio-orthogonal ATP analogs (Figure 2a). When combined with analog ATPγS, AS-kinases enable direct tagging of substrates with thiophosphate. Purification of the substrates via a catch-and-release strategy followed by mass spectrometry allows for substrate and phosphorylation site identification in an unbiased, highly sensitive, and high-throughput manner [21,22]. Several AS kinase-substrates screens have been performed with yeast and murine cells, and a recent screen using AS-Cdk1, a serine/threonine kinase involved in mitotic regulation, has been performed in human cells [21,23–25]. In this latter study, nearly one third of the phosphorylation sites identified did not occur in a Cdk consensus sequence, indicating either that Cdk1 can phosphorylate non-consensus motif substrates, or that non-AS-kinases in the cell lysate can use the ATPγS analog. We have recently shown that wild-type kinases can utilize the ATPγS analog; by coupling stable isotope-tagging to AS-kinase substrate determination, it is possible to quantify off-target utilization of the ATPγS analog and determine true substrates for the selected kinase [Carlson et al., unpublished]. In the future, this combination of stable-isotope labeling, AS-kinases, and ATPγS analogs will be used to discover unknown substrates of tyrosine kinases that are overexpressed or have increased activity in cancer, such as c-Src or Abl. These results will be critical in developing a better understanding of the connectivity and topology of these highly complex signaling networks.
Numerous computational techniques have been developed for predicting kinase-substrate relationships. Among these are methods, such as Scansite, which primarily rely on consensus sequence motifs recognized by the active site of the kinase, and NetworKIN, which also incorporates co-localization, co-expression, and scaffolding by using the probabilistic protein association network, STRING [26,27]. By integrating contextual information and consensus motif sequence, the prediction accuracy of NetworKIN was double that of consensus motif sequence alone. NetworKIN has also been shown to correctly predict known and novel phosphorylation sites involved in the DNA-damage response network . Although this example is very promising, results from purely computational techniques such as NetworKIN are often incorrect due to inaccurate data sources: many of the automated curation algorithms incorrectly identify protein-protein interactions, and consensus motifs have been typically generated from in vitro kinase assays with peptide libraries, conditions which are likely not replicated in cells. Therefore, it is important to note that although these computational methods can make reliable predictions, these predictions need to be validated using conventional techniques such as biochemical assays and in vivo kinase-substrate studies.
Most phosphatase substrates have been identified using substrate-trapping approaches with catalytically inactive protein tyrosine phosphatase (PTP) mutants that form stable interactions with their substrates. Similar to in vitro kinase assays, the traditional phosphatase substrate-trapping assay is limited to one phosphatase-substrate pair at a time, as detection is typically based on immunoblotting with commercially available antibodies . Unfortunately, phosphatases have not been as extensively studied as kinases, and the knowledge of phosphatase substrates lags behind that of kinases. Nevertheless, techniques to analyze phosphatase substrates in a comprehensive and systematic manner are emerging [18,30]. Recently, mass spectrometry has been applied to the substrate-trapping technique, allowing for an unbiased analysis of PTP substrates (Figure 2b) [31,32]. Microarrays have also been used to characterize PTP substrates. In a recent study, phosphatase substrate preference was analyzed by comparing two PTP substrate-trapping mutants simultaneously: each PTP mutant was labeled with distinctive fluorescent tags and used in a microarray imprinted with known phosphotyrosine peptides . Although disassociation constants were measured and used to evaluate substrate affinity, the in vitro nature of this approach may not be reflective of in vivo selectivity and enzyme substrate kinetics.
Tyrosine phosphorylation can mediate cellular signaling by affecting protein activity, stability, localization, and by modulating protein-protein interactions through Src homology 2 (SH2) and phosphotyrosine binding (PTB) domains. Although many protein-protein interactions have been uncovered by co-immunoprecipitation experiments, this method does not work well for transient or weak interactions, and does not provide quantitative information regarding affinities of these interactions . Protein microarrays provide a high-throughput solution to identification of phosphotyrosine-mediated protein-protein interactions, but can be severely limited by the non-physiological context of the assay. In one of the few quantitative applications of this technique, protein microarrays were used to identify SH2 and PTB binding partners of the phosphorylated tyrosines of the ErbB family. In this study, protein microarrays comprised of nearly every SH2 and PTB domain were probed with fluorescently labeled, tyrosine phosphorylated peptides from the ErbB family. By varying the concentration of each peptide and quantifying the fraction bound to the microarray, saturation binding curves and dissociation constants were generated for each interaction. This analysis led to the intriguing finding that overexpression increases the promiscuity of protein-protein interactions for EGFR and ErbB2, but not ErbB3, and that this increase in binding may be related to the oncogenic potential of these overexpressed proteins . Thus far SH2/PTB protein microarrays have been used to evaluate tyrosine phosphorylation sites on receptor tyrosine kinases [35–38], but this technique can easily be applied to identify and evaluate binding partners of other tyrosine phosphorylation sites.
SH2 domain binding assays provide another platform to profile the global tyrosine phosphorylation state and establish network topology. In one format, SH2-domains have been used to probe far-western blots, identifying protein molecular weights capable of binding the probe, and providing signatures of changes in tyrosine phosphorylation across biological samples . Although this assay is not capable of directly determining the site of tyrosine phosphorylation bound to the SH2 domain, this information may be accessible by including data obtained by quantitative mass spectrometry-based tyrosine phosphoproteomics of the same samples. A high-throughput, quantitative, reverse-phase version of the SH2 binding assay, the Rosette assay, features immobilized protein samples on nitrocellulose in a 96-well plate. In this assay, each well can be tested with a recombinant SH2 domain probe; detection occurs with chemiluminescence and densitometric quantification. Rosette-assay analysis of whole cell lysates from v-Abl, v-Src, and v-Fps transformed NIH-3T3 cells generated distinctive quantitative patterns of SH2 domain binding. These patterns might correlate with therapeutic sensitivity, but additional validation needs to be performed .
As oncogenic mutations frequently result in alternations in tyrosine phosphorylation, defining and understanding tyrosine phosphorylation networks is paramount. To this end, recent methodological advancements in mass spectrometry enable the unbiased identification and quantification of tyrosine phosphorylation. Future directions in the field will apply multiple analytical techniques, several of which have been described here, to generate complimentary data describing additional aspects of signaling network, including topology and connectivity (Figure 3). One crucial aspect to understanding the biological impact of network activation, heretofore largely disregarded, is to combine the data from these analyses with phenotypic data describing the biological system under a variety of conditions to enable the extraction of key regulatory nodes in the network controlling cellular biological activity. The more comprehensive understanding of tyrosine phosphorylation networks that emerges from these combined analyses will help to elucidate how oncogenic mutations perturb the network and lead to aberrant biological response.
Application of additional molecular and cellular quantitative analytical techniques to a given biological system, while providing additional insight into the network, incurs significant data validation and data integration challenges. Many of the above techniques are plagued by false positive (and false negative) results due to the in vitro nature of the technique and/or non-specificity of reagents and detection. Perhaps the most effective solution to these data quality issues involves confirmation of results from one technique through analysis with a second technique. For instance, data gathered on reverse-phase microarrays could be confirmed by phospho-MS, and phospho-MS results could be confirmed by quantitative ELISA. Integration of disparate data types at the cellular and molecular level requires computational algorithms such as partial least-squares regression (PLSR) or machine learning to identify molecular signatures that are most highly correlated with selected cellular measurements. Several examples of PLSR-based integration of signaling and phenotypic data have been published over the past five years. It is anticipated that multiple other computational techniques will be applied to this task in the near future, especially given the complexity of the proteomic, transcriptional, and genomic data that is currently being generated to characterize the molecular characteristics of given systems.
The analytical and computational techniques described here should be applicable to a wide variety of biological systems and can be used in the future to elucidate the framework of other, non-phosphorylation-mediated, signaling networks including acetylation and ubiquitination. Consequently, a combined systems approach to understanding multiple levels of signal transduction can lead to the much broader understanding of the interaction of regulatory networks. Eventually, this data will lead to the identification of novel cancer therapeutic targets and therapeutics that maximize efficacy while minimizing deleterious side effects.
We would like to thank members of the Forest White lab for helpful discussions regarding this review. This work was supported by National Cancer Institute Grant CA118705.
POTENTIAL CONFLICTS OF INTEREST
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest