We have demonstrated the use of PLSR to characterize the relative importance of tyrosine phosphoryation events for cell migration and proliferation in two human mammary epithelial cell lines with varying HER2 expression levels under both EGF and HRG treatment. In addition, we have identified an important subset of molecules from our original large signaling dataset to serve as a network gauge for the prediction of migration and proliferation (). Our results both highlight previously identified elements in the HER2 signaling network, and suggest new pathways and targets critically implicated in HER2-mediated signaling and its effect on migration and proliferation.
Scores plot analysis () helped generate global intuition as to how different combinations of ligand and receptor expression activated the phosphotyrosine signaling network. We related these changes back to original measurements through the use of inner products, generating lists of proteins correlated with any given ligand or receptor transition. Because the lists are derived after applying PLSR, the proteins highlighted have already been identified as important for the description of changes in cellular behavior. This procedure represents an improvement over traditional analysis of large mass spectrometry datasets (usually fold-change analysis) and demonstrates, to our knowledge, the first time an approach based on inner products has been used to extract understanding from PLSR-based biological models. Our lists (–) show that a particular behavior may be controlled through different network signaling strategies depending on cellular input. For instance, when EGF treatment replaces HRG in 24H cells, migration is stimulated through a different set of molecules than are used to elevate migration when HER2 levels are increased.
The reduction of the mass spectrometry dataset to nine highly informative phosphorylation sites on six proteins suggests elements of network architecture that likely control migration and proliferation, namely endocytosis and signaling through PIP3- and PI3K-mediated pathways. Three of the six highly informative proteins, TfR, SHIP-2, and ACK, are all linked to endocytosis [
24,
30,
35]. The tight connection between endocytic regulation and the signaling networks governing cell migration and proliferation has been documented, most powerfully in a recent study using RNA interference against the human kinome [
42]. The results of this study indicate that more kinases than previously appreciated are involved in endocytosis, and taken together with other recent efforts implicate endocytosis as a high-level regulator and sensor of cell-signaling networks [
42,
43]. Endocytosis can occur via many different mechanisms, principally clarthrin-mediated endocytosis and caveolar/raft-mediated endocytosis, with each mechanism regulating different sets of kinases and cell behaviors [
42,
43]. The fact that TfR endocytosis was identified as highly informative instead of EGFR endocytosis might be due to the fact that EGFR internalization is mediated by both clarthrin-mediated endocytosis and caveolar/raft-mediated endocytosis after treatment with high amounts of EGF, whereas TfR is thought to internalize independent from RCE [
28]. The dynamic and quantitative resolution in our signaling assay was most likely critical for the capture of endocytic events, as endocytosis strongly regulates both signal duration and intensity. Furthermore, although our assay did not measure spatial distribution, endocytic information may have served as a proxy for that, further explaining its presence in the reduced model. Signaling through PI3K and PIP3 affects both commonly recognized downstream targets, such as protein kinase B, and important distinct pathways such as those containing ERK and p53 [
44]. A recent mapping of the complete ErbB signaling network reveals PIP3 and its upstream kinase PI3K as highly informative nodes upon which a large fraction of signaling information converges [
45]. Not surprisingly, then, we identify four proteins in our network gauge that interact with or are downstream of PIP3 or PI3K. These molecules are: Shc, SHIP-2, TfR, and SCF38 [
22,
37,
46]. Thus, model reduction not only identifies a network gauge, but also suggests salient elements of the signaling network.
The PLSR model's ability to predict levels of proliferation and migration in 24H cells given only data from parental cells indicates that, although signals drastically change as we move from parental to 24H cells, the cell decides upon levels of migration and proliferation according to the same “rules.” These rules are nonintuitive but amount to the calculation of behavior according to the regression equation given by the PLSR model. Identification of conserved algorithms used to control behavior across cell type highlights the potential to predict a priori how changes in signaling will affect cell behavior and gives insight into conserved themes for cellular decision-making processes. Thus, the linear mapping of phospho-proteomic data onto cellular phenotype identified a key set of signals descriptive and predictive of phenotype in breast epithelial cells. It also identified subsets of signals that govern phenotype under either ligand or receptor perturbation, and in that process revealed new hypotheses about HER2-mediated signaling events. Of course, these hypotheses need to be tested through further focused molecular and biochemical work. Nevertheless, the modeling approach we introduce here is a powerful first step toward understanding signaling networks and the behaviors they control.