|Home | About | Journals | Submit | Contact Us | Français|
Characterizing the extent and logic of signaling networks is essential to understanding specificity in such physiological and pathophysiological contexts as cell fate decisions and mechanisms of oncogenesis and resistance to chemotherapy. Cell-based RNA interference (RNAi) screens enable the inference of large numbers of genes that regulate signaling pathways, but these screens cannot provide network structure directly. We describe an integrated network around the canonical receptor tyrosine kinase (RTK)–Ras–extracellular signal–regulated kinase (ERK) signaling pathway, generated by combining parallel genome-wide RNAi screens with protein-protein interaction (PPI) mapping by tandem affinity purification–mass spectrometry. We found that only a small fraction of the total number of PPI or RNAi screen hits was isolated under all conditions tested and that most of these represented the known canonical pathway components, suggesting that much of the core canonical ERK pathway is known. Because most of the newly identified regulators are likely cell type– and RTK-specific, our analysis provides a resource for understanding how output through this clinically relevant pathway is regulated in different contexts. We report in vivo roles for several of the previously unknown regulators, including CG10289 and PpV, the Drosophila orthologs of two components of the serine/threonine–protein phosphatase 6 complex; the Drosophila ortholog of TepIV, a glycophosphatidylinositol-linked protein mutated in human cancers; CG6453, a noncatalytic subunit of glucosidase II; and Rtf1, a histone methyltransferase.
Intracellular signaling mediated by growth factor–stimulated receptor tyrosine kinases (RTKs), such as those activated by insulin or epidermal growth factor (EGF), acting through Ras to extracellular signal–regulated kinases (ERKs) is required for metazoan development and physiology. Mutations in genes encoding components of this conserved signaling network, the RTK-Ras-ERK pathway, have been repeatedly identified as drivers in multiple malignancies. Understanding the hierarchical relationships among pathway regulators can have profound clinical significance, as exemplified by Kras genotype in determining responsiveness to inhibitors of the EGF receptor (EGFR) (1).
A complete understanding of cell signaling through this pathway requires identification of (i) all components of the system, (ii) the quantitative contribution of these components to various signaling outputs, and (iii) the hierarchical relationships, including physical connections, between these components. Systematic functional genetic approaches, such as genome-wide RNA interference (RNAi) screening used to identify previously unknown signaling genes, are inferential in that they do not distinguish between direct and indirect effects. Large-scale protein-protein interaction (PPI) mapping complements genetic studies by revealing physical associations, but fails to reveal the function of interacting proteins or the functional consequences of the interactions. Separate such “systems-level” functional genomic and interactome studies in the past few years have revealed that signaling is likely propagated within large networks of hundreds of proteins and thus have challenged linear cascade models derived from traditional reductive approaches (2). However, each systematic screening approach performed separately suffers from inherent technical limitations of the methods used, leading to false negatives and positives, restricting the comprehensiveness of pathway regulator discovery.
We have previously described an antibody-based, genome-wide RNAi screen assay for ERK activity in Drosophila cells after insulin stimulus (3). This assay relies on an antibody that recognizes phosphorylated Drosophila ERK (dpERK). We showed specific examples from secondary screens of a small subset of genes that were required downstream of insulin receptor (InR), but not of the EGFR, for activation of ERK in particular cell types, suggesting that many potential components of this pathway may have been missed by a single primary screen (3). Although multiple RTKs can signal through Ras to ERK, their output is context-dependent despite the apparent similarity in signal propagation through the core pathway (4–6).
A combined systematic approach using complementary functional genomic and interactome technologies would be more likely to uncover direct regulators and more completely describe the landscape of a signaling pathway (7). We performed multiple genome-wide RNAi screens in parallel to generating a tandem affinity purification–mass spectrometry (TAP-MS)–based PPI network surrounding the canonical pathway components of the RTK-Ras-ERK signaling pathway, using data from cells responding to insulin or EGF. Although we identified several previously unknown pathway regulators, the functional genomic and interactome data sets suggest that much of the core canonical pathway is complete.
To comprehensively discover genes that regulate ERK signaling output and to identify other specificity-generating proteins, we conducted four systematic, cell-based RNAi screens for regulators of EGF-stimulated ERK activation in two stable Drosophila cell lines expressing EGFR, S2R+mtDER, and Kc167mtDER (fig. S1, A and B). These four screens combined with our two previously published screens performed with S2R+ cells that were unstimulated (baseline) or stimulated with insulin (4) interrogated >20,000 double-stranded RNAs (dsRNAs) targeting roughly 14,000 Drosophila genes. We compared all six primary screens, divided into three groups by stimulus (insulin, EGF) and cell line (S2R+, Kc) (Fig. 1A). These screens uncovered 2677 annotated genes, in addition to 756 unannotated predicted genes (Fig. 1A and table S1). As expected, these genes include most of the known canonical pathway-associated genes (table S5). We identified both EIF4AIII and mago (table S1) as positive regulators in our RNAi screen in Kc cells and these two genes were also found in an RNAi screen for regulators of the mitogen-activated protein kinase (MAPK) pathway in Drosophila S2 cells (8).
Gene Ontology (GO) annotation of the hits from the RNAi screens showed expected enrichment for processes controlled by RTK-Ras-ERK signaling, including tracheal development, photoreceptor differentiation, imaginal disc morphogenesis, and hematopoiesis; genes controlling mitosis, neuronal differentiation, cell motility, female gamete generation, and SUMO (small ubiquitin-like modifier protein) binding were also enriched in the hits from the RNAi screens (table S2). The hits from the RNAi screens were also significantly enriched for proteins conserved in humans and implicated in a human disease (P < 3.5 × 10−9 and 9.8 × 10−4, respectively), implying that many of the newly identified regulators are also involved in mammalian MAPK signaling. Human orthologs had stronger RNAi scores on average (P < 0.001), suggesting that genes with more central roles in the pathway have been conserved.
We observed distinct subsets of genes isolated in the primary RNAi screens under specific cell or RTK-stimulus contexts (fig. S1C). We were also able to identify genes that were common to both cell types under both stimulus conditions (Fig. 1B). These genes were quantitatively stronger regulators than the remaining hits (fig. S1D). Our systematic screens permitted global observation of the processes regulating specificity; compared to all hits from the RNAi screens, those identified in the insulin screen were enriched for cytoskeletal genes and cell cycle processes (P < 1.3 × 10−6 and 0.03, respectively), whereas transcriptional and peptidase activities were enriched in the EGF screen in Kc cells (P < 4 × 10−4 and 0.02, respectively).
Distinct subsets of genes were specific to insulin or EGF signaling in either cell type or were regulated by insulin or EGF in both cell types (table S3). Signaling downstream of the InR activates both ERK and Akt signaling pathways; we confirmed that genes encoding components of the Akt-Tor pathway, including InR itself, PTEN, Akt, Tor, and gig (Tsc2), were insulin-specific regulators of ERK. This insulin-specific regulation of ERK and the Akt-Tor pathway is likely mediated through feedback from S6 kinase to InR (9). (Throughout the text, where different from the Drosophila gene or protein names, mammalian common names or abbreviations of the proteins are shown after the names or abbreviations for these components in Drosophila.) Other genes specifically associated with InR signaling included PRL-1, encoding a phosphatase that can transform cells (10); the kinase-encoding gene Tak1; and CG9468 and CG5346, which are genes predicted to encode proteins with α-mannosidase and iron oxygenase activities, respectively. Genes specifically associated with EGF signaling included EGFR itself, and those encoding several components potentially involved in receptor localization, or down-regulation, or both, including Snap, encoding a protein required for vesicular transport; CG7324, encoding a Rab guanosine triphosphatase (GTPase)–activating protein; and RSG7, encoding a putative G protein [heterotrimeric guanosine triphosphate (GTP)–binding protein] γ subunit that also interacts with Snapin, a component of the SNARE complex (11). Because these genes were associated with EGF signaling but not insulin signaling, this suggests that these are required for EGFR but not InR localization.
Many of the previously unknown regulators identified in the RNAi screens may act indirectly through general cellular processes or through multiple levels of transcriptional feedback. Furthermore, RNAi screens suffer from off-target effects even after computational filtering and use of multiple RNAi reagents for each gene (12). PPI mapping provides an orthogonal representation of network regulators compared to functional genomic approaches because it reveals physical associations. Although large-scale yeast two-hybrid (Y2H) screening can reveal potential PPIs with high accuracy (13) and has been performed on a large scale for MAPK-related proteins (14), Y2H cannot detect interactions that may rely on regulatory posttranslational modifications that occur in endogenous signaling contexts. Large-scale TAP-MS has been used to discover PPIs, most comprehensively in yeast (15–17) and in human cells in “pathway-oriented” mapping of tumor necrosis factor-α (TNFα) signaling (18), Wnt signaling (3), and autophagy (19).
We used TAP-MS to capture the dynamic “mini-interactome” surrounding 15 well-recognized, conserved canonical components of the RTK-Ras-ERK pathway: InR, PDGF (platelet-derived growth factor)– and VEGF (vascular endothelial cell growth factor) receptor-related (PVR), EGFR, the adaptors Drk (Grb2) and Dos (Gab), the GTPase Ras85D, the Ras GTP exchange factor Sos, the cytoplasmic tyrosine kinase Src42A, the GTPase-activating protein Gap1, the phosphatase Csw (Shp2), the MAPK kinase kinase Phl (Raf ), the MAPK kinase Dsor1 [mitogen-activated or extracellular signal-regulated protein kinase kinase (MEK)], the scaffolds Ksr and Cnk, and the MAPK Rl (ERK). These 15 proteins served as the “baits” in the affinity purification assay. The proteins and a control were expressed in S2R+ cells with TAP vectors (20) and lysates prepared at baseline (unstimulated cells) or after stimulation with insulin or EGF. Two or more biological replicates were performed for each bait and condition. Interacting proteins were determined by TAP and microcapillary liquid chromatography-tandem MS (LC-MS/MS). A total of 54,339 peptides were identified representing 12,208 proteins, encompassing an unfiltered network of 5009 interactions among 1188 individual proteins (table S4). Among the most abundant proteins identified in replicate pull-downs and absent in control preparations were other known RTK-Ras-ERK canonical proteins. A network based on the observed interactions among these canonical proteins recapitulates many of the known RTK-Ras-ERK signaling pathway interactions (Fig. 1C), validating the sensitivity of our TAP-MS approach in robustly identifying pathway interactors.
Raw TAP-MS data often contain sticky proteins found in control preparations. To provide a ranked list of the most specific pathway interactors by filtering out these sticky proteins, we applied the Significance Analysis of Interactome (SAINT) method to our PPI data set (15). Using a SAINT cutoff of 0.83 and false discovery rate (FDR) of 7.2%, we generated a filtered PPI network of 386 interactions among 249 proteins surrounding the canonical components of the RTK-Ras-ERK signaling pathway (Fig. 2 and table S4). We evaluated our PPI network by comparing it with various literature-derived physical interaction networks (fig. S2, A and B). For this network comparison, we generated a master physical interaction network (MasterNet) composed of five different types of networks (see Materials and Methods). Our filtered network is significantly overrepresented in the MasterNet, with 29% overlap, compared to 17% for the excluded proteins; the canonical network has a 97% overlap with MasterNet. SAINT scores were highly correlated with appearance in literature data sets, implying that the PPI network as filtered by SAINT represents high-confidence pathway interactors (fig. S2C). Of the literature-derived networks, appearance in the Drosophila binary PPI data set most closely correlated with higher SAINT scores (fig. S2D).
Using traditional coimmunoprecipitation techniques and quantitative Western blotting, we corroborated selected previously unknown interactions (fig. S3). Among these, we verified an ERK interaction with the cyclin-dependent kinase cdc2c (CDK2), as reported for mammalian cells (21), implying that ERK can directly regulate the cell cycle through this interaction. Many of the proteins that interacted with multiple RTKs were adaptors (table S4). A notable exception was CG10916, which was one of the few common interactors of multiple RTKs (InR, PVR, and EGFR) that was not an adaptor (fig. S3, A and B). Thus, individual RTKs likely recruit distinct complexes during signaling and may compete for a common set of canonical interactors. As a negative regulator of ERK activation and a predicted RING domain–containing protein, CG10916 may be involved in receptor degradation or down-regulation of RTKs. We also found that some interactions below our conservative SAINT threshold of 0.83 could be verified by coimmunoprecipitation (fig. S3C), suggesting that the true size of the network may be larger than the cutoff we chose.
On the basis of GO classifications, we found that the filtered PPI network was enriched in genes encoding regulators of Ras signaling, signaling by the RTKs Sevenless and Torso, and R7 photoreceptor differentiation, all processes known to involve ERK activation, and also those encoding pro teins associated with mitosis, the cytoskeleton, axis specification, oogenesis, kinase activity, and SUMO binding (table S2). Compared to the total filtered network, proteins interacting with Drk (Grb2) were enriched for GO terms associated with epithelium development and cell fate (P < 0.02 for both), but otherwise individual bait networks were representative of the entire network. As with the RNAi hits, our filtered PPI network was enriched for genes conserved in humans and in human diseases (P < 5.4 × 10−16 and P < 4.6 × 10−3, respectively).
Feedback regulation is a mechanism of ensuring pathway robustness(22). Several studies have examined the transcriptional responses to RTK-Ras-ERK signaling stimulation or perturbation in vivo (23–25). We culled genes in these studies responsive to pathway modulation and overlaid them with our PPI data set. We found that the expression of 25% of the genes for these interactors was changed in response to pathway modulation, a significantly enriched proportion (P < 2.4 × 10−9; table S4 and Fig. 3A). These gnes are strong candidates encoding mediators of feedback regulation of RTK-Ras-ERK signaling. Among these were several ribosomal genes (for example, RpL6, RpL23A, RpL27, RpS18, and RpS30) that exhibited reduced expression in response to pathway activation (Fig. 3A) and that were isolated as negative regulators in the RNAi screens, implying feedback amplification through inhibition of translational repression. These genes also had negatively correlated gene expression with their canonical pathway interactors in published gene expression studies (Fig. 3B).
During assembly of the RTK-Ras-ERK interactome, we identified complexes under baseline, insulin-stimulated, and EGF-stimulated conditions to find pathway interactors and to study the dynamics of complex assembly and disassembly, using quantitative label-free proteomics (26). Previous systematic evaluation of dynamics in interactomes has been limited to individual proteins; for example, one study identified dynamic interactors of ERK (27). Using the SAINT scores at baseline and stimulated conditions, we assembled interactomes of proteins with a high probability of a dynamic interaction with the canonical baits in response to insulin (Fig. 4A) or EGF stimulation (Fig. 4B). We observed several expected interaction dynamics, including the association of subunits of phosphatidylinositol 3-kinase (PI3K) with InR after insulin stimulus, which likely occurs through the adaptor Chico (IRS) and association of the adaptor Drk (Grb2) with EGFR after EGF stimulus (table S4). Our global analysis showed that proteins that interacted with the adaptor Dos were more likely to associate than dissociate under insulin stimulus, whereas those that interacted with Drk (Grb2) did not significantly change on the basis of SAINT probabilities. EGFR interactors dissociated when cells were stimulated with insulin. Upon EGF stimulus, interactors with Cnk, Dsor1, Gap1, and Ksr all preferentially dissociated, whereas Phl (Raf) interactors associated (Fig. 4B).
We overlaid the functional genomic data from our six systematic RNAi screens for ERK activation with the TAP-MS network structural data (Fig. 2). Nearly half of the proteins (119) of the filtered PPI network were encoded by genes that scored in the RNAi screens, which represented a significant enrichment over the genome for regulators of this pathway (19%; P < 7 × 10−25) and was an overlap higher than achieved with a more directed RNAi screening of TNFα pathway interactors (18). Thirty-two percent (38 of 119) of the interacting proteins were isolated from RNAi screens in both cell types and after both stimuli (Fig. 4C), whereas if all of the hits from all of the RNAi screens were counted, then only 8% were isolated from both cell types and stimuli.
Together, our RNAi and PPI experiments identified hundreds of previously unknown RTK-Ras-ERK regulators, as well as a core network of genes that were identified with both methods. Because visualization, navigation, and comprehension of complex networks of interacting proteins with functional data can be challenging, we provide our resource of RTK-Ras-ERK interactome and functional genomic data as browsable data files and in Cytoscape format, a graph layout and querying tool (28). However, given the widespread importance of this pathway and to make the integrated network interactive and widely accessible, we also provide access to the data with the Interaction Map (IM) Browser, an online network visualization tool for interactive, dynamic visualization of PPIs (29). Because integration of multiple data sources improves the specificity and reliability of individual high-throughput data, we merged our data with the Drosophila Interactions Database (http://www.droidb.org), which contains previously determined PPIs from Y2H and other studies, a wealth of Drosophila genetic interactions, and predicted conserved interactions, or interologs, from yeast, worms, and humans (30). With these tools, the RTK-Ras-ERK network can be searched, filtered, and overlaid with multiple genomic data sets.
RTK signaling to ERKs regulates diverse processes during Drosophila development. Among these, phenotypic alterations in the Drosophila eye and wing are the most easily scored, because Ras activity promotes cell growth, cell proliferation, cell survival, and differentiation into vein tissue downstream of EGFR activity. Because most of our newly identified pathway-associated genes do not have known alleles, we tested for phenotypes by expressing RNAi hairpins in Drosophila, which can faithfully recapitulate known phenotypes (31, 32). We tested for phenotypes of multiple genes isolated in our screens by expressing hairpins from a library created for transgenic RNAi, or in a few cases by complementary DNA (cDNA) overexpression, in the developing wing disc (Fig. 5, fig. S4, Table 1, and table S5). Of the 84 genes tested, 48 (57%) had a phenotype in the wing. Consistent with systematic PPI analyses in yeast (13), we found that proteins with a high degree (“hubs”) in MasterNet were no more likely than proteins with a lower degree to result in a wing phenotype.
We found that even genes that were identified both in RNAi screens and in the PPI interaction network were no more likely than genes isolated from each individually to score in wing phenotypes. One of the genes that were positive in both the functional genomic screen and the interaction screen was CG6453, which encodes a noncatalytic subunit of glucosidase II. The interaction between the CG6453 protein and Raf had a high SAINT score, and coimmunoprecipitation experiments confirmed this interaction (fig. S3A). In the S2R+ EGFR RNAi screen, this gene was a negative regulator, and we demonstrated that its depletion by RNAi resulted in a growth and patterning defect (ectopic wing vein material) in the wing, which is consistent with negative regulation of the pathway (Fig. 5A). Although genes encoding TepIV, the Drosophila homolog of a glycophosphatidylinositol (GPI)–linked protein that is mutated in human cancers, and components of the protein phosphatase PPP6 complex, its catalytic subunit PpV and regulatory subunit CG10289, were not found in the RNAi screens, these proteins were positive in the interaction screen. We confirmed their interactions with pathway components by coimmunoprecipitation (fig. S3A) and demonstrated that their knockdown produced in vivo phenotypes (Fig. 5, B and C). TepIV interacted with Ksr and, despite not scoring in our RNAi screens and having a weak RNAi phenotype in cells, nevertheless modified the RasN17 phenotype, consistent with a role as a positive regulator (Fig. 5B). PpV and CG10289 interacted with each other and Raf, and PpV depletion resulted in a growth defect in the wing (Fig. 5C). Finally, Rtf1, a histone methyltransferase, was a weak interactor with multiple pathway components and was filtered out of the final PPI network because of its SAINT score. However, the gene encoding this protein was identified as a negative regulator in our RNAi screens, and we confirmed an in vivo phenotype associated with increased dpERK (indicating increased activity) in the wing (Fig. 5D), showing that Rtf1 is a bona fide regulator of ERK activation.
Dissection of oncogenic signaling pathways with functional genomics and proteomics approaches facilitates understanding dynamic information processing and how these pathways may be disrupted by mutations or targeted therapeutically (26). By combining multiple, parallel genome-wide RNAi screens and TAP-MS interactome screens, we have assembled an integrated network of RTK-Ras-ERK signaling with both PPI interactions and functional information obtained in the same signaling environment. This network provides a resource for subsequent hypothesis-driven, mechanistic investigation of hundreds of conserved regulators.
Because high-throughput data sets are individually susceptible to multiple sources of technical and biological noise, confidence in subsets of any given “omics” data set can be increased by overlapping contrasting experimental approaches. Most integrative efforts up to now have queried data sets generated under disparate conditions and even different organisms. We found that only a small fraction of the hits from interactome or functional screening were isolated under all conditions tested, and most of these represented known “canonical” pathway components. Many of the hits that were identified from each method individually also showed evidence of activity in vivo. Comparing our studies to other studies of MAPK regulators suggests that the complete landscape of proteins regulating RTK-Ras-ERK signaling under specific conditions is likely to be larger than the conservative overlapping network that we describe. In comparison to a Y2H screen for MAPK pathway interactors, where >600 interactions were identified (14), only 54 proteins overlapped with our network, 30 (56%) of which also were positive in our RNAi screen, including the proton transporter ATPsyn-β (ATP5B), which was a negative regulator in our RNAi screens. Of the 31 proteins from a study of dynamic ERK interactors that overlapped with our filtered data set (27), 22 were encoded by genes positive in our RNAi screens, but only one, heat shock protein 60 (HspD1), was pulled down by ERK itself in our study. However, another 16 proteins interacted with Raf and 8 interacted with Dsor (MEK). By considering the Raf-MEK-ERK cassette as a whole, the number of overlapping interactions increased. Although these comparisons are limited by the differences in Y2H and TAP-MS techniques, the population of regulators that can be identified is probably highly technique- and condition-specific, and this work should be seen as a “first pass” at identifying the universe of proteins regulating the output of this pathway.
We used PPI mapping and functional genomic methods to identify several previously unknown regulators that also exhibited in vivo roles in RTK-Ras-ERK signaling. Translation of cell culture regulators to in vivo phenotypes is challenging due to lack of knowledge of the correct tissue in which to test for activity. Because many of the newly identified regulators are likely cell type– and RTK-specific, we were unable to identify phenotypes in the wing disc for many of these regulators. A large number of genes positive in the RNAi screens were not identified in the PPI network, either because of false negatives or because the encoding proteins modulate activity of the pathway indirectly. A prime example of this latter category is Rtf1, a histone methyltransferase knockdown of which enhanced ERK activation in vivo. Rtf1 enhances Notch pathway activity (33), and the Notch pathway can inhibit ERK activity (34); thus, Rtf1 may be a key mediator of Notch-ERK crosstalk. In contrast, we identified another protein phosphatase 2A (PP2A) family member, the PPP6 ortholog PpV and its regulatory subunit CG10289, as interacting with Raf, but did not identify the genes encoding these proteins in our RNAi screens. In mammals, PPP6 components can interact with the inhibitor of nuclear factor κB IκBε (18, 35) and regulate the cell cycle in normal and pathological contexts. The role of the Ser-Thr phosphatase PP2A in the Ras pathway has been principally described as a positive regulator through dephosphorylation of Ser259 on Raf and Ser392 on Ksr (numbering is based on human proteins), inducing 14-3-3 protein dissociation (36); PPP6 may play a similar role in Raf activation in specific in vivo contexts. CG6453, a noncatalytic subunit of glucosidase II, was identified in the interaction screen and was identified in the RNAi screens, indicating a high-confidence interactor. Although its mechanism of regulating MAPK output remains unknown, it is consistent with the growing recognition that metabolic and other genes previously thought to have “housekeeping” roles can have specific functions in signaling (37, 38). Finally, despite its interaction with intracellular Ksr, TepIV has homology with CD109, a GPI-linked cell surface marker of T cells, endothelial cells, and activated platelets, that contains a protease inhibitor α2 macroglobulin domain (39); CD109 is mutated in 7% of colorectal cancers (40) and may thus affect ERK output in these cancers. As more human cancers are characterized through ongoing large-scale next-generation sequencing, our data set of regulators of RTK-Ras-ERK signaling will provide a resource for understanding the potential mechanistic contribution of somatic mutations to cancer development.
Primary screening procedures were performed as published previously (4, 41). We derived an S2R+ cell line expressing DER (EGFR) from a metallothionein promoter (S2R+mtDER) also expressing cyan fluorescent protein (CFP)–tagged Dsor1 (MEK) and yellow fluorescent protein (YFP)–tagged Rl (ERK) (4). We confirmed ERK activation after secreted Spitz (sSpitz) (EGF in mammals) stimulus of both endogenous and tagged ERK by Western blotting and high-throughput format, and confirmed assay sensitivity, using dsRNAs targeting canonical components of the RTK-Ras-ERK pathway. For primary screening in Kc167 cells, we used our previously described cell line Kc167 expressing DER (EGFR) from a metallothionein promoter (Kc mtDER) (4) and modified the high-throughput assay with our Alexa 647–conjugated dpERK antibody normalized to DAPI (4′,6-diamidino-2-phenylindole) staining of nuclei to quantify ERK activity. Cells were stimulated with conditioned media containing sSpitz for 10 or 30 min. We performed secondary screens as described (41), using S2R+ and Kc cell lines with insulin (25 μg/ml) or sSpitz-containing conditioned media. Briefly, cell lines were seeded in plates prepopulated with resynthe-sized dsRNA amplicons identified from the primary screen as InR- or EGFR-specific. After stimulation, cells were fixed and stained for dpERK as previously described. Primary screen hits were prefiltered for computationally predicted off-target effects, which is generally sufficient to reduce off-target noise to below assay noise (4); however, any individual dsRNA should be treated with caution until validated with multiple amplicons (42). A Z score threshold of ±1.5 was used as the primary screen cutoff and is an average of replicate screens under each condition. Full data sets and dsRNA sequence information are available at the Drosophila RNAi Screening Center (DRSC) Web site (http://www.flyrnai.org).
TAP expression vectors permitting low-level expression of tagged components in stable Drosophila cell lines using the metallothionein promoter have been previously described (20). For the bait proteins, we cloned InR, PVR, EGFR, Drk (Grb2), Dos (Gab), Sos, Src42A, Gap1, Csw (Shp2), Ras85D, Phl (Raf), Dsor1 (MEK), Ksr, Cnk, and Rl (ERK) into the C-terminal tag TAP vector and created stable cell lines for each, as well as a control cell line for subtracting nonspecific interactors or contaminants. All cell lines except InR-TAP also expressed EGFR from an uninduced metallothionein promoter (resulting in minimal low-level expression) for induction with sSpitz (EGF). Cells (1 × 109 to 2 × 109) induced with 140 μM CuSO4 overnight were used for each lysis at the given condition. Cells were lysed as described (20) and in-solution TAP was performed essentially as described (43), with the exception of final washes and elution, which was performed in ammonium bicarbonate buffer without detergent for LC-MS/MS analysis. At least two biological replicates were performed for each bait and condition.
Several micrograms of TAP immunoprecipitation from each bait condition were reduced with 10 mM dithiothreitol at 55°C, alkylated with 55 mM iodoacetamide at room temperature, and then digested overnight with 2.5 μg of modified trypsin (Promega) at pH 8.3 (50 mM ammonium bicarbonate) in a total of 200 μl. The digest was stopped with 5% trifluoroacetic acid (TFA) and cleaned of buffer and debris with a C18 ZipTip (Millipore). Thirty-five microliters of aqueous high-performance liquid chromatography (HPLC) A buffer was added to the C18 ZipTip elution (50% acetonitrile/0.1% TFA) and was dried to 10 μl to concentrate the sample and remove organic content.
A 5-μl aliquot was injected onto the microcapillary LC-MS/MS system for sequencing. The microcapillary LC-MS/MS setup consisted of a 75-μm inside diameter (ID) × 10-cm length microcapillary column (New Objective Inc.) self-packed with Magic C18 (Michrom Bioresources) and operated at a flow rate of 300 nl/min by means of a splitless EASY-nLC system (Thermo Fisher Scientific). The HPLC gradient was 3 to 38% B over 60 min followed by a 7-min wash at 95% B. The column was preequilibrated with A buffer for 15 min at 0% B before the runs (A: 99% water/0.9% acetonitrile/0.1% acetic acid; B: 99% acetonitrile/0.9% water/0.1% acetic acid). The microcapillary LC system is coupled directly to an LTQ Orbitrap XL mass spectrometer (Thermo Fisher Scientific) operated in positive ion mode for data-dependent acquisitions (DDAs) [Top 5: one Fourier transform (FT) survey scan followed by five scans of peptide fragmentation (MS/MS) in the ion trap by collision-induced dissociation (CID) using helium gas]. The spray tip voltage was 2.8 kV and capillary voltage was 35 V. A single microscan with a maximum inject time of 400 ms was used for the FT-MS scan in the Orbitrap, and 110 ms was used for the MS/MS scans in the ion trap. Typically, between 3000 and 6000 MS/MS spectra were collected per run. The total number of LC-MS/MS runs collected for this study was 94 and collected over a 6-month period. All LC-MS/MS runs were separated by at least one blank run to prevent column carryover. Raw MS/MS spectra are available by request and are deposited in Proteome Commons as data set 76892.
All collected MS/MS fragmentation spectra were searched against the reversed dmel-all-translation protein database (FlyBase Consortium) version 5.4 (41,644 protein entries, January 2008) using the Sequest search engine in Proteomics Browser Software (Thermo Scientific). Differential posttranslational modifications including deamidation of QN (glutamine and asparagine) (+0.989 dalton) and oxidation of methionine (+15.9949 daltons), common in vitro modifications that occur during sample processing, were included in the database searches. From Sequest, protein groups containing at least two unique identified peptides were initially accepted if they were top-ranked matches against the forward (target) dmel-all-translation protein database and with a consensus score of greater or equal to 1.0. Individual peptides that were not part of protein groups were accepted if they matched the target database and passed the following stringent Sequest scoring thresholds: 1+ ions, Xcorr ≥1.9, Sf ≥0.75, P ≥ 1; 2+ ions, Xcorr ≥2.0, Sf ≥0.75, P ≥ 1; 3+ ions, Xcorr ≥2.55, Sf ≥0.75, P ≥ 1. After passing the initial scoring thresholds, all peptide hits not contained in protein groups were then manually inspected to be sure that all b (fragment ions resulting from amide bond breaks from the peptide's N terminus) and y ions (fragment ions resulting from amide bond breaks from the peptide's C terminus) aligned with the assigned sequence using tools (FuzzyIons and GraphMod) in Proteomics Browser Software (Thermo Fisher Scientific). An FDR rate of 1.84% for peptide hits and 0.6% for protein hits was calculated on the basis of the number of reversed database hits above the scoring thresholds.
We used the “significance analysis of interactome” (SAINT) algorithm to calculate probability scores for interactions observed by MS. SAINT uses spectral count data and constructs separate distributions for true and false interactions to derive the probability of a bona fide PPI. Because SAINT models spectral counts with a unimodal distribution, we ran the algorithm separately for each condition and combined the scores. Specifically, we assumed that each condition was conditionally independent given the spectral count data and computed the probability that the interaction was true in any condition. For proteins A and B in conditions 1 to n, the combined score is computed as:
where P(A ↔ B cond i) is the SAINT score for condition i. Some proteins were not used as baits in all conditions; hence, some interactions that were observed in one condition could not be observed in another. In this case, we used the previous probability of an interaction occurring in that condition as computed by the SAINT algorithm. In the general setting, this would be the probability that a randomly chosen pair of proteins interact, that is, (number of interacting pairs of proteins)/(number of pairs of proteins). In our specific case, we are choosing a pair of proteins from proteins that are observable in MS, so we adjust the ratio accordingly to our specific setting.
Additionally, we computed pairwise dynamic difference scores between conditions (the probability that an interaction is true in one condition but not the other), assuming the conditions were conditionally independent given the spectral count data.
To determine a high-confidence threshold, we constructed a set of true-positive interactions by overlapping our experimental interactions with BioGRID. This list contained 49 interactions between 114 proteins. We formed a true-negative set by taking interactions that were more than three hops away in the BioGRID protein interaction network. A receiver operating characteristic (ROC) curve generated with this gold standard list and generated with fly binary and fly complex data is shown in fig. S2, A and B. We chose 0.83 as the cutoff to achieve a 7.2% false-positive rate and 26.5% true-positive rate, which is comparable to the results achieved in (15).
Filtered binary interactions were graphed using the Cytoscape environment (28). For analysis of feedback regulation, three in vivo microarray studies were collated (23–25). Microarray data from in vivo analysis of mesoderm (24) were reanalyzed to focus on subgroups for the RTK-Ras-ERK pathway only, excluding other pathway data sets.
Human orthologs were predicted with DIOPT, an integrative ortholog prediction tool developed at DRSC (44) (http://www.flyrnai.org/cgi-bin/DRSC_orthologs.pl). The orthologs with the best prediction score, reflecting the number of methods from which the prediction was identified, were selected. Potential human disease-related fly homolog information was obtained from Homophila version 2.1 (45). Gene expression levels were obtained from DRSC (http://www.flyrnai.org/cgi-bin/RNAi_expression_levels.pl), and cell line gene expression data were obtained from the modENCODE project (46). The significance of conserved genes, expressed genes, or disease-related genes was tested by calculating cumulative hypergeometric probability. The enrichment of GO annotations for Molecular Function and Biological Process, as well as Panther pathway annotation, was performed with the online DAVID tool (http://david.abcc.ncifcrf.gov/) (47). Hierarchical clustering and graphing was performed with the MultiExperiment Viewer, Cluster, and Java TreeView programs (48–50).
MasterNet is a compilation of databases. (i) Fly binary PPI network: This network was constructed by integrating experimentally identified binary PPIs (direct physical interactions) from major PPI databases, such as BioGRID (51), IntAct (52), MINT (53), DIP (54), and DroID (30). The fly binary PPI network consists of 29,325 interactions between 8161 proteins. The PPIs were downloaded from the source databases in PSI-MI format (55), and the gene/protein identifiers were mapped to FlyBase (56) gene identifiers. (ii) Interolog binary PPI network: PPIs were predicted on the basis of experimentally identified binary PPIs for human, mouse, worm, and yeast. (iii) Interolog protein complexes network: PPIs were predicted from experimentally identified protein complexes for human, mouse, worm, and yeast. Both the interolog networks were compiled from BioGrid, IntAct, MINT, DIP, and HPRD (57) databases. The PSI-MI files were downloaded from the source databases and the experimental identifier from interaction detection type field was used to sort the PPI as either binary or complex. Using ortholog annotation from DIOPT database, we mapped 129,090 PPIs between 5954 proteins to fly. (iv) Kinase-substrate network: For each experimentally verified phosphorylation site, the kinase that phosphorylates that site was predicted with the NetPhorest program (58, 59). The program uses probabilistic sequence models of linear motifs to predict kinase-substrate relationships. The fly kinase-substrate network consists of 26,736 interactions between 55 kinases and 2518 substrate proteins. (v) Domain-domain interaction (DDI) network: Known and predicted protein DDIs were extracted from DOMINE database (60), which includes 26,219 interactions inferred from Protein Data Bank (PDB, http://www.pdb.org) entries and those that are predicted by 13 different computational approaches using Pfam domain definitions. For network integration, we considered only high-confidence DDIs as defined by DOMINE and those derived from crystal structures.
All Western blotting and coimmunoprecipitation procedures and antibodies used were previously described (4). Quantification of dpERK and total ERK (used as normalization value) was performed with the LI-COR detection system. Western blotting and coimmunoprecipitation experiments were performed a minimum of two times.
Stocks used for genetic analysis were obtained from Bloomington except where noted. All hemagglutinin-tagged cDNA constructs were cloned by polymerase chain reaction (PCR) cloning with Phusion Polymerase (New England Biolabs) into pUAST. cDNA clones or libraries used as templates were as follows: Dco (LD04938), CG31666 (SD04616), Rack1 (RE74715), CG1884 (cDNA library), and CG31302 (AT04807). Hairpins described in the text were cloned into pWiz as described previously (61) using the following primers: CG7282: CACGCCCAGCTGTCAG, TTCACGTTCTCCAGTTTCTC; CG3878: CAGCTCCGCAGTGCTCGTGT, AGTTGTCGTCGTCGGAGCTC; CG1884: TCGGCTTGGGCACAAAC, AAGGACTTCGCCCTGGAT; and CG17665: GCAGAAGCAATAGCCGAATC, ATTTTCTCATCTGCCGCATC. Other RNAi hairpins were designed with the attP targeted transgenic system for an in vivo RNAi project (“TRiP” lines) as described (31), as well as RNAi lines from Vienna Drosophila RNAi Center and NIG-Fly Japan stock center. Other fly lines are y,w,hsFlp, MS1096-Gal4, UAS Ras1N17, ElpB1/CyO, apterous-Gal4, and UAS-mCD8-GFP/CyO. For dpERK staining, wing discs from third instar larvae were dissected in cold phosphate-buffered saline (PBS), fixed for 15 min in 4% formaldehyde, and washed in PBS + 0.1% Triton. Discs were stained with a rabbit antibody that recognizes dpERK (Cell Signaling). Wings of the indicated genotype were mounted in a 1:1 mixture of Per-mount and xylenes.
A complete list of the hairpin lines used in this study is given table S6.
We thank the DRSC and its staff for RNAi screening reagents and advice and current and former members of the Perrimon lab for reagents and discussions. We thank M. Herman for help with in vivo analysis; P. Schmid and N. Palmer for additional data analysis; C. Bakal, M. Kulkarni, and B. Mathey-Prevot for critical manuscript review; and X. Yang for help with MS experiments.
Funding: A.A.F. was supported by the Medical Scientist Training Program. R.L.F. was supported by NIH R01 HG001536. This work was supported in part by R01 DK071982 to N.P. and P01 CA120964 to N.P. and J.M.A. N.P. is an investigator of the Howard Hughes Medical Institute.
Author contributions: A.A.F. and N.P. designed the project. A.A.F., D.Y., R.B., and M.P. performed the experiments. J.M.A. performed the MS experiments. G.T., R.S., A.V., Y.H., P.H., X.S., and B.B. designed and performed statistical analysis. S.P., T.M., and R.L.F. integrated the data into DroID. A.A.F. and N.P. wrote the manuscript.
Competing interests: The authors declare that they have no competing interests.
Data availability: Full data sets and dsRNA sequence information are available at the DRSC Web site (http://www.flyrnai.org). MS data are supplied as peptide reads and values in the Supplementary Materials, and interactions are deposited in DroID (http://www.droidb.org). MS data are available through Proteome Commons data set 76892.
SUPPLEMENTARY MATERIALS www.sciencesignaling.org/cgi/content/full/4/196/rs10/DC1
Cytoscape file for the PPI and RNAi data.
Excel spreadsheets of peptides isolated in each TAP-MS experiment.