|Home | About | Journals | Submit | Contact Us | Français|
Cullin Ring Ligases (CRLs) represent the largest E3 ubiquitin ligase family in eukaryotes and the identification of their substrates is critical to understanding regulation of the proteome. Using genetic and pharmacologic Cullin inactivation coupled with genetic (GPS) and proteomic (QUAINT) assays, we have identified hundreds of proteins whose stabilities or ubiquitylation status are regulated by CRLs. Together, these approaches yielded many known CRL substrates as well as a multitude of previously unknown putative substrates. One substrate, NUSAP1, we demonstrate is an SCFCyclin F substrate during S and G2 phases of the cell cycle and is also degraded in response to DNA damage. This collection of regulated substrates is highly enriched for nodes in protein interaction networks, representing critical connections between regulatory pathways. This demonstrates the broad role of CRL ubiquitylation in all aspects of cellular biology, and provides a set of proteins likely to be key indicators of cellular physiology.
Ubiquitin dependent proteolysis is a major mechanism for post-translational re-organization of the proteome. Ubiquitylation occurs through a cascade of three enzymes termed E1, E2 and E3, with the E3 imparting substrate specificity. Ubiquitylation can alter substrate activity or target it for degradation. We previously identified the SCF, a modular class of E3 ubiquitin ligases that use an interchangeable set of substrate adaptors termed F-box proteins (Bai et al., 1996). The SCF utilizes CUL1 as a scaffold, recruiting the F-box family of substrate specificity factors through an adaptor, SKP1 (Bai et al., 1996; Feldman et al., 1997; Skowyra et al., 1997). CUL1 also binds the RING domain protein RBX1 that in turn recruits an E2 (Ohta et al., 1999; Seol et al., 1999; Skowyra et al., 1999; Tan et al., 1999). The human genome encodes nine cullins (Cul1, 2, 3, 4A, 4B, 5, 7, PARC and APC2), and each utilizes a unique set of substrate specificity factors and adaptors (reviewed in Petroski and Deshaies, 2005; Willems et al., 2004). These ligases are collectively termed Cullin Ring Ligases (CRLs).
With the exception of the APC/C, all CRLs require Neddylation for full activation (reviewed in Deshaies et al., 2010; Skaar and Pagano, 2009). Nedd8 is a small ubiquitin like protein that attaches to substrates using similar E1, E2 and E3 enzymes. Neddylation occurs on the cullin subunit, and allosterically activates ligase activity (Duda et al., 2008; Saha and Deshaies, 2008). The chemical MLN4924 inhibits the Nedd8 E1 enzyme, Nae1 (Brownell et al., 2010; Soucy et al., 2009), inhibiting all CRLs.
While transcriptional regulation of genes is frequently examined, little is known about post-translational regulation of protein abundance, despite the existence of 500 ubiquitin ligases in mammals and many more in plants. Furthermore, analysis of CRL substrates in multiple organisms has revealed that many are critical regulators of their respective pathways and often lie downstream of signal transduction pathways. The global identification of highly regulated CRL substrates should unveil critical regulatory nodes that will represent key intersections between pathways (Benanti et al., 2007; Yen and Elledge, 2008). To approach this problem we recently described a genetic screening platform for identifying proteins with regulated stabilities, termed Global Protein Stability Profiling (GPS). GPS is a fluorescent based reporter system that combines Fluorescent Activated Cell Sorting (FACS) with DNA microarray deconvolution to systematically examine changes in protein stability in live cells (Yen et al., 2008).
An alternative for identifying ubiquitin-regulated proteins is mass spectrometry (MS) based proteomics using SILAC (Stable Isotope Labeling with Amino Acids in Cell Culture) (Ong et al., 2002). This has been limited due to the low abundance of ubiquitinated peptides. Since the last three amino acids of ubiquitin are Arginine-Glycine-Glycine (RGG), tryptic cleavage of ubiquitylated proteins results in the presence of a GG remnant on the ubiquitin modified lysine of tryptic peptides. Antibodies generated against this remnant can specifically recognize this site, allowing for the enrichment of these tryptic peptides from ubiquitylated proteins (Xu et al., 2010).
Using the approaches described above, we identified proteins regulated by CRL ligases. This set of nearly 500 proteins are enriched for highly connected hub proteins in interaction networks and represent a group of proteins that play key roles in cellular physiology.
The GPS screening platform utilizes an internally normalized fluorescent-based retroviral reporter system and combines FACS with DNA microarray deconvolution to systematically examine changes in protein stability (Figure 1A) (Yen et al., 2008). GPS vectors express a single transcript encoding DsRed and EGFP-ORFs separated by an internal ribosome entry site (IRES). We constructed a human tissue culture cell library expressing each of the proteins encoded in the human ORFeome Collection, with each cell expressing a single EGFP-ORF. Importantly, the EGFP/DsRed ratio of each cell acts as an indirect reporter for the half life of the expressed ORF. Screening is performed by FACS sorting the library into bins based on the EGFP/DsRed ratio (low to high). Genomic DNA from each bin is harvested and the ORFs are PCR amplified from each bin with common primers targeting the viral backbone. PCR amplified ORFs are fluorescently labeled and hybridized to custom designed DNA microarrays (one microarray per bin) and the intensity of each probe is measured and graphed across the sorted bins. When performed in a comparative manner we can assess changes in protein stability based on changes in the probe distribution intensity across the sorted bins (Figure 1A).
To improve GPS, we designed a lentiviral reporter vector (pGPS-LP) (Figure S1A) that can infect non-dividing cells and has a larger packaging capacity (>10 kb). pGPS-LP contains a PGK-puromycin cassette for selection and a T7 promoter downstream of the ORF that significantly improves the efficiency of PCR. pGPS-LP showed a 50-fold increase in viral titer and a larger packaging capacity (Figure S1B). We constructed an updated GPS library using pGPS-LP and the latest CCSB Human ORFeome Collection, which includes 15,483 human ORFs covering 12,794 genes and found that the updated library has significantly improved preservation of larger ORFs (Figure S1C).
To improve the accuracy of GPS, we designed multiple probes for each ORF with increased stringency (Figure S1D and S1E, Table S1). These microarrays contain 46,000 probes with an average of ~3.3 probes per gene, and 80% of genes have ≥ 2 probes. We also developed a scoring system that considers changes in Protein Stability Index (ΔPSI), hybridization intensity, agreement among multiple probes and percentage of cells with altered EGFP/DsRed ratios after treatment (percent shift).
We utilized MLN4924 to interrogate the roles of CRLs on the proteome using the second generation GPS platform. MLN4924 treatment stabilized known CRL substrates, including the CRL3 target NRF2 and the CRL4 target CDT1 (Figure 1B). 293T cells stably expressing GPS-NRF2, GPS-RBM19 or GPS-CDC34 showed an increased EGFP/DsRed ratio when treated with MLN4924 whereas the GPS-library and GPS-negative controls (GPS-Empty and GPS-RPS2) were unaffected (Figure 1C and Figure S2C). We ultimately screened our GPS library with 1 µM MLN4924 for 4 hours, conditions that do not affect the cell cycle (Figure S2A).
We GPS screened on a 293T lentiviral GPS-library treated with either DMSO or MLN4924 (Figure 1A). We hybridized PCR amplified ORFs (sorted versus unsorted) onto second generation DNA microarrays, using a single microarray for each sorted bin in each condition (16 total). For each probe we determined the Protein Stability Index (PSI; approximates the statistical mean of the distribution) and graphed the probe distribution across the 8 bins, comparing treated and untreated conditions (see Figure 1A.v). The 2nd generation microarrays showed strong agreement between probes for a single gene. The standard error between probes, for genes with multiple probes, is less than 0.1 for > 90% of genes. Comparison of probe PSI between MLN4924 and DMSI treated samples showed a linear relationship, with an R2 value of 0.92 (Figure 1D). When comparing treated and untreated conditions within a single bin, each individual bin showed an R2 value ≥0.91. Figure S3 shows three randomly chosen probes for a subset of validated proteins (see below) that are stabilized by MLN4924. The probe distributions across the 8 bins are almost indistinguishable, suggesting strong agreement and low cross-reactivity.
Probes were ranked according to their ΔPSI and 1000s of high priority graphs were visually inspected. Proteins with multiple corresponding probes (when available) that showed a significant positive shift after MLN4924 treatment were considered putative CRL substrates, yielding 244 high priority candidates (Table S2). Importantly, the MLN4924-GPS screen identified a large number of well characterized CRL substrates, including Hypoxia Inducible Factor (HIF), NRF2, CDC25, the CDK inhibitors CDKN1A and CDKN1B (p21/CIP1 and p27/KIP1, respectively), ATF4, CYCLIN D, numerous substrate adaptors including F-box and Kelch-BTB proteins, STAT1, JUN and PDCD4 (Figure 1E). A recent study examining cullin interactors using IP-MS/MS (Bennett et al., 2010) identified specificity factors co-precipitating with specific cullins. Since many specificity factors are ubiquitylated by their cognate ligase we examined this dataset. Of the 190 unique cullin interacting proteins present in the ORFeome, 96 (50%) showed a positive shift in their probe distribution in the GPS screen, indicating that many were stabilized by MLN4924 treatment, confirming that this screen identified CRL substrates.
To validate the reproducibility of the screen we individually tested ORFs for their response to MLN4924. We individually subcloned 74 unique ORFs into pGPS-LP. Cell lines expressing individual ORFs were treated with MLN4924 or DMSO and analyzed by FACS to assess changes in the EGFP/DsRed ratio. Forty-seven (65%) had an increased ratio after MLN4924 treatment, suggesting CRL-dependent stabilization. To confirm that this reflected an increased abundance of the tagged protein we immunoblotted 18 EGFP tagged ORF cell lines and all showed increased protein levels (Figure 1F). We also examined the endogenous levels of 10 proteins and found that 8 increased in abundance following MLN4924 treatment (Figure S3B). The cumulative results of our extensive validation analysis with MLN4924 are summarized in Table S2 and a subset of FACS validated proteins is shown in Figure S3A.
To further identify substrates of CRL ligases, we employed a peptide IP proteomic strategy. We employed an immunoaffinity reagent specific for tryptic ubiquitin remnants, the PTMScan ubiquitin remnant motif antibody. This antibody specifically recognizes a di-glycine tag that remains on ubiquitylated lysine residues after trypsin digestion of proteins into peptides and enriches them ~1,000 fold from lysates.
To identify ubiquitylation sites that are specifically CRL-dependent, we utilized a quantitative approach based on SILAC-MS that we refer to as QUAINT (Quantitative Ubiquitylation Interrogation). Cells were treated with MG132 alone or with MG132 and MLN4924 (Figure 2A). MG132 was included to capture ubiquitylated substrates that would otherwise be degraded by the proteasome. The 4 h incubation in MLN4924 did not affect the HeLa cell cycle (Figure S2B). Three independent replicates identified 9,957 unique peptides, corresponding to 2,814 proteins, at false discovery rate of 0.11% (Figure 2B, Table S3). Internal validation of peptide identification was provided by the fact that 5,114 (>50%) peptides overlapped between at least two experiments (Figure 2B). Since MLN4924 leads to CRL inactivation, it reduces the heavy/light ratio (H/L) for peptides that contain lysines ubiquitylated by a CRL. Overall, 1,015 unique peptides were quantitatively reduced more than two-fold in at least one replicate (Figure 2C).
The H/L average and standard deviation for all 5,114 unique peptides appearing in multiple replicates was calculated. We selected 364 peptides displaying an average H/L reduction of 2-fold between multiple replicates and added 448 peptides that were reduced more than two-fold, yet quantified in only one replicate. This corresponds to 812 peptides (from 410 proteins) which we designate as our QUAINT-MLN4924 regulated CRL candidates (Table S3). Importantly, the individual values contributing to the mean H/L ratios were very similar across replicates, as 77% of the 364 peptides were reduced ≥2-fold and 94% trended (reduced ≥1.5-fold) in a second replicate. This yielded many known substrates, including SETD8, NFkB, Cyclin D, CDT1, HIF1A, POLR2A, YBX1, CDC25A, ORC1, β-Catenin and many CRL adaptors. The enrichment for known substrates and the overlap between replicates suggests that QUAINT proteomics has identified a large number of CRL substrates with a high degree of confidence.
We compared the overlap between the QUAINT and GPS MLN4924 screens. Of these top 411 QUAINT scoring proteins, 295 (72%) are present in the human ORFeome collection and 108 (37%) had a shifted probe distribution in the MLN4924 GPS screen (p- value < 10−50). This suggests that at least 37% of the QUAINT regulated proteins represent bona fide CRL regulated proteins. Fusion to GFP and a very high initial stability (pre-MLN4924) can preclude the identification of some substrates in GPS, suggesting that this is likely to be an underestimate.
The 108 proteins that overlapped in both screens are depicted in Figure 2D. Known substrates, substrate adaptors, proteins with known interactions with CRLs, and proteins that scored in other GPS screens for CRL substrates (see below) are colored (see Figure Legend). Strikingly, these account for 63 (58%) of the proteins on the QUAINT-GPS overlap list, suggesting that our complementary approaches identified a high confidence list of both known and previously unrealized CRL substrates.
The successful identification of CRL substrates prompted us to interrogate specific ligases with GPS. Following DNA damage, CRL4 causes the rapid degradation of p21CIP1 (CDKN1A), CDT1 and SETD8/SET8 (reviewed in Abbas and Dutta, 2011). The substrate specificity factors for CRL4, termed DCAFs (DDB1 and Cul4-associated Factors), contain a WDxR motif. We co-expressed dominant negative Cul4A and Cul4B (DN-Cul4) to disrupt CRL4 and found that it prevented the destabilization of CDT1 following treatment with UV light, or the UV mimetic 4NQO, confirming its inhibition (Figure 3A). We also generated a GPS-p21CIP1 expressing cell line (CDT1 fusion to EGFP prevents its recognition by CRL4). GPS- p21CIP1 is destabilized following UV and this is blocked by DN-Cul4 (Figure 3B).
Conditions were optimized for the maximum duration of DN-Cul4 treatment that would not affect the overall stability of the library. A 293T–GPS library was treated with DN-Cul4 or a control empty vector expressing virus for 20 h and both conditions were treated with 4NQO for 2 h prior to FACS since the degradation of some CRL4 substrates is triggered by DNA damage. The PSI for probes in DN-Cul4 and control conditions showed a linear relationship (R2 = 0.91; Figure S4B), suggested the screen data is of high quality. After ΔPSI ranking and inspecting probe graphs we identified 279 high priority candidates and successfully validated 113 by individually retesting (37%) using FACS. Twenty of these validated candidates were assayed for stabilization by immunoblot of EGFP tagged proteins after DN-Cul4 and all validated (Table S4). Importantly, reanalysis of the MLN4924-GPS screen graphs for each of the 279 candidates revealed that substrates shifting in both CRL4 and MLN4924 GPS screens validated at a rate of 72% (Figure 3E,Table S4), suggesting that cross-referencing overlapping screens can reduce the false-positive rate of GPS. A subset of validated CRL4 candidate substrates is shown in Figure 3C. High confidence hits scoring in both the MLN4924 and CRL4 GPS screens and validated when re-examined by flow cytometry are depicted in Figure 3E.
To assess cell type specificity, we tested a subset of validated proteins by FACS in additional lines (HeLa and U2OS). Ferritin is the primary iron uptake and storage protein in cells and its dysfunction has been associated with neurodegenerative disease. FTH1 (ferritin heavy chain) scored in both the MLN4924 and CRL4 GPS screens and was validated in 293T and HeLa cells (Figure 3C and Figure S4C). FUT11 is a cytoplasmic fucosyltransferase enzyme that scored in the CRL4 and MLN4924 GPS screens, and was validated in 293T, HeLa and U2OS cells (Figure 3C and Figure S4C). All 10 proteins tested in HeLa and U2OS cells (FUT11, FTH1, KIAA0101/PAF15, TM2D1, ITM2A, KIAA1680, LDHB, MAPK6, FAM53A and MCM7) validated in at least one of those cell types.
Several solute carriers scored in both the MLN4924 and CRL4 GPS screens, including: SLC38A2, SLC29A2, SLC17A3 and SLC39A13. SLC38A2 is a sodium dependent amino acid transporter and SLC29A2 is a nucleotide transporter. In addition, LDHB (lactate dehydrogenase b), which catalyzes the inter-conversion of lactate and pyruvate, and NAD and NADH, in the glycolytic pathway, scored in the CRL4 screen and the QUAINT MLN4924 screen, and validated in 293T and HeLa cells (Figure 3C and Figure S4C). Taken together, this suggests a role for CRL4 in regulating various aspects of metabolism and cellular homeostasis.
Numerous nuclear proteins, particularly transcriptional regulators, scored in the CRL4 screen. CCNH encoding Cyclin H, a component of the CDK-activating Kinase (CAK) and TFIIH and a key regulator of RNA Pol II, scored in both the CRL4 and MLN4924 GPS screens and validated by endogenous immunoblotting after MLN4924 treatment. FAM53A, was strongly stabilized in both the CRL4 and MLN4924 GPS screens, and validated with DN-Cul4, is a nuclear protein involved in dorsal neural tube development (Figure 3C and Figure S4C) (Jun et al., 2002). ETS2, which scored in the CRL4 and MLN4924 GPS screens and validated with DN-Cul4, is a winged helix-turn-turn transcription factor involved in telomere maintenance through hTERT transcriptional regulation, and in maintenance of trophoblast and colonic stem cells (Figure 3C) (Munera et al., 2011; Wen et al., 2007; Xu et al., 2008). HDAC3 interacts with SMRT and N-CoR in a nuclear co-repressor complex and scored and validated in both DN-Cul4 and MLN4924 screens (Figure 3C) (Karagianni and Wong, 2007). Endogenous HDAC3 was also validated by cycloheximide chase following treatment with DN-Cul4 (Figure S4D). INTS3 and INTS4 are components of the Integrator Complex, which binds to the C-terminal domain of RNA polymerase to aid in processing of small nuclear RNAs. INTS3 and INTS4 scored in the MLN4924 and CRL4 GPS screens and INTS3 validated by FACS in 293T cells (Table S4). Together this strongly argues that CRL4 plays an important role in regulating a variety of transcription factors.
The MCM2–7 helicase complex unwinds DNA ahead of the replicative DNA polymerase. MCM2, MCM5 and MCM7 scored in the DN-Cul4 GPS screen and MCM2, MCM3, MCM5 and MCM7 scored in the MLN4924 GPS (Figure S4E). All components showed regulated ubiquitylation in the QUAINT analysis and were validated by DN-Cul4 in 293T cells (Figure 4F and Table S4). GPS-MCM7 also validated in HeLa cells following DN-Cul4 treatment (Figure S4E). While very little is known about potential MCM complex ubiquitylation, an increase in ubiquitin dependent turnover of MCM3 in G1 phase has been observed (Cheng et al., 2002).
CRL3KEAP1 constitutively degrades the oxidative stress response transcription factor Nrf2 (Cullinan et al., 2004; Kobayashi et al., 2004). Following oxidative damage, Keap1 is inhibited allowing NRF2 to rapidly accumulate and initiate transcription. CRL3 utilizes Kelch-BTB proteins (Bric-a-Bric, Tamtrack and Broad) as substrate specificity factors (Xu et al., 2003). To confirm that DN-Cul3 inhibited CRL3, we immunoblotted treated cells for NRF2 (Figure 4A). GPS-NRF2 was stabilized following DN-Cul3 treatment to the same extent as strong oxidative stress induced by TBHQ treatment (tert-Butylhydroquinone; Figure 4C, D).
Comparing the PSI for probes from DN-Cul3 treated and control treated samples revealed an R2 value of 0.91. The screen identified the well characterized substrates, NRF2, DAPK1 and DVL1. After ranking probes and inspecting graphs, we identified 188 high priority candidate substrates (Table S5). We cross-referenced the high priority CRL3 candidates against the probe graphs for the MLN4924-GPS screen and identified 88 proteins that overlap in both screens which we predict will validate at a high rate (70–80% based on the results of our SCF and CRL4 screens; Figure 4F and Table S5). This list is enriched for proteins containing the BTB-Kelch fold found in CRL3 specificity factors (Figure 4G). Based on our analysis and validation of the CRL4 and SCF (below) screens and the identification of numerous substrate specificity factors as well as known substrates, we predict that this overlapping list contains many CRL3 substrates.
We previously applied the first generation GPS system to the identification of SCF substrates utilizing DN-Cul1 to inhibit ligase activity (SCF-GPS.1) (Yen and Elledge, 2008). We used the second generation GPS library, with conditions optimized from our first screen, to identify additional substrates (SCF-GPS.2). The 293T–GPS library was treated with either a lentivirus expressing DN-Cul1 or empty vector. Comparison of the PSI for all probes between the two conditions yielded an R2 value of 0.95. Validation was performed by individually retesting ORFs under the conditions of the screen and yielded a validation rate of approximately 59% (80 out of 139 high priority candidates tested; Table S6). This was an improvement over SCF-GPS.1 which had a 47% validation rate. In addition, all of the SCF-GPS.1 validated proteins that were individually tested in the second generation lentiviral GPS-LP vector recapitulated stabilization in response to DN-Cul1. Performing this screen with the updated pGPS-LP library validated 67 additional putative SCF substrates not recovered in our original screen (Table S6), such as TRIM9, BZW1, ZNF238, HFM1, MICALL2 and SH3BP5L. TRIM9 interacts with the F-box protein β-TRCP by yeast-two hybrid (Rual et al., 2005) and contains the β-TRCP degron sequence (DSGxxS), strongly suggesting that it is controlled by SCFβ-TRCP.
Validated proteins that overlap between the MLN4924 and SCF GPS screens are shown in Figure 5B. Proteins scoring in both SCF and MLN4924 screens validated at a rate of ~81% (Table S6). Since the identification of a protein in multiple screens is a strong predicator of its likelihood to validate, we have generated a summary table of 472 proteins that either individually validated in GPS or scored in one GPS and a second GPS or QUAINT screen (Table S7).
Since phosphorylation can drive proteolysis, we cross-referenced our overlap lists with phospho-proteomic cell-cycle and DNA damage screens (Dephoure et al., 2008; Matsuoka et al., 2007) and identified NUSAP1, which shows regulated phosphorylation in mitosis and in response to DNA damage. NUSAP1 is a cell cycle regulated microtubule binding protein with roles in chromosome congression and segregation (Raemaekers et al., 2003; Ribbeck et al., 2006; Ribbeck et al., 2007). To identify the specific ligase controlling NUSAP1, we treated cells with MLN4924 or DN-Cul. MLN4924 confirmed the CRL-dependency (Figure 6A) and only DN-Cul1 produced a significant increase in NUSAP1 levels (Figure 6B). To identify the F-box protein for NUSAP1, we performed an siRNA screen of all 69 known F-box proteins. U2OS cells were transfected with siRNA pools targeting each of the different F-box proteins and 72 h post transfection, cells were harvested and immunoblotted for NUSAP1. We found that depletion of Cyclin F, the founding member of the F-box family, increased the levels of NUSAP1 (Figure 6C). To date, CEP110 is the only known SCFCyclin F substrate (D’Angiolella et al., 2010). To confirm specificity, we tested four independent Cyclin F siRNAs and found Cyclin F depletion inversely correlated with NUSAP1 levels (Figure S5A). To test whether this regulation was post-translational, Cyclin F was depleted from 293T cells expressing GPS-NUSAP1 or GPS-MDH1 (negative control). Cyclin F depletion caused an increase in the EGFP/DsRed ratio of GPS-NUSAP1 cells (Figure 6D), reflecting an increase in EGFP-NUSAP1 stability, relative to the siRNA.
Since SCFCyclin F ligase activity is cell cycle regulated, we examined NUSAP1 proteins levels throughout the cell cycle. NUSAP1 accumulated during S and G2 phase following release from synchronization at the G1/S boundary and was destroyed at the end of mitosis. Its destruction in late mitosis and G1 is similar to that of PAF15 and Cyclin B (Figure 6E). This was expected since NUSAP1 has been reported to be an APC/C substrate (Song and Rape, 2010), similar to PAF15 and Cyclin B (Emanuele et al., 2011; King et al., 1995).
Following release from a double thymidine block, Cyclin B and PAF15 levels were relatively high, increasing ~20% between the time of release and their maximal level achieved in mitosis. NUSAP1 levels were low throughout S and abruptly increased in G2 (Figure 6E– graph). We asked whether Cyclin F controlled NUSAP1 during S and G2. U2OS and Hela cells treated with Cyclin F siRNA were synchronized at the G1/S boundary and released into the cell cycle. Cyclin F depletion increased the levels of NUSAP1 during S and G2 phase following only 24 h of siRNA depletion (Figure 6F and Figure S5B). Importantly, Cyclin F depletion does not affect cell cycle timing following release (Figure 6C and Figure S5B). To confirm that Cyclin F and NUSAP1 interact, Flag-Cyclin F was immunoprecipitated from HeLa cells. Immunoblotting of the precipitates showed that full length Cyclin F, but not a truncation lacking the Cyclin Box (Cyclin F 1–270), precipitated endogenous NUSAP1 (Figure 6G). We conclude that SCFCyclin F targets NUSAP1 for degradation during S and G2 phases of the cell cycle.
NUSAP1 is phosphorylated on S124 by the ATM/ATR kinases following DNA damage (Matsuoka et al., 2007; Xie et al., 2011). Based on this information, we examined NUSAP1 protein levels following DNA damage with ultraviolet light (UV), the UV mimetic 4NQO and ionizing radiation (IR). We found UV and 4NQO, but not IR, caused NUSAP1 degradation in U2OS, HeLa and 293T cells (Figures 7A, S5C, S5D). Degradation was observed at 1 h following UV treatment, and with as little as 10 J/m2 UV. Importantly, cells treated with MG132 or MLN4924 could not effectively degrade NUSAP1, demonstrating both proteasome and CRL dependence (Figure 7B). To map the ligase responsible for NUSAP1 degradation, we employed a panel of dominant negatives targeting each of the cullins. We found that SCF inhibition prevented NUSAP1 degradation (Figure 7C). However, depletion of Cyclin F had no effect on the UV-induced NUSAP1 degradation, indicating it is likely to be a substrate of two distinct SCF ligases (Figure 7D).
Nusap1 resides on chromosome 15q15.1, a region frequently deleted in a wide variety of cancers. The role of NUSAP1 in microtubules prompted us to examine the potential sensitivity of NUSAP1 depleted cells to anti-tubulin chemotherapeutics. We found that depletion of NUSAP1 with 3 independent siRNA made both U2OS and HCT116 cells highly sensitive to treatment with the anti-tubulin cancer therapeutic taxol relative to siFF treated controls (Figures 7E and S5G) without perturbing the cell cycle distribution (Figure S5E, F). This phenotype was rescued by re-introduction of a NUSAP1 ORF lacking the 3’UTR following depletion with the siRNA targeting the 3'UTR (Figure S5H). We also observed a similar degree of sensitivity to nocodazole, which disrupts the microtubule cytoskeleton through an alternative mechanism, in U2OS and HCT116 cells (Figures 7E and S5G). This suggests that the presence of NUSAP1 makes cells more resistant to the toxic effects of anti-tubulin chemotherapeutics and could explain taxol sensitivity in tumors deleted for NUSAP1.
Functional categorization and domain analysis of proteins identified by QUAINT and GPS are shown in Tables S1 and S2 and confirm the role of CRLs in a wide swath of cellular processes. We performed a similar analysis on the validated substrates from each individual GPS screen. CRL4 regulated proteins were enriched for involvement in nuclear, golgi and endoplasmic reticulum function, as well as DNA metabolism, replication and repair (Figure 3D). As well as domain enrichment for WD40 repeat proteins, which serve as substrate adaptors for CRL4, and protein kinases, zinc fingers and others (Figure 3D). The enrichment for replication and repair was expected from the known role of CRL4, however, a role in most other categories has not been previously established for CRL4. Domain and functional category analysis for putative CRL3 and SCF substrates are shown in Figure 4F and Figure 5A, respectively. Domain analysis for putative CRL3 and SCF substrates revealed a significant enrichment for their respective adaptors, Kelch-BTB and F-box proteins. In addition, SCF substrates showed the expected enrichment for proteolysis but an unexpected enrichment for cytoskeleton and cell projection, suggesting a role for the SCF in cell migration. Most importantly, the functional analysis for each ligase is distinct. This suggests that each CRL evolved in a specialized fashion to regulate specific aspects of cellular physiology.
We performed an interaction analysis for proteins that validated for regulation by CRLs in our screens to determine the degree to which these proteins participated in protein interaction networks. The network that emerged from this analysis was analyzed for a property called “betweenness”, which is a statistical measure of a proteins centrality within an interaction network. A higher degree of betweenness indicates a greater degree of inter-connectivity within a network. The CRL candidate list of 472 proteins, which scored in at least two overlapping screens, was mapped onto the most current BioGRID human protein-protein interaction network (Table S7). This analysis demonstrated that CRL regulated proteins show a high degree of betweenness (P value of 3.96 ×10−15; Figure S6), indicating that they are highly connected within protein interaction networks. In addition, proteins scoring with greater than a 2-fold change by QUAINT (Table S3) and those that overlapped between the MLN4924 GPS and QUAINT screens (Figure 2D) also showed a high degree of betweenness (p-value of 1.1×10−22 and 2.08×10−9, respectively: Table S7). Graphs demonstrating the increased protein interactions for CRL candidate substrates are shown in Figure S6D.
Based on these results, we analyzed the individual validation lists for the SCF, CRL4 and MLN4924 GPS screens. As expected, the validated substrate lists all showed a statistically significant degree of betweenness (p-values of 3.7×10−3, 3.2×10−3, 1.72×10−6, respectively: datasets in Table S7). Network diagrams showing the betweenness centrality of putative substrates from these screens are depicted in Figure S6. A sub-network for SCF is also shown in Figure 5C. Thus, the proteins regulated by CRL ligases represent central hubs within networks and pathways. By regulating these critical junctures, we hypothesize that CRLs could have a maximal impact on a particular pathway.
Regulated ubiquitin-mediated proteolysis through E3 ligases is a critical aspect of cellular homeostasis. Functionally, E3 ligases are equivalent to micro-RNAs in the hierarchy of regulated gene expression. Just as micro-RNAs are sequence-specific adaptors that target RNA molecules for destruction to filter the transcriptome, E3 ligases are sequence-specific adaptors that target proteins for destruction to filter the proteome. Their ability to reshape the proteome in response to stimuli is of vital importance to both development and physiological responses to stimuli in eukaryotes. Thus, it is critically important to identify the substrates of ubiquitin ligases at a systems level.
Here we have applied two emerging techniques, GPS and QUAINT, to identify ubiquitylation substrates of the CRL family of E3s. Of the known SCF substrates (Skaar et al., 2009) in the ORFeome library, approximately 22% were identified by GPS. This is an underestimate of the potential of GPS for substrate identification, since known substrates were identified from a variety of cell types and we performed GPS analysis in only one. Since cell lines vary with respect to the constellation of substrate adaptors and activity of different signaling pathways, complete overlap is not expected. Of the known SCF substrates, QUAINT identified 14%, which is also an underestimate.
QUAINT and GPS have different strengths. GPS measures protein abundance and directly interrogates changes in protein stability independent of their endogenous abundance or expression in a specific cell type. Despite these strengths, GPS suffers from a reliance on an N-terminal GFP fusion, which can affect protein localization and degron activity in certain cases. We are in the process of reconstructing libraries with alternative N- and C- terminal tags to circumvent some of these issues. In addition, expression levels can affect our ability to detect substrates, evidenced by the fact that some of the validated SCF substrates from first generation screen did not rescore in the current screen. GPS currently relies on ORFeome collections that are not sequence verified and contain truncated and mutant proteins. Advances in ORFeome collections and the assembly of a non-redundant, sequenced verified ORFeome collection will enhance the GPS system.
The strength of QUAINT is its quantitative nature and ability to recognize endogenous proteins. It is also able to identify ubiquitylation events, such as mono-ubiquitylation, that do not result in protein degradation. Moreover, it identifies ubiquitylation sites that could provide mechanistic insights and inform substrate mutational analysis. However, QUAINT cannot distinguish whether the ubiquitin modification reflects a change in the entire protein population of a substrate or a small fraction, the biological significance of which is less certain. Furthermore, QUAINT cannot distinguish between ubiquitin, ISG15 and Nedd8 modification because all three modifiers leave a GG-lysine after trypsin proteolysis. In addition, it is possible that ubiquitylated proteins which change transcriptionally in response to stimuli, such as MLN4924, may appear to have altered ubiquitylation that does not reflect a change in ligase activity. QUAINT cannot distinguish between mono-ubiquitylation and poly-ubiquitylation, only the latter of which could affect protein stability. Finally, QUAINT is biased towards more abundant substrates.
Both GPS and QUAINT strategies identified known CRL substrates missed by the other. For example, QUAINT identified β-CATENIN and CDT1, which are missed by GPS; β -CATENIN because it is not encoded by the current ORFeome collection and CDT1 because conjugation with GFP interferes with its degron. Conversely, the GPS-MLN4924 screen identified a number of well-characterized substrates missed by our proteomic efforts, including, but not limited to; NRF2, DVL1, PDCD4, CDKN1A, CDKN1B, FBXO5/EMI1 and MCL1. With ongoing ORFeome development and C-terminal tagging strategies, we expect the GPS system to continue to improve. Importantly, the combination of these emerging techniques has given us a very deep snapshot of the regulated protein stability and modification landscapes that will only improve in the future.
As a proof of principle, using layered genetic screens, we have identified the precise ligase for one CRL regulated protein. We discovered that NUSAP1 is a substrate of SCFCyclin F1 and Cyclin F interact with one another and Cyclin F depletion specifically affects the stability of NUSAP1. Cyclin F has only one other known substrate, CEP110 (D’Angiolella et al., 2010). CEP110 localizes to centrosomes and regulates their duplication cycle (Ou et al., 2002). CEP110, like NUSAP1, is involved in regulating the microtubule cytoskeleton. Since CEP110 and NUSAP1 depletion cause chromosomes segregation defects, this suggests that the upstream regulation of these two factors by Cyclin F is essential for maintaining chromosome stability. NUSAP1 is also destabilized upon treatments that activate the ATR/ATRIP pathway. Importantly, this destabilization is Cul1-dependent but Cyclin F-independent. Since NUSAP1 is also an APC/C substrate, it is likely to be regulated by multiple cullin based E3 ubiquitin ligases; APC/C, SCFCyclin F, and a third SCF based ligase.
Nusap1 is located in a focal deletion region for many tumor subtypes and as we found cells depleted for NUSAP1 were sensitive to taxol, tumors with reduced NUSAP1 might be responsive to taxol. Further, if the degradation of NUSAP1 in response to chemotherapeutic agents that cross-link DNA is a general finding, then tumors cells containing NUSAP1 but lacking a functional G2/M checkpoint might be especially sensitive to a combination therapy consisting of the proper class of DNA damaging agent and taxol. These findings may have important implications for cancer therapies based on neddylation inhibition.
Functional category enrichment analysis for individual CRLs showed that each ligase evolved in a specialized manner to control distinct physiological pathways. Moreover, we find the CRL substrate list is strongly enriched for a property known as betweenness, being highly enriched as hubs within interaction maps. Thus, they are much more highly connected than the average protein in the database (akin to the highly connected individual in a social network). Highly connected nodes are positioned to control the flow of information across a network, and their removal will have the highest impact across the network due to their inherent connectivity. In yeast, nodes genes are three times more likely to be essential than their non-node counterparts (Jeong et al., 2001; Jonsson and Bates, 2006). We have reanalyzed the yeast data with the more complete interaction network in BIOGRID and have found that our set of CRL substrates are as highly enriched for betweenness as the set of essential genes in S. cerevisiae (Figure S6D). The biological interpretation of our observation that CRL substrates represent nodes is that cells have evolved mechanisms to regulate the abundance of the most essential components of networks. Since signal transduction pathways control regulated ubiquitylation by CRLs, these pathways turn on or off key nodes to maximally impact cellular physiology. Together, these observations indicate that the CRL substrates we have identified represent a collection of key regulatory proteins. We predict that these proteins are highly enriched for indicators of the physiological state of the cell. This study, together with anticipated future studies, will allow us to gain a systems level understanding of the dynamics of protein stability in the proteome.
A complete description of experimental procedures is available in the supplemental experimental procedures.
Cells were transfected with plasmids using TransIT transfection reagent (Mirus). Retroviruses and lentiviruses were packaged in 293T cells using standard techniques. Cells were transfected with siRNA oligonucleotides using RNAiMAX (Invitrogen).
Flag IPs were performed on from 10 cm plates of HeLa cells 24 h after transfection and 3 hours after treatment with 5µM MG132. Cells were lysed in NETN buffer containing protease and phosphatase inhibitors. Clarified, combined nuclear and cytoplasmic lysates were precipitated with Flag-M2 agarose beads, for 3 h at 4°C. Beads with immune complexes were washed 4 times and boiled in SDS-PAGE sample buffer.
Cell viability was measured using Cell Titer Glo Reagent (Promega) according manufacturers protocols. For viability assays following NUSAP1 depletion, cells were treated with siRNA for 16–18 h, before re-plating in 24 well plates. Four hours after replating, media was supplemented with taxol or nocodazole and viability was assayed 72 h later. All experiments were performed in triplicate and data reported are a mean.
GPS screens were performed essentially as described (Yen and Elledge, 2008). For the MLN4924 screen cells were treated for 4 h with 1 µM drug. For the SCF, CRL3 and CRL4 screens, cells were treated with lenti-virus expressing DN-Cul1, DN-Cul3 and DN-Cul4A and DN-Cul4B. Control cells were treated with an empty vector expressing lentivirus or DMSO. A viral titer of greater than 10 was used for all of the CRL screens.
Microarray probe data was filtered and normalized and for each probe a ΔPSI value was calculated and a graph was drawn comparing the probe distribution across bins for treated and untreated samples (example in Figure 1A, part v). Graphs were rank ordered based on ΔPSI and were visually inspected one at a time to assess the significance of the probe shift for all graphs with a ΔPSI greater than 0.25.
HeLa cells were grown in DMEM containing heavy or light arginine and lysine, as described (Matsuoka et al., 2007). Cells were treated with 5 µM MG132 (Boston Biochem) and/or 10 µM MLN4924. Cells were harvested and lysed in denaturing buffer. Heavy and light lysates were mixed in a 1:1 ratio, and proteins were digested with trypsin, followed by desalting on a Sep-Pak C18 column (Waters). Peptides were dissolved in IP buffer (50 mM MOPS buffer pH 7.2, 10 mM sodium phosphate, 50 mM NaCl) and incubated with PTMScan Ubiquitin Remnant Antibody Beads (Cell Signaling Technology, Inc). Beads were washed with IP buffer followed by water, and enriched ubiquitylated peptides were eluted with 0.15% TFA, followed by LC-MS/MS analysis using an LTQ Orbitrap Velos.
Protein interactions in Figure 2C were identified using BioGRID, except for SOD1, which has a genetic interaction with Cul1. We used Pfam 25.0 (http://pfam.sanger.ac.uk/) to analyze protein domains. We used DAVID 6.7 to test the gene functions of all lists and used genes in the ORFeome library as a background for gene-annotation enrichment analysis (Huang da et al., 2009).
To determine protein-protein interactions, we used online database BioGRID 3.1.77 (Stark et al., 2011). The interactions are displayed as a graphical network using the open source software Cytoscape 2.8.1 (Smoot et al., 2011). The betweenness centrality of each gene is a number between 0 and 1, the betweenness Gene A is computed as follows: if all connections pass through the Gene A, betweenness value of Gene A is 1; if Gene A is a terminal node in the network, the value is 0. We used the sign-test to see if the genes in a list have a higher betweenness value or not. Each network’s median is calculated in a sorted list of betweenness values and the gene list will be separated into two parts: (a) less than or equal to median and (b) greater than median sets. Binomial distribution is used with a sign-test to determine the p-value.
We thank the members of the Elledge lab for helpful discussions. We thank W. Harper for the Cul3 and Cul4 dominant negative clones and F-box siRNA, S. Gygi for access to and assistance with MS/MS analysis software, the technical support of the IMB Bioinformatics Services Core of Academia Sinica, and Michele Pagano for Cyclin F plasmids. We thank Marc Vidal and David Hill for the ORFeome library. MJE is the Philip O’Bryan Montgomery Jr., MD fellow of the Damon Runyon Cancer Research Foundation (DRG-1996-08). AEHE is supported by fellowships from The Jane Coffin Childs Foundation and The American Society for Radiation Oncology. CRT is funded by an EMBO and HFSP fellowship. QX is supported by a fellowship from the American Cancer Society. HCY was supported by NSC grant (NSC100-2628-B-001-014-MY3). This work was supported by an NIH grant to SJE. SJE is an investigator with the Howard Hughes Medical Institute.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.