The observation of genetic interactions is key to the definition of cellular networks. RNAi has enabled genetic approaches in both cultured mammalian cells (1
) and intact animals (6
). Large-scale screens of small interfering RNA (siRNA) (10
) and shRNA collections (5
) have generally adopted a one-by-one approach, interrogating phenotypes in a well-based format. This requires both considerable infrastructure and a substantial investment for each cell line to be screened. Alternatively, shRNA collections can be screened by assaying enrichment from pools, but this limits the range of phenotypes that can be addressed. Our focus was identifying essential genes or synthetically lethal genetic interactions through shRNAs that were selectively depleted from populations. This type of screen holds promise for the discovery of novel targets for cancer therapy and genetically validated combination therapies. Previously, one such screen was reported; however, this tested only ~500 shRNAs in a single pool (17
). We therefore sought methods that allow multiplex analysis of phenotypic outputs on a genomic scale.
Pooled libraries drew from our previous collections wherein shRNAs are carried in a backbone derived from miR-30 (18
). Combining RNA polymerase II promoters with miR-30–based shRNAs permits efficient suppression even with a single-copy integrant (19
). Therefore, pooled shRNAs were transferred from pSM2 (18
) to pLMP (19
), wherein shRNA expression is driven from the murine stem cell virus long-terminal repeat promoter. Three different pools, containing ~6000, ~10,000, and ~20,000 shRNAs, were constructed to test screening at varying scales and levels of population complexity. Target cell populations were infected such that each cell contained, on average, a single integrated virus, and each individual shRNA occupied ~1000 cells. Three parallel infections generated biological replicate samples. Because our goal was to identify essential genes, genomic DNA was prepared from each replicate at three time points during a simple outgrowth assay ().
Fig. 1 Experimental approach. (A) shRNA plasmids were packaged into retroviruses in triplicate and introduced into replicate target cell populations at a multiplicity of ~0.3 to achieve ~1 integrant per cell. Over a 2- week culture period, time points were collected (more ...)
Each shRNA cassette contains two unique identifiers: the shRNA itself and a random 60-nucleotide barcode. Barcode sequences were determined for the human shRNA library, and custom, multiplex format microarrays were prepared that contained both barcode and half-hairpin (HH) probes (21
) (). Proviral DNA fragments encompassing both shRNAs and barcodes were amplified from genomic DNA pools and hybridized to arrays in competition with a common reference.
We established a rigorous data analysis pipeline (22
) for analyzing pooled shRNA screens. Correlations between biological replicates were high but diminished at later time points, whereas correlations between the reference channels remained unchanged (table S1
). Overall, a gene was scored as a candidate if either its barcode or shRNA probe showed greater than 2-fold change with a false discovery rate (FDR) <10%.
We began with a pooled analysis of 6000 (6K) shRNAs in MCF-10A and MDA-MB-435. Although enriched gene sets varied considerably, similar numbers and largely overlapping gene sets showed depletion in both cell lines (tables S2 and S3
). Among negatively selected shRNAs were many targeting regulators of the cell division cycle (23
) (table S3
). These included cyclins, cell division cycle (CDC) proteins, E2F family members, minichromosome maintenance deficient genes, proliferating cell nuclear antigen, and RNA polymerase II–associated genes. Additionally, the proteasome (15 of 25 subunits; P
= 5.61 × 10−5
) and anaphase-promoting complex/cyclosome (APC/C) (6 of 11 subunits; P
= 0.0139) scored as being essential in both cell lines (table S3
To validate candidates, we constructed a regulated shRNA vector, which linked shRNA and green fluorescent protein expression (fig. S1A
). Inducible shRNAs against two APC/C subunits, ANAPC2 and 4, inhibited the growth of MCF-10A in a manner that correlated with mRNA knockdown (fig. S1A
and ). Similarly, MDAMB- 435 was sensitive to ANAPC2 depletion (). Nineteen additional MCF-10A lines were constructed with inducible shRNAs targeting 11 different candidates (). Of these, 16 lines exhibited shRNA-dependent growth inhibition (30% to 95%), which correlated with mRNA knockdown in 14 cases. The exceptions were CDC-5L and DKC-1, where growth suppression could be due to off-target effects ().
Fig. 2 Validation of genes essential to multiple cell lines. Cell viability assays (bars) were performed on cell lines (MCF-10A or MDA-MB- 435) expressing individual candidate shRNAs. Tables below the graphs show the level of target suppression, determined by (more ...)
Among additional candidates were MAD2 and BUBR1, mitotic checkpoint proteins required for regulation of sister chromatid separation (24
), and Kinesin-7/CENP-E, a component of the kinetochore (27
). MAD2/MAD2L1 and Kinesin- 7/CENP-E were validated as being essential in MCF-10A (table S3
and ). CENP-E depletion also inhibited growth in MDA-MB-435 (table S3
and ). Considered together, these studies showed that multiplex RNAi screens successfully identified essential components of cell growth and survival networks.
We also screened higher complexity populations containing 10,000 (10K) or 20,000 (20K) shRNAs. The 10K pool was introduced into MDA-MB-231, T-47D and ZR-75-1. The most complex pool (20K) was introduced into MCF-10A to allow direct comparison with the 6K screen. In all cases, cell numbers were scaled to maintain a representation of 1000 cells per shRNA. The quality of each screen was similar, with high correlations between biological replicates (table S4
). We assessed the consistency of the MCF-10A screens by comparing depleted gene sets for the 6K and 20K pools. FDR thresholds were the same for both data sets (q
< 0.1), but the fold-change criterion was relaxed from 2-fold to 1.5-fold for the 20K screen so that similar numbers of candidates were compared. A set of 172 genes (P
= 1.123 × 10−9
) overlapped in both data sets, despite some differences in the protocols used to carry out each screen, and most of the validated targets from the 6K screen were found in the overlapping list of essential genes (tables S5 and S6
). This suggests that a pool of ~20K shRNAs can be effectively screened.
We next sought to uncover cell line–specific genetic sensitivities that might reflect differences in the genetic constitutions of MCF-10A, MDA-MB-231, MDA-MB-435, and ZR-75.1. Initial comparisons focused on the 6K screens done with MCF-10A and MDA-MB-435. Filtering for shRNAs that had a low FDR (q
< 0.1) and at least 2-fold depletion in MCF-10A but no more than 1.2-fold depletion in MDA-MB-435 yielded 35 genes (table S7
). This compares to 166 genes that were important for growth in both cell lines and 3 genes that were differentially required in MDA-MB-435 ( and table S7
). Among the candidates required in MCF-10A were two components of P-TEFb, CDK9 and cyclin T2 (28
). We verified this differential sensitivity using both conditional shRNA expression () and pharmacological inhibition (). CDK9 is a DRB-sensitive kinase (28
). Although DRB may also target other proteins, MCF-10A showed greater sensitivity to its effects than MDA-MB-435 cells ().
We repeated the 6K screens of MCF-10A and MDA-MB-435 cells on the same array platform as the 10K screens of MDA-MB-231, T-47D, and ZR-75.1 cells and integrated the results (table S8
). This was possible because more than 90% of the 6K shRNA set was contained within the 10K pool. Clustering of the resulting sensitivities (i.e., by fold-change, considering only shRNAs with q
< 0.1) yielded a dendrogram wherein the more normal MCF-10A segregated from the other, more transformed lines (). MDA-MB-435 also segregated, perhaps reflecting the observation that it is more related, by expression profiling, to melanoma than breast epithelia. Finally, the remaining lines separated into a group containing T-47D and ZR-75.1, both luminal tumor cell lines, and MDA-MB-231, a basal tumor cell line (29
Fig. 3 Cross-comparison of straight lethal screens in five different cell lines. (A) A heat map and dendrogram were generated by clustering based on shRNAs that showed depletion in at least one cell line. (B) shRNAs targeting selected complexes or pathways were (more ...)
Viewing this portrait of shRNA sensitivity in more detail revealed a number of pathways and complexes that were differentially required in MCF-10A. These included epidermal growth factor receptor (EGFR), an effect that could be reproduced pharmacologically using the EGFR inhibitor Tarceva (30
) (). DNA methyltransferases also scored either above or close to the threshold (table S8
and ). In accord with these results, MCF-10A cells showed a more than 50-fold greater sensitivity to 5-aza-deoxycytidine, a methyltransferase suicide substrate (31
), than the other cell lines. As a final example, numerous proteasome subunits were preferentially depleted from MCF-10A (table S8
and ). These cells showed the greatest sensitivity to a proteasome inhibitor, MG-132 (32
). Interestingly, MDA-MB-435 showed an intermediate level of sensitivity to the drug, and this was reflected precisely in their intermediate level of depletion of proteasomal shRNAs during the screen (table S8
We have validated a highly scalable approach for screening shRNA libraries. Although we used a phenotypic filter reflecting growth and survival, virtually any characteristic that allows separation of phenotypically distinct cells can be applied. We also validated the ability of functional shRNA screening to separate cell lines based on their genetic vulnerabilities in a manner that reflects their already defined characteristics (e.g., immortal versus tumor, basal versus luminal). Although one could attribute selective dependency to culture conditions in some cases, the overwhelming concordance of the shRNAs that affect proliferation and survival across these lines, many of which are cultured identically, strongly argues against this being a pervasive explanation. In all, this approach enables genomewide screens for tumor-specific vulnerabilities to be carried out on large numbers of tumor lines. Moreover, it permits rational searches for lesions that synergize with existing therapeutics to produce a path toward genetically informed combination therapies.