Over the last decade, cell-based RNAi screens have emerged as a powerful research tool. High-content RNAi screening in particular has been successfully used to systematically determine genes that contribute to a wide variety of cellular processes, identify new disease genes, and gain insights into the architecture of signalling networks (Mohr et al, 2010). As with the development of any new technology, RNAi screening has encountered growing pains, but technical improvements have been implemented to reduce false-positive rates due to factors such as off-target effects (Bakal and Perrimon, 2010). However, recent studies have revealed inconsistencies between the phenotypes generated by different siRNAs targeting the same gene (from either the same or different libraries; Collinet et al, 2010) and poor reproducibility of similar screens performed by different laboratories. A striking example is the low overlap (<7%) of hits identified between three published screens that identified host factors required for HIV infection (Bushman et al, 2009). Through a novel experimental and computational analysis, Lucas Pelkmans and colleagues now show that the reproducibility of RNAi screens can be improved when the population heterogeneity of cell lines is considered (Snijder et al, 2012).
It is now clear that genetically identical cells, such as those frequently used in RNAi screens, display heterogeneity in their cellular behaviours (Altschuler and Wu, 2010). Cell-to-cell variations are partly due to the inherent stochasticity of biochemical reactions involving only a small number of reactants, the burst-like nature of transcription, and unequal segregation of mRNA and proteins during cell division (Loewer and Lahav, 2011). In addition, a previous work from the Pelkmans group showed that cellular heterogeneity can have a deterministic component that is due to the cell's local environment (e.g. depending on whether a cell is at the edge or in the centre of a colony) (Snijder et al, 2009). It is thus intuitive that if cells in culture are heterogeneous, their response to a particular stimulus or perturbation, such as virus uptake, may also be variable. In fact, the Pelkmans group demonstrated the heterogeneous nature of virus infection is in large part determined by the population context of the cell (Snijder et al, 2009). These findings underscored the importance of analysing certain phenotypes at the single-cell level instead of using population averages to measure an effect. Using the population average to quantify viral infection in this case would have completely missed the effects of local environment.
In their recent work, the Pelkmans group have generated a dataset comprising 41 different RNAi screens measuring the efficiency of infection for 17 different viruses in two different cell lines (including four strains of one cell line). The authors reasoned that inhibition of genes could affect virus uptake either directly or indirectly. Effects are direct if gene depletion leads to an alteration in the inherent ability of an individual cell to be infected by a virus (e.g. by inhibiting endocytosis). However, gene perturbations that affect population context (e.g. local cell density) may also lead to significant changes in infection rate, which can be considered as an indirect effect. For example, Human Rotavirus 2 (RV) preferentially infects cells on the edges of colonies. If RNAi depletion leads to increased cellular proliferation (e.g. depletion of adenylate kinase 5), there will be fewer cells at colony edges and the number of cells infected by RV will decrease. Thus, in order to distinguish between direct and indirect effects, the authors measured at the single-cell level the extent of virus uptake as well as 200 cellular features, including the position of cells within colonies. These data were used to computationally disentangle the contribution of direct and indirect effects. Given the size of the dataset and the number of single cells analysed, this study represents a tour de force in RNAi screening from both an experimental and a computational point of view.
The first conclusion of this work is that, as predicted, RNAi-mediated gene knockdown can affect viral infection in both a direct and an indirect manner. By developing a novel computational method that essentially ‘normalises' the data by considering the population context, Snijder et al. were able to categorise all genes in all screens into direct versus indirect effects. This method also revealed the existence of masked effects, where an indirect effect acts in the opposite manner to a direct effect, thus concealing the true role of this gene in virus uptake.
Remarkably, by only considering direct effects, reproducibility across RNAi screens was increased. Improved agreement was obtained between phenotypes elicited by different siRNAs targeting the same gene, between hit-lists of host factors required for the uptake of the same virus across different cell lines, and between siRNAs required for the uptake of different viruses in the same cell line. This method also improved the statistical power for hit-scoring by reducing the number of false positives in the dataset. Using these hit-lists, the authors were furthermore able to provide systems-level insights into the host signalling networks that are involved in viral infection.
What are the implications for previous RNAi screens that did not take population context into account? While the methodology developed here is powerful, it will only have an effect when population context has a significant impact on the phenotype under study. The Pelkmans group do extend their analyses beyond viral infection to consider cell size, cellular cholesterol levels and endosome abundance, and find that population context plays a role in these cases as well. However, the true extent of the impact of population context on other phenotypes is unknown. Moreover, the absolute improvement in consistency observed between screens remains relatively modest in most cases. Finally, it still remains to be determined if population context affects single-cell behaviour in cell lines that do not form extensive colonies. Considering the imaging, computational and statistical infrastructure required to implement such methodology, this analytical approach may not be feasible for many research groups. If population context is known or suspected to have an impact on the phenotype being screened for, then single-cell analyses and the implementation of this computational methodology could reduce the number of false positives and improve statistical confidence in the hits. This makes the methodology by Pelkmans, Snijder and colleagues potentially a very powerful tool to improve the quality of future RNAi screens.