|Home | About | Journals | Submit | Contact Us | Français|
The recent development of RNAi-based techniques for protein knockdown in mammalian cells has allowed for unprecedented flexibility in the study of protein function. Currently, large siRNA libraries are available that allow the knockdown of all proteins known to be encoded by the human genome. These libraries have been used to identify the host proteins required for the replication of several clinically important viruses, including HIV, flaviviruses and influenza. This review summarizes the methods used in RNAi-based screening for host factors involved in virus replication, and discusses published examples of such screens.
Viruses, by definition, are obligate parasites. They require host cell proteins and pathways to carry out many of the phases of their lifecycles. Until recently, elucidation of the contribution of individual mammalian host cell proteins to viral replication faced certain limitations. Direct interactions between viral and host cell proteins could be determined by genetic or biochemical techniques, such as the yeast two-hybrid screen or co-immunoprecipitation. Further analysis of these interactions usually required construction of mutant viruses that lack this interaction. Alternatively, studies made use of small-molecule inhibitors of proteins of interest or knockout mouse lines in which the host gene of interest has been deleted. In 2001, RNAi-mediated knockdown of individual mRNAs in mammalian cells was first described, allowing the transient depletion of individual proteins from target cells . Commercially available siRNA libraries targeting all known and predicted mRNA transcripts in humans, and other organisms, have recently made genome-wide screens practicable. Screening for proteins important in viral replication has been among the first uses of these libraries (also reviewed in [2,3]). Of particular interest is the fact that several screens have been performed in cells infected with the same, or related viruses, allowing analysis of the reproducibility and robustness of this technique.
siRNAs are 21–23 nucleotide (nt), dsRNAs containing 2-nt 3′ overhangs. One strand of the duplex is complementary to the target mRNA. This targeting (or guide) siRNA strand is incorporated into a multiprotein complex, termed the RNA-induced silencing complex (RISC). The RISC then uses the guide strand as a sequence-specific probe for targeting mRNAs for endonucleolytic cleavage by the RISC component protein argonaute 2. The depletion of a specific mRNA leads to a temporary reduction (knockdown) of the levels of the mRNA-encoded protein in the cell. A complicating technical issue in the use of siRNAs in large screens is the fact that different siRNA sequences, designed against the same mRNA, will result in variable degrees of target degradation. A rigorous analysis of multiple siRNAs designed against each of two target mRNAs (with each siRNA offset by 2 nt from the previous siRNA) demonstrated that siRNA silencing efficiency varied greatly between siRNA sequences, including adjacent siRNAs that varied by only 2 nt . These data suggest that factors intrinsic to the siRNA sequence have a significant impact on the efficacy of a given siRNA. This study further correlated higher degrees of silencing with several features of an siRNA:
In addition, regions of secondary structure within the mRNA target also influence the ability of a given siRNA to effect target cleavage . Despite the correlates of siRNA functionality described in this paper and elsewhere [6,7], the extent of protein reduction by an individual siRNA requires empirical determination. Furthermore, off-target effects, in which an siRNA may affect the level of unintentionally targeted mRNAs, often through partial sequence complementarity, are well documented [8–10]. Depletion of non-targeted mRNA may result in misinterpretation of the phenotype produced by an individual siRNA. Therefore, large siRNA libraries are typically designed to contain component siRNAs with minimal sequence complementarity to other mRNAs. These libraries are usually composed of multiple siRNAs targeting each specific mRNA, in order to increase the chance that at least one siRNA will produce adequate target degradation. In some cases, the siRNAs are pooled, requiring analysis of each individual from the pool in a secondary screen. Other factors may also affect the ability of a particular siRNA to substantially reduce target protein expression. Proteins with a long half-life may be less susceptible to knockdown, owing to the propensity of the targeted protein to remain in the cell at close to normal levels, even if the encoding mRNA has been significantly diminished. In order to address this problem, siRNA transfected cells are generally assayed 48–72 h post-transfection. Alternatively, multiple siRNA transfections over several days may be used in order to suppress the level of the mRNA for a longer period of time. Since high-throughput screens containing large numbers of siRNAs prevent optimization of each particular siRNA, certain genes may be inefficiently targeted, potentially resulting in low protein knockdown and false negatives.
Efficiency of transfection is another factor that must be considered when performing or interpreting the results of large-scale siRNA-based screenings. Although transfection of siRNAs into a given cell type is more efficient than transfection of plasmid DNA , some cells types remain refractory to siRNA transfection, effectively excluding them from use in such screens. In most of the screens for viral replication factors, discussed in the following sections, the cell types used were derived from easily transfected cell lines, such as HeLa or HEK293. If hard-to-transfect cell types must be used in a particular assay, introduction of retroviral vectors designed to express shRNA have been used as an alternative [12,13].
In the siRNA screens performed to date, expression of a viral or reporter protein is usually used as the read-out to assess viral replication. Visualization of infected cells using antibodies directed against viral proteins, or the use of genetically altered viruses expressing green fluorescent protein if available, allows for quantitation of the percentage of cells infected. Alternatively, viruses engineered to contain enzymatic reporter genes (e.g., luciferase) will exhibit enzyme activity proportional to viral protein production. The chosen read-out, as well as the time postinfection that the assay is performed, can influence which aspects of the viral lifecycle are subject to examination. For example, assaying a viral protein expressed early in infection follows the virus through entry, uncoating or protein expression. (Of course, there may be several additional steps preceding translation – in the case of HIV, reverse transcription and proviral integration.) By contrast, detection of viral proteins whose expression first requires replication of the viral genome (i.e., ‘late’ genes) will add replication to the list of viral lifecycle steps assayed. In contrast, infection of siRNA-transfected cells at low multiplicity and analysis of the spread of infection between cells, or the production of infectious progeny, will allow evaluation of the entirety of the viral replication cycle.
For many cellular proteins, a significant reduction in intracellular levels may result in deleterious effects on the cell itself. Therefore, toxicity associated with protein knockdown must be monitored in the transfected cells. For immunofluorescence-based screens, 4′,6-diamidino-2-phenylindole (DAPI) staining of cellular nuclei allows for quantitation of cells in each well, which can be performed in parallel with quantitation of numbers of virus-infected cells. Wells with significantly fewer cells than control wells can then be discarded from further analysis. Alternatively, fluorescent reagents that interact with cellular metabolic pathways, such as alamar blue, can be used to evaluate the overall viability of cells in siRNA-transfected wells.
In the studies described below, several methods were used to classify significant effects on viral replication (i.e., ‘hits’ worthy of further study; for a comprehensive review of these techniques, see ). Some of these methods are straight-forward and intuitive, such as the selection of wells displaying a twofold or greater reduction of infected cells compared with controls . By contrast, others require more extensive statistical analysis, such as strictly standardized mean difference [16,17] or analysis of redundant siRNA activity (RSA) [18,19]. The RSA analysis selects targets for which two or more independent siRNAs, targeted to the same mRNA, demonstrate substantial suppression of virus replication. Although this is a robust technique that is less likely than other methods to miss weaker positives, RSA analysis requires that individual siRNA sequences be present in separate wells, and so may not be compatible with libraries in which the siRNAs are arranged in pools . Finally, as Birmingham et al. point out, no matter what statistical methodology is applied to the screening results, extensive secondary screening must be employed to validate the hits, and actual ‘hands-on’ bench science, quite possibly without the benefit of multiwell plates and liquid-handling robots, will be needed to figure out which of the proteins identified in high-throughput screens truly contribute significantly to the viral lifecycle. First, secondary screens must confirm that the siRNAs used in the large screen indeed resulted in knockdown of the intended target. Second, experiments designed to elucidate the role of individual host cell proteins will vary depending on the virus and protein of interest, but will likely include identification of the stage of the viral replication cycle that is inhibited by siRNA-mediated knockdown of the protein, examination of proteins interacting with the protein of interest during viral infection, and potential substrates of the protein of interest (viral or cellular), if it possesses enzymatic activity.
With regard to viruses, RNAi-based screens have been used most extensively in the study of HIV. (Table 1 summarizes all of the genome-wide screens discussed in this review). To date, four independent studies have been published identifying host factors important in different phases of the HIV lifecycle [13,17,19,20], as well as a meta-analysis further examining the combined results of three of these studies . Two of these studies performed the screen in HeLa cells expressing the viral co-receptor CD4, although the studies used different siRNA libraries [17,20]. The study by Brass et al. examined p24 Gag expression to examine early phases of HIV replication (i.e., viral entry, reverse transcription, integration and steps up to and including protein translation) . Late-acting host factors involved in viral assembly and egress were identified by the transfer of supernatants of infected, siRNA-transfected cells to fresh reporter cells harboring a Tat-responsive β-galactosidase gene, followed by measurement of β-galactosidase activity (thereby measuring infectious progeny). The initial screen included siRNAs covering more than 20,000 genes. Of these, 273 resulted in a significant (more than two standard deviations from the mean) decrease in p24 or β-galactosidase activity, without overt cellular toxicity. In general, identification in the primary screen of multiple genes that are known to interact or participate in the same pathways can be taken as additional validation of the role of these genes in the viral lifecycle (at least, under the cell culture conditions used in the assay). Consistent with this rule of thumb, Brass et al. identified several components of the nuclear pore, autophagy, retrograde vesicular transport and the mediator complex (an RNA-polymerase II-associated complex involved in activation of transcription) .
A second screen conducted by Zhou et al. used a similar strategy to that of Brass et al. In this screen, a Tat-responsive β-galactosidase reporter was used following primary infection of siRNA-transfected cells (assaying viral entry through proviral integration and Tat expression), as well as to measure progeny infectious virus by culture of infected cell supernatants with fresh Tat-responsive cells . This screen confirmed 232 genes that exerted an effect on HIV replication. Interestingly, only 15 of these genes were among those identified by Brass et al., although this is a greater degree of overlap than would be expected by chance alone. Seven of the 15 overlapping genes were previously identified as playing a role in HIV replication, including the viral receptors CD4 and CXCR4 (CD4, in fact, was used by as a positive control to optimize the screening conditions). In addition, several components of the mediator complex identified by Brass et al., as well as other components of this complex, were identified in this screen, further confirming the role of the mediator complex in HIV replication.
A third study by König et al. narrowed the scope of the screen by focusing on early events of the HIV lifecycle . An HIV construct, deleted of the envelope (Env) sequence and possessing a luciferase reporter gene, was pseudotyped with vesicular stomatitis virus G protein, allowing this virus to complete a single round of replication, without producing infectious progeny. Therefore, this study excludes the steps of Env-dependent entry, but addresses the HIV lifecycle from uncoating to protein synthesis. As in the previous screens, the virus was used to infect 293T cells that had been transfected with an siRNA library. In parallel, Moloney murine leukemia virus (a γ-retrovirus), and adeno-associated virus (a parvovirus), each containing a luciferase reporter, were used to infect similarly transfected cells. Luciferase acticty was measured at 24 h postinfection and subjected to an extensive filtering process, including RSA analysis (described previously), removal of siRNAs that showed cellular toxicity, removal of factors shared with adeno-associated virus and delineation of enriched functional networks by comparison of data with human protein–protein interaction databases derived from yeast two-hybrid experiments. Application of these criteria and others resulted in a final pool of 295 genes that act specifically on HIV (and murine leukemia virus, which showed considerable similarity). Again, little overlap was observed with the previous studies.
In an elegant meta-analysis, Bushman et al. addressed this surprising lack of overlap by reexamination of the three siRNA-based screens, together with HIV protein–protein interaction data, previously published interactions of HIV with cellular host proteins, genetic studies of HIV-infected patients, as well as data from siRNA screens with other viruses (West Nile virus [WNV] and influenza, discussed in the following sections) . The authors suggest that variation and lack of overlap may be partially attributed to experimental noise, differences in the timing of sampling and differences in filtering criteria used to select hits. This group had performed one of the previously discussed HIV screens , and had data available from a duplicate screen, allowing estimation of experimental variance. Given this variance, it is expected that only 150 of the top 300 hits would be obtained if the screen were repeated using identical experimental conditions. It is then perhaps less surprising that three screens performed under different, albeit similar, conditions produced so little overlap. Owing to the lack of overlap in the siRNA screens, this group looked across the siRNA screens for commonalities in proteins involved in similar cellular processes. Using the Database for Annotation, Visualization, and Integrated Discovery (DAVID)  they identified gene ontology groups significantly enriched in each screen. In general, all three screens were highly enriched for the same gene ontology groups, including the nuclear pore, ubiquitin/proteasome, mediator complex, RNA binding and others. Therefore, analysis of enriched networks appears to be more robust than identification of individual proteins involved in viral replication.
In contrast to these three reports, a fourth, recently published, RNAi-based screen of HIV host factors contained several important differences in methodology. First, this study used Jurkat cells, a T-cell-derived line more closely resembling the natural targets of HIV infection. Second, instead of transfected siRNA pools, shRNA were introduced into target cells by a lentiviral vector. An RNA polymerase III-dependent promoter in the vector drives expression of the shRNA, which must be further processed by the cellular enzyme dicer before incorporation into the RISC. Furthermore, unlike the previously discussed siRNA transfection-based studies, in which siRNAs (or siRNA pools) targeting individual genes are segregated in 384-well plates, cells in this study were transduced by the lentiviral vectors and subjected to antibiotic selection to develop a mixed population of cell clones, each expressing a shRNA. As might be expected, many shRNAs were not tolerated over the course of prolonged selection, or resulted in enough of a growth disadvantage that these shRNAs were eventually eliminated from the population. Therefore, approximately 9300 clones, or 18% of the original library, were subsequently examined for an effect on HIV infection. Because Jurkat cells die 3–5 days postinfection with HIV, this screen used cell survival as a read-out of inhibition of virus replication. Following infection, RNA was isolated from surviving clones and shRNAs in the clones were identified using a microarray approach. Despite the fact that fewer cellular targets were examined than in the previously discussed siRNA-based screens (owing to the removal of many shRNAs because of toxicity or growth disadvantage), a comparable number of hits (252) were obtained in this screen. Not surprisingly, given the lack of overlap between the first three screens, few genes identified by the shRNA library had been found in any of the siRNA screens – none with Brass et al. , and three each with König et al.  and Zhou et al. . Nevertheless, significant overlap in involvement in cellular signaling pathways and protein complexes was found between proteins identified in this study and those of the other three, including NF-κB, AKT and PPAR-γ signaling, and the nuclear pore complex.
Several recent siRNA-based screens have focused on members of the virus family, Flaviviridae. Flaviviruses are single-stranded, positive-sense RNA viruses. Their genome encodes a single polyprotein that is cleaved into functional subunits by host and virus proteases. These subunits include the structural components of the virus, as well as machinery required for replication of the viral RNA . Using the same collection of pooled siRNAs as Brass et al. , Krishnan et al. examined the effect of host protein knockdown on a single cycle of WNV replication . In this assay, HeLa cells, transfected with the gene-specific siRNA pools, were infected with WNV at low multiplicity (0.3 plaque-forming units/cell). A total of 24 h postinfection, cells were fixed and the viral envelope protein (Env) was detected by indirect immunofluorescent staining. The authors suggest that this assay is focused on early stages of infection – viral entry and RNA translation – while excluding replication, assembly and egress. However, translation of viral proteins occurs at much higher levels following replication and amplification of the genomic RNA , so it is therefore likely that RNA replication is also one of the facets of the viral lifecycle assayed by this screen. Setting a threshold of a twofold change compared with controls for selection of hits, the authors identified 305 genes that affect WNV. Knockdown of 283 of these proteins reduced viral gene expression (indicating that the targeted proteins are required for, or enhance, viral replication). These include some previously identified factors, such as components of the vacuolar ATPase, which helps produce the low pH environment of the endosome needed for viral fusion and entry. By contrast, 22 siRNAs enhanced viral gene expression, suggesting that the targeted proteins act to inhibit replication. These include IRF3, a major regulatory component of the cellular response to RNA viruses, including interferon production. As with the HIV screens, this screen identified several proteins that are present in the same macromolecular complexes, or have a previously documented interaction with other identified host factors. These include proteins involved in intracellular trafficking, endoplasmic reticulum-associated degradation and ubiquitination/proteasomal degradation pathways. The authors investigated the roles of some of the identified proteins and found that CBLL1, a ubiquitin ligase involved in endocytosis of cell surface proteins, is required for entry of WNV into the host cell. They further determined that several components of the proteasome function at a later stage in the virus lifecycle. In agreement with these findings, Gilfoy and Mason (who also contributed to the Krishnan et al. study) identified proteasome subunits in a smaller scale siRNA-based study (encompassing ~5500 genes) using WNV replicons containing a luciferase reporter . These investigators determined that proteasome activity is required for amplification of the viral genome, although the exact mechanism of action still remains to be elucidated.
Krishnan et al. also asked if proteins that were identified as affecting WNV replication are also involved in the replication of a related flavivirus, Dengue virus (DENV). Using the same assay used for WNV, they determined that approximately one third of the factors also appeared to play a role in DENV infection. Interestingly, all the proteins whose silencing increased WNV infection had a similar effect on DENV, suggesting that these proteins are involved in restriction of virus (or, at least, positive-sense RNA virus) growth. This is certainly expected for host proteins for which an antiviral role has already been clearly defined, such as IRF3.
Sessions et al., also conducted a screen for host factors associated with DENV . Given that DENV replicates in a mosquito host, this group took advantage of the availability of siRNA libraries targeting the genome of Drosophila melanogaster. This library had previously been used to identify host factors for Drosophila C virus (DCV) [27,28]. In invertebrates, RNAi does not require 22 bp siRNAs, but can be achieved with long (~700 bp) dsRNAs, derived from sequences with the target mRNA. The dsRNAs are easily introduced into Drosophila S2 cells, requiring only co-incubation with the cells . Another advantage of RNAi screening in Drosophila is the relative absence of redundant genes within the Drosophila genome, making it less likely that knockdown of a single host gene will be compensated for by a related protein . Following a duplicate screen and subsequent rescreen of hits, 116 genes were identified as Drosophila host factors for DENV. In total, 82 of the 116 host factors had identifiable human homologs, allowing a further screen to determine if these genes are required for DENV growth in human cells. Following transfection of Huh-7 cells with siRNAs targeting the human homologs, 42 scored as positive for knockdown of DENV growth.
HCV is also a member of the family Flaviviridae and, therefore, uses a similar replication strategy to WNV and DENV. Approximately 170 million people are infected with HCV worldwide, making this a virus of considerable importance with regards to public health . Therefore, it is not surprising that many groups have used siRNA-based approaches for identification of HCV-associated host factors, with an eye to the development of novel therapeutics [32–37]. To date, however, only one has used a library covering the entire genome . This screen was carried out in Huh-7 cells harboring an HCV luciferase-expressing replicon. The replicon contains neomycin phospho-transferase and luciferase genes at the 5′ end of the genome, followed by the HCV non-structural (NS) proteins NS3, NS4 and NS5. The HCV NS proteins form an active replication complex that maintains the RNA genome in the host cells. Luciferase activity is directly proportional to the copy number of replicon RNA in the cell. Since the replicon is present within the cell at the beginning of the assay, and is unable to produce infectious progeny, this assay examines neither entry nor egress, but focuses on viral protein expression and RNA replication. The initial results yielded 236 hits. This was further pared down to 96 following secondary screening, in which a threshold was set requiring at least two individual siRNAs from the pool to effect reduction of luciferase expression. Little overlap was observed in hits between this screen and other siRNA screens performed earlier [33–35]. Tai et al. observed that several previously identified putative HCV host factors may have been identified in their screen if less stringent selection criteria were used . This may also reflect the possibility that certain weaker targets may be more readily identified in smaller screens. Nevertheless, both Tai et al.  and Berger et al. , identified a requirement for phosphatidylinositol 4-kinase in HCV replication. Both groups defined a role for phosphatidylinositol 4-kinase in the formation of the membranous web, an intracellular structure present in HCV-infected cells that is thought to be a necessary structure for the formation of HCV replication complexes .
Influenza viruses are negative stranded, multisegmented viruses of the family Orthomyxoviridae. Influenza A represents a seasonal public health threat and periodically has the potential to become a global pandemic, arguably representing the greatest viral public health threat to mankind . Hao et al. used a Drosophila dsRNA library to identify influenza host factors . Since influenza does not normally infect insect cells, a genetically modified virus in which the vesicular stomatitis virus G protein and a luciferase reporter replace the hemagglutinin and neuraminidase coding regions was constructed for this assay. Furthermore, this virus does not release infectious progeny into the culture supernatant of infected Drosophila cells. Therefore, similar to the replicon-based screen, this screen is focused on the middle stages of the virus lifecycle, including the release and nuclear import of the viral ribonucleoprotein complexes, mRNA synthesis and translation. Screening of the dsRNA library identified 110 genes that reduced luciferase expression from this virus, several of which had clear human homologs. These include genes expected to play a role in influenza replication, including a subunit of the v-ATPase, a component of mitochondrial electron transport and an mRNA nuclear export factor. These human homologs were analyzed in human HEK293 cells by knockdown with specific siRNAs. These experiments verified not only reduction of luciferase from the reporter virus in human cells, but also reduction of intact influenza virus.
Although individual host proteins that play highly significant roles in the viral lifecycle will certainly be identified in RNAi-based screens (e.g., CD4 in assays of HIV infection), the fact that several of the screens of the same or similar viruses identified components of the same complexes (e.g., the mediator complex and the nuclear pore) or pathways (e.g., proteasomal degradation) suggests that the strength of the RNAi-based screens lies in this technique’s ability to identify protein networks and macromolecular complexes involved in viral replication. Nevertheless, this assertion needs to be verified by careful analysis of the networks and complexes in question in smaller screens, under conditions in which the efficacy of each siRNA can be verified. In addition, as demonstrated in several of the papers discussed [19,21], comparison of the siRNA data sets to data sets obtained by other experimental protocols (e.g., ‘interactomes’ derived from two-hybrid or proteomic analysis that define the interactions of host cell and viral proteins) may provide valuable clues as to how individual host proteins exert their effect on viralreplication.
As the popularity of this technique increases and more groups add to the ‘public data set’ (especially for viruses that are of significant public health interest), a clearer picture of which cellular processes contribute to the viral lifecycle will emerge. As these leads are further investigated, the mechanistic details of how these proteins function to affect viral replication will be elucidated. This, in turn, may lead to new therapies for the treatment of viral infection.
The author thanks Jessica Smith and David Stein for helpful discussions and critical reading of this manuscript.
For reprint orders, please contact: moc.enicidemerutuf@stnirper
Financial & competing interests disclosure
This work was supported by federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, as part of the Pacific Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Disease Research U54 Award AI081680. The author has no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.
No writing assistance was utilized in the production of this manuscript.
Papers of special note have been highlighted as:
of considerable interest