|Home | About | Journals | Submit | Contact Us | Français|
Short-hairpin RNA (shRNA)-induced RNAi is used for biological discovery and therapeutics. Dicer, whose normal role is to liberate endogenous miRNAs from their precursors, processes shRNAs into different biologically active siRNAs, affecting their efficacy and potential for off-targeting. We found that in cells, Dicer induced imprecise cleavage events around the expected sites based on the previously described 5′/3′-counting rules. These promiscuous non-canonical cleavages were abrogated when the cleavage site was positioned 2 nt from a bulge or loop. Interestingly, we observed that the ~1/3 of mammalian endogenous pre-miRNAs that contained such structures were more precisely processed by Dicer. Implementing a new “loop-counting rule”, we designed potent anti-HCV shRNAs with substantially reduced off-target effects. Our results suggest that Dicer recognizes the loop/bulge structure in addition to the ends of shRNAs/pre-miRNAs for accurate processing. This has important implications for both miRNA processing and future design of shRNAs for RNAi-based genetic screens and therapies.
MicroRNAs (miRNAs), 21–23 nt in length, are responsible for the regulation of at least one-half of all protein encoding genes in mammals (Friedman et al., 2009). Primary microRNA transcripts (pri-miRNA) are initially transcribed from genome encoded sequences, and then further processed into pre-miRNA and finally small RNA duplexes [reviewed in (Bartel, 2004; Carthew and Sontheimer, 2009; Liu and Paroo, 2010)]. Based on the thermodynamic stabilities of the duplex ends, one strand of the resulting duplex (the miRNA strand or guide strand) is preferentially loaded into Argonaute proteins, the core component of the RNA induced silencing complex (RISC)(Hammond et al., 2001; Khvorova et al., 2003; Schwarz et al., 2003). Gene expression is reduced by a process referred to as RNA interference (RNAi) through site specific cleavage or non-cleavage repression. While the more efficient means of knocking down gene expression is induced when the target sequence has complete complementarity with the small RNA, the major mode of miRNA-induced gene regulation, occurs when complementarity is maintained in the first third of the small RNA and target mRNA but mismatches arise in the remainder of the aligned sequence (Gu and Kay, 2010; Huntzinger and Izaurralde, 2011).
The RNAi pathway can be induced to mediate transient sequence-specific gene silencing by directly transfecting chemically synthesized siRNA duplexes into tissues or cells (Elbashir et al., 2001). Alternatively, DNA-based transcriptional templates expressing a small hairpin RNA (shRNA), which are processed into siRNAs, can be used to achieve long-term gene silencing (Brummelkamp et al., 2002; McCaffrey et al., 2002; McManus et al., 2002; Paddison et al., 2002; Zeng et al., 2002). Both approaches have become routine in biological research and are used in novel therapeutic applications to treat various diseases (Kim and Rossi, 2007).
Most of the current transcriptional RNAi approaches developed for therapeutics or biological discovery (e.g. shRNA libraries) utilize polymerase III transcription cassettes because they are relatively simple to construct and provide high levels of expression (Paddison et al., 2004). More recently, regulated and/or cell type specific transcription of shRNA can be accomplished using polymerase II promoters but this requires that the shRNA sequences be embedded within a native or artificial pre-miRNA sequence that may affect the processing and creation of the desired siRNA product (Pan et al., 2012).
RNase III enzymes are crucial in the biogenesis of miRNAs (Kim et al., 2009). In particular, Dicer recognizes the hairpin-shaped pre-miRNA and cuts the terminal-loop to generate a duplex miRNA/miRNA* containing 2nt 3′-overhangs, a classical feature of the RNAi pathway (Gurtan et al., 2012; Zhang et al., 2002; Zhang et al., 2004). The precise processing of Dicer is critical as inaccurate cleavage events generate miRNAs with different seed regions, altering the set of genes a particular miRNA regulates (Lewis et al., 2003). A shifted Dicer cleavage site changing the nucleotide composition of duplex ends can have profound effects on which miRNA strand is loaded into RISC. Dicer is also responsible for processing shRNAs into siRNAs (Siolas et al., 2005). One of the problems and limitations of shRNA based RNAi approaches is the fact that an unpredictable number of various duplex RNAs are generated within a cell (McIntyre et al., 2011), which can limit their effectiveness. Therefore, defining the precise rules for Dicer cleavage will be a great benefit in shRNA design.
It has been demonstrated that Dicer determines its cleavage sites by measuring a fixed distance from either the 3′ end overhang (the 3′ counting rule) (MacRae, 2006; MacRae et al., 2007) or 5′ end phosphate group (the 5′ counting rule) (Park et al., 2011), with the latter being the predominant process in mammalian systems. However, both models were established based on reconstituted non-cellular Dicer cleavage studies and the applicability of these rules for shRNAs or endogenous miRNA processing in vivo has not been clearly delineated.
Here we wanted to establish additional rules governing the processing of endogenous miRNAs and shRNAs by Dicer in living cells. To do this, we focused our attention on pol III driven shRNAs to eliminate potential variables that could be introduced by the additional processing steps (e.g. Drosha) required in pol II based expression systems. Nonetheless, the relatively simplified pol III shRNA expression system provided new insights into Dicer processing that were confirmed by bioinformatic analyses to be operational in endogenous miRNA processing. Most importantly, these new parameters provide a means to design shRNA expression cassettes with enhanced efficacy in gene knockdown studies.
To investigate how Dicer processes pre-miRNA-like substrates in vivo, we designed a U6 promoter-driven shRNA (sh-miR30) consisting of a passenger strand at the 5′arm, a 9nt loop (from hsa-miR-22) and a 3′ arm guide strand (based on the hsa-miR-30-3p sequence). For efficient Pol III transcription start and termination, the shRNA sequence begins with a Guanine (G) and ends with a track of five thymines (T). The expected transcript is a 24bp-long-stem shRNA with two or more uracils (U) in the 3′ overhangs, which are processed by Dicer into a guide/passenger duplex (Figure 1A). Plasmids containing the shRNA expression cassette were transfected into HEK293 cells. Thirty-six hours later, the guide and passenger strands processed from sh-miR30 were separately detected by Northern blot. Interestingly, multiple products varying in length were identified from each strand, indicating heterogeneous Dicer cleavages and/or post-cleavage modifications (Figure 1B). Of note, these small RNA products were absent when we performed the same experiments in Dicer KO ES cells (Calabrese et al., 2007) but were rescued by complementing Dicer expression, confirming their specificity to Dicer processing (Figure S1A).
To analyze the Dicer processed cleavage products in detail, we deep sequenced all 18 to 32 nt small RNAs from the transfected cells. Over 750K reads were mapped to the guide strand sequences (with 3 or fewer mismatches) while less than 50K reads were identified from the passenger strand, suggesting the passenger strand is quickly degraded during RISC loading. Although the cleavage products were highly heterogeneous, nearly 90% of the reads that mapped to the 3′ arm (designated guide strand) started at two distinct positions. One group began at the expected cleavage site of Dicer based on the 5′-counting rule, the other (about 18% of all the 3′ arm reads) started 2nt upstream. 5′-RNA extension is an extremely rare event in cells (Seitz et al., 2008), making it unlikely that the latter group was the result of nucleotide addition to the former group. Rather, the results strongly suggest that in addition to the canonical cut, Dicer was also able to make a non-canonical cut. In support of this idea, the majority (~60%) of the 5′ arm reads ended at two corresponding positions predicted by the two-cleavage model. Of note, none of the frequent reads overlapped with the endogenous hsa-miR30-3p, which represented a minor fraction (<0.5%) of the reads and were therefore omitted from our analysis. A similar cleavage pattern was observed when the same experiment was performed in mouse embryonic fibroblast (MEF) cells (Figure S1B), indicating the non-canonical cuts were not limited to human cells, nor were they tissue-specific.
The 3′ but not the 5′ ends of small RNA are subject to intensive modifications such as trimming and tailing in cells (Ameres et al., 2010; Burroughs et al., 2010; Seitz et al., 2008). This was consistent with our finding that the majority of mismatches in mapped reads to sh-miR30 were located in the last 3nt of the 3′ end while less than 1% of the total reads contained mismatches in the first 3nt at the 5′ end (Figure 1D and Figure S1C). It was unclear if the enriched mismatches at the 3′ end were caused by sequencing errors or 3′ non-templated nucleotide addition. Nonetheless, this clearly indicates that the 5′ end of the 3′ arm strand, but not the 3′ end of the 5′ arm, is the appropriate manner to infer the Dicer cleavage pattern, despite the fact that both were generated by the same cleavage events. Thus, we focused our analyses on the 5′ ends of the 3′ arm (guide strand) to further investigate Dicer processing.
Since Dicer is a core component of the RISC loading complex (Chendrimada et al., 2005), we wanted to determine if the non-canonical Dicer cleavage products were able to associate with downstream RISC. To address this directly, we analyzed the RISC associated small RNAs by deep-sequencing after Ago2 immunoprecipitations from cells co-transfected with Flag-tagged Ago2 and sh-miR30. The relative percentage of the Dicer cleavage products were unchanged after Ago2 immunoprecipitation (Figure 2A, Figure S2 A, B), indicating the small RNAs generated from both non-canonical and canonical Dicer cleavage can associate with the downstream RNAi pathway.
ShRNAs can be designed such that after canonical Dicer cleavage the siRNA contains the appropriate end nucleotides that make one strand more favorable for RISC loading. Our sh-miR30 constructs are designed to ensure that the 3′ arm is processed into the guide strand that is preferentially loaded into RISC. However, non-canonical Dicer cleavage generates siRNAs with different nucleotide composition at the ends, making it possible for the passenger strands to be loaded into RISC and induce off-target effects. To verify this experimentally, we prepared two synthetic siRNAs, si-miR30-can (canonical) and si-miR30-non (non-canonical), to mimic the two predominant products of Dicer cleavage (Figure 2A). Dual-luciferase reporters containing target sequences perfectly complementary to either the guide or passenger strands in their 3′UTR (Figure 2B) were separately co-transfected with each of the siRNAs into HEK293 cells. While both siRNAs induced robust gene repression, si-miR30-non but not si-miR30-can inhibited the expression from the reporter containing a target complementary to the passenger-strand (Figure 2C). Similar results were obtained when we used a reporter containing mismatched target sequences (Figure S2C). As expected, when the transcriptional based sh-miR30 was used as a template to generate siRNAs in similar reporter knockdown experiments, passenger strand mediated off-target effects were observed in human (Figure 2D and Figure S2D) and mouse cells (Figure S2E). Taken together, our results demonstrated that non-canonical Dicer cleavage products generated off-target effects from unintentional loading of the passenger strands.
To generalize our observation, we designed two additional shRNAs, sh-Bantam and sh-Bantam-P, based on the Drosophila miRNA bantam, which has no known homologous sequence in mammals. Multiple 3′ arm small RNA products were generated when these two shRNAs were expressed in human (Figure 3A, B) or mouse cells (Figure S3A). Similar results were obtained when we tested a third shRNA (sh-LSW) containing an artificial sequence in both human (Figure 3A, C) and mouse cells (Figure S3B). The detection of canonical and non-canonical Dicer products containing various nucleotides at the 5′ end suggests that cleavage site selection was not dependent on the sequence of the RNA. Interestingly, non-canonical Dicer cleavage was as efficient as (sh-Bantam and sh-LSW) or even more prevalent (sh-Bantam-P) to the canonical cutting, suggesting the non-canonical processing is not a minor event.
To further support our conclusion through use of a different promoter, we compared a shRNA directed against human α-antitrypsin (sh-hAAT-25) under the transcriptional control of either the U6 or H1 promoter (Grimm et al., 2006). Although the expression level of H1-sh-hAAT-25 was weaker than that of U6-sh-hAAT-25 (Figure 3A), the pattern of Dicer processing based on deep-sequencing was strikingly similar (Figure 3D and Figure S3C), suggesting that the high expression level of shRNA substrate was not likely the cause of Dicer’s non-canonical processing. Of note, sh-hAAT-25 contains a different 7nt loop sequence which has been widely used in other studies(McIntyre et al., 2011), indicating the heterogeneous processing was also not limited to certain loop sequences/structures.
Pol III driven transcripts contain a triphosphate group at the 5′ end which may not be efficiently recognized by Dicer and interfere with the 5′ counting rule. In addition, the termination of Pol III polymerase leaves variable numbers of uridines at the 3′ end, which may shift the Dicer cleavage sites according to the 3′ counting rule. Therefore, it is possible that the non-canonical cleavage events observed in vivo are unique to Pol III driven shRNAs and can be explained by known rules of Dicer processing. To test this idea, we chemically synthesized the sh-miR30 sequences with a monophosphate group at the 5′ end and two uridines at the 3′ end. After transfected into HEK293 cells, the small RNAs processed from synthetic sh-miR30 were analyzed by Northern blot and deep sequencing. Results from both experiments showed that the processing products of synthetic sh-miR30 were even more diversified in length and start position compared to those of expressed sh-miR30 (Figure 1B, Figure 4A). It was possible that some of the transfected synthetic sh-miR30 were trapped in endosomes and subject to non-specific cleavage. Indeed, only a portion of those small RNAs originated from synthetic sh-miR30 were found to be specific to Dicer processing when we performed the same experiments in Dicer KO ES cells with or without the complementary Dicer expression (Figure S1A). Furthermore, after Ago2 Immunoprecipitation, only those reads with the same start position as the predominant products of expressed sh-miR30 were enriched, indicating that in vivo Dicer processing is the same between synthetic and expressed sh-miR30 (Figure 4A). As expected, passenger strand mediated off-target effects were also observed with synthetic sh-miR30 (Figure 4B).
Similar observations were made when synthetic sh-miR30 was transfected in mouse cells (Figure S4). Altogether, our results strongly suggest that non-canonical Dicer cleavage is independent of RNA end heterogeneity and an inherent feature of the Dicer processing in vivo.
Since the heterogeneous Dicer cleavage is not sequence specific and not the result of the inevitable heterogeneity of shRNA ends, we elected to investigate whether we could design shRNAs that would be homogeneously processed by Dicer. To do this, we generated a number of different sh-miR30 variants that varied in length and stem structure, and then examined their processing in HEK293 cells. While multiple 3′ arm processing products were detected in all designs by Northern blot (Figure 5A), the individual Dicer cleavage pattern revealed by deep sequencing was distinct (Figure 5B).
Consistent with the 5′ counting rule, Dicer made canonical cleavages at all shRNAs at a position strictly 19bp (passenger) /21bp (guide) away from the 5′ end. Interestingly, while a single mismatch in middle of the shRNA stem did not affect the position of the canonical cut (sh-miR30-M), placement of an asymmetrical bulge at the 5′ or 3′ arm of the shRNA partially shifted the canonical cuts in opposite directions. Specifically, after a bulge was introduced at the 5′ arm of sh-miR30, the percentage of canonical cuts was reduced (69% to 51%) whereas a new cleavage site (22%) emerged one nucleotide upstream (compare sh-miR30-24B with sh-miR30 in Figure 5B), indicating that a portion of the canonical cuts were shifted towards the loop termini. Conversely, introducing a 3′ bulge shifted a portion of the canonical cuts towards the open terminus (compare sh-miR30-23B with sh-miR30). Similar observations were also made with sh-miR30-22B and sh-miR30-21B, suggesting asymmetrical bulges should be avoided in order to achieve a distinct canonical cut.
In contrast to the canonical cut, non-canonical cutting was apparently not determined by the distance to the open ends of the shRNAs. Instead, non-canonical cleavages were observed at various positions near the site of the canonical cut without a specific pattern. Interestingly, while the mismatch/bulge at the stem had little impact on the position and/or relative abundance of non-canonical cleavages (compare sh-miR30 with sh-miR30-M, sh-miR30-24B and sh-miR30-23B), the pattern of non-canonical cleavage can be substantially altered when the distance between the site of the canonical cut and loop vary (such as sh-miR30 vs sh-miR30-22). Particularly, we found that the non-canonical cleavages were almost completely abrogated when the canonical cut was 2nt from the loop (sh-miR30-21) (Figure 5B). This result was further confirmed when we examined the small RNAs associated with Ago2. Over 98% of the reads mapping to the guide strand (3′ arm) of sh-miR30-21 started at the position of Dicer’s canonical cleavage (Figure S5A). As expected, the bioactivity measurements were corroborative – as the homogeneous Dicer processing of sh-miR30-21 resulted in reductions of the passenger strand mediated off-target effects in both human and mouse cells (Figure 2D, Figure S2D, E; compare to sh-miR30).
By moving the loop to the location 2nt away from the expected position of canonical cleavage, we observed near homogenous Dicer processing when additional shRNAs which were previously shown to be heterogeneously processed were tested (Figure S5B). Taken together, our results indicate that the accuracy of Dicer cleavage is determined by the distance between the site of cleavage and upstream loop structure, which we refer to as a “loop-counting rule”.
To evaluate the biological relevance of our findings, we sought to investigate Dicer processing of endogenous miRNAs. In contrast to the overexpressed Pol III driven shRNAs, the majority of the pre-miRNAs are generated by Pol II transcription. Thus, we elected to study the Dicer processing of endogenous pre-miRNAs directly by analyzing the 5′ end of the 3p miRNAs (or miRNA*) that are generated by Dicer cleavage.
In contrast to shRNAs, natural pre-miRNAs have more complex structures making it difficult to establish the existence of a stem-bulge, internal and/or terminal loop. For example, the RNA structure of the pre-hsa-let-7a miRNA can be drawn with either multiple internal bulges or one large terminal loop. Nonetheless, we found that the distance between the start position of the 3p miRNA/miRNA*, indicative of Dicer cleavage, and the position of the most adjacent non-complementary region (bulge, terminal or internal loop) upstream was not randomly distributed. Rather, 314 out of 970 (32.4%) miRNAs in human, and 209 out of 624 (33.5%) miRNAs in mouse, share structure contained a non-complementary region at a 2 nt distance upstream the Dicer cleavage site (Figure 6A). This indicates that evolutionary selection may be operative to maintain the relative position of the loop/bulge structure of the pre-miRNA in order to achieve accurate miRNA biogenesis as predicted by the “loop counting rule”.
In addition to the non-random position of the loop in mammalian miRNAs, we predicted based on our “loop counting rule” that these miRNAs would result in precise Dicer cleavage. Therefore, we sought to measure the accuracy of Dicer cleavage of miRNAs in vivo and ask whether it correlated with the relative position of nearby loop/bulge structures. To do this, we used a well-documented sequencing result consisting of over 60 million mouse small RNAs from various tissues and developmental stages (Chiang et al., 2010). The variation in the 5′ end start position was calculated (see methods for detail) for each individual miRNA (or miRNA*). Indeed, the most precise Dicer cleavage, which was inferred by the least variation at the 5′ ends of the 3p miRNA, was observed when such cleavage was 2nt away from a loop/bulge structure (Figure 6B). In contrast, the variation at the 5′ ends of 5p miRNA, which was created by Drosha, did not follow the same pattern (Figure 6C), indicating the precision at the 5′ end was the result of Dicer processing. Taken together, these results demonstrate the “loop-counting rule” obtained with artificial shRNAs is part of the natural role that Dicer plays in generating endogenous miRNAs.
On the basis of our findings, shRNA with a 21bp-long stem loop should be precisely processed by Dicer and generate fewer RNA species capable of generating off-target effects from the passenger strand. To validate this principle in a relevant preclinical setting, we created 11 shRNAs that varied in stem length from 19 to 29 bp which all produced the same guide strand targeting the 5NSB region of HCV (sh-HCV-19 to sh-HCV-29). Consistent with our prediction, non-canonical Dicer cuts were observed with all other shRNAs with the exception of the one with a 21 bp stem (sh-HCV-21) (Figure 7A and Figure S6A). Consistent with what we learned from the processing of pre-miRNAs, introducing an internal loop at 2nt away from the expected site of Dicer canonical cut into sh-HCV-29 (sh-HCV-29iB) shifted the pattern of Dicer cleavage to what was obtained with sh-HCV-21, indicating the precision of Dicer cleavage was not specific to the stem-length or terminal loop, but a result of an optimal distance to an upstream non-complementary structure.
Despite the difference in stem length or structure, all anti-HCV shRNAs had relatively potent on-target activity (repression from Guide-strand target), with the relative amount of knockdown correlating with the abundance of a mature guide strand (Figure 7B, C). In contrast, the passenger strand mediated off-target activity was not always coupled with the level of passenger strands. Rather, it paralleled with the heterogeneity of Dicer processing with the exception resulting from sh-HCV-19 and sh-HCV-20, which seemed to be poorly processed by Dicer and generated little passenger strand in the first place (Figure 7B). Similar results were also obtained in mouse cells (Figure S6). Together, our results demonstrated that potent shRNAs with improved safety could be achieved by applying the “loop-counting rule” and increasing the accuracy of Dicer processing.
We performed high throughput small RNA sequencing with 30 designed shRNAs to establish differential sites of Dicer cleavage. In contrast to northern blot analysis which is unable to distinguish similar but heterogeneous sequences of the same length, deep sequencing provides a more complete delineation of the cleavage products. These observations allowed us to make accurate predictions on how altering stem length and bulge position affect Dicer cleavage patterns. The consistency of our experimental results between the myriad of shRNAs tested in both mouse and human cells confirmed our ability to design shRNAs that resulted in the creation of a homogenous population of guide-strand RNAs.
Furthermore our finding that these artificial shRNA cleavage guidelines paralleled endogenous mammalian miRNA processing led us to propose the “loop-counting rule” as a critical mechanism for directing bona fide small RNA processing in mammalian cells. The “loop-counting rule” is as follows: Dicer cleaves precisely when it is able to recognize a single-stranded RNA sequence either from the loop region or internal bulge at a fixed distance (two nucleotides) relative to the site of cleavage. Otherwise, Dicer cleavage is not precise leading to a range of Dicer cleavage products with variable 5′ start positions.
Our observations raised intriguing questions regarding the mechanism of how Dicer determines its cleavage site. Interestingly, the helicase domain of Drosophila Dicer-1 was shown to be responsible for recognizing the single-stranded loop region of pre-miRNA(Tsutsumi et al., 2011). Furthermore, the recent elucidation of the human Dicer 3D structure indicated that the helicase domain was physically adjacent to the RNase III domain, the catalytic center for Dicer cleavage (Lau et al., 2012; Sawh and Duchaine, 2012). In light of these finding and the data presented here, we propose that the cleavage site is defined in a stepwise process. First, Dicer docks the open ends of hairpin RNA and feeds the region to be cleaved into the catalytic core as previously elucidated (MacRae, 2006; Park et al., 2011). Then, the cleavage site is secured by a second contact between Dicer and its substrate RNA. The precise cleavage is achieved when an ssRNA region (loop/bulge) is available at the optimal position where the helicase domain can grab on to and in turn stabilizes the catalytic center (Figure 7D). To test this idea, we deep sequenced the miRNAs in a Dicer mutant HCT cell line in which the helicase domain was disrupted with a 43-amino-acid in frame insertion (Cummins et al., 2006). There was a trend for less precise pre-miRNA processing in mutant versus wild-type cells (Figure S7A). Interestingly, only pre-miRNAs containing the optimal 2nt distance between Dicer cleavage site and an upstream single-stranded region, but not the rest of pre-miRNAs, showed a statistically significant (P=0.005) reduction in the precision of Dicer cleavage in mutant versus wild-type cells (Figure S7 B, C). This result supports a role for the helicase domain in sensing the region of non-complementarity within the pre-miRNA. Alternatively, Dicer co-factors, such as TRBP, may be responsible for recognizing the loop/bulge structure. However, this scenario is less likely to be true, as the optimal loop position indicated by the “loop-counting rule” is too close to the catalytic center to be accessible by a protein other than Dicer.
This feed and clamp model not only renders a molecular explanation for the “loop-counting rule”, but also provides new insights into the requirement of the helicase domain of Drosophila Dicer-2 in processing blunt end dsRNA substrates (Welker et al., 2011). Given that blunt ends were poorly recognized by the Paz domain (MacRae et al., 2007), Dicer-2 may have to rely on its helicase domain for more efficient binding and hence more proficient cleavage. Another line of evidence supporting this model comes from the regulatory role of the helicase domain in Dicer processing. While there is a tradeoff between precision and speed, the absence of a helicase domain in human Dicer was shown to increase its catalytic activity (Ma et al., 2008). Interestingly, the Dicer helicase insertion mutation was found to increase the processivity of Dicer for long, stable hairpins, but decreased its processivity for bulged hairpins, suggesting the terminal loop and internal bulge/loop may be sensed differently by the helicase domain(Soifer et al., 2008).
Consistent with a previous report (Starega-Roslan et al., 2011), we found that asymmetrical stem bulges may twist the 3-D folding of substrate shRNA and indirectly affect the position of Dicer cleavage. Therefore, the overall structure of substrate RNA, including the relative position of the loop/bulge to the cleavage site, has a much more profound impact on Dicer processing than previously believed. Further investigation into the structural conservation of pre-miRNAs based on the parameters provided in this study may finally provide an evolutionary meaning for the unique bulge-enriched structure of miRNA precursors.
The studies presented here also have great implications for RNAi technology, particularly the design of shRNAs. Despite the successes, the application of shRNAs is hampered due to unwanted off-target effects (Jackson and Linsley, 2010; Kaelin, 2012). A major source of off-targeting is unintentional loading of the passenger strand into RISC. To increase the odds of RISC loading guide strands over passenger strands, we avoided placing the guide strand in the 5p arm for two reasons: 1) The 5p small RNA generated from Pol III driven shRNA carries a triphosphate group at the 5′ end which may interfere with its incorporation into Ago2/RISC. 2) 5p transcripts starting with a guanine (G) or adenine (A) are required for efficient Pol III transcription making them structurally unfavorable for Ago2/RISC loading. In addition, the 3p guide strand was designed to start with a uracil (U), which not only is the most preferred nucleotide for Ago2 association but also lowers the 5′ end thermodynamic stability further enhancing its preferential loading into RISC (Frank et al., 2010; Seitz et al., 2011). Indeed, we found this design to be effective when the shRNAs were processed as expected (sh-miR30-21, sh-HCV-21 and shHCV-29iB). However, we found that the additional products generated from Dicer’s non-canonical cleavage, even when present in relatively small amounts could induce robust passenger strand mediated off-target effects, highlighting the importance of reducing heterogeneous processing in an shRNA design. Furthermore, the generation of shRNAs with differing seed region sequences can result in additional guide strand mediated off-target effects through seed region base paring. Therefore, longer hairpins, especially those expressed shRNAs where chemical modification is unavailable, would suffer from inaccurate processing and should be used with caution. Consistent with this idea, shRNAs with longer stems were generally more toxic when overexpressed in mouse liver (Grimm et al., 2006). More importantly, we have experimentally demonstrated the implementation of the “Loop-counting rule” in designing shRNAs free of heterogeneous processing. Overall, our results provide additional guidelines of how to design potent si/shRNAs with minimal off-target effects for biological knockdown of important genes and/or treatment of diseases.
For cloning of all psi-Check reporter system with various target, both strands of insert were chemically synthesized, annealed, purified, and inserted between the XhoI and SpeI sites in the psiCheck2 vector (Promega). All the oligo sequences to generate these reporter plasmids can be found in Table S1. A similar approach was used to generate all the sh-RNA expressing plasmids. ShRNA sequences (as detailed in the figures ) were directly cloned downstream of U6 Pol III promoter between BglII and KpnI. Plasmids expressing Flag tagged human Argonaute2 (Ago2) and human Dicer were obtained from Addgene (www.addgene.org).
HEK293 and Mouse embryonic fibroblast (MEF) cells were grown in Dulbecco’s modified Eagle’s medium (DMEM; Gibco-BRL) with L-glutamine, non-essential amino acids, sodium pyruvate and 10% heat-inactivated fetal bovine serum with antibiotics. All transfection assays were done using Lipofectamine 2000 (Invitrogen) following the manufacturer’s protocol.
One hundred ng of psi-check reporter plasmids were co-transfected with either 10 ng shRNA plasmids or certain amount of synthetic siRNA/shRNA (to a final concentration of 30nM) into HEK293 or MEF cells in 24-well plate. Thirty six hrs post-transfection, FF-luciferase and RL-luciferase activities were measured using Promega’s dual-luciferase kit (cat E1980) protocol and detected by a Modulus Microplate Luminometer (Turner BioSystems).
HEK293 cells in 10cm dishes were transfected with 5 ug of sh-RNA expressing plasmids. 36hr post transfection, total RNA was isolated using Trizol (Invitrogen) and then electrophoresed on 20% (w/v) acrylamide/7M urea gel. After transfer onto a Hybond-N1 membrane (Amersham Pharmacia Biotech), small RNAs were detected using P32-labeled probes (See Table S1 for sequences).
IP experiments were performed in a slightly modified protocol as described previously(Gu et al., 2011). In brief, HEK293 cells in 10-cm dishes were co-transfected with 0.5 ug of plasmids expressing Flag-tagged Ago2 and 4.5 ug of plasmids expressing various shRNAs. Cells were lysed 36hrs post-transfection and incubated with Anti-Flag M2 agarose beads (Sigma #A2220) overnight at 4°C. After three washes with cold IP buffer, Flag-Ago-RNA complexes were eluted with 100 ug/ml 3x Flag peptide (Sigma #F4799) in TBS. RNAs associated with Agos were extracted by Trizol and subject to small RNA deep sequencing.
Small RNA libraries were created using a protocol similar to previous small RNA capture procedures (Lau et al., 2001; Maniar and Fire, 2011). Sequencing reads (36nt) for all libraries were generated using the Illumina Genome Analyzer II (Stanford Functional Genomics Facility). After removing low quality reads, all sequences were sorted based on the 5′ bar codes (four nucleotides). Further, reads without 3′ adaptor sequences or shorter than 18nt were dropped. After removing the 3′ linker and 5′ barcode sequences, the resulting reads were aligned to either the 5′arm or 3′arm of shRNA sequences with up to 3 mismatches by bowtie (version 0.12.7) (Langmead et al., 2009) without allowing mapping to the reverse-complement reference strand (command “norc”).
All reads were first aligned to human miRNA library sequences (miRBase (Kozomara and Griffiths-Jones, 2011)) by bowtie (Langmead et al., 2009). For each particular miRNA or miRNA* sequences, reads with a 5′ end within 4n distance to the expected position were considered as small RNA generated from such loci and taken into calculation in the next step. With very few exceptions, the expected 5′ end as indicated in miRBase was also the most abundant 5′ end for that miRNA/miRNA* measured in deep sequencing results. The variation of each particular read was calculated as the absolute value of distance between its 5′ end and the expected end. The variation for a particular miRNA/miRNA* was then calculated by averaging the individual variation and using the relative abundance as weight.
This work was supported by NIH DK 078424 (MAK) and NIH AI071068 (MAK). We thank Dr. Grace Zheng for the Dicer KO ES cells and Dr. George Mias for helpful discussions on statistical methods.
The NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/) accession number for the sequence reported in this paper is submitted but pending
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.