|Home | About | Journals | Submit | Contact Us | Français|
Until recently, only nematodes among animals had a well-defined endogenous small interfering RNA (endo-siRNA) pathway. This has changed dramatically with the recent discovery of diverse intramolecular and intermolecular substrates that generate endo-siRNAs in Drosophilamelanogaster and mice. These findings suggest broad and possibly conserved roles for endogenous RNA interference in regulating host-gene expression and transposable element transcripts. They also raise many questions regarding the biogenesis and function of small regulatory RNAs in animals.
RNA interference (RNAi), the process by which double-stranded RNA (dsRNA) is processed into small interfering RNAs (siRNAs) that silence homologous transcripts, is both a fascinating cellular machinery and a powerful experimental technique. Despite an avalanche of RNAi research over the past decade, however, a nagging question remained mostly unanswered: what good is RNAi to the organism itself?
Substantial roles for RNAi in regulating endogenous gene expression have been difficult to ascertain because Drosophila melanogaster1,2 and Caenorhabditis elegans3,4 mutants that selectively inactivate RNAi seem to be normal and fertile. These mutants are hypersensitive to viruses, which suggests that RNAi defends against selfish and invasive nucleic acids5. But if RNAi had an ancestral role in virus restriction it seems to have been subsumed in vertebrates by the interferon pathway. In fact, the nonspecific capacity of dsRNA to activate the interferon response, thereby leading to the general inhibition of cellular translation, was widely perceived to preclude substantial roles for endogenous RNAi in vertebrates.
Eight concurrent papers from the Zamore, Sasaki, Siomi, Lai and Hannon laboratories recently described a rich diversity of endogenous siRNAs (endo-siRNAs) in mice6,7 and D. melanogaster8–13. These studies introduce unanticipated complexity in small-RNA sorting pathways and in the biological roles of siRNAs. We highlight these new classes of endo-siRNAs and the pressing questions that are raised by their discovery.
Argonaute proteins lie at the heart of related small-RNA pathways that operate in organisms as diverse as Archaea, plants and animals14. They bind various small RNAs that are < 32 nucleotides (nt) in length which guide the Argonaute complexes to their regulatory targets (FIG. 1).
Among animals, the AGO and Piwi subclasses constitute two main classes of conserved Argonaute proteins. AGO proteins bind to microRNAs (miRNAs)14,15 — RNAs of ~ 22 nt that derive from host transcripts with short (usually < 100 nt) inverted repeats. These repeats are processed by the RNase III enzymes Drosha (in the nucleus) and Dicer (in the cytoplasm) (FIG. la). Specialized AGO proteins with efficient ‘slicing’ activity are the carriers of 21 nt siRNAs2,16,17. Exogenous dsRNAs are processed into siRNAs in the cytoplasm by Dicer, and therefore they do not require Drosha (FIG. 1 b). Piwi-interacting RNAs (piRNAs) are slightly longer RNAs (~24–32 nt) that are bound by Piwi-family proteins, which also have slicer activity18–20. Although their biogenesis is not completely understood, a major pathway for piRNA production involves reciprocal cleavages of sense and antisense substrates by antisense and sense piRNAs, respectively19,21 (FIG. 2a).
Until recently, C. elegans was the only animal for which endo-siRNAs had been well characterized. Primary siRNAs that are processed by dicing dsRNA are exceedingly rare in this organism22–24. Instead, the 3′ ends of targets that have been cleaved by primary siRNAs are recognized by an RNA-dependent RNA polymerase (RdRP), which generates abundant, untemplated, secondary siRNAs with distinctive 5′ triphosphates (FIG. 3). Secondary siRNAs are then loaded into specialized secondary Argonautes (SAGOs)25. Because other animals do not seem to encode RdRP or SAGOs, it is not evident that the mechanism for worm siRNA biogenesis is broadly conserved. Nematodes also lack conventional piRNAs, as the Piwi homologue PRG-1 contains ‘21U’ RNAs. The biogenesis of these 21 nt RNAs does not seem to be related to that of fly or vertebrate piRNAs22,26–28 (FIG. 2b). Therefore, fundamental aspects of conserved animal small-RNA pathways have clearly been altered in C. elegans.
Recent work now reveals diverse sources of endo-siRNAs in D. melanogaster and in mouse. Most of these endo-siRNA classes seem to be analogous between species, and include those derived from transposable elements, from complementary annealed transcripts, and from long ‘fold-back’ transcripts called hairpin RNAs (hpRNAs).
Because of the mutagenic consequences of transposable elements (TEs), powerful mechanisms are needed to restrict their activity. Such protection is indispensable in the germ line to maintain faithful transmission of the genome. In this context, piRNAs mediate a major defence against TEs29. However, scattered reports in the literature indicated that canonical RNAi also influences TEs. This was most clear in C. elegans, because many RNAi-defective mutants also deregulate transposons3,30. It was proposed that transcriptional read-through across Tcl transposable elements might produce intramolecular dsRNA between the terminal inverted repeats, the processing of which by RNAi could generate siRNAs that silence Tcl elements in trans31.
A conundrum for mammalian piRNA studies was that although multiple mouse Piwi-gene mutants exhibit testicular defects, transposon activation and sterility, corresponding mutant ovaries were normal and functional32–34. Instead, Dicer-mutant ovaries and oocytes exhibit higher levels of certain retrotransposon transcripts35,36. This is consistent with either an miRNA-based system for TE control or perhaps the usage of endo-siRNAs. In fact, earlier small-scale sequencing from mouse oocytes and testes revealed that some siRNAs derived from retrotransposons37, which could silence long interspersed nuclear elements (LINEs) in trans38. Newer large-scale cloning provided clearer evidence for TE-siRNAs in mouse oocytes6,7. Many of these mapped to the same genomic locations as piRNA clusters, which raised the possibility that these specialized ‘master loci’ are involved in both piRNA-mediated and siRNA-mediated TE control. However, some transposon classes were apparently targeted by only one of these RNA classes, which suggested that piRNAs or siRNAs preferentially control certain TEs. For example, several long-terminal repeat (LTR) retrotransposons were nearly exclusively targeted by siRNAs6,7.
In D. melanogaster, deep sequencing of the small RNAs that directly associate with AGO2 (the Argonaute that mediates RNAi) revealed that TEs are a substantial source of RNAs of precisely 21 nt11,12. Similar conclusions were reached by sequencing RNAs that were β-eliminated — this prevents RNAs from being ligated on their 3′ ends, unless they bear a 3′ modification, and thus enriches AGO2-loaded RNAs8 — or by analysing total head or cultured-cell RNAs13. Their accumulation is dependent on DCR2 (one of the two Dicers in D. melanogaster, and the one that generates exogenous siRNAs14; FIG. 1c), and the depletion or mutation of either DCR2 or AGO2 elevates TE transcript levels8,11–13. The TE-siRNA response is extremely active in various lines of cultured cells and correlates with the strong genomic amplification of specific LTR retrotransposons in these cells. Therefore, both TE-siRNAs and TE-piRNAs repress transposon transcripts in flies and mammals (FIG. 4).
Cis-natural antisense transcript (cis-NAT) arrangements are genomic regions that encode exons on both DNA strands, and can involve 5′, 3′ or internal exons (FIG. 4). Careful analysis of small-RNA sequences in mouse oocytes7 and D. melanogaster tissues and cultured cells8,9,11,12 revealed that cis-NAT overlaps are favourable for siRNA production. The extent of 21 nt RNA production was limited to annotated exons that are transcribed bidirectionally, excluding adjacent introns. D. melanogaster cis-NAT-siRNAs are dependent on DCR2, and mouse cis-NAT-siRNAs are similarly Dicer-dependent. However, although virtually all cis-NAT-siRNAs in flies derived from 3′ untranslated region (UTR) overlaps, one of the abundant mouse cis-NAT-siRNA loci involved Pdzd 11/Kif4, whose transcripts overlap on their 5′ UTRs (FIG. 4).
The levels of the 3′-overlapping transcripts Pdzd 11 and Kif4 increased modestly in mouse Dicer mutants7, consistent with an autoregulatory activity of the siRNAs generated by this cis-NAT. D. melanogaster cis-NAT-siRNAs specifically load AGO2, but evidence for changes in their progenitor transcripts on loss of DCR2 or AGO2 was equivocal. However, D. melanogaster cis-NAT-siRNA genes (but not cis-NAT genes in general) exhibited striking enrichment for several nucleic-acid-based functions, including transcription cofactors, deoxyribo-nucleases and ribonucleases9. In addition, most co-expressed cis-NATs in D. melanogaster S2 cells did not generate siRNAs. These data indicate that only a subset of co-expressed cis-NAT pairs are selected for siRNA production, presumably reflecting an endogenous functional use. Intriguingly, one of the most highly expressed cis-NAT-siRNA loci in the entire genome involves the CG7739/Ago2 gene pair8,9,12 — thus AGO2 carries its own siRNAs.
A special class of cis-NAT-siRNAs come from the D. melanogaster klarsicht (klar) gene, which is involved in lipid-droplet transport and nuclear migration, and from the thickveins (tkv) gene, which is involved in transforming growth factor-β signalling9,12. Although these loci produce 3′ modified, 21 nt, AGO2-bound RNAs from both DNA strands, they seem to involve a specialized mechanism for extremely efficient cis-NAT-siRNA production over extended genomic intervals that are 5–10 kb in length9,12. In addition, klar and tkv are not 3′ cis-NATs, but instead involve overlaps with 5′ exons, internal transcript exons and/or annotated intronic regions. Therefore, the strategy for klar and tkv siRNA production seems to differ from that of conventional cis-NAT-siRNAs.
Mammalian genomes encode large numbers of pseudogenes, which are presumed to be non-functional entities that will eventually be lost. Small-RNA cloning from mouse oocytes revealed an unexpected class of ‘functional’ pseudogenes. Multiple genes with antisense-transcribed pseudogenes were inferred to anneal with their complementary progenitors (as trans-NATs) and be diced into siRNAs6,7. The existence of siRNAs that bridge exon-exon junctions suggested that mature mRNAs constitute the dsRNA substrate, as suggested for cis-NAT-siRNA pairs. Microarray profiling and quantitative PCR analysis of Dicer-mutant oocytes revealed substantial upregulation of multiple genes with complementary siRNAs (FIG. 4), indicating that this system regulates endogenous gene expression6,7. It is unclear whether the dicing of targets during trans-NAT-siRNA biogenesis accounts for target regulation, or whether pseudogene-derived antisense siRNAs actively slice sense-strand mRNAs (FIG. 1c). In at least one case — histone deacetylase-1 (Hdac1) — siRNAs derived exclusively from sense-antisense pseudogene duplexes, which were inferred to repress functional Hdac1 trancripts6.
Earlier functional tests showed that long dsRNA does not activate protein kinase R or the interferon response in oocytes, as it does in most other mammalian cells39,40. Therefore, oocytes might provide a favourable setting for the exploitation of endogenous RNAi to regulate host transcripts. Genes with complementary pseudogene siRNAs are heavily enriched for microtubule-related functions6. This suggests a regulatory focus to the trans-NAT-siRNA pathway.
Although animal miRNA hairpins are usually < 100 nt, plant miRNA hairpins can be significantly longer41. Because of this property, the hairpin precursors of some plant miRNAs were not initially recognized. Likewise, some ‘long’ miRNA hairpins that are double the length of typical miRNAs were only recently identified in D. melanogaster42. Therefore, animal RNAs that map to inverted repeats might have escaped conventional miRNA annotation.
Bioinformatics studies in D. melanogaster revealed a number of candidate loci that produce small RNAs from extended inverted repeats that are termed hairpin RNAs (hpRNAs), the stems of which were up to 400 base pairs in length10. At least seven distinct loci generate siRNAs, and the hp-CG4068 locus alone encodes 20 tandem hairpins10–12. Despite their structural similarity to miRNAs, hpRNAs are processed by DCR2 instead of DCR1, and generate 3′ blocked siRNAs that load AGO2 (REFS 10–12) (FIG. 1). As with siRNAs from artificial long-inverted repeats, the siRNA duplexes derived from hpRNAs are phased and direct AGO2 to cleave targets.
One of the hp-CG4068 siRNAs is highly complementary to the coding region of mutagen-sensitive-308 (mus308), a DNA polymerase that is involved in the DNA-damage response, and can cleave this target site10,12. In this case, mus308 is the only obvious target of the many siRNAs that are generated by hp-CG4068. However, hp-CG18854 is a pseudogene with substantial homology to CG8289, which encodes a chromodomain protein, and elevated hp-CG 18854 could repress CG8289 in trans10. Curiously, several candidate hpRNA loci were identified in mouse, including a long-inverted repeat pseudogene of the Ran GTPase-activating protein-1 (Rangap1) gene6,7. It is unclear whether these hpRNA pathways are conserved or convergent, but they at least suggest that analogous systems operate in D. melanogaster and mammals. However, it is clear that entry into an endo-siRNA pathway can endow pseudogenes in both species with regulatory activity.
There are two Dicers in D. melanogaster — DCR1 cleaves pre-miRNA hairpins into miRNA duplexes, whereas DCR2 cleaves long dsRNA into siRNA duplexes14 (FIG. 1). Each Dicer directly binds to a dsRNA-binding domain (dsRBD) partner that aids its function. DCR2 interacts with R2D2 (whose name derives from the fact that it contains two dsRNA-binding domains (R2) and is associated with DCR2 (D2)), which is essential for the loading of siRNA into AGO2 (REFS 43,44). DCR1 interacts with Loquacious (LOQS), which promotes its ability to cleave pre-miRNA hairpins, the products of which are preferentially loaded into AGO1 (REFS 45–47).
Although the attractive symmetry of RNase III, dsRBD and AGO partnerships in the RNAi and miRNA pathways lent support to the proposed division of these pathways, genetic observations suggested that there are much more complex interactions among these factors. For example, unlike Dcr2 mutants, r2d2 mutants reveal its requirement for early development and female fertility. Moreover, r2d2 (but not Dcr2) phenotypes are strongly enhanced on reduction of Dcr1 (REF 48). Reciprocally, Dcr1 proved to be an RNAi-defective mutant1. There is substantial functional overlap between AGO1 and AGO2, as detected by double-mutant analysis49, and some miRNAs sort to both AGO1 and AGO2 (REFS 11,12,50–52). Finally, loqs functions in inverted-repeat RNA-mediated silencing46. These findings indicate that there is substantial crosstalk between the RNAi and miRNA pathways.
Despite its original classification as a core component of the miRNA pathway, loqs-null mutants have only modest defects in the maturation of many miRNAs53. It seems that DCR1 can cleave pre-miRNAs without LOQS, albeit with lowered efficiency that varies between miRNAs53. Surprisingly, LOQS is essential for the accumulation of many endo-siRNAs9,10,12,13 (FIG. 1c). At least some of the members of all of the siRNA classes — TE-siRNAs, cis-NAT-siRNAs and hpRNA-siRNAs — are dependent on LOQS. Although previous tests did not reveal a physical interaction between LOQS and DCR2, proteomic analysis of DCR2 complexes revealed that there is comparable coverage of LOQS and R2D2 peptides12. Therefore, LOQS is a component of both miRNA and RNAi pathways.
The recent papers on endo-siRNAs raise fundamental questions regarding the biogenesis of small RNAs. Some of the most important questions concern mechanistic aspects of small-RNA sorting pathways. For example, how does LOQS work with DCR2? And given that R2D2 is needed to load exosiRNAs into AGO2 (REFS 43,44), to what extent do endo-siRNAs require R2D2 for loading? How are miRNA and hpRNA precursors distinguished? Some ‘long’ miRNAs and ‘short’ hpRNAs in D. melanogaster are indistinguishable in size and structure10,42. They are effectively sorted, however, as long miRNAs make only a single small-RNA duplex (as is typical for DCR1 substrates), whereas short hpRNAs produce multiple duplexes (as is typical for DCR2 substrates). How can the cell distinguish these hairpins?
The regulation of dsRNA formation is another mystery. For example, the cis-NAT-siRNA pathway accepts many substrates — at least 17 in mouse oocytes6,7 and at least 140 in D. melanogaster8,9,12. However, cis-NAT-siRNA loci constitute only 25% of co-expressed cis-NATs in D. melanogaster9. Is there active selection for entry into the RNAi pathway, which could be mediated at the step of dsRNA formation? Conversely, how do co-expressed mammalian cis-NATs, and co-expressed pseudogene-gene complementary pairs, avoid triggering an interferon response outside of oocytes? Finally, although it seems evident that cis-NAT and trans-NAT siRNAs are generated from processed transcripts, it is not known whether the dsRNA substrate forms in the nucleus or cytoplasm, nor is it clear where the dsRNA encounters Dicer.
Valuable lessons were taught by the length and structure of primary hpRNA transcripts. Their dsRNA character was recognized only after genomic fragments of sufficient length were examined, and consequently their siRNAs were prone to being misannotated as having derived from shorter, unstructured precursors8. The stems of some plant miRNA hairpins are separated by long, unstructured terminal loops and even introns54, and we now recognize the same to be true for several hpRNAs10,12. It is therefore conceivable that the stems of some hpRNA precursors might be separated by kilobases or tens of kilobases. Do the structured precursors of any anonymous cloned small RNAs that are currently deposited in public databases await discovery?
Endogenous sources of mammalian dsRNA remain to be recognized outside of oocytes. As is the case with oocytes, introduction of long dsRNA into embryonic stem cells (ESCs) does not activate an interferon response55,56. Might ESCs also harbour endo-siRNAs, the action of which is relevant for maintaining pluripotency? Although endo-siRNAs were not previously found in ESCs57 this possibility might deserve further study.
Finally, although small-RNA sorting pathways have received little attention in mammalian systems, there is growing recognition of their importance to siRNA and miRNA function in plants58,59, worms60,61 and flies51,62. As only one of the four mammalian AGO proteins (AGO2) has slicer activity16,17, the directed sorting of mammalian siRNAs is presumably important for their ability to slice complementary targets63. Consequently, the elucidation of mammalian siRNA sorting rules might have important implications for attempts to improve siRNA efficacy for experimental and therapeutic purposes.
To return to the question posed at the beginning of this Perspective, what good is endogenous RNAi to an organism? The necessity to preserve RNAi in mammals has been somewhat of an enigma as they seem to have mostly dispensed with siRNAs for antiviral defence, and some aspects of mammalian biology can be rescued by slicer-defective AGO2 (REF. 64). However, in addition to a few endogenous cleavage targets of miRNAs65, and a role for AGO2 in the biogenesis of select miRNAs66, the new studies suggest widespread usage of endo-siRNAs as endogenous regulators of gene expression.
However, it is safe to say that we do not understand the specific biological functions of endo-siRNAs well. Indeed, the question of endo-siRNA function remains mostly unanswered in worms22,67,68, and the discovery of abundant endo-siRNAs in flies and mammals only makes the understanding of this topic more pressing. The recent papers do show deregulation of retrotransposon transcripts, pseudogene-complementary transcripts and some cis-NAT pairs in Dicer and/or Ago mutants, and thus their regulation by endo-siRNAs is plausible, although this remains to be shown directly Evidence for direct siRNA-mediated target regulation was only explicitly shown for some hpRNAs in D. melanogaster10,12, and such evidence would be desirable for other classes of endo-siRNAs.
The established targets of D. melanogaster hpRNAs encode DNA-binding proteins10,12. This seems reminiscent of the fact that D. melanogaster cis-NAT-siRNA loci are significantly enriched for DNA and RNA-binding proteins9, raising this as a substantial molecular axis for endo-siRNA regulation. It is relevant to note, therefore, that D. melanogaster Dcr2 mutants exhibit abnormal nucleolar morphology69, whereas Ago2 mutants were reported to have chromosome segregation defects70. These phenotypes are plausibly connected to the types of gene functions that are highly enriched in cis-NAT-siRNAs. The mouse oocyte pseudogene-gene siRNA system seems to preferentially target genes that are involved in microtubule dynamics6, and this is plausibly connected to the observation that Dicer loss in growing oocytes disrupts spindle formation and chromosome segregation35,71. Nevertheless, the endogenous requirement of these systems remains to be demonstrated by specific knockouts of hpRNAs or siRNA-generating pseudogenes.
Overall, the fact that core RNAi pathway mutants in worms and flies are mostly normal and fertile, whereas core miRNA pathway mutants are lethal, suggests that the role of endogenous RNAi is fundamentally different than that of miRNA regulation. This is further suggested by the fact that many miRNAs are deeply conserved but most D. melanogaster hpRNA loci10,12 and most mouse pseudogenes that generate siRNAs6,7 are poorly conserved. We must therefore think more openly about their usage. Is the usage of these RNAs a matter of fine-tuning gene expression, or perhaps a matter of maintaining fitness in an ever-changing environment? Is endogenous RNAi used for robustness in gene regulation, perhaps to canalize traits? Or is it a regulatory mechanism that generates species-specific characters during evolution? These are questions that remain for the future, but given the pace with which the field of endo-siRNAs has recently advanced, we might expect some answers to soon be forthcoming.
Eric C. Lai’s homepage:
ALL LINKS ARE ACTIVE IN THE ONLINE PDF
K.O. was supported by the Charles Revson Foundation. E.C.L. was supported by the V Foundation for Cancer Research, the Sidney Kimmel Foundation for Cancer Research, the Alfred Bressler Scholars Fund and the National Institutes of Health (GM083300).