|Home | About | Journals | Submit | Contact Us | Français|
Small RNAs directly or indirectly impact nearly every biological process in eukaryotic cells. To perform their myriad roles, not only must precise small RNA species be generated, but they must also be loaded into specific effector complexes called RNA-induced silencing complexes (RISCs). Argonaute proteins form the core of RISCs and different members of this large family have specific expression patterns, protein binding partners and biochemical capabilities. In this Review, we explore the mechanisms that pair specific small RNA strands with their partner proteins, with an eye towards the substantial progress that has been recently made in understanding the sorting of the major small RNA classes — microRNAs (miRNAs) and small interfering RNAs (siRNAs) — in plants and animals.
The discovery of RNA interference (RNAi) in the late 1990s sparked a renaissance in our understanding of RNAs as regulatory molecules. A growing number of small RNA classes has since emerged from studies of eukaryotic organisms, and these RNAs can be approximately divided into two groups: small RNAs that engage RNAi-related machinery and those that do not. As yet, we know very little about many newly discovered groups of small RNAs, but our understanding of the biogenesis and biological functions of RNAi-related small RNA classes is growing rapidly.
Small RNAs that engage RNAi-related pathways share several characteristic features. They are mainly ~20–30 nucleotides (nt) in length, have 5′ phosphate groups and 3′ hydroxyl (−OH) (although sometimes modified) termini, and they associate with specific members of a large protein family — the Argonautes. The precise combination of a small RNA with a particular Argonaute protein determines its biological function. Therefore, it is crucial that these very similar species are appropriately sorted among closely related partners. Only then can the target specificity conferred on Argonaute proteins by their small RNA guides enable their myriad important roles, which include the regulation of gene expression, modification of chromosome structure and protection from mobile elements. Conceptually, all small RNA-mediated regulatory events can be considered as the culmination of several consecutive steps: small RNA biogenesis, strand selection (in which dsRNA is the precursor), loading into Argonaute, target recognition and effector function.
The biogenesis of most small RNA classes, including microRNAs (miRNAs) and many small interfering RNAs (siRNAs), requires the action of RNase III family proteins (reviewed in REFS 1–3). Some small RNA classes, including Piwi-interacting RNAs (piRNAs) and secondary siRNAs in worms, however, are not derived from dsRNA precursors and are produced through alternative biogenesis mechanisms independently of RNase III enzymes4-8.
Following their production, small RNAs are sorted to confer association with specific Argonaute family proteins, which function as the core of the rna-induced silencing complex (RISC). Argonaute proteins can be classified into three subgroups according to their sequence relationships: the AGO subfamily, the Piwi subfamily and the worm-specific WAGO clade9-11. Piwi subfamily proteins load small RNAs derived from single-stranded precursors (piRNAs) and AGO clade proteins usually associate with small RNA duplexes processed by RNase III endonucleases (miRNAs and siRNAs; reviewed in REFS 1,2). Small RNAs that occupy WAGO clade proteins are usually direct products of RNA synthesis6,7,9.
Mature RISC consists of a single-stranded small RNA bound to an Argonaute protein. As some small RNAs are generated as duplexes, only one strand (the guide strand) is retained and the other (passenger) strand is discarded during RISC assembly12-14. AGO clade proteins are generally loaded with small RNA duplexes before RISC maturation. Thus, it is of key importance to assemble RISC in a manner that ensures that the appropriate guide strand is selectively stabilized, as loading of the passenger strand would obviously misdirect RISC towards inappropriate targets. Small RNAs guide mature RISC through complementary base pairing to its targets, with the most common outcome being target repression (reviewed in REFS 15-17).
The knowledge of the mechanisms that guide a particular small RNA strand into a specific Argonaute family member is crucial. It impacts our ability to predict the biological function of a small RNA and to effectively use small RNAs as experimental tools or therapeutics. This Review focuses on our understanding of small RNA sorting in plants and animals. We consider biogenesis as a starting point as this affects the nature of small RNAs and, in some cases, the complexes which the small RNAs join. Next, we discuss the small RNA-intrinsic determinants of sorting, followed by RISC loading and maturation. Finally, we briefly cover the implications of sorting for Argonaute function. We do not extensively discuss the effector mechanisms of mature RISC, but instead refer the reader to several excellent recent reviews on this topic15-17.
In effect, the first step of small RNA sorting is biogenesis, as this determines the small RNAs that are available for RISC loading. Moreover, the precise enzymes that liberate small RNAs from their precursor transcripts or generate them de novo seem to impact the choice of their ultimate Argonaute partner. Therefore, it is important to begin with an introduction to the varied mechanisms that can produce small RNAs.
Small RNA duplexes from partial or perfect dsRNA precursors are generated by RNase III family enzymes through sequential endonucleolytic cleavage events. These enzymes often partner with dsRNA binding domain (dsRBD) proteins, which serve to increase substrate specificity and affinity, leading to increased activity. The resulting products are duplex ~20–24-nt small RNAs consisting of two strands (the guide or miR and passenger or miR* strands). These small RNAs feature 5′ monophosphates and 2-nt overhangs that have hydroxyl groups at their 3′ termini.
miRNAs are ubiquitous in animal genomes and are often transcribed as separate coding units, many of which consist of polycistronic clusters containing multiple miRNAs. Some miRNAs are also present in introns and presumably arise from further processing of the excised introns of proteincoding genes18. Most miRNAs are transcribed by DNA-dependent RNA polymerase II (RNAPII) to generate a primary miRNA (pri-miRNA) containing a region of imperfect dsRNA, known as the stem–loop structure, that harbours the future mature miRNA19,20 (FIG. 1). Primary miRNA transcripts seem largely like the transcripts of protein-coding genes. They have 5′ cap structures, polyA tails and may contain introns. The production of conventional miRNAs from these precursors proceeds through two site-specific cleavage events. Processing likely begins with a dsRBD protein, Pasha/DiGeorge syndrome critical region gene 8 (DGCR8), binding to the pri-miRNA and recruiting the RNase III enzyme Drosha to form a multiprotein complex called the Microprocessor21-24. This complex recognizes the duplex character of the pri-miRNA, although the precise RNA–protein interactions that select pri-miRNAs as Microprocessor substrates and how the cleavage site is determined by these interactions are matters of ongoing work. The pri-miRNA is cleaved by Drosha to liberate a ~60–70-nt precursor miRNA (pre-miRNA) from the primary transcript25. The nuclear export protein exportin 5 recognizes the 2-nt single-stranded 3′ overhang of the pre-miRNA (characteristic of RNase III-mediated cleavage) and actively transports it in a Ran–GTP-dependent manner to the cytoplasm26-28. Additional factors, including the nuclear export receptor exportin 1 (XPO1), the cap-binding complex (CBC) and the Arabidopsis thaliana SERRATE homologue, ARSENITE-RESISTANCE PROTEIN 2 (ARS2), were recently suggested to play a part in the transition from pri- to pre-miRNA29-31.
Once in the cytoplasm, the pre-miRNA is cleaved into a ~22–23-nt miRNA:miRNA* duplex by Dicer32-35. For this purpose, the sole mammalian Dicer partners with the dsRBD protein TAR RNA-binding protein 2 (TARBP2, also known as TRBP)36,37, whereas the Drosophila melanogaster miRNA-generating Dicer 1 (DCR1) similarly interacts with a specific isoform of its dsRBD protein partner loquacious (LOQS-PB)38-42. Small RNA duplexes generated by Dicer (and its protein partner) exhibit 2-nt single-stranded 3′ overhangs at both ends, a signature of RNase III cleavage.
Several unconventional miRNAs that are defined by their use of alternative maturation strategies have now been noted. For example, mirtrons have been found in flies and mammals43-45. Mirtrons bypass the Drosha processing step and instead use the splicing machinery to generate pre-miRNAs. Mirtrons are very short introns and are excised, debranched and refolded into short stem–loop structures that mimic pre-miRNAs and are processed into mature miRNAs by Dicer. A few recently discovered mirtrons in flies are initially generated with extended 3′ tails that must be resected by the exosome to form a pre-miRNA suitable for Dicer processing46.
Plant miRNAs are transcribed by RNAPII to yield capped and polyadenylated pri-miRNAs with local stem–loop structures that are potentially stabilized by the RNA-binding protein DAWDLE (DDL)47. Plant pri-miRNAs typically display greater diversity in the size and structure of their stem–loops compared with their animal counterparts48. As plants lack a Drosha orthologue, pri-miRNAs are converted into mature miR:miR* duplexes by a single RNase III family enzyme, DICER-LIKE 1 (DCL1)48-51, which fulfils the functions of both Drosha and Dicer (BOX 1). As in animals, Dicer is assisted by a dsRBD protein, in this case, HYPONASTIC LEAVES 1 (HYL1)52-54. HYL1 and the zinc finger protein SERRATE promote accurate miRNA processing53,55-57. miRNA maturation is also aided by the nuclear cap-binding complex53,58,59, probably by facilitating the loading of miRNA-processing factors onto pri-miRNAs.
Plant microRNAs (miRNAs) are generally produced by sequential rounds of Dicing. This is necessitated by the lack of a Drosha orthologue. The extensive nature of the hairpins that lead to many plant miRNAs also permits phased production of multiple small RNA duplexes through sequential Dicing events, conceptually the plant version of long hairpin endogenous small interfering RNA (siRNA) precursors or miRNA polycistrons in animals. a | Usually, consecutive Dicing proceeds from the base of the stem–loop. The secondary structure of the primary miRNA (pri-miRNA) flanking the mature miR:miR* duplex is important for proper and efficient processing, analogous to the proposed role of the ‘basal stem’ of animal pri-miRNAs156-159. Accurate processing depends on a region of imperfect pairing (junction between ssRNA and dsRNA) approximately 15 nucleotides (nt) from the miR:miR* duplex (towards the free end of the stem-loop), which localizes DICER-LIKE 1 (DCL1) to its initial cleavage site. This liberates an intermediate similar to animal precursor miRNAs (pre-miRNAs), which is further processed by DCL1 into the mature miR:miR* duplex. b | Variations in processing mechanisms are possible. For example, miR319 and miR159 (both with conserved long precursors) are produced by an unusual loop-to-stem mechanism. Following the first cleavage of the loop by DCL1, consecutive cuts by DCL1 are necessary to release the mature miRNA duplex160. CBC, cap-binding complex; DDL, DAWDLE; HEN1, HUA ENHANCER 1; HYL1, HYPONASTIC LEAVES 1; SE, SERRATE.
Maturation of plant miRNA duplexes often proceeds through several rounds of sequential Dicing from the base of a long stem–loop (BOX 1). Processed miRNA duplexes are modified by the methyltransferase HUA ENHANCER 1 (HEN1)60-62. In contrast to its D. melanogaster homologue, plant HEN1 is nuclear and adds methyl groups to the 3′ ends of both strands of the miR:miR* duplex. This 2′-O-methylation is thought to protect miRNAs from further modifications, such as 3′ uridylation60,62, which mark single-stranded miRNAs for destruction by exonucleases of the SMALL RNA-DEGRADING NUCLEASE (SDN) family63. This adaptation may be necessitated by the fact that plant miRNAs pair extensively with target mRNAs and cleave them, a process which in animals provides a trigger for small RNA destruction64. Following methylation by HEN1, miR:miR* duplexes are thought to be transported by an Exportin 5 homologue, HASTY (HST), or through HST-independent mechanisms to the cytoplasm65, where sorting and RISC assembly takes place. However, the exact form of the exported cargo and the subcellular localization of plant RISC loading and maturation remain subjects of current debate3. In this regard, a recent study proposed a model in which RISC is assembled in the nucleus and only mature AGO1–RISC containing a single-stranded miR can be exported to the cytoplasm66.
The first siRNAs were discovered in plants67. The earliest identified examples were derived from viral replication intermediates or complex interactions between transgene copies. By considering the commonalities between these origins, dsRNAs were indicated as the source of small RNAs. It is now clear that plants and animals produce a wide range of siRNAs. These vary in their biogenesis mechanisms, but can be approximately divided into two classes, depending on whether they require RNA-dependent RNA polymerases (RdRPs) for their production.
The process of converting dsRNA into small RNAs is perhaps currently best understood in D. melanogaster. Here, the experimental introduction of long dsRNAs results in the production of exo-siRNAs that are ~21 nt in size (FIG. 2a). long dsRNAs are processed into siRNA duplexes through sequential cleavage events by the RNase III protein Dicer 2 (DCR2) (REFS 68,69) in collaboration with its dsRBD co-factor, a particular Loquacious isoform, LOQS-PD42,70. Dicer 2 also interacts with another dsRBD protein R2D2, but only LOQS-PD enhances siRNA production69,71. Recent studies indicate a role of R2D2 in loading siRNA duplexes into RISC (discussed below), suggesting that these two dsRBD proteins may have distinct and sequential functions71,72.
In flies, siRNAs also originate from numerous endogenous loci and were termed endogenous siRNAs (endo-siRNAs)73-77. These can originate from RNA transcripts with extensive hairpin structures, from convergent transcription units (similar to plant nat-siRNAs, see below) or from the annealing of sense and antisense RNAs from unlinked loci. One example of the latter type of siRNAs are endo-siRNAs that target transposons, which seem to arise at least in part from the hybridization of transposon mRNAs with piRNA cluster transcripts. Another possible source of dsRNA hybrids is the interaction of sense and antisense transcripts across individual transposon copies, and it has even been suggested that RdRPs may operate in animals to form dsRNAs78. As with exo-siRNAs, the biogenesis of endo-siRNAs depends on Dicer 2 assisted by LOQS-PD42,73,75-77.
A similar situation has been described in mammals; however, the range of cell types in which dsRNAs are produced and converted into siRNAs seems to be limited. Thus far, endo-siRNAs have been detected in abundance only in mouse oocytes and embryonic stem (ES) cells79-81. The dsRNA triggers that give rise to murine endo-siRNAs are predicted to arise from trans interactions between gene and pseudogene transcripts, from overlapping transcription units and from transcripts that can form long hairpins. As in flies, endo-siRNA biogenesis is dependent on Dicer and, presumably, its dsRBD partners.
In contrast to mammals and flies, worms and plants produce numerous endo-siRNAs using biogenesis mechanisms that depend on the action of RdRPs. Plant RdRPs copy single-stranded precursors into long dsRNAs that are cleaved by Dicer, whereas worm RdRPs can directly synthesize siRNAs without Dicer processing.
Primary siRNAs in Caenorhabditis elegans are produced conventionally, from long dsRNA triggers through the action of DCR-1 (REFS 33,35,82)(FIG. 2b). The siRNAs associate with the Argonaute family protein, RDE-1 and guide it to target transcripts. The RDE-1–target interaction recruits an RdRP, an outcome that is independent of RDE-1 catalytic activity83. The RdRP uses the target as template for the synthesis of secondary siRNAs of 22– 24 nt. Secondary siRNAs possess triphosphates at their 5′ ends, indicating that each small RNA is produced as a discrete moiety by de novo synthesis6,7.
The production of most plant siRNAs requires the action of RdRPs to convert ssRNA precursors to dsRNA triggers. Three major subclasses of endogenous siRNAs can be distinguished in plants: trans-acting siRNAs (ta-siRNAs), natural antisense transcript-derived siRNAs (nat-siRNAs) and heterochromatic siRNAs (hc-siRNAs). Each of these small RNA subclasses is produced by a specific Dicer family member and preferentially loaded into a distinct AGO complex.
The biogenesis of ta-siRNAs requires the interplay of canonical components of miRNA and siRNA processing84-90 (FIG. 2c). The process begins with miRNA-mediated cleavage of the TAS1 or TAS3 non-coding RNAs by miR390–AGO7 or miR173–AGO1, respectively. This triggers the recruitment of SUPPRESSOR OF GENE SILENCING 3 (SGS3) and RNA-DEPENDENT RNA POlYMERASE 6 (RDR6), which synthesizes dsRNA using the cleavage site as the entry point. The resulting dsRNA is processed by DCl4 and its dsRBD protein partner DRB4 into a phased series of 21-nt siRNA duplexes, which begins at the site of initial cleavage. ta-siRNAs are methylated by HEN1 before AGO loading. The subcellular localization of biogenesis factors and RNA intermediates, along with the recruitment of SDE5 (a putative export factor homologue), suggests that ta-siRNA biogenesis might involve specific nuclear–cytoplasmic shuttling3,91.
Plant genomes often possess convergent transcription units that can give rise to dsRNA. Under certain conditions, often resulting from biotic and abiotic stress, bidirectional transcription is induced and the resulting dsRNA is processed into nat-siRNAs92-94. Production of nat-siRNAs requires DCL2 (which produces 24-nt siRNAs) or DCL1 (resulting in 22-nt siRNAs), depending on the genomic origin of the overlapping transcripts. Other essential biogenesis factors include RDR6, SGS3, HYL1, HEN1 and RNAPIV92,93.
A highly abundant class of plant endo-siRNAs — hc-siRNAs — arises from repeats and transposable elements95-101. hc-siRNAs are predominantly 24 nt in size and their biogenesis, which is thought to occur in nucleolar bodies, depends on DCL3, its partner protein CLASSY1 (a SNF2 domain protein), the RdRP RDR2 and the plant-specific DNA-dependent RNA polymerases RNAPIV and RNAPV. Processed siRNA duplexes are methylated by HEN1 and primarily loaded into AGO4.
Once produced, small RNAs and, in many cases, specific small RNA strands must be loaded into Argonaute proteins. Sorting is influenced by the Dicer that processes the precursor, the structure of the small RNA duplex, its terminal nucleotides, its thermodynamic properties and the destination AGO protein (see BOX 2 for structural properties of AGO proteins).
Argonaute (AGO) proteins provide numerous possibilities for RNA–protein interactions that might underlie the proposed determinants of small RNA strand sorting. The interaction between AGOs and small RNAs occurs through several contact points in three characteristic domains of the protein: the PAZ, Mid and PIWI domains (a and b; part b shows a stereo view of the crystal structure of Thermus thermophilus AGO bound to a guide DNA–target RNA duplex161).
The PAZ domain hosts the 3′ end of the small RNA162,163, whereas the Mid domain forms a binding pocket that anchors the 5′ phosphate of the terminal nucleotide of the small RNA111,161,164-167. These interactions provide opportunities for base-specific contacts that might provide preferences for 5′ nucleotides or might encourage the loading of duplexes with unstable 5′ ends. Whereas plant, fly and worm microRNAs (miRNAs) show a strong tendency to start with U, human miRNAs are biased towards U or A as 5′ terminal nucleotides73,76,106-109,111,112. Recent work provides structural evidence for nucleotide-specific interactions in the Mid domain of human AGO2 that ensure the preference for a 5′ terminal U (or A), while excluding G or C through a nucleotide specificity loop111. Interestingly, this structure is well conserved in all four human AGO proteins as well as in Drosophila melanogaster AGO1 or the worm miRNA acceptors ALG-1 and ALG-2. By contrast, AGO proteins that function in other small RNA pathways, such as D. melanogaster AGO2 or plant AGOs, lack this nucleotide specificity loop111. Whether the region corresponding to the nucleotide specificity loop in these distant proteins contributes to sorting of small RNAs, depending on the 5′ nucleotide or not, awaits further structural investigation.
The PIWI domain, which shows similarity to RNase H folds, harbours the residues required for catalytic activity (in AGO protein usually Asp–Asp–His). Thus, cleavage-competent AGO proteins carry out endonucleolytic cleavage of target transcripts through their PIWI domain164,168-170. Cleavage products of AGO enzymes feature 3′ hydroxyl and 5′ phosphate ends171,172.
Panel b is reproduced from Ref. 161 © (2008) Macmillan Publishers Ltd. All rights reserved.
In part, sorting may be driven by specific protein–protein interactions between biogenesis and effector components. For example, in animals, Dicing and Argonaute loading have been proposed to occur as concerted processes102,103. This provides an opportunity for determining the fate of specific precursors to join certain effector complexes if a particular Dicer preferentially binds one Argonaute family member. However, Dicer and Argonaute cannot be the full story. Instead, it is clear that more complex-loading and strand-recognition pathways also influence the sorting of small RNAs. To exert its regulatory functions, mature RISC must be programmed with a single-stranded RNA. Thus, for small RNAs that are initially produced as duplexes, one strand must be chosen and the other discarded — a process called RISC loading. Strand selection must not be random. For example, for most miRNAs, evolutionary pressure has honed one particular strand of the duplex as a crucial regulator and loading of the other strand, the miR*, would cause silencing of the wrong set of genes.
Even from the first mechanistic studies, it was clear that strand choice was partly encoded in the intrinsic structure of the small RNA duplex, and a major determinant resides in its thermodynamic properties104,105. For both miRNAs and siRNAs in flies and mammals, the strand with the least stable 5′ end is more often retained. There are also additional favourable sequence characteristics, such as a bias for a U at position 1 (see BOX 2 for further details)73,76,106-110. Recently, our understanding of small RNA-sorting determinants has expanded substantially, and Argonaute and RNA structural studies have begun to provide a mechanistic basis for observations from in vitro and in vivo analyses90,106,107,110-112.
In mammals, a single Dicer assorts siRNAs and miRNAs among four Argonaute subfamily proteins, apparently without much discrimination. However, in D. melanogaster, two distinct Dicer proteins process small RNA duplexes that preferentially enter AGO1 or AGO2 complexes. Generally, AGO1 is occupied by miRNAs, whereas AGO2 associates with siRNAs. This parallels the processing of miRNAs by Dicer 1 and siRNAs by Dicer 2. However, there are exceptions to the rule. For example, there are Dicer 1-derived small RNAs that preferentially load AGO2, implying the existence of a post-processing sorting mechanism107,113. Although miRNA and siRNA processing intermediates are approximately 19–21-nt duplexes with 2-nt 3′ overhangs, the character of their duplexed portions substantially differs (FIG. 3a). siRNAs are derived from duplexes featuring perfect or nearly perfect dsRNA, whereas miRNAs originate from precursors that typically contain several mismatches or bulges. Other features that affect sorting include the terminal nucleotides and thermodynamic properties of the duplex ends.
The numerous inputs into the sorting decisions of small RNAs have posed a challenge to predicting their fates in D. melanogaster. However, recent studies have suggested the application of hierarchical rules to predict differential AGO loading106,107,110. At the top level is duplex structure, specifically its degree of base pairing. Small RNA strands with unpaired central regions (~nucleotides 9–10) tend to be directed into AGO1 and disfavoured for AGO2 loading. Although D. melanogaster AGO1 and AGO2 show different preferences for terminal nucleotides (AGO1 favours a terminal U, whereas AGO2 shows a bias towards a 5′ C)76,106,107, the identity of the 5′ nucleotide only makes a minor contribution to sorting107. For perfect duplexes, thermodynamic asymmetry dominates strand choice, which is precisely as was originally proposed104,105,107.
It should be noted that sorting is a strand-centric process. Once a duplex is made, it seems that one strand is assessed and its fate determined. Thus, for many miRNAs, miR strands are abundant in AGO1 complexes and miR* strands predominate in AGO2–RISC; however, these miR and miR* strands arise from independent precursor molecules rather than through the stabilization of both strands of a given duplex. Thus, for each processed duplex, the choice seems to be whether the miR strand becomes committed to AGO1 or the miR* is committed to AGO2, with the complementary strand of each miRNA duplex being discarded during RISC maturation. Thus, AGO1 and AGO2 may compete for the selection of strands from each duplex, with the strength of preferential loading signals determining the ultimate abundance of the miR and miR* in AGO1 and AGO2 complexes, respectively106,107,110.
It was recently noted in D. melanogaster that some hairpin-derived endo-siRNAs accumulate in AGO2 even though they originate from mismatched duplexes and have a terminal U — features which are thought to direct them towards AGO1 (REF. 114). Interestingly, in vitro, these small RNAs are sorted into AGO1. In vivo, however, these AGO1-loaded endo-siRNAs silence targets with high sequence complementarity. This paradox can be resolved by invoking target-directed small RNA destruction; small RNAs of this sort may be loaded into AGO1 in vivo, but they are unstable owing to lack of the stabilizing 2′-O-methylation, which they acquire when loaded into AGO2.
As in D. melanogaster, worm miRNAs and siRNAs are partitioned among distinct AGO subfamily proteins. Although worm sorting rules have not been probed in detail, miRNAs show a tendency towards central mismatches and are sorted into ALG-1 or ALG-2, whereas siRNAs from perfect duplexes preferentially load RDE-1 (REFS 115,116). In contrast to flies and worms, individual mammalian AGO clade proteins show no specialized structural and 5′ nucleotide preferences for small RNA duplexes117-119. This raises the possibility that mammals lack a strict system for small RNA sorting, at least among their AGO subfamily members.
A. thaliana encodes ten Argonaute proteins, which vary in their degrees of specialization and expression patterns. As in animals, plant AGO proteins tend to show preferences for distinct small RNA classes, which are produced through somewhat compartmentalized biogenesis pathways. For example, AGO1 is manly occupied by miRNAs that arise through processing by DCL1. AGO4 prefers hc-RNAs that are processed by DCL3. AGO2 is the principal recipient for ta-siRNAs. An additional complexity is that different Dicers produce small RNAs of distinct sizes. Plant DCL1 and DCL4 produce 21-nt RNAs, DCL2 22-nt RNAs and DCL3 24-nt RNAs. Different Dicer proteins have also been proposed to reside in different subcellular compartments. Thus, a wide range of properties might be exploited to establish specificity in plant small RNA sorting. Surprisingly, although the terminal nucleotide of the siRNA had a minor effect on sorting in flies and mammals, it strongly impacts sorting in plants.
Deep sequencing of small RNAs associated with AGO family members clearly indicated that distinct AGO proteins preferentially load small RNAs with specific 5′ nucleotides90,112,120,121. AGO1 showed a strong bias towards a terminal U. AGO2 and AGO4 selected sequences that begin with an A, and AGO5 mainly bound RNAs starting with a 5′ C. Simply changing the terminal nucleotides could redirect small RNAs into different complexes in a predictable manner, strongly supporting the dominance of this sorting signal.
There were exceptions to the simple rule proposed above. MiR390, which begins with an A, would be predicted to load AGO2 but, instead, exclusively occupied AGO7 (REF. 90). Moreover, miR390 could not be redirected by altering its terminal base. Thus, although base recognition contributes strongly to sorting, other characteristics of small RNAs must also be taken into account. These could include duplex properties, such as thermodynamic asymmetry or degree of base pairing, although this hypothesis has yet to be examined. Overall, the data support a model in which plant small RNAs dissociate following Dicer cleavage and are subject to a sorting process, which surveys their terminal base. Other considerations, their size and the Dicer that produced them may contribute to specificity in a manner that varies with the small RNA species, but which becomes the dominant determinant of sorting in a few instances.
To date, we know far more about the loading determinants of miRNAs and siRNAs than of any other small RNA class. Even within these well-studied groups, there are exceptions to the rules outlined above. For example, several reports now support the idea that pre-miRNA hairpins can be successfully loaded into RISC118,122-126 (BOX 3). Mirtrons bypass the Drosha step but are presumably loaded using the normal miRNA strand determinants following Dicer cleavage.
It had been reported in the literature that precursor-microRNA (pre-miRNA) hairpins are sometimes directly loaded into RNA-induced silencing complex (RISC) instead of being funnelled into the canonical Dicer-dependent biogenesis pathway118,125,126. Recently, it was shown that this strategy is actually used as a biogenesis mechanism by a conserved vertebrate miRNA, miR-451 (REFS 122-124). Like other endogenous miRNAs, mir-451 is synthesized by RNA polymerase II (RNAPII) as a polycistronic transcript together with mir-144 (see the figure above). This primary miRNA (pri-miRNA) is initially processed by the Microprocessor (Drosha–Pasha complex) through the canonical biogenesis pathway. However, following export to the cytoplasm, the two pre-miRNAs adopt distinct fates. Although pre-mir-144 continues along the canonical miRNA path and is processed by Dicer, pre-mir-451 is not a Dicer substrate, perhaps because its 17-nucleotide (nt)-duplexed region is too short. Instead, the pre-mir-451 hairpin is directly loaded into Argonaute 2 (AGO2). There, the duplexed portion of the hairpin is cleaved by the Argonaute RNase H-like motif and the cleaved product is resected by an unknown activity to form mature miR-451. Although it is unclear whether pre-mir-451 is actively sorted into AGO2, only those species which occupy this catalytically competent AGO family member can mature.
As a second example, the pre-miRNA equivalents for mirtrons are formed by the splicing machinery rather than by Drosha. Their biogenesis is outlined in FIG. 1.
Several small RNA classes are formed without a double-stranded precursor. Even though this should pose a simpler sorting problem, with no need to discriminate guide versus passenger strands, we know little about how these species are selectively loaded into specific Argonautes. Among good examples are the secondary siRNAs in worms, which are generated as direct RdRP products, presumably without the need for further processing6,7. These are specifically loaded into WAGO clade Argonautes through a still mysterious mechanism9. One could easily imagine that biogenesis and loading could be tightly coupled, or that the 5′ triphosphate termini on these small RNAs could contribute to binding specificity through interactions with the mid-domain of the Argonaute, but these ideas remain to be tested.
piRNAs, including worm 21U RNAs, do not depend on Dicer processing and are thought to originate from single-stranded precursors4,5,8,127,128. The loading of these small RNAs into Piwi subfamily proteins and the requirements of associated partner proteins for proper Piwi–RISC assembly are unknown. Whether the striking bias for a terminal U seen in many piRNAs reflects upstream processing activities or is a consequence of the nucleotide-binding preferences of these Piwi proteins (as is seen in plants) remains unclear.
Small RNA duplexes cannot be efficiently incorporated into AGO proteins without assistance from additional proteins118,119. These factors are also known as the RISC-loading machinery (or pre-RISC) and their precise nature differs for distinct AGO proteins. RISC loading is an active process that requires ATP118,129-132, probably owing to the necessity to drive conformational changes so that AGO proteins accept small RNA duplexes. This concept, which was originally suggested based on structural analyses of AGO proteins, has gained recent support from studies that characterized interactions between Argonautes and the heat shock cognate 70 (HSC70)–heat shock protein 90 (HSP90) chaperone complex133-136. These studies support a model in which the interaction between Argonautes and the chaperone complex creates an ‘open’ conformation that is suitable for the loading of duplexed small RNAs. ATP hydrolysis and dissociation of the chaperone results in a structure that can discard or cleave the miR* or passenger strand to form an active RISC.
In flies, the loading machinery for AGO2–RISC also involves Dicer 2 and its dsRBD partner R2D2 (REFS 68,69,130,131,137,138) (FIG. 3c). In fact, these factors have been proposed to be the biochemical sensors for thermodynamic asymmetry. In this regard, R2D2 has been shown to bind the more stable end of the dsRNA duplex, whereas Dicer 2 is positioned at the less-stable end of the duplex, providing a mechanism for orientated AGO2 loading139. Although a minimal pre-RISC could be constituted with only Dicer 2, R2D2 and AGO2 (REF. 140), the bona fide AGO2–RISC-loading machinery in vivo undoubtedly contains additional components, including the chaperone complexes described above. Roles for Dicers have also been suggested for AGO1 loading. One report suggests that AGO1–Dicer 1 complexes correspond to the AGO1–RISC-loading complex141, whereas a second report indicated that Dicer 1 was dispensable for AGO1–RISC assembly132.
Although little is known about the loading machinery in plants, a recent study proposed that the thermodynamic properties of duplex ends (instead of terminal nucleotides) are the dominant determinant for strand selection of some DCL1-processed miRNAs and that HYL1, like fly R2D2, functioned as a component of the asymmetry sensor66.
For RISC to exert its function, pre-RISC needs to mature (FIG. 3c). Although the orientation of the miRNA duplex was determined during RISC assembly and loading, the crucial maturation step is discarding of the passenger or miR* strand. In flies and mammals, distinct AGO proteins seem to achieve this by different mechanisms, which depend on the nature of the AGO protein and the degree of base pairing in the loaded duplex. Using their ‘slicer’ activity, fly or mammalian AGO2 can cleave the passenger strand of perfect or nearly perfect duplexes12-14. The cleaved strand dissociates from RISC and, in flies, is degraded by a multimeric endonuclease complex (consisting of Translin and Trax), termed C3PO (component 3 promoter of RISC)140. Following passenger strand removal, AGO2-bound single-stranded small RNAs are methylated at their 3′ termini by the methyltransferase HEN1 (also known as Pimet) to yield mature AGO2–RISC142,143.
Maturation of miRNA RISC is less well understood (FIG. 3b, bottom). Human AGO1, AGO3 and AGO4 all lack slicer activity and fly AGO1 is a poor slicer. Moreover, miRNA duplexes often contain sufficient bulges to prevent slicing of miR* strands even by competent enzymes. Therefore, it has been proposed that miR* strands dissociate in a cleavage-independent manner by unwinding — a process that is facilitated by the presence of mismatches in the loaded duplexes113,118,132. Biochemical evidence supports unwinding as a passive, ATP-independent process, with degradation of the miR* strand on its release. It is unclear how plant Argonautes remove the miR* or passenger strand during RISC maturation. MiR* and passenger strands could be cleaved through the slicer activity of AGOs (similar to fly AGO2) or unwound passively (like fly AGO1)12-14,132.
The ultimate result of accurate strand selection and sorting is that an active RISC is formed, which is imbued with the ability to regulate a target gene or process. Argonaute family members differ in their biochemical properties, subcellular localization and expression patterns, and matching the right small RNA with the correct partner is key to proper biological function.
Although AGO proteins evolved as ribonucleases, animal miRNAs affect their targets without the need for this activity. miRNAs generally interact with their targets through limited base-pairing interactions that are insufficient to place the scissile phosphate of the target in the enzyme active site where cleavage can occur. The prevalence of cleavage-independent repression modes is also reflected in the diversity of the Argonaute family. In mammals, three of the four AGO proteins have lost catalytic potential, and AGO1, the D. melanogaster AGO protein into which most miRNAs are sorted, is a poor enzyme compared with its siRNA-binding cousin113.
miRNA-directed target cleavage has only been reported in a few cases144,145. However, this is assumed to be the principal regulatory mode for endo-siRNAs and for piRNA-mediated repression of transposons. Here again, the choice of a particular AGO partner is crucial. Piwi family members all retain catalytic competence and D. melanogaster AGO2, the main partner for endogenous and viral siRNAs, is tuned for highly active slicing (BOX 4).
Individual Argonaute (AGO) proteins differ in their expression patterns, subcellular localization and enzymatic properties. Thus, distinct AGOs can function through many different effector modes that may involve slicing of target transcripts, cleavage-independent regulation and chromatin modification (reviewed in REFS 15-17). Another layer of complexity is added by the degree of sequence complementarity between the AGO-bound small RNA and target transcripts, which determines the mechanism of regulation. a | In flies, AGO1-associated microRNAs (miRNAs) typically target mRNAs in their 3′ UTRs to reduce protein synthesis. Owing to limited sequence complementarity between the small RNA (seed region) and the mRNA, such interactions usually do not result in direct cleavage of the targeted transcript. Instead, AGO1 and its partner protein GW182 are likely to disrupt crucial interactions between the polyA tail and the cap of the transcript, leading to a reduction in translational initiation and an induction of mRNA decay173. In mammals, it was recently shown that reduced protein output is predominantly owing to destabilization of the target transcript174. b | Small RNAs bound to Drosophila melanogaster AGO2 do not exhibit a bias towards binding their targets in the 3′ UTR. AGO2 primed with a small RNA sharing extensive complementarity with its target typically directs endonucleolytic cleavage of the mRNAs through AGO2 slicer activity. The 2-O-methyl modification of AGO2-bound small RNAs prevents their degradation when targeting perfectly complementary transcripts64,142,143. However, other modes are possible: AGO2 can also regulate targets with limited sequence complementarity through a block in translation initiation (not shown)173. PABP, poly(A)-binding protein
AGO1-associated plant miRNAs usually share extensive sequence complementary with their mRNA targets and these interactions often result in target cleavage146. However, recent studies have indicated that cleavage-independent translational repression is widespread in plants, even for highly complementary target sites147. Nevertheless, miRNA-mediated cleavage is of key importance for some processes like the biogenesis of ta-siRNAs, for which the initial slicing event is key to RdRP recruitment and dsRNA synthesis88.
Notably, small RNAs that direct cleavage, for example, plant miRNAs, piRNAs and fly endo-siRNAs, often have a 2′-O-methyl modification on their 3′ termini. Although the purpose of this modification was initially mysterious, it is now clear that this functions as a protective group to prevent small RNA destruction64,142,143. In flies and mammals, small RNAs that have extensive complementarity to their targets can be recognized by terminal uridyl transferases, which mark small RNAs for degradation64. The uridylation event is blocked by the 2′-O-methyl modification, preserving these small RNAs, which have evolved to function through cleavage64. The balance between protection and targeted destruction has been proposed as a quality control on small RNA sorting and as an evolutionary mechanism to drive animal miRNAs toward a cleavage-independent repression mode64.
hc-siRNAs are thought to function by different mechanisms148,149. They must be sorted into a particular Argonaute, AGO4, which they guide to target DNA loci by base pairing with nascent non-coding transcripts synthesized by RNAPV. Effector proteins, such as the chromatin-remodelling factor DRD1, the de novo methyltransferase DRM2 and other factors, are then recruited, resulting in DNA methylation at cytosine residues150,151. As this regulation functions by repressing RNA synthesis, it was termed transcriptional gene silencing to distinguish it from post-transcriptional gene-silencing modes. Some piRNAs in flies and mammals must associate with particular Piwi-family proteins — that is, PIWI and MIWI2, respectively — which enable these small RNAs to enter the nucleus, where they are thought to induce transcriptional repression through changes in chromatin structure or DNA methylation, respectively152-154. Similarly, worm NRDE-3, an Argonaute of the WAGO clade, transports siRNAs to the nucleus and functions through co-transcriptional gene silencing155.
Thus, the final effects of small RNA sorting are felt in the modes of repression that become available as they join specific AGO proteins. The consequences of improper sorting may range from a loss of target regulation to inappropriate regulatory modes.
An understanding of the mechanisms by which small RNAs are selected and sorted among different potential effector complexes is crucial. In part, this knowledge guides hypotheses concerning the cellular roles of an ever-growing roster of small RNA species. However, the ability to predict the fate of small RNAs based on their sequence and structural characteristics is also essential to their effective use as experimental tools and potential therapeutics. We have begun to piece together the properties that determine small RNA fates and, in some instances, these properties can even predict with reasonable accuracy which small RNAs will efficiently join a particular effector complex. Yet, we still have a relatively poor ability to design effective small RNAs ab initio for experimental or therapeutic use. This capacity will rest on advances in both our understanding of RISC as an enzyme, including its mechanisms of target recognition, silencing and product release, and a detailed knowledge of how specific RNA strands are efficiently loaded into RISC as guides.
The authors thank O. Tam, F. Muerdter, J.-W. Wang and R. Zhou for comments on the manuscript. The authors are greatly indebted to J. Duffy for assistance with figures. B.C. is supported by a Ph.D fellowship from the Boehringer Ingelheim Fonds. This work was supported by grants from the National Institutes of Health and a kind gift from K. W. Davis. G.J.H. is an investigator of the Howard Hughes Medical Institute.
Competing interests statement
The authors declare no competing financial interests.
mir-144 | miR173 | miR390 | miR-451
AGO1 | AGO2 | DCR1 | DCR2 | Drosha | HEN1 | LOQS-PB | R2D2
Gregory J. Hannon’s homepage: http://hannonlab.cshl.edu/index.html
Gregory J. Hannon’s Cold Spring Harbor Laboratory homepage: http://www.cshl.edu/Faculty/hannon-gregory.html