Our understanding of the mechanisms by which small RNAs are loaded into RISCs is derived mainly from biochemical studies in
Drosophila. There, siRNA loading is facilitated by the RISC-loading complex (RLC) that contains Dicer-2 and its partner R2D2 (
Liu et al., 2003;
Pham et al., 2004;
Tomari et al., 2004a). These proteins form a heterodimer, which senses the thermodynamics of a siRNA duplex and loads the strand with the less-stably paired 5′ end into RISC (
Tomari et al., 2004b). A similar mechanism has also been proposed for loading
Drosophila miRNAs into Ago1-RISC, but the proteins facilitating miRNA loading remain to be identified (
Okamura et al., 2004).
Recent studies in
Drosophila and
C. elegans have addressed the mechanisms that assort small RNAs into specific Argonaute complexes. In neither case is small RNA loading coupled to precise biogenesis mechanisms, such as the Dicer that generates the mature RNA. Instead, structural features of precursor duplexes determine their ultimate Ago partner (
Forstemann et al., 2007;
Steiner et al., 2007;
Tomari et al., 2007). Cloning and sequencing data indicate that most
Arabidopsis miRNAs follow asymmetry guidelines (
Jones-Rhoades et al., 2006), which implies an RLC analogous to Dicer-2/R2D2. However, the coexistence of miRNAs and siRNAs in each
Arabidopsis AGO complex suggests that, in contrast to the loading of animal small RNAs, the structure of an
Arabidopsis small RNA duplex may not play a role in sorting the small RNA. Instead, our data support a small RNA sorting mechanism in which the 5′ terminal nucleotide determines its loading into a particular AGO complex.
Structural analysis of
A. fulgidus PIWI in complex with an siRNA revealed that the 5′ end of the RNA is anchored in a basic pocket in the Mid domain (
Ma et al., 2005;
Parker et al., 2005). Our data raise the possibility that the analogous pocket in each
Arabidopsis AGO protein has evolved to recognize a specific nucleotide. Such specific recognition could be conferred exclusively by the AGO protein itself, as supported by our in vitro binding studies () and domain-swap experiments (), or may be additionally aided by an unidentified factor. We cannot rule out the possibility that there might also exist one factor recognizing a particular 5′ terminal nucleotide and pooling small RNAs to be accepted by an AGO protein recognizing the same nucleotide.
A sorting mechanism directed by the 5′ terminal nucleotide explains the predominant association between
Arabidopsis miRNAs and AGO1. The need to act in concert with AGO1 has likely driven
Arabidopsis miRNAs toward a strong 5′ U bias. In contrast, excluding miRNA*s from AGO1 complexes simply required evolution of a different 5′ nucleotide on this mature strand (
Table S2). This type of discrimination is likely to be particularly important for the loading of miRNAs that do not follow thermodynamic asymmetry rules. In the miR391/miR391* and miR393b/miR393b* duplexes, the 5′ ends of the miRNA strands are more stable than those of the miRNA* strands (
Table S2). These miRNAs, which initiate with a U, are nevertheless selectively loaded into AGO1. Their miRNA*s, which contain a 5′ terminal A, are instead incorporated into AGO2 complexes (, and ).
DCL1 cleavage is not always precise, and occasional processing errors could give rise to miRNA variants with 5′ heterogeneity (
Rajagopalan et al., 2006). Moreover, some miRNA precursors, especially newly evolved ones, produce particularly heterogeneous small RNAs (
Rajagopalan et al., 2006). For example, the miR163 precursor produces two miRNAs, miR163.1 and miR163.2, which have 5′ terminal U and A, respectively (
Kurihara and Watanabe, 2004). Only miR163.1 is efficiently incorporated into AGO1 (, and ). Therefore, the specific recruitment of small RNAs bearing a 5′ terminal U by the AGO1 complex can help to compensate for inaccurate Dicer processing, which could otherwise lead to off-target effects.
A terminal nucleotide-directed loading mechanism is clearly not the only determinant of the destination of a small RNA. Our sequencing data indicate that both AGO2 and AGO4 complexes prefer small RNAs having 5′ terminal A. Thus, it is surprising that AGO2 and AGO4 bind a limited number of small RNAs in common (<8%) (). Moreover, the types of small RNAs that join these two complexes differ significantly (). In part, this is reflected also in the binding of different size classes of small RNAs by each AGO protein (), suggesting a preferential coupling of each AGO with one or more Dicer-like protein. We are also unable to explain the loading of miR172 (with a 5′ terminal A) in AGO1 (
Table S2).
We envision that several mechanisms act in concert to sort small RNAs into specific AGO complexes. First, the localization of AGO proteins to specific subcellular compartments likely determines their access to different classes of small RNAs. AGO4 is localized in nucleus (
Li et al., 2006;
Pontes et al., 2006). This could explain why 24 nt rasiRNAs that are made and function in the nucleus are mainly associated with AGO4. Second, the tissue-specific and developmental expression pattern of an AGO protein may enable it to preferentially recruit small RNAs that share that expression pattern. Third, distinct RLCs may interact with different groups of small RNAs to facilitate their loading into distinct AGO complexes.
Given that there are ten AGO proteins in
Arabidopsis, redundancy must exist between the AGO proteins in their recognition of the four possible 5′ terminal nucleotides. miR390-mediated cleavage of
TAS3 is required to initiate the production of tasiRNAs, which regulate the vegetative phase transition in
Arabidopsis (
Allen et al., 2005;
Axtell et al., 2006;
Peragine et al., 2004;
Vazquez et al., 2004;
Xie et al., 2005). We found that miR390 (bearing a 5′ terminal A) was underrepresented in AGO1 and instead accumulated in AGO2 (). However, there was no detectable change in the accumulation of miR390 and tasiRNAs in an
ago2 null plant (Salk_037548), and the vegetative phase transition is normal in
ago2 mutant plants (Figure S7). These data suggest that another AGO protein must play a predominant role in recruiting miR390. Genetic studies have shown that
ago7 mutants have defects in the regulation of vegetative phase change (
Adenot et al., 2006;
Fahlgren et al., 2006;
Garcia et al., 2006;
Hunter et al., 2006), which points to AGO7 as the best candidate.
Arabidopsis contains one of the most complex small RNA regulatory networks yet discovered. While 5′ end recognition mechanism may be restricted to plants, it may also operate in other organisms with highly elaborated Argonaute families.
C. elegans has 27 Argonaute proteins that must distinguish between primary and secondary siRNAs bearing monophosphate and triphosphate termini, respectively (
Pak and Fire, 2007;
Sijen et al., 2007;
Yigit et al., 2006). Presumably, other Argonautes specifically associate with 21U RNAs that initiate with 5′ U (
Ruby et al., 2006) and siRNAs that begin predominantly with a G (
Ambros et al., 2003;
Ruby et al., 2006). In several animals, Piwi-interacting RNAs (piRNAs) show a strong preference for a 5′ U (
Aravin et al., 2006;
Girard et al., 2006;
Grivna et al., 2006;
Lau et al., 2006). It has been presumed thus far that this bias reflects the biogenesis mechanism for this RNA class; however, our findings raise the possibility that the observed bias is instead generated by selection of RNAs with an appropriate 5′ end by Piwi clade Argonautes. Overall, many tantalizing hints suggest that the mechanisms that we have uncovered in plants are general properties of Argonaute proteins that have been exploited more broadly to help compartmentalize small RNA regulatory pathways.