|Home | About | Journals | Submit | Contact Us | Français|
RNA interference (RNAi) uses small RNA molecules to regulate transcriptional and post-transcriptional gene expression. In recent years, a number of structural studies provided insights into the molecular architecture and mechanism of functional modules of RNAi. Mechanisms of nucleic acid recognition and cleavage have been revealed by structural studies of proteins and their nucleic acid complexes involved in RNA biogenesis, for example, Argonaute, PIWI, RNase III, Dicer, Drosha and DGCR8. While quite a few questions remain, an excellent structural and mechanistic overview of RNAi processes has already emerged. In this review, we examine functional modules and their assemblies in RNAi processes.
The discovery of RNA interference is a major breakthrough in modern biology. RNAi influences cell proliferation, development, immunity, transposition and tumorigenesis. Functions and processes of RNAi have been reviewed extensively [1–3]. Briefly, two major classes of RNAi are known – small interfering RNAs (siRNA) and micro-RNAs (miRNA) [4,5] (Fig 1A). Both processes require cleavage of dsRNA to generate ~ 20–30 bp dsRNA with 2-nucleotide overhangs at the 3´ ends. siRNA fragments are generated by Dicer from dsRNA precursors. The miRNA pathway starts in nucleus from processing of non-coding pri-miRNA by a complex of Drosha and DGCR8 proteins to produce ~70-nt pre-miRNAs. Pre-miRNAs are then transported to the cytoplasm and cleaved by Dicer to make miRNA fragments. One strand of short dsRNA products (known as a guide RNA) is then incorporated into the RISC complex for selecting target mRNAs by base pairing. In siRNA, Argonaute (Ago, the nuclease component of RISC) uses the guide RNA to select complementary mRNA for degradation. In miRNA, base pairing between guide and target mRNA leads to translational inhibition. A recently discovered new class of RNAi, piRNAs (PIWI interacting RNA, PIWI and Ago are closely related), prevents the spreading of selfish genetic elements by methyl-dependent epigenetic silencing [6,7] and cleavage of transposon mRNA  (Fig. 1B) .
The following functions are repeatedly required during RNAi: recognition of the 3´ and 5´ ends of RNA, binding of dsRNA, and cleavage of one or both strands of dsRNA at a defined distance from one end. Remarkably, nature uses a limited number of protein modules to accomplish these tasks (Fig 1C). PAZ domain recognizes the 3’ end of RNA in both miRNA and siRNA, the Mid domain of PIWI and Ago binds the 5´-phosphate, and dsRNA-binding domains (dsRBD) are found in Drosha, DGCR8 and Dicer. An RNase III-like endonuclease domain (endoND) carries out all dsRNA cleavage in RNAi. The PIWI domain in PIWI and Ago resembles RNase H and can nick mRNA (passenger or target RNA) that forms base pairs with the guide RNA. This is often referred to as the “slicer” activity. Some PIWI and Ago do not possess the slicer activity due to active site mutations, and their function is most likely to bind dsRNA. In this review, we briefly summarize the structural and functional modules in RNAi processes and how they are assembled into various complexes.
The PAZ domain contains around 130 residues and binds the 3´ end of ssRNA or a 3´ end protruding from dsRNA [9,10]. The core of the PAZ domain consists of a short six-stranded β-barrel that resembles an OB fold and a β-hairpin and α-helix on the side of the barrel (Fig. 2A). In some PAZ domains two α-helices cover one end of the barrel (Fig. 2B) [11–14]. Structures of the Drosophila Argonaute PAZ domain bound to ssRNA or ssDNA , human Ago PAZ bound to siRNA-like duplex  and T. thermophilus Ago (Tt-Ago) bound to a ssDNA guide strand and RNA/DNA duplex [15,16] reveal a conserved mechanism for nucleic acid recognition. PAZ approaches the 3´ end of RNA or DNA with its open β barrel end (Fig. 2B), and the interactions with the last two nucleotides at the 3´ end are conserved among these structures (Fig 2B). The orientation and structures of the remaining ds or ss nucleic acids are variable apparently due to limited and non-specific interactions with PAZ.
The last 2 nts at the 3´ end are stacked in the A conformation and inserted into a large binding pocket between the side of the β-barrel and the hairpin/helix element forming extensive polar and hydrophobic interactions with PAZ (Fig. 2B). Several conserved residues including key aromatic side chains line the pocket and interact mainly with the phosphosugar backbone. In particular, a characteristically conserved tyrosine stabilizes the phosphate between the last two nucleotides with a hydrogen bond. The 3´-end is surrounded by hydrophobic and aromatic residues, and the 3´-OH is hydrogen bonded with a main-chain carbonyl oxygen. The W–C edges of these unpaired bases are exposed to solvent, thus nucleic acid-binding is non-sequence specific. There is no interaction with 2’-OH of the RNA, which agrees with the observation that PAZ binds either RNA or DNA.
The ~70-residue dsRNA-binding domain (dsRBD) is present in many proteins involved in transcription, RNA processing, mRNA localization and translation . Structures of several dsRBDs have been determined. They all contain a central three-stranded anti-parallel β-sheet and two helices in the topological order of α-β-β-β-α [18–20]. The two helices are roughly parallel and located on one side of the β-sheet. Structures of different dsRBDs in complex with RNAs (S. cerevisae Rnt1p , X. leavis RNA-binding protein A  and RNase III from A. aeolicus ) reveal two alternative binding modes. Most often a dsRBD binds dsRNA without sequence preference and interacts with the phosphosugar backbones in the A conformation, which is characterized by a wide minor groove and narrow major groove (Fig. 2C). Charge-charge interactions are formed between the helical dipoles (N-termini) of the α-helices and the phosphate groups surrounding the narrow major groove. Recognition of riboses instead of deoxyriboses is achieved through hydrogen bonding between protein side chains and 2´-OH groups along the minor groove (Fig. 2C). Alternatively, a dsRBD can recognize hairpin-loop structures, mismatches and bulges in a sequence-specific manner by interacting with unpaired RNA or a loop region. For example the dsRBD from Rnt1p (yeast RNase III) binds a tetraloop (Fig. 2D) and directs cleavage of adjacent dsRNA at a specific location by endoND .
The 5´ end of a guide RNA is recognized by RISC complex, and the cleavage site of target mRNA is determined by the number of base pairs from the 5´ end . The Mid domain found in PIWI and Ago contains a four-stranded parallel β-sheet flanked with a pair of α-helices on each side (Fig. 2E). The fold is similar to the sugar-binding domain of Lac repressor . Structures of Mid domain associated with dsRNA and ssDNA (A. fulgidus PIWI protein bound to a small siRNA-like duplex [27,28] and T. thermophilus Ago with ssDNA and RNA/DNA [15,16]) reveal that it recognizes the 5’-phosphate specifically. Even with dsRNA most contacts are formed with the strand presenting the 5´ end. The 5’ end of either RNA or DNA makes a U-turn. The first base is flipped out and stacked with an Arg or Tyr side chain, and the 5´ phosphate is placed in a highly conserved pocket in the Mid domain interacting with invariant Tyr, Lys, and Gln residues (Fig. 2E). Interestingly, the 5’-phosphate is juxtaposed to the 3rd phosphate, and a metal ion coordinated by the carboxylate group of a C-terminal valine or leucine stabilizes these phosphates (Fig. 2F).
All RNase III-like endonucleases are dimeric and cleave dsRNA. The cleavage products of 5’-phosphate and 3’-OH with a 2nt overhang are characteristic of all RNAi pathways . Bacterial RNases III and yeast Rnt1p are true dimers with one endoND in each subunit (Fig. 3A). Dicer and Drosha are pseudo-dimeric, and the two endoNDs are tandem in one polypeptide chain (Fig. 1C, Fig. 3B) .
Both endoND and the dimeric interface are highly conserved. A single endoND typically comprises seven α-helices. A pair of long perpendicular α-helices forms the active site as well as the dimer interface. (Fig. 3A, B). The active site is at the intersection of the two helices and consists of two pairs of carboxylates, one pair from each helix. Dimerization results in a valley-shaped dsRNA-binding surface along the length of the dimer [23,31], and two active sites ~ 20 Å apart are symmetrically located at the bottom of the valley. The placement of the active sites matches the two scissile bonds 2 base pairs apart across the minor groove (Figure 3C).
Divalent metal ions, preferably Mg2+, are essential for catalysis. Although in many RNase III structures only one metal is observed, biochemical data [32–34] and crystallographic studies of A. aeolicus RNase III complexed with RNA  and G. intestinalis Dicer with Er3+ ions  indicate that dsRNA cleavage occurs via the two-metal ion mechanism (see later discussion on RNase H). Structures of bacterial RNase III-dsRNA complexes (Fig 3C) [23,31] indicate that the specificity for dsRNA cleavage and not dsDNA or RNA/DNA hybrid is due to specific recognition of A-form duplex and 2´-OH groups.
The only crystal structure of full-length Dicer to date is that of G. intestinalis protein (Gi-Dicer), which consists of two endoNDs, PAZ and platform domains (Fig. 3D) . The structure of Gi-Dicer has been likened to a hatchet with the dimeric endoNDs forming the blade and the PAZ at the end of the handle. Gi-Dicer cleaves dsRNA to 25 bps with 2 nt overhang at the 3´ end. Using structures of bacterial RNase III-dsRNA and PAZ-nucleic acid complexes as templates, a model of Gi-Dicer - dsRNA complex was proposed (Fig. 3D) . The platform domain between the endoND and PAZ appears to have a positively charged flat surface for dsRNA binding and separates the cleavage site (endoND) from the 3´ end (PAZ) by ~25-bps. One can imagine that by slightly changing the size and shape of the platform domain, Dicer may generate 20 to 30 bp products.
Gi-Dicer may be the minimal Dicer that is functional in vitro and in vivo . As depicted in Fig. 1C, Drosha-DGCR8 microprocessor complex and other Dicers contain dsRBD in addition to endoND and PAZ domains. DUF283 in Dicer has been suggested to resemble dsRBD based on sensitive sequence comparisons . In bacterial RNase III, each dsRBD domain is tethered to an endoND and contributes significantly to dsRNA binding (Fig. 3C). The core of DGCR8 (DiGeorge syndrome critical region gene 8 in human, Pasha in C. elegans) contains two tandem dsRBDs and complements Drosha in pri-miRNA processing as efficiently as the full-length DGCR8 [37–39]. In the structure of the DGCR8 core domain, the dsRBDs interact extensively to form a compact structure . If these dsRBDs bind a single dsRNA, the RNA has to be significantly bent. Alternatively they may interact with different segments or even separate molecules of pri-miRNA . How dsRBDs increase the catalytic activity of Dicer awaits future experiments. The absence of dsRBD in Gi-Dicer may be compensated by the platform domain (Fig. 3D).
Both Drosha and Dicer can process imperfectly base-paired dsRNA (Fig. 1A). Bacterial RNases III can nick bulges present in dsRNA . By analogy, Dicers may accommodate unpaired bulges and bubbles. Indeed Dicer is reported to cleave substrates with bulges or bubbles 6–10 bps away from the endoND active sites and located between the endoND and PAZ . In addition the dsRBD of Dicer may facilitate the binding to a hairpin loop region as shown in the structure of dsRBD from Rnt1p (Fig. 2D).
Crystal structures of PIWI and Ago, which contains PIWI, reveal that the PIWI domain resembles RNase H, an endonuclease that nicks RNA strands in an RNA/DNA hybrid (Fig. 4A). Further biochemical analyses have confirmed the intrinsic endonucleolytic activity of Ago . Two catalytically essential carboxylates of RNase H are conserved in Ago and PIWIs that possess the slicer activity, and the third carboxylate, which could be Asn in some RNases H, is replaced by His and occasionally Lys or Asp in Ago (Fig. 4A) . RNase H cleaves RNA by the two-metal ion mechanism , the same as employed by RNase III. An important feature of two-metal ion catalysis is that metal-ion coordination in the active site is substrate dependent. This ensures substrate specificity – only when the proper substrate is bound can the metal ions be coordinated to promote catalysis . This is probably why RNase H binds dsRNA and RNA/DNA hybrids but cleaves only RNA/DNA hybrids . Similar to the RNase H specificity, PIWI domains have high affinities for ssDNA, medium affinities for RNA/DNA and dsDNA, and the lowest for dsRNA and ssRNA [27,47]. Yet the most efficient slicing activity has been observed with RNA/DNA hybrids followed by dsRNA .
Ago is the core of the RISC complex, which uses ~22nt ssRNA processed as siRNA or miRNA to target and cleave mRNA. Functionally Ago recognizes 3´ and 5´ ends of a guide RNA, dsRNA formed between guide and target RNA, and cleaves target RNA at the 10th bp from the 5´ end of guide RNA. Structurally, Ago can be divided into four domains: N-terminal, PAZ, Mid and RNase H-like PIWI domains. Except for the N domain, each has been ascribed with a specific function. The structures of Ago proteins with and without nucleic acid fragments [15,16,26,27,47] clearly show that each domain can move relative to the others (Fig. 4B). Recently, structures of T. thermophilus Ago (Tt-Ago) bound to a ssDNA mimicking a guide RNA  and RNA/DNA duplex  have been reported. In both structures the guide strand anchors the 5’ phosphate in the Mid domain and the 3’ -OH in PAZ (Fig 4C). In the RNA/DNA complex (with DNA as the guide strand) the duplex can only be traced for the first 10–11 base pairs from the 5’ end of the guide strand. Nucleotides 10 and 11 of the target strand are in the vicinity of the active site and are partially disordered due to the mismatch introduced there to prevent cleavage. Also visible in the electron density are the last 3 nucleotides of the guide strand bound to the PAZ domain. To accommodate the RNA/DNA duplex the central groove is significantly wider compared to other Ago structures. This is a result of large movements of the N-ter and PAZ domains which are flexible and can be compared to two lobster claws, while the Mid and PIWI (RNase H) domains of Ago may be compared to the lobster tail and head. When fully stretched, it can accommodate >10 bp of dsRNA or RNA/DNA from the 5´-end of a guide RNA. The structure demonstrates that this so-called “seed region” is key for target recognition through base pairing, which is also confirmed by biochemical data. Guide strands as short as 9 nt can support cleavage and introduction of bulges or mismatches in the seed region leads to a decreased cleavage, while even up to six contiguous mismatches were tolerated in the 3’-end region of the guide after the active site .
Tremendous progress has been made in understanding of the mechanism of RNAi from the structural perspective, but structures of key protein-RNA complexes remain to be determined, e.g. Drosha, the N-terminal domains in Droshas and Dicers, and Ago complexed with dsRNA substrates longer than 10 bps. In addition, new RNAi pathways are being discovered [8,48] and new Dicer and Slicer variations may emerge with different substrate specificity. Structural biology will likely continue to play a central role in enlightening RNAi pathways.
W. Yang is supported by NIDDK, NIH intramural research program. M. Nowotny is supported by the Wellcome Trust International Senior Research Fellowship and the EMBO Installation Grant. We thank Dr. R. Craigie for editing the manuscript.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.