|Home | About | Journals | Submit | Contact Us | Français|
RNaseIII proteins are dsRNA-specific endonucleases involved in many important biological processes, such as small RNA processing and maturation in eukaryotes. Various small RNAs have been identified in a protozoan parasite Entamoeba histolytica. EhRNaseIII is the only RNaseIII endonuclease domain (RIIID)-containing protein in E. histolytica. Here, we present three crystal structures that reveal several unique structural features of EhRNaseIII, especially the interactions between the two helixes (α1 and α7) flanking the RIIID core domain. Structure and sequence analysis indicate that EhRNaseIII is a noncanonical Dicer and it lacks a dsRBD in the C-terminal region (CTR). In vitro studies suggest that EhRNaseIII prefers to bind and cleave longer dsRNAs, generating products around 25 nucleotides in length. Truncation of the CTR or attaching the dsRBD of Aquifex aeolicus RNaseIII can enhance the binding and cleavage activities of EhRNaseIII. In combination with in vitro crosslinking assay, our results suggested that EhRNaseIII functions in a cooperative mode. We speculate that some partner proteins may exist in E. histolytica and regulates the activity of EhRNaseIII through interaction with its CTR. Our studies support that EhRNaseIII plays an important role in producing small RNAs in E. histolytica.
Ribonuclease III (RNaseIII) proteins are metal-ion-dependent, double-stranded (ds) RNA-specific endonucleases1 that are highly conserved in bacteria and eukaryotes. In higher eukaryotes, RNaseIII proteins, such as Dicer and Drosha, play important roles in the RNA interference (RNAi) pathway2. Dicer converts long dsRNAs into small interfering RNA (siRNA) duplexes that are 21–25 nucleotides (nt) long with a phosphate group at the 5′-end and a 2-nt overhang at the 3′-end3,4. These features are critical for the loading of siRNA duplexes onto the RNA-induced silencing complex (RISC)5,6. Argonaute (Ago) is the effector protein of RISC and it is activated after the degradation of the siRNA passenger strand7. The guide strands of siRNA duplexes then direct the RISC to the target RNAs via Watson–Crick base pairing8,9. Similar to the passenger strands, the target RNAs are cleaved by RISC, leading to their silencing. In lower eukaryotes that do not have the RNAi system, such as Saccharomyces cerevisiae, RNaseIIIs play multiple roles in the processing and maturation of precursor rRNAs, small nucleolar RNAs, and small nuclear RNAs10. In bacteria, RNaseIIIs are mainly involved in rRNA maturation and post-transcriptional gene regulation11.
Based on their sizes and domain architectures (Fig. 1A), RNaseIIIs can be divided into four classes1,12. Class I RNaseIIIs mainly exist in bacteria, such as Escherichia coli RNaseIII (EcRNaseIII)13 and Aquifex aeolicus RNaseIII (AaRNaseIII)14,15,16,17. Class I RNaseIIIs are around 230aa in size and they possess one N-terminal RIIID domain and one C-terminal dsRBD. Compared with class I RNaseIIIs, class II RNaseIIIs are longer (~500aa). Class II RNaseIIIs are represented by Saccharomyces cerevisiae Rnt1 (ScRnt1)18,19 and Kluyveromyces polysporus Dicer 1 (KpDcr1); both have an N-terminal extension domain (NTD) that forms an intermolecular dimer. As revealed by the crystal structure of ScRnt1-product complex18, the NTD domain forms several hydrogen bonds (H-bond) with the AGUC tetraloop of the RNA substrate. Deletion of the NTD lowers the RNA processing accuracy of ScRntp1. Class III RNaseIIIs are represented by Homo sapiens (Hs) Drosha20,21. HsDrosha is around 1,400aa long and possesses one P-rich and one RS-rich domain at the N-terminus followed by one platform and one PAZ-like domain in the middle, and two RIIID domains and one dsRBD domain at the C-terminus. Class IV RNaseIIIs are represented by HsDicer22,23,24, which is close to 1,900aa in length. Similar to HsDrosha, HsDicer also contains two RIIID domains followed by one dsRBD domain at the C-terminus. The N-terminus of HsDicer is composed of one helicase domain and one DUF283 domain followed by a platform and a PAZ domain, and these domains are involved in the binding and terminus recognition of substrate RNAs9. Extensive structural and functional studies have been carried out for these representative RNaseIIIs, including AaRNaseIII, ScRnt1, and HsDrosha, which elucidated the catalytic mechanism and structural basis for substrate recognition; however, the structures and functions of many non-representative RNaseIIIs from other species remain elusive.
Entamoeba histolytica is a protozoan parasite that infects millions of people and causes nearly 100,000 amoebiasis deaths worldwide per year, according to a World Health Organization report of 199725. Entamoeba histolytica has two life-cycle stages: the cyst form and the trophozoite form. The cyst form is its dormant stage that helps the parasite survive adverse conditions. In the trophozoite form, the parasite can infect people and cause disease26. Various small RNAs, with lengths of 16nt, 22nt, and 27nt, have been discovered in E. histolytica27. There are three genes in the E. histolytica genome (EHI_186850, EHI_125650, and EHI_177170) that encode Ago proteins. Among them, Ago2-2, encoded by EHI_125650, is highly expressed and associates with the 27-nt RNAs27,28,29. The E. histolytica genome also contains a homolog of the RNA-dependent RNA polymerases (RdRP), which are essential for small RNA biogenesis in some eukaryotes, such as S. pombe, C. elegans, and plants. Interestingly, there is only one RNaseIII protein (EhRNaseIII) encoded by the E. histolytica genome30, which is composed of 256 amino acids (aa). in vivo RNaseIII activity has been detected in E. histolytica trophozoites31 and, recently, it was confirmed that EhRNaseIII can process dsRNA in the RNAi-negative background of Saccharomyces cerevisiae32, and it can partially reconstitute the RNAi pathway in conjunction with Saccharomyces castellii Ago133. These observations suggest that EhRNaseIII may play a role in the RNAi pathway in E. histolytica.
To further investigate the potential role of EhRNaseIII and to uncover the structural basis underlying its functions, we performed crystallographic studies and in vitro catalytic assays of EhRNaseIII (Supplementary Fig. S1). Herein, we present three high-resolution crystal structures of EhRNaseIII, including selenomethionine (SeMet)-labeled EhRNaseIII (aa 1–194, SeMet-EhRNaseIII194), EhRNaseIII229 (aa 1–229), and an EhRNaseIII229-Mn2+ complex. These structures, in combination with sequence analysis, suggest that EhRNaseIII is a noncanonical Dicer that possesses some very unique structural features. Our in vitro assays show that the C-terminal region (CTR) of EhRNaseIII has an inhibitory effect on its binding and cleavage of dsRNA, and that this effect can be attenuated by removal of the CTR or by attaching a classical dsRNA binding domain (dsRBD) after the CTR. EhRNaseIII preferentially binds and cleaves longer dsRNAs, generating products of around 25nt. Based on these observations, we propose that EhRNaseIII binds and cleaves dsRNAs in a cooperative way, and we speculate that some unknown partner proteins, most likely dsRBD-containing proteins, may exist in E. histolytica that can enhance the activity of EhRNaseIII.
The common structural features of all RNaseIIIs are the RIIIDs (Fig. 1B). RIIID is characterized by a signature motif, which is 38ERLEFLGD46 in EcRNaseIII; E41 and D46 are two conserved catalytic residues. Two more negatively charged residues (corresponding to D114 and E117 in EcRNaseIII) are also highly conserved and critical for the catalytic activity of RNaseIIIs, although in some class III RNaseIIIs there is a D→N variation in the first RIIID domain (RIIIDa) (Fig. 1B). Although EhRNaseIII has very low similarity to these representative RNaseIIIs, sequence alignment was able to identify the signature motif (48EKNEFYGD55) and the two conserved catalytic residues (D116 and E119). Sequence alignment also identified two more conserved residues (N91 and K112) in EhRNaseIII. These two residues are not conserved in bacterial RNaseIIIs; whereas they are highly conserved throughout eukaryotic RNaseIIIs (with a K→H variation in the RIIIDa domains of CeDrosha and HsDrosha). In vitro studies in HsDicer and KpDcr1 showed that N→A and K→A mutation will significantly reduce the cleavage activities of the proteins, suggesting that these two residues play important roles during the cleavage reaction34.
The conservation of the six catalytically important residues suggests that EhRNaseIII is closely related to the eukaryotic RNaseIIIs. Evolutionary analysis further indicates that EhRNaseIII is similar to the RIIIDa domains of Drosha and Dicer proteins (Fig. 1C). However, the size of EhRNaseIII is more similar to the bacterial RNaseIIIs compared with the eukaryotic RNaseIIIs. The size of the CTR (corresponding to aa 163–256) of EhRNaseIII is similar to the typical bacterial RNaseIII dsRBDs, but no sequence similarity was identified between them. A typical dsRBD, such as the dsRBDs of AaRNaseIII, adopts an αβββα fold (Supplementary Fig. S2). The second α-helix sits in-between the first α-helix and the three β-strands, and it plays two functionally important roles: it stabilizes the dsRBD structure, and enhances dsRBD and dsRNA binding through the formation of H-bonds. According to the secondary structure prediction program GOR4, there are two consecutive α-helixes followed by two short β-strands in the EhRNaseIII CTR region. Phylogenetic analysis and the lack of a dsRBD domain suggest that EhRNaseIII might represent a noncanonical Dicer.
Three EhRNaseIII structures were solved in this study, including SeMet-EhRNaseIII194, EhRNaseIII229, and EhRNaseIII229-Mn2+ complex. The structures all belong to P212121 space groups with one EhRNaseIII intermolecular dimer per asymmetric unit. The C-terminal 24 residues (aa 196–229) were disordered in the EhRNaseIII229 structure; all structures are very similar with rmsd (root mean square deviations) of 0.4–0.7Å between all EhRNaseIII dimers. Because of its high resolution, the EhRNaseIII229 structure was used for structural analysis and comparison hereafter.
Each EhRNaseIII229 monomer contains seven helixes (Fig. 2A,B); the conformations of α1 (aa 1–28) and α7 (aa 165–195) are unique, compared with other RNaseIII structures, including AaRNaseIII, KpDcr1, and ScRnt1. In most of the RNaseIII structures, there is a flexible linker between the RIIID and the dsRBD domains, which provides the structural basis for the major conformational changes associated with substrate binding (Supplementary Fig. S2). As confirmed by the EcRNaseIII study, substitution of Q153 by the rigid P153 residue in the middle of the linker will reduce the linker flexibility and abolish the dsRNA cleavage activity35. α6(aa 135–159) corresponds to the last helix in other RIIID domain structures, and it is connected to α7 through a 5-residue linker (referred as the α6–α7 linker, 160NPPKL164). Unlike other RNaseIIIs, the α6–α7 linker of EhRNaseIII forms tight interactions with the surrounding residues (Fig. 2C). Via the N and ND2 atom, N160 forms two H-bonds (2.8Å and 3.1Å) with the O atom of Y156. The side chain of P162 sits in a hydrophobic pocket formed by I76, M80, F157, V165, and I169; the backbone O atom of P162 interacts with K166 via H2O-mediated H-bond. The H2O-mediated H-bond was also observed between K163 and Q75. Although P161 does not form direct interactions with other residues, it could further reduce the flexibility of the linker owing to its rigidity.
α7 was fixed in the structure, it forms several interactions with α1 (Fig. 2D), including the hydrophobic interactions formed by the side chains of M10, F177, and L181, one H-bond (3.2Å) between the O atom of S3 and the NE2 atom of Q174, and one salt bridge (2.5Å) between the OD2 atom of D16 and the NH2 atom of R185. Both α1 and α7 were further stabilized by their interactions with α6. The N-terminus of α6 (135TLFLLFAHALI145) mainly interacts with α1 and α7 through hydrophobic interactions. Besides hydrophobic interactions, the C-terminus of α6 (147YIFYHSSYIYFNA159) also forms several H-bonds with α1 and α7, via the OH groups of Y147 and Y150, the ND1 atom of H151, and the OD1 atom of N158. In between the N-terminus and C-terminus of α6, there is one charged residue, D146. Interestingly, the OD2 atom of D146 forms one salt bridge (2.8Å) with the NZ atom of K180, and one H-bond (2.6Å) with the OH group of Y184, respectively. Other helixes, such as α2, also interact with α1 and α7.
The conformations of two loops, loop A (32DLLQLNQAYSS42, the loop between helixes α1 and α2) and loop B (103LGDTKTFE110, the loop between helixes α4 and α5), are significantly different in the EhRNaseIII229 and the SeMet-EhRNaseIII194 structures. The conformation of loop B is also different in the AaRNaseIII and KpDcr1 structures (Supplementary Figs S3A and C). In the AaRNaseIII-RNA complex structure, loop B interacts with the major groove of dsRNA (Supplementary Fig. S3B). Loop B of EhRNaseIII is shorter than the corresponding loops of AaRNaseIII and KpDcr1 by 5 and 12 aa, respectively (Fig. 2A), and it may contribute to the weak dsRNA-binding ability of EhRNaseIII described later on.
For efficient RNA cleavage, RNaseIIIs have to form a dimer either intramolecularly or intermolecularly. Although EhRNaseIII229 contains seven helixes, structural comparison revealed that the RIIID core domain only contains the middle five helixes, α2–α6. As depicted in Fig. 2B, the dimer interface of EhRNaseIII229 is mainly formed by α2 (aa 42–71) and α3 (aa 78–89). There are some other dimerization-enhancing interactions, such as helixes swapping or loop cross-talking in the KpDcr1 and AaRNaseIII structures, respectively; however, such interaction was not observed in the EhRNaseIII229 structure. α2 contains the signature motif [48EKNEFYGD55, which has three highly conserved residues (underlined)]. In all other RNaseIIIs, there is another highly conserved residue (Leu or Val) prior to the conserved Gly and Asp residues, whereas it is a Tyr residue (Y53) at the corresponding position in EhRNaseIII. The OH atom of Y53 forms one very tight H-bond (2.6Å) with the OE2 atom of E60 from the partner molecule (Fig. 2E); and together with its hydrophobic interactions with the surrounding residues, such as Y57, L67, V126, and L127, Y53 may function as a lock holding the two monomers together from the opposite site of the catalytic valley. Interestingly, there is another lock at the catalytic valley side (Fig. 2F), which is composed of F52, S63, and R85. The NH1 atom of the R85 side chain forms two H-bonds, one (3.0Å) with the O atom of the F52 backbone, and another (3.0Å) with the OG atom of the S63 side chain. In addition, F52 and R85 also interact with each other through the stacking of their side chains.
The rmsd between the core RIIID domain of EhRNaseIII229 and that of the AaRNaseIII (aa 18–145) is 2.8Å, and is 2.3Å and 2.1Å, when compared with the core RIIID domains of KpDcr1 (aa 112–260) and ScRnt1 (aa 197–363), respectively. As shown in Fig. 3A, the catalytic valley of EhRNaseIII is highly negatively charged; actually, it is more negatively charged when compared with AaRNaseIII, KpDcr1, and ScRnt1 structures, owing to the presence of E59 (which resides right next to the two-fold axis of the dimer). E59 is not conserved in the class I and class II RNaseIIIs, but it is highly conserved in the RIIIDa domains of class III RNaseIIIs and the RIIIDb domains of class IV RNaseIIIs (Fig. 1B), though the functional importance of this residue remains elusive.
The active site of EhRNaseIII (Fig. 3B) contains four negatively charged residues (E51, D55, D116, and E119), which form two metal-binding sites: the prominent metal-binding site (M1) and the second metal-binding site (M2). Divalent metal ions (preferentially Mg2+ and Mn2+) are required for the RNA cleavage reaction catalyzed by RNaseIIIs. The complex structures have been determined for RNaseIIIs from different classes, such as AaRNIII-Mg2+ (PDB code: 1RC5) in class I36, and KpDcr1-Mg2+ (PDB code: 3RV0, Fig. 3D) in class II. Both AaRNIII-Mg2+ and KpDcr1-Mg2+ complex structures were obtained through co-crystallization method, which uses 1.5mM MgCl2 in the protein sample and 20mM MgCl2 in the crystallization buffer, respectively. Very surprisingly, although the crystallization buffer contains 20mM MgCl2, no Mg2+ ion was bound at the M1 or M2 sites in the EhRNaseIII229 structure, suggesting that the Mg2+ ion-binding affinity of EhRNaseIII is weak. The EhRNaseIII229-Mn2+ complex structure (Fig. 3B) was obtained by soaking the EhRNaseIII229 crystals overnight in crystallization buffer supplemented with 10mM MnCl2. The occupancy of Mn2+ ions at the catalytic site A was very low, so it was not modeled in the structure. In contrast to catalytic site A, two well-defined Mn2+ ions were bound at the M1 and M2 positions of catalytic site B. As depicted in Fig. 3C, the M1 site Mn2+ ion coordinates with the side chains of E51 and D55; whereas the Mn2+ ion at the M2 site coordinates with E51, D116, and E119. Structural comparison revealed that the conformations of D116 and E119 are conserved; whereas, E51 and D55 can undergo obvious conformational changes upon binding of Mn2+ ions.
In eukaryotic RNaseIIIs, there are two more important conserved catalytic residues, one Asn and one Lys (Fig. 3D). There are four consecutive Lys residues (K111-K114) in EhRNaseIII, and structural superimposition revealed that K112 is the important one for catalysis. The backbone of K112 is well defined, but the side chain is very flexible, indicated by the extremely weak electron density. In the KpDcr1 structure, the NZ atom of K217 forms one H-bond (2.5Å) with the nucleophilic water, which attacks the phosphorus atom at the cleavage site34. In the ScRnt1 structure, the NZ atom of K313 forms one H-bond (2.9Å) with the OP1 atom of the product 5′-phosphate group18. These differences suggest that the flexibility of the K112 side chain is functionally relevant and that it provides the structural basis for the necessary conformational changes associated with the metal ion and RNA binding. In the KpDcr1 structure, N184 interacts with the Mg2+ ion at the M1 site through one water molecule (the distance between the bridge water and the OD1 atom of N184 is 2.9Å). Both M1 and M2 sites were occupied by an Mg2+ ion in the ScRnt1 structure; interestingly, the M2 site Mg2+ ion also interacted with N278 through a water molecule, and the distance between the bridge water and OD1 atom of N278 is 2.6Å. These interactions indicated that the conserved Asn residue was mainly involved in the stabilization of the metal ions. N91 residues are very stable in all our EhRNaseIII structures, supported by the well-defined electron densities. Structural comparison further reveals that the overall conformations of N91 in EhRNaseIII structures are similar to that of N184 in KpDcr1 (Fig. 3D) and N278 in the ScRnt1-product structure.
In the EhRNaseIII229 structure, the α7 (aa 165–195) folds back and forms tight interactions with N-terminal α1 and other helixes of the RIIID core; the remaining 61 residues (aa 196–256) are too short to form a typical dsRBD, which is consistent with the secondary structure prediction results. For other RNaseIIIs, such as EcRNaseIII and KpDcr1, their dsRBDs play an important role in the dsRNA substrate-binding and cleavage reaction. To better understand the functional role of the EhRNaseIII CTR, EMSA assays (Fig. 4) were carried out using different dsRNA substrates and various EhRNaseIII proteins, including EhRNaseIII194, EhRNaseIII229, and EhRNaseIII256. In total, four sets of dsRNA substrates, RNA25, RNA50, RNA70, and RNA100 were used in the EMSA. Among them, RNA25 was not bound by all three EhRNaseIII proteins (not shown). EhRNaseIII256 did not bind RNA50 or RNA70 (Fig. 4A,B, left panel), whereas, it did bind RNA100 (Fig. 4C, left panel). The apparent Kd for the RNA100 binding by EhRNaseIII256 was ~6×10−4M, which is much lower than that of KpDcr134. EhRNaseIII229 did not bind RNA50 or RNA70 (Fig. 4A,B, middle panel); however, similar to EhRNaseIII256, EhRNaseIII229 could bind RNA100 (Fig. 4C, middle panel). Interestingly, the RNA100-binding affinity of EhRNaseIII229 was at least 2-fold higher than that of EhRNaseIII256, as revealed by the almost complete shifting of RNA100 in lane 6 (the concentration of EhRNaseIII229 was 3×10−4M). The lower Kd of EhRNaseIII229 protein suggests that the C-terminal aa 230–256 have an inhibitory effect on dsRNA substrate binding.
EhRNaseIII194 can bind all dsRNA substrates, including RNA50, RNA70, and RNA100. Although its binding affinity to RNA50 is still low (Fig. 4A, lane 8 of the right panel), EhRNaseIII194 can bind RNA70 substrate at the concentration of 1.0×10−4M (Fig. 4B, lane 5 of the right panel), and this binding is tighter than the binding between RNA100 and EhRNaseIII256 (also at 1.0×10−4M concentration). RNA100 can be shifted by EhRNaseIII194 at the concentration of 5.0×10−5M (Fig. 4C, lane 4 of the right panel); this binding affinity is about 6- and 2-fold higher than that of EhRNaseIII256 and EhRNaseIII229, respectively, estimated from the molar ratios of bound RNAs versus free RNAs. These observations indicate that aa 195–229 also play a role in inhibiting dsRNA substrate binding.
The dsRNA binding affinity of EhRNaseIII194 follows the order: RNA100>RNA70>RNA50>RNA25 (Supplementary Fig. S4), indicating that the binding affinity is correlated with the substrate size. A similar conclusion can also be drawn for EhRNaseIII229 and EhRNaseIII256, based on the EMSA results depicted in Fig. 4. EhRNaseIII256 and EhRNaseIII229 form one major complex with RNA100, which moves just slightly slower than the free RNAs. However, such complex was not observed in the case of EhRNaseIII194; instead, EhRNaseIII194 forms two complexes with the RNAs, and both of them move much more slowly than the complexes formed in the presence of EhRNaseIII229 and EhRNaseIII256. These observations suggest that the slow moving complex may contain multiple EhRNaseIII194 dimers.
dsRNA cleavage activities of RNaseIIIs are dependent on the divalent metal ions, preferentially Mg2+. EhRNaseIII has a conserved RIIID, including the conserved residues that coordinate with the metal ions. No Mg2+ ion was observed at the catalytic site of EhRNaseIII structures; however, previous studies have revealed that the metal ion (especially the one at the M2 position) binding affinities of RNaseIIIs can be enhanced by the presence of RNA substrates. Therefore, it is possible that EhRNaseIII is still active in the presence of Mg2+. To explore this possibility, we carried out in vitro cleavage assays with RNA substrates in the presence of Mg2+; however, very surprisingly, no detectable dsRNA cleavage activity was observed for any EhRNaseIII proteins (Supplementary Fig. S5A), indicating that Mg2+ alone was not enough to assemble the EhRNaseIII-dsRNA complex in catalytic form.
Some RNaseIIIs are also active in the presence of Mn2+; and, as for other cation-dependent nucleases, our structure revealed that the binding of the negatively charged catalytic residues with Mn2+ was stronger than that with Mg2+. Therefore, we also carried out the cleavage assay in the presence of Mn2+. Almost no RNA25 was cleaved by the three native proteins, including EhRNaseIII256, EhRNase229, and EhRNase194 (not shown); whereas, the RNA50, RNA75, and RNA100 could be cleaved by all three proteins under the same reaction conditions (37°C, 100min). As exampled by EhRNase194 (Fig. 5A, left panel), the major cleavage product of RNA50 is about 25nt in size; there are two major products formed in the case of RNA70, which are about 25nt and 50nt, respectively. Besides these two products, another product with a length close to the 70-nt marker was generated from RNA100. As indicated by the product intensity, the RNA cleavage activity of EhRNase194 follows the order: RNA100>RNA70>RNA50; similar results are also observed for EhRNase229 (Fig. 5A, middle panel) and EhRNaseIII256 (Fig. 5A, right panel), suggesting that EhRNaseIII preferentially cleaves the longer RNAs. Interestingly, besides the product bands, some slow moving bands were observed on the gel; these bands may be caused by the EhRNaseIII proteins, which are not completely denatured under the condition.
The RNA100 cleavage activity of the proteins follows the order: EhRNase194>EhRNase229>EhRNase256; though it is not as obvious as RNA100, the proteins follow the same order in cleaving RNA70 and RNA50 (Fig. 5A). These observations suggest that the CTR of EhRNaseIII may have certain inhibitory effect on substrate cleavage, and this conclusion can be further supported by the in vitro cleavage assay of RNA100 with time course. As depicted in Fig. 5B, there are significant amount of products generated at the reaction time of 60min for EhRNase194; whereas only small amount of products formed in the presence of EhRNase229 and only trace amount of products is observed in the case of EhRNase256. As the reaction time was increased, more substrates are cleaved by the proteins; the longest products are converted into the shorter ones. Though it needs to be further determined, the pattern and the convergence of these products suggests that the sizes of the two longer fragments might be double and triple that of the smallest one, which is about 25-nt in size. As a negative control, the in vitro cleavage assay with the catalytic deficient mutant E119Q was also carried out; as depicted in Fig. 5B, no any product generated, confirming that the above EhRNaseIII cleavage activities are not caused by contamination.
Many small RNAs exist in E. histolytica, and EhRNaseIII is the only RNaseIII protein identified in E. histolytica. As revealed by our in vitro studies, EhRNaseIII is active in the presence of Mn2+ ions, but very high protein concentrations (up to the μM level) are required for efficient substrate cleavage. The C-terminal dsRBDs play an important role in the dsRNA processing by KpDcr1; removing the two dsRBDs dramatically reduces the cleavage activity and results in the formation of heterogeneous products34. Some other RNaseIIIs also have dsRBD-containing protein partners (Supplementary Fig. S8), which play critical roles in miRNAs biogenesis, such as DGCR8 for Drosha in Homo sapiens37, HYL1 for DCL1 in Arabidopsis thaliana38, and Loqs-PD and R2D2 for Dicer2 in Drosophila39. Like the dsRBDs of KpDcr1, these partner proteins also have dsRNA-binding ability.
EhRNaseIII does not contain a dsRBD domain; and to test whether a dsRBD can enhance the RNA-binding and cleavage activity of EhRNaseIII, we constructed a chimeric EhRNaseIII protein, EA256 (Fig. 6A), which is composed of EhRNaseIII and the dsRBD domain of AaRNaseIII. The AaRNaseIII dsRBD was selected because its structure and its interaction with dsRNAs have been well characterized14,15,16,17 (Supplementary Fig. S2). As shown in Fig. 6B, EA256 can bind RNA50, RNA70, and RNA100 completely at the concentration of 1.0×10−4M (Lane 4); at this concentration, EA256 can also bind more than 70% of RNA25, which does not interact with EhRNaseIII256. These results suggest that EA256 has significantly improved dsRNA-binding affinity and compared with EhRNaseIII256, the binding affinity was estimated to be increased by 10- to 100-fold.
Similar to EhRNaseIII256, EA256 is not active in the presence of Mg2+ ions (Supplementary Fig. S5B). In the presence of Mn2+, the dsRNA cleavage activity of EA256 is much higher than the EhRNaseIII256; at the concentration of 1.0×10−6M, EA256 can efficiently digest all the RNA substrates, including RNA50, RNA70, and RNA100 (Fig. 6C). As estimated from the gel, about 20% RNA50 was cleaved at the reaction time of 100min, created a product, which is about 25nt in size; under the same condition, more than 50% RNA70 was cleaved, formed two products with lengths about 25 and 50nt, respectively. Degraded by EA256, one product with longer length was also observed in the case of RNA100; the pattern of these product bands are very similar to those of EhRNaseIII256, EhRNaseIII229, and EhRNaseIII194. Also similar to these EhRNaseIII proteins, EA256 can convert the longer product into the short ones when the reaction time increased (Fig. 6D). These observations suggested that EA256 shares the similar substrate binding and cleavage mechanism as the native EhRNaseIII proteins.
RNaseIIIs from higher species have developed various mechanisms to precisely control their product sizes, which is critical for their functions. The product lengths of Dicer are determined by the RNA structures and the cooperative interactions of the PAZ, dsRBD, and helicase domains40,41,42; ScRnt1 uses two molecular rulers embedded at the NTD and dsRBD domains to ensure accurate cleavage of the substrate18. Due to lack of molecular ruler, the products created by class I RNaseIIIs vary in sizes. As revealed by the AaRNaseIII structure, the product could be as short as 11nt. Similar to the class I RNaseIIIs, EhRNaseIII has no known molecular ruler. However, the obvious product pattern (Figs 5, 6C and D) suggests that EhRNaseIII can control its product length via certain method.
Similar to EhRNaseIII, no molecular ruler exists in KpDcr1; however, previous study revealed that KpDcr1 can achieve the precise substrate cleavage through the cooperative interactions between the protein molecules34. It was proposed that KpDcr1 dimers bind the conjugated dsRNA along its length, and two variable loops (VL-1 and VL-2) within the RIIIDs play an important role in the packing of neighboring dimers. VL-1 corresponds to the loop that extends along the RNA minor groove in the AaRNaseIII-product structure (PDB code: 2NUG); replacement of VL-1 with the analogous regions from the GiDicer RIIIDb domain would reduce its RNA cleavage activity and generate heterogeneous products. KpDcr1VL-2 corresponds to the loop that constitutes the RNA-binding motif 4 in AaRNaseIII; substitution of VL-2 with the analogous regions from the GiDicer RNIIIb domain would completely abolish the dsRNA cleavage activity of KpDcr1.
To verify whether EhRNaseIII cleaves the substrates via the cooperative model similar to KpDcr1, the protein-RNA crosslinking assay was carried out using RNA100 and EA256, due to its higher activity. As depicted in Fig. 7A, RNAs alone has no impact on the shifting of EA256 on the gel. EhRNaseIII functions as dimer, however, DSS (Disuccinimidyl suberate, the crosslinking reagent) alone does not lead to the formation of the dimer bands, may due to the lack of proper lysine residues on the dimerization interface. Interestingly, some faster moving bands appeared at the bottom of the gel, which may be caused by the DSS modification on the EhRNaseIII monomer. When both RNAs and DSS are present, several bands corresponding to two, three, or multiple EhRNaseIII molecules appeared. The Loop A (aa 32–42) and Loop B (aa 103–110, Supplementary Fig. S3) of EhRNaseIII correspond to the VL-1 and VL-2 loops of KpDcr1, respectively. Interestingly, the loop B of EhRNaseIII is shorter than VL-2 by 12nt; it is also 5nt shorter than the corresponding loops in AaRNaseIII. As revealed by our EhRNaseIII structures, the loop A and loop B are flexible and they can undergo large conformational changes. Tough it needs to further verified whether the loop A and loop B are involved in the substrate binding, our crosslinking assay clearly indicated that EhRNaseIII functions through the cooperative mode.
In the AaRNaseIII-product structure, AaRNaseIII dimers were adjacently packed along the pseudocontinuous dsRNA formed by 11-nt RNAs, and the distance between the two active sites of the adjacent RNaseIII dimers was 22nt. The size of dsRNA products generated by KpDcr1 was 23nt, whereas the products were about 25nt in size in the presence of EhRNaseIII, suggesting that the RNA-binding mode of EhRNaseIII may not be exactly same as that of AaRNaseIII and KpDcr1. To better illustrate the cooperative dsRNA binding mode, we built a EhRNaseIII-dsRNA binding model, depicted in Fig. 7B.
EhRNaseIII is the only RIIID-containing protein identified in E. histolytica. As revealed by our structural studies, EhRNaseIII lacks a typical dsRBD in its CTR and is a noncanonical Dicer protein. EhRNaseIII possesses some very unique structural features, including the cross-talking between helixes α1 and α7, and a unique dimerization enhancing mechanism. EhRNaseIII has a conserved RIIID core; and as revealed by in vitro catalytic assays, EhRNaseIII is active in the presence of Mn2+ and can produce RNA product with a length ~25nt. These results indicate that EhRNaseIII may play a role during the siRNA biogenesis process in E. histolytica.
The size of the small RNAs identified in E. histolytica varies and includes RNAs of 16, 22, and 27nt. Unlike the RNA products generated in the in vitro assay, which possess a monophosphate group at the 5′-end, many of the small RNAs discovered in E. histolytica have a triphosphate group at their 5′-end, which is similar to siRNAs found in C. elegans. In C. elegans, the 5′-triphosphate capped siRNAs are amplified by the RdRP-dependent secondary siRNA production pathway43,44, suggesting that the E. histolytica siRNAs are not the immediate products of EhRNaseIII cleavage. Besides EhRNaseIII and Ago proteins, other homologous proteins involved in the secondary RNAi pathway, such as RdRP, also exist in E. histolytica. We speculate that the RdRP protein may be responsible for the 5′-triphosphate formation (Fig. 7C). In the RNAi pathway in higher eukaryotes, many RNaseIIIs have dsRBD-containing protein partners, such as DGCR8 partnering HsDrosha and TRBP partnering HsDicer, which play critical roles in small RNA biogenesis. No EhRNaseIII cleavage activity was observed in the presence of Mg2+ in our in vitro assay; whereas, previous studies showed that Mg2+ can support the cleavage activity of EhRNaseIII in cell lysate43,44. The functions of many of the proteins expressed in E. histolytica remain to be characterized, and we speculate that one or more of dsRBD-containing proteins may function as an EhRNaseIII partner. These partner proteins may regulate the dsRNA-binding and cleavage activity of EhRNaseIII in vivo via their interaction with the CTR of EhRNaseIII. Also, with the help of the partner proteins, EhRNaseIII should be functional in the presence of Mg2+, which is the physiological cofactor of many known RNaseIII proteins.
The plasmid used for overproduction of the recombinant His-Sumo-EhRNaseIII was constructed as follows. The full-length gene of the wild-type EhRNaseIII was PCR amplified from E. histolytica cDNA using two primers, EhRNaseIII-BamHI-1F and EhRNaseIII-SalI-256R. The product was double-digested with BamHI and SalI, and cloned into the Sumo-tag-containing pET28 vector (Novagen), referred as pET28-Sumo hereafter. Then, the plasmid was transfected into Escherichia coli strain BL21(DE3) and its sequence was confirmed through DNA sequencing. The CTR truncated proteins EhRNaseIII194 and EhRNaseIII229 were constructed using the same procedure but with primer EhRNaseIII-SalI-256R replaced with EhRNaseIII-SalI-194R and EhRNaseIII-SalI-229R, respectively. The E119Q mutant was constructed using the site-direct mutagenesis method with two primers: EhRNaseIII-E119Q-F and EhRNaseIII-E119Q-R; the plasmid of the full-length wild type His-Sumo-EhRNaseIII was used as template. The DNA construct of chimeric protein EhRNaseIII-AadsRBD (EA256, which contains the full-length EhRNaseIII followed by the dsRBD of AaRNaseIII) was generated by overlapping PCR. The EhRNaseIII and AaRNaseIII plasmids served as templates for two PCR reactions: (1) with primers EhRNaseIII-BamHI-1F and EhRNaseIII-256R-Aa, and (2) with primers AaRBD-145F and AaRBD-221-SalI-R, respectively. The amplicons of PCR reactions 1) and 2) were combined and used as template for a third PCR reaction with primers EhRNaseIII-BamHI-1F and AaRBD-221-SalI-R. The resulting amplicon was cloned into the pET28-Sumo vector and transfected into E. coli BL21(DE3) competent cells for DNA sequencing and protein expression. A schematic diagram of the EhRNaseIII constructs is depicted in Supplementary Fig. S1 and the detailed sequences of the primers are listed in Supplementary Table S1.
All EhRNaseIII proteins were expressed and purified using identical procedures, as described below. Each recombinant strain was cultured at 37°C in 1L LB medium supplemented with 50μg/mL kanamycin, and protein expression was induced at OD600 ≈ 0.6 by the addition of isopropyl β-D-1-thiogalacto-pyranoside (IPTG, final concentration 0.2mM). The induced culture was then grown at 18°C overnight. The cells were harvested by centrifugation and resuspended in the lysis buffer (20mM Tris-HCl pH 8.0, 500mM NaCl, 25mM imidazole pH 8.0). The cells were lysed under high pressure using a JNBIO homogenizer (Guangzhou Juneng Biology & Technology Co., Ltd.). The homogenate was clarified by centrifugation, and the supernatant was loaded onto a HisTrapTM HP column (GE Healthcare) and eluted with elution buffer (20mM Tris-HCl pH 8.0, 500mM NaCl, 500mM imidazole pH 8.0) using a linear gradient. The fractions containing the recombinant His-Sumo-EhRNaseIII protein were pooled and digested with Ulp1 protease while being dialyzed against buffer S (20mM Tris-HCl pH 8.0, 500mM NaCl). The protein was loaded onto a HisTrapTM HP column again to remove the cleaved His-Sumo tag. The flow-through containing the target EhRNaseIII protein was diluted to lower the NaCl concentration to 100mM and was then loaded onto a HiTrap Q column equilibrated with buffer A (20mM Tris-HCl pH 8.0, 100mM NaCl). The protein samples were eluted with buffer B (20mM Tris-HCl pH 8.0, 1M NaCl) using a linear gradient. The eluted sample was concentrated and loaded onto a HiLoad 16/60 SuperdexTM 75 column (GE Healthcare) equilibrated with gel filtration buffer (10mM Tris-HCl pH 8.0, 100mM NaCl). Protein was concentrated using an Amicon-Ultra centrifugal device from Millipore and its purity was analyzed by SDS-PAGE.
For the overproduction of selenomethionine (Se-Met) substituted EhRNaseIII194 protein (SeMet-EhRNaseIII194), the E. coli BL21(DE3) strain containing the recombinant pET28-Sumo-EhRNaseIII194 plasmid was grown in 100mL LB medium supplied with 50μg/mL kanamycin at 37°C overnight. Next day, the cells were pelleted and resuspended in 2L M9 medium and grown at 37°C. When the culture reached the early log phase (OD600 ≈ 0.6), the temperature was lowered to 18°C. One hour later, 0.2mM IPTG and 60mg/L Se-Met (J&K) were added to induce the protein expression. The induced culture was then grown at 18°C overnight. The cells were pelleted by centrifugation and the SeMet-EhRNaseIII194 protein was purified using the same procedures as those used for the native proteins. DTT (1mM) was present in all purification buffers to avoid the oxidation of Se.
Crystals of SeMet-EhRNaseIII194 were grown using the hanging-drop vapor diffusion method at 16°C. The droplets contained equal volumes of protein sample and reservoir solution [1.26M (NH4)2SO4, 0.1M Hepes pH 7.8]. Crystals of EhRNaseIII229 were obtained using the sitting-drop vapor diffusion method in a 3-drop intelliplate at 16°C. The drop contained 0.3μL protein solution and 0.3μL crystallization buffer [1.6M (NH4)2SO4, 0.02M MgCl2, 0.05M Tris-HCl pH 7.5]. To obtain the EhRNaseIII229-Mn2+ complex crystals, the freshly grown EhRNaseIII229 crystals were transferred sequentially into crystallization buffer supplemented with 5mM and 10mM MnCl2 and soaked for 30min in each solution. Then, the crystals were transferred into crystallization buffer containing 10mM MnCl2 and 20% glycerol and soaked overnight. The soaked crystals were frozen by plugging directly into liquid nitrogen. Crystals of SeMet-EhRNaseIII194 and EhRNaseIII229 were cryoprotected by dipping quickly into their mother liquid supplemented with 20% glycerol and flash frozen in liquid nitrogen.
All the X-ray diffraction data were collected on beamline BL17U and BL19U at the Shanghai Synchrotron Radiation Facility at the cryogenic temperature maintained by the cryogenic system. One single crystal was used in each case. The Se-Met single-wavelength anomalous diffraction data of SeMet-EhRNaseIII194 was collected at a wavelength of 0.97915Å. Other data were all collected at a wavelength of 1.0000Å. All data processing was carried out with the HKL2000 program45 and the data collection and processing statistics are summarized in Table 1.
The SeMet-EhRNaseIII194 structure was solved using the Se-Met single-wavelength anomalous diffraction method46 with the AutoSol program47 embedded in the PHENIX suit48; the Figure of Merit value was 0.49. The program identified six out of the eight incorporated Se atoms and generated an initial model that covered more than 75% of protein residues in the asymmetric unit. The side chains of the residues were manually built based on the electron density map using the graphic program, Coot49. The partial model was then refined against the diffraction data using the Refmac5 program embedded in CCP4i50. The more complete model of SeMet-EhRNaseIII194 was built based on the improved density map resulting from the refinement. The EhRNaseIII229 and EhRNaseIII229-Mn2+ complex structures were solved by the molecular replacement method using the Phaser51 program of CCP4i; the SeMet-EhRNaseIII194 structure was used as the search model. The resulting model was refined using Refmac5 and the phenix.refine program52 of PHENIX. During refinement, 5% of randomly selected data was set aside for free R-factor cross validation calculations. The 2Fo-Fc and Fo-Fc electron density maps were regularly calculated and used as guides for the building of the missing residues. Water molecules were added either automatically or manually using Coot. Sulfate and metal ions were modeled in the refinement until the last few cycles. The Rwork and Rfree were 0.202 and 0.246, 0.207 and 0.240, and 0.197 and 0.240 for the SeMet-EhRNaseIII194, EhRNaseIII229, and EhRNaseIII194-Mn2+ structures, respectively. All the residues were located in the favored or allowed regions of the Ramachandran plot. The detailed structure refinement statistics are summarized in Table 1.
All the RNAs used in this work (Supplementary Table S1) were produced by in vitro transcription catalyzed by T7 RNA polymerase53. The pUC19 plasmids containing the target sequences were ordered from Shanghai GENERAY Biotech, amplified and purified using a Miniprep Kit (Qiagen) according to the manufacturer’s instructions. Prior to in vitro transcription, the templates were linearized using SmaI, extracted with phenol-chloroform, and precipitated with ethanol. The transcription reactions were carried out at 37°C for 5h. α-32P-UTP [with a 100:1 molar ratio of UTP:α-32P-UTP (3000Ci/mmol)] was included during the reaction to generate 32P body-labeled ssRNAs in some cases. Reactions were quenched by the addition of 1μL 0.5M EDTA. The template DNAs were digested with DNase I, and then the samples were resolved by denaturing PAGE (using a 15% gel for 25 and 50-nt ssRNAs, and a 10% gel for 70 and 100-nt ssRNAs). UV254-shadowing (over Xerox paper) was used to visualize the RNAs. The target RNAs were eluted from gel slices with elution buffer (0.02% SDS, 1mM EDTA, 0.3M NaAc) overnight at 37°C and precipitated using ethanol. ssRNAs were dissolved in RNase-free ddH2O and the concentrations were determined using an ultraviolet spectrometer.
dsRNAs, RNA25, RNA50, RNA70, and RNA100, which are 25, 50, 70, and 100bp long, respectively, were generated by annealing of complementary ssRNAs. The complementary RNAs were mixed at a molar ratio of 1:1 in annealing buffer (30mM Tris-HCl pH 7.5, 100mM NaCl, 1mM EDTA). The mixture was heated at 95°C for 2min, and slowly cooled to room temperature. Annealed dsRNAs were fractionated by native PAGE and detected by autoradiography or ultraviolet shadowing. dsRNAs were eluted form gel slices with elution buffer (0.02% SDS, 1mM EDTA, 0.3M NaAc), ethanol precipitated and stored in storage buffer (10mM Tris-HCl pH 7.5, 10mM NaCl, 0.1mM EDTA).
Binding of EhRNaseIIIs, including EhRNaseIII256 (the full-length EhRNaseIII), EhRNaseIII229, EhRNaseIII194, and chimeric protein EA256, to the dsRNA substrates was monitored using electrophoretic mobility shift assays (EMSA). Five microliters of EhRNaseIII, 3μL dsRNA, and 2μL 5×binding buffer (150mM Tris-HCl pH 7.5, 150mM NaCl, 25mM MgCl2, 5mM DTT, 0.5mM EDTA, 25% glycerol) were mixed in a thin-wall Eppendorf tube. The final concentrations of EhRNaseIIIs and dsRNAs are indicated on the figures. The reaction mixtures were incubated at room temperature for 10min followed by incubation on ice for an additional 20min. Samples were loaded onto a pre-cooled 6% native polyacrylamide gel. Gels were run at 160V for 3–4h at 4°C in 0.5×TBE buffer supplemented with 5mM MgCl2. RNAs were visualized by phosphorimaging using a Typhoon 9000 (GE Healthcare) (for Fig. 4) or by staining with Gelred (Biotium) (for Figs 5 and and66).
Ten-microliter samples consisting of 2μL 5×reaction buffer (150mM Tris-HCl pH 7.5, 150mM NaCl, 50mM MnCl2, 5mM DTT, 0.5mM EDTA), 5μL protein and 3μL RNA (3μM for RNA25, and 1μM for RNA50, RNA70, and RNA100) were mixed and incubated at 37°C; the protein concentrations and the incubation times were given at the figure legends. Reactions were quenched by the addition of 10μL loading buffer (90% formamide, 18mM EDTA, 0.025% SDS, 0.02% bromophenol blue). Samples were heated at 95°C for 5min, centrifuged, and loaded onto 8M urea 10% PAGE. Gels were run at 10W for 50min in 0.5×TBE and stained with Gelred for 15min. RNAs were detected using a Gel-Imaging system (Bio-Rad).
Four-microliter samples consisting of 1μL 5×reaction buffer (150mM Hepes-NaOH pH 7.6, 150mM NaCl, 25mM MgCl2, 5mM DTT, 0.5mM EDTA), 2μL EA256 protein (50μM) and 1μL RNA (10μM or 20μM for RNA100) were mixed and incubated at room temperature for 30min. One-microliter diluted DSS (disuccinimidyl suberate, sigma) were added to the reaction system with the final concentration of 50μM, 100μM, and 200μM, respectively. The final concentrations of RNA100 are 2μM or 4μM if present. The samples were incubated at room temperature for an additional 20min and quenched by the addition of 1.25μL SDS-loading buffer. Samples were heated at 95°C for 5min, centrifuged, and loaded onto 10% SDS-PAGE. Gels were run at 200V for 60min in 1×SDS running buffer and stained with coomassie brilliant blue.
How to cite this article: Yu, X. et al. Structural and functional studies of a noncanonical Dicer from Entamoeba histolytica. Sci. Rep. 7, 44832; doi: 10.1038/srep44832 (2017).
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank Prof. Upinder Singh of Stanford University School of Medicine for providing the EhRNaseIII cDNA. We thank the BL17U and BL19U beamline staff at the Shanghai Synchrotron Radiation Facility for help during data collection and Prof. Xinhua Ji and members of the Gan and Ma laboratories for insightful discussions. This work was supported by the National Natural Science Foundation of China (31370728 and 31230041), the National Basic Research Program of China (2011CB966304 and 2012CB910502), the National Postdoctoral Program for Innovative Talents (BX201600034), and the Key Research and Development Project of China (2016YFA0500600).
The authors declare no competing financial interests.
Author Contributions X.Y., X.H.L. and L.N.Z. performed the experiments. Y.H., J.B.M. and J.H.G. determined the crystal structures and analyzed all the experiment data. Y.H., J.B.M. and J.H.G. designed the experiments and wrote the manuscript.