Free group II introns are infectious retroelements that can bind and insert themselves into RNA and DNA molecules via reverse-splicing. Here we report the 3.4Å crystal structure of a complex between an oligonucleotide target substrate and a group IIC intron, as well as the refined free intron structure. The structure of the complex reveals the conformation of motifs involved in exon recognition by group II introns.
The discovery that RNA molecules can fold into complex structures and carry out diverse cellular roles has led to interest in developing tools for modeling RNA tertiary structure. While significant progress has been made in establishing that the RNA backbone is rotameric, few libraries of discrete conformations specifically for use in RNA modeling have been validated. Here, we present six libraries of discrete RNA conformations based on a simplified pseudo-torsional notation of the RNA backbone, comparable to phi and psi in the protein backbone. We evaluate the ability of each library to represent single nucleotide backbone conformations and we show how individual library fragments can be assembled into dinucleotides that are consistent with established RNA backbone descriptors spanning from sugar to sugar. We then use each library to build all-atom models of 20 test folds and we show how the composition of a fragment library can limit model quality. Despite the limitations inherent in using discretized libraries, we find that several hundred discrete fragments can rebuild RNA folds up to 174 nucleotides in length with atomic-level accuracy (<1.5Å RMSD). We anticipate the libraries presented here could easily be incorporated into RNA structural modeling, analysis, or refinement tools.
RNA structure; RNA backbone conformation; RNA fragment library; RNA modeling
The D135 group II intron ribozyme follows a unique folding pathway that is direct and appears to be devoid of kinetic traps. During the earliest stages of folding, D135 collapses slowly to a compact intermediate, and all subsequent assembly events are rapid. Collapse of intron Domain 1 (D1) has been shown to limit the rate constant for D135 folding, although the specific substructure of the D1 kinetic intermediate has not yet been identified. Employing time-resolved Nucleotide Analog Interference Mapping (NAIM), we have identified a cluster of atoms within the D1 main stem that control the rate constant for D135 collapse. Functional groups within the κ–ζ element are particularly important for this earliest stage of folding, which is intriguing given that this same motif also serves later as the docking site for catalytic Domain 5 (D5). Importantly, the κ–ζ element is shown to be a divalent ion binding pocket, indicating that this region is a Mg2+-dependent switch that initiates the cascade of D135 folding events. By measuring the Mg2+ dependence of the compaction rate constant, we conclude that the actual rate-limiting step in D1 compaction involves the formation of an unstable folding intermediate that is captured by the binding of Mg2+. This carefully orchestrated folding pathway, in which formation of an active-site docking region is early and rate-limiting, ensures proper folding of the intron core and faithful splicing. It may represent an important paradigm for the folding of large, multidomain RNA molecules.
ribozyme; RNA folding; catalysis; splicing; RNA structure
Background: RIG-I is an essential innate immune receptor that detects viral RNAs in infected cells.
Results: RIG-I uses distinct subdomains to recognize specific characteristics of viral RNAs.
Conclusion: The 5′-triphosphate is critical for high affinity RIG-I/RNA interaction.
Significance: Characterizing the RIG-I/RNA interface is essential for understanding early stages of immune response against RNA viruses.
RIG-I is a cytoplasmic surveillance protein that contributes to the earliest stages of the vertebrate innate immune response. The protein specifically recognizes 5′-triphosphorylated RNA structures that are released into the cell by viruses, such as influenza and hepatitis C. To understand the energetic basis for viral RNA recognition by RIG-I, we studied the binding of RIG-I domain variants to a family of dsRNA ligands. Thermodynamic analysis revealed that the isolated RIG-I domains each make important contributions to affinity and that they interact using different strategies. Covalent linkage between the domains enhances RNA ligand specificity while reducing overall binding affinity, thereby providing a mechanism for discriminating virus from host RNA.
ATPases; Interferon; RNA Helicase; RNA-Protein Interaction; Viral Immunology; RNA Triphosphate
RCrane is a new tool for the partially automated building of RNA crystallographic models into electron-density maps of low or intermediate resolution. This tool helps crystallographers to place phosphates and bases into electron density and then automatically predicts and builds the detailed all-atom structure of the traced nucleotides.
RNA crystals typically diffract to much lower resolutions than protein crystals. This low-resolution diffraction results in unclear density maps, which cause considerable difficulties during the model-building process. These difficulties are exacerbated by the lack of computational tools for RNA modeling. Here, RCrane, a tool for the partially automated building of RNA into electron-density maps of low or intermediate resolution, is presented. This tool works within Coot, a common program for macromolecular model building. RCrane helps crystallographers to place phosphates and bases into electron density and then automatically predicts and builds the detailed all-atom structure of the traced nucleotides. RCrane then allows the crystallographer to review the newly built structure and select alternative backbone conformations where desired. This tool can also be used to automatically correct the backbone structure of previously built nucleotides. These automated corrections can fix incorrect sugar puckers, steric clashes and other structural problems.
RCrane; RNA model building
Intracellular RIG-I-like receptors (RLRs, including RIG-I, MDA-5, and LGP-2) recognize viral RNAs as pathogen-associated molecular patterns (PAMPs) and initiate an antiviral immune response. To understand the molecular basis of this process, we determined the crystal structure of RIG-I in complex with double-stranded RNA. The dsRNA is sheathed within a network of protein domains that include a conserved “helicase” domain (regions HEL1 and HEL2), a specialized insertion domain (HEL2i), and a C-terminal regulatory domain (CTD). A V-shaped pincer connects HEL2 and the CTD by gripping an α-helical shaft that extends from HEL1. In this way, the pincer coordinates functions of all the domains and couples RNA binding with ATP hydrolysis. RIG-I falls within the Dicer-RIG-I clade of super family 2 of helicases and this structure reveals complex interplay between motor domains, accessory mechanical domains and RNA that has implications for understanding the nanomechanical function this protein family and other ATPases more broadly.
RIG-I; RNA helicase; innate immunity; X-ray crystallography
Mss116 is a Saccharomyces cerevisiae mitochondrial DEAD-box RNA helicase protein essential for efficient in vivo splicing of all group I and II introns and activation of mRNA translation. Catalysis of intron splicing by Mss116 is coupled to its ATPase activity. Knowledge of the kinetic pathway(s) and biochemical intermediates populated during RNA-stimulated Mss116 ATPase is fundamental for defining how Mss116 ATP utilization is linked to in vivo function. We therefore measured the rate and equilibrium constants underlying Mss116 ATP utilization and nucleotide-linked RNA binding. RNA accelerates the Mss116 steady-state ATPase ~7-fold by promoting rate-limiting ATP hydrolysis, such that Pi release becomes (partially) rate-limiting. RNA binding displays strong thermodynamic coupling to the chemical states of the Mss116-bound nucleotide such that Mss116 with bound ADP-Pi binds RNA more strongly than with bound ADP or in the absence of nucleotide. The predominant biochemical intermediate populated during in vivo steady-state cycling is the strong RNA binding, Mss116-ADP-Pi state. Strong RNA binding allows Mss116 to fulfill its biological role in stabilization of group II intron folding intermediates. ATPase cycling allows for transient population of the weak RNA binding, ADP state of Mss116 and linked dissociation from RNA, which is required for the final stages of intron folding. In cases where Mss116 functions as a helicase, the data collectively favor a model in which ATP hydrolysis promotes a weak-to-strong RNA binding transition that disrupts stable RNA duplexes. The subsequent strong-to-weak RNA binding transition associated with Pi release dissociates RNA-Mss116 complexes, regenerating free Mss116.
RNA helicase; ATPase cycle; kinetics; fluorescence correlation spectroscopy (FCS)
Hepatitis C virus NS3-4A is a membrane-bound enzyme complex that exhibits serine protease, RNA helicase, and RNA-stimulated ATPase activities. This enzyme complex is essential for viral genome replication and has been recently implicated in virus particle assembly. To help clarify the role of NS4A in these processes, we conducted alanine scanning mutagenesis on the C-terminal acidic domain of NS4A in the context of a chimeric genotype 2a reporter virus. Of 13 mutants tested, two (Y45A and F48A) had severe defects in replication, while seven (K41A, L44A, D49A, E50A, M51A, E52A, and E53A) efficiently replicated but had severe defects in virus particle assembly. Multiple strategies were used to identify second-site mutations that suppressed these NS4A defects. The replication defect of NS4A F48A was partially suppressed by mutation of NS4B I7F, indicating that a genetic interaction between NS4A and NS4B contributes to RNA replication. Furthermore, the virus assembly defect of NS4A K41A was suppressed by NS3 Q221L, a mutation previously implicated in overcoming other virus assembly defects. We therefore examined the known enzymatic activities of wild-type or mutant forms of NS3-4A but did not detect specific defects in the mutants. Taken together, our data reveal interactions between NS4A and NS4B that control genome replication and between NS3 and NS4A that control virus assembly.
Multiple studies hypothesize that DEAD-box proteins facilitate folding of the ai5γ group II intron. However, these conclusions are generally inferred from splicing kinetics, and not from direct monitoring of DEAD-box protein-facilitated folding of the intron. Using native gel electrophoresis and DMS structural probing, we monitored Mss 116-facilitated folding of ai5γ intron ribozymes and a catalytically active self-splicing RNA containing full length intron and short exons. We found that the protein directly stimulates folding of these RNAs by accelerating formation of the compact near-native state. This process occurs in an ATP-independent manner, although, ATP is required for the protein turnover. As Mss 116 binds RNA non-specifically, most of binding events do not result in the formation of the compact state, and ATP is required for the protein to dissociate from such non-productive complexes and rebind the unfolded RNA. Results obtained from experiments at different concentrations of magnesium ions suggest that Mss 116 stimulates folding of ai5γ ribozymes by promoting the formation of unstable folding intermediates, which is then followed by a cascade of folding events resulting in the formation of the compact near-native state. DMS probing results suggest that the compact state formed in the presence of the protein is identical to the near-native state formed more slowly in its absence. Our results also indicate that Mss 116 does not stabilize the native state of the ribozyme, but that such stabilization results from binding of attached exons.
ribozyme; RNA folding; DEAD-box protein; tertiary structure
Group II introns are self-splicing ribozymes that excise themselves from precursor RNAs and catalyze the joining of flanking exons. Excised introns can behave as parasitic RNA molecules, catalyzing their own insertion into DNA and RNA via a reverse-splicing reaction. Previous studies have identified mechanistic roles for various functional groups located in the catalytic core of the intron and within target molecules. Here we introduce a new method for synthesizing long RNA molecules with a modified nucleotide at the 3′-terminus. This modification allows us to examine the mechanistic role of functional groups adjacent to the reaction nucleophile. During reverse-splicing, the 3′-OH group of the intron terminus attacks the phosphodiester linkage of spliced exon sequences. Here we show that the adjacent 2′-OH group on the intron terminus plays an essential role in activating the nucleophile by stripping away a proton from the 3′-OH and then shuttling it from the active-site.
The autocatalytic group II intron ai5γ from Saccharomyces cerevisiae self-splices under high-salt conditions in vitro, but requires the assistance of the DEAD-box protein Mss116 in vivo and under near-physiological conditions in vitro. Here, we show that Mss116 influences the folding mechanism in several ways. By comparing intron precursor RNAs with long (∼300 nt) and short (∼20 nt) exons, we observe that long exon sequences are a major obstacle for self-splicing in vitro. Kinetic analysis indicates that Mss116 not only mitigates the inhibitory effects of long exons, but also assists folding of the intron core. Moreover, a mutation in conserved Motif III that impairs unwinding activity (SAT → AAA) only affects the construct with long exons, suggesting helicase unwinding during exon unfolding, but not in intron folding. Strong parallels between Mss116 and the related protein Cyt-19 from Neurospora crassa suggest that these proteins form a subclass of DEAD-box proteins that possess a versatile repertoire of diverse activities for resolving the folding problems of large RNAs.
RNA helicases are proteins essential to almost every facet of RNA metabolism, including the gene-silencing pathways that employ small RNAs. A phylogenetically related group of helicases is required for the RNA-silencing mechanism in Caenorhabditis elegans. Dicer-related helicase 3 (DRH-3) is a Dicer-RIG-I family protein that is essential for RNA silencing and germline development in nematodes. Here we performed a biochemical characterization of the ligand binding and catalytic activities of DRH-3 in vitro. We identify signature motifs specific to this family of RNA helicases. We find that DRH-3 binds both single-stranded and double-stranded RNAs with high affinity. However, the ATPase activity of DRH-3 is stimulated only by double-stranded RNA. DRH-3 is a robust RNA-stimulated ATPase with a kcat value of 500/min when stimulated with short RNA duplexes. The DRH-3 ATPase may have allosteric regulation in cis that is controlled by the stoichiometry of double-stranded RNA to enzyme. We observe that the DRH-3 ATPase is stimulated only by duplexes containing RNA, suggesting a role for DRH-3 during or after transcription. Our findings provide clues to the role of DRH-3 during the RNA interference response in vivo.
ATPases; Double-stranded RNA; RNA Helicase; RNA Interference (RNAi); siRNA
The superfamily 2 vaccinia viral helicase nucleoside triphosphate phosphohydrolase-II (NPH-II) exhibits robust RNA helicase activity but typically displays little activity on DNA substrates. NPH-II is thus believed to make primary contacts with backbone residues of an RNA substrate. We report an unusual nucleobase bias, previously unreported in any superfamily 1 or 2 helicase, whereby purines are heavily preferred as components of both RNA and DNA tracking strands. The observed sequence bias allows NPH-II to efficiently unwind a DNA·RNA hybrid containing a purine-rich DNA track derived from the 3′-untranslated region of an early vaccinia gene. These results provide insight into potential biological functions of NPH-II and the role of sequence in targeting NPH-II to appropriate substrates. Furthermore, they demonstrate that in addition to backbone contacts, nucleotide bases play an important role in modulating the behavior of NPH-II. They also establish that processive helicase enzymes can display sequence selectivity.
DNA Helicase; Molecular Motors; Protein-Nucleic Acid Interaction; RNA Helicase; Transcription Termination; DExH Helicase; NPH-II; Helicase Activity; Sequence Dependence
Nonstructural protein 3 (NS3) is an essential replicative component of the hepatitis C virus (HCV) and a member of the DExH/D-box family of proteins. The C-terminal region of NS3 (NS3hel) exhibits RNA-stimulated NTPase and helicase activity, while the N-terminal serine protease domain of NS3 enhances RNA binding and unwinding by NS3hel. The nonstructural protein 4A (NS4A) binds to the NS3 protease domain and serves as an obligate cofactor for NS3 serine protease activity. Given its role in stimulating protease activity, we sought to determine whether NS4A also influences the activity of NS3hel. Here we show that NS4A enhances the ability of NS3hel to bind RNA in the presence of ATP, thereby acting as a cofactor for helicase activity. This effect is mediated by amino acids in the C-terminal acidic domain of NS4A. When these residues are mutated, one observes drastic reductions in ATP-coupled RNA binding and duplex unwinding by NS3. These same mutations are lethal in HCV replicons, thereby establishing in vitro and in vivo that NS4A plays an important role in the helicase mechanism of NS3 and its function in replication.
Quantitatively describing RNA structure and conformational elements remains a formidable problem. Seven standard torsion angles and the sugar pucker are necessary to completely characterize the conformation of an RNA nucleotide. Progress has been made toward understanding the discrete nature of RNA structure, but classifying simple and ubiquitous structural elements such as helices and motifs remains a difficult task. One approach for describing RNA structure in a simple, mathematically consistent, and computationally accessible manner involves the invocation of two pseudotorsions, η (C4’n-1, Pn, C4’n, Pn+1) and θ (Pn, C4’n, Pn+1, C4’n+1), which can be used to describe RNA conformation in much the same way that ϕ and ψ are used to describe backbone configuration of proteins. Here we conduct an exploration and statistical evaluation of pseudotorsional space and of the Ramachandran-like η−θ plot. We show that, through the rigorous quantitative analysis of the η−θ plot, the pseudotorsional descriptors η and θ, together with sugar pucker, are sufficient to describe RNA backbone conformation fully in most cases. These descriptors are also shown to contain considerable information about nucleotide base conformation, revealing a previously uncharacterized interplay between backbone and base orientation. A window function analysis is used to discern statistically relevant regions of density in the η−θ scatter plot and then nucleotides in colocalized clusters in the η−θ plane are shown to have similar three-dimensional structures through RMSD analysis of the RNA structural constituents. We find that major clusters in the η−θ plot are few in number, thereby underscoring the discrete nature of RNA backbone conformation. Like the Ramachandran plot, the η−θ plot is a valuable system for conceptualizing biomolecular conformation, it is a useful tool for analyzing RNA tertiary structures, and it is a vital component of new approaches for solving the three-dimensional structures of large RNA molecules and RNA assemblies.
RNA structure; reduced representation; pseudotorsions; Ramachandran; cluster analysis
Tetraloops are a common building block for RNA tertiary structure and most tetraloops fall into one of three well-characterized classes: GNRA, UNCG, and CUYG. Here, we present the sequence and structure of a fourth highly conserved class of tetraloop that occurs only within the ζ-ζ′ interaction of group IIC introns. This GANC tetraloop was identified, along with an unusual cognate receptor, in the crystal structure of the group IIC intron and through phylogenetic analysis of intron RNA sequence alignments. Unlike conventional tetraloop-receptor interactions, which are stabilized by extensive hydrogen bonding interactions, the GANC-receptor interaction is limited to a single base stack between the conserved adenosine of the tetraloop and a single purine of the receptor, which consists of a one to three nucleotide bulge and does not contain an A-platform. Unlike GNRA tetraloops, the GANC tetraloop forms a sharp angle relative to the adjacent helix, bending by approximately 45° towards the major groove side of the helix. These structural attributes allow GANC tetraloops to fit precisely within the group IIC intron core, thereby demonstrating that structural motifs can adapt to function in a specific niche.
tetraloop; motif; RNA structure; group II intron; ribozyme
Non-structural protein 3 (NS3) is a multifunctional enzyme possessing
serine protease, NTPase, and RNA unwinding activities that are required for
hepatitis C viral (HCV) replication. HCV non-structural protein 4A (NS4A)
binds to the N-terminal NS3 protease domain to stimulate NS3 serine protease
activity. In addition, the NS3 protease domain enhances the RNA binding,
ATPase, and RNA unwinding activities of the C-terminal NS3 helicase domain
(NS3hel). To determine whether NS3hel enhances the NS3 serine protease
activity, we purified truncated and full-length NS3-4A complexes and examined
their serine protease activities under a variety of salt and pH conditions.
Our results indicate that the helicase domain enhances serine protease
activity, just as the protease domain enhances helicase activity. Thus, the
two enzymatic domains of NS3-4A are highly interdependent. This is the first
time that such a complete interdependence has been demonstrated for a
multifunctional, single chain enzyme. NS3-4A domain interdependence has
important implications for function during the viral lifecycle as well as for
the design of inhibitor screens that target the NS3-4A protease.
The folding of group II intron ribozymes has been studied extensively under optimal conditions for self-splicing in vitro (42 °C and high magnesium ion concentrations). In these cases, the ribozymes fold directly to the native state by an apparent two-state mechanism involving the formation of an obligate intermediate within intron Domain 1. We have now characterized the folding pathway under near-physiological conditions. We observe that compaction of the RNA proceeds slowly to completion, even at low magnesium (3 mM). Kinetic analysis shows that this compact species is a “near-native” intermediate state that is readily chased into the native state by the addition of high salt. Structural probing reveals that the “near-native” state represents a compact Domain 1 scaffold that is not yet docked with the catalytic domains (D3 and D5). Interestingly, native ribozyme reverts to the “near-native” state upon reduction in magnesium concentration. Therefore, while the intron can sustain the intermediate state under physiological conditions, the native structure is not maintained and is likely to require stabilization by protein cofactors in vivo.
ribozyme; RNA folding; kinetics; mechanism; RNA structure
Helicases are a ubiquitous class of enzymes involved in nearly all aspects of DNA and RNA metabolism. Despite recent progress in understanding their mechanism of action, limited resolution has left inaccessible the detailed mechanisms by which these enzymes couple the rearrangement of nucleic acid structures to the binding and hydrolysis of ATP1,2. Observing individual mechanistic cycles of these motor proteins is central to understanding their cellular functions. Here we follow in real time, at a resolution of two base pairs and 20 ms, the RNA translocation and unwinding cycles of a hepatitis C virus helicase (NS3) monomer. NS3 is a representative superfamily-2 helicase essential for viral replication3, and therefore a potentially important drug target4. We show that the cyclic movement of NS3 is coordinated by ATP in discrete steps of 11 ± 3 base pairs, and that actual unwinding occurs in rapid smaller substeps of 3.6 ± 1.3 base pairs, also triggered by ATP binding, indicating that NS3 might move like an inchworm5,6. This ATP-coupling mechanism is likely to be applicable to other non-hexameric helicases involved in many essential cellular functions. The assay developed here should be useful in investigating a broad range of nucleic acid translocation motors.
Most RNA molecules collapse rapidly and reach the native state through a pathway that contains numerous traps and unproductive intermediates. The D135 group II intron ribozyme is unusual in that it can fold slowly and directly to the native state, despite its large size and structural complexity. Here we use hydroxyl radical footprinting and native gel analysis to monitor the timescale of tertiary structure collapse and to detect the presence of obligate intermediates along the folding pathway of D135. We find that structural collapse and native folding of Domain 1 precede assembly of the entire ribozyme, indicating that D1 contains an on-pathway intermediate to folding of the D135 ribozyme. Subsequent docking of Domains 3 and 5, for which D1 provides a preorganized scaffold, appears to be very fast and independent of one another. In contrast to other RNAs, the D135 ribozyme undergoes slow tertiary collapse to a compacted state, with a rate constant that is also limited by the formation D1. These findings provide a new paradigm for RNA folding and they underscore the diversity of RNA biophysical behaviors.
NPH-II is a prototypical member of the DExH/D subgroup of superfamily II helicases. It exhibits robust RNA helicase activity, and a detailed kinetic framework for unwinding has been established. However, like most SF2 helicases, there is little known about its mode of substrate recognition and its ability to differentiate between RNA and DNA substrates. Here, we employ a series of chimeric RNA–DNA substrates to explore the molecular determinants for NPH-II specificity on RNA and to determine if there are conditions under which DNA is a substrate. We show that efficient RNA helicase activity depends exclusively on ribose moieties in the loading strand and in a specific section of the 3′-overhang. However, we also document the presence of trace activity on DNA polymers, showing that DNA can be unwound under extremely permissive conditions that favor electrostatic binding. Thus, while polymer-specific SF2 helicases control substrate recognition through specific interactions with the loading strand, alternative specificities can arise under appropriate reaction conditions.
Recurring RNA structural motifs are important sites of tertiary interaction and as such, are integral to RNA macromolecular structure. Although numerous RNA motifs have been classified and characterized, the identification of new motifs is of great interest. In this study, we discovered four new conformationally recurring motifs: the π-turn, the Ω-turn, the α-loop and the C2′-endo mediated flipped adenosine motif. Not only do they have complex and interesting structures, but they participate in contacts of high biological significance. In a first for the RNA field, new motifs were discovered by a fully automated algorithm. This algorithm, COMPADRES, utilized a reduced representation of the RNA backbone and was highly successful at discerning unique structural relationships. This study also shows that recurring RNA substructures are not necessarily accompanied by consistent primary or secondary structure.
Given the wealth of new RNA structures and the growing list of RNA functions in biology, it is of great interest to understand the repertoire of RNA folding motifs. The ability to identify new and known motifs within novel RNA structures, to compare tertiary structures with one another and to quantify the characteristics of a given RNA motif are major goals in the field of RNA research; however, there are few systematic ways to address these issues. Using a novel approach for visualizing and mathematically describing macromolecular structures, we have developed a means to quantitatively describe RNA molecules in order to rapidly analyze, compare and explore their features. This approach builds on the alternative η,θ convention for describing RNA torsion angles and is executed using a new program called PRIMOS. Applying this methodology, we have successfully identified major regions of conformational change in the 50S and 30S ribosomal subunits, we have developed a means to search the database of RNA structures for the prevalence of known motifs and we have classified and identified new motifs. These applications illustrate the powerful capabilities of our new RNA structural convention, and they suggest future adaptations with important implications for bioinformatics and structural genomics.