Modular architecture is a hallmark of RNA structures, implying structural, and possibly functional, similarity among existing RNAs. To systematically delineate the existence of smaller topologies within larger structures, we develop and apply an efficient RNA secondary structure comparison algorithm using a newly developed two-dimensional RNA graphical representation. Our survey of similarity among 14 pseudoknots and subtopologies within ribosomal RNAs (rRNAs) uncovers eight pairs of structurally related pseudoknots with non-random sequence matches and reveals modular units in rRNAs. Significantly, three structurally related pseudoknot pairs have functional similarities not previously known: one pair involves the 3′ end of brome mosaic virus genomic RNA (PKB134) and the alternative hammerhead ribozyme pseudoknot (PKB173), both of which are replicase templates for viral RNA replication; the second pair involves structural elements for translation initiation and ribosome recruitment found in the viral internal ribosome entry site (PKB223) and the V4 domain of 18S rRNA (PKB205); the third pair involves 18S rRNA (PKB205) and viral tRNA-like pseudoknot (PKB134), which probably recruits ribosomes via structural mimicry and base complementarity. Additionally, we quantify the modularity of 16S and 23S rRNAs by showing that RNA motifs can be constructed from at least 210 building blocks. Interestingly, we find that the 5S rRNA and two tree modules within 16S and 23S rRNAs have similar topologies and tertiary shapes. These modules can be applied to design novel RNA motifs via build-up-like procedures for constructing sequences and folds.
RNA inverse folding is a computational technology for designing RNA sequences which fold into a user-specified secondary structure. Although pseudoknots are functionally important motifs in RNA structures, less reports concerning the inverse folding of pseudoknotted RNAs have been done compared to those for pseudoknot-free RNA design. In this paper, we present a new version of our multi-objective genetic algorithm (MOGA), MODENA, which we have previously proposed for pseudoknot-free RNA inverse folding. In the new version of MODENA, (i) a new crossover operator is implemented and (ii) pseudoknot prediction methods, IPknot and HotKnots, are used to evaluate the designed RNA sequences, allowing us to perform the inverse folding of pseudoknotted RNAs. The new version of MODENA with the new crossover operator was benchmarked with a dataset composed of natural pseudoknotted RNA secondary structures, and we found that MODENA can successfully design more pseudoknotted RNAs compared to the other pseudoknot design algorithm. In addition, a sequence constraint function newly implemented in the new version of MODENA was tested by designing RNA sequences which fold into the pseudoknotted structure of a hepatitis delta virus ribozyme; as a result, we successfully designed eight RNA sequences. The new version of MODENA is downloadable from http://rna.eit.hirosaki-u.ac.jp/modena/.
inverse folding; pseudoknot; secondary structure; pseudobase; Rfam; sequence constraint
Translation of Hepatitis C viral proteins requires an internal ribosome entry site (IRES) located in the 5′ untranslated region of the viral mRNA. The core domain of the Hepatitis C virus (HCV) IRES contains a four-way helical junction that is integrated within a predicted pseudoknot. This domain is required for positioning the mRNA start codon correctly on the 40S ribosomal subunit during translation initiation. Here we present the crystal structure of this RNA, revealing a complex double-pseudoknot fold that establishes the alignment of two helical elements on either side of the four-helix junction. The conformation of this core domain constrains the open reading frame’s orientation for positioning on the 40S ribosomal subunit. This structure, representing the last major domain of HCV-like IRESs to be determined at near-atomic resolution, provides the basis for a comprehensive cryo-electron microscopy-guided model of the intact HCV IRES and its interaction with 40S ribosomal subunits.
The diverse landscape of RNA conformational space includes many canyons and crevices that are distant from the lowest minimum free energy valley and remain unexplored by traditional RNA structure prediction methods. A complete description of the entire RNA folding landscape can facilitate identification of biologically important conformations. The Crumple algorithm rapidly enumerates all possible non-pseudoknotted structures for an RNA sequence without consideration of thermodynamics while filtering the output with experimental data. The Crumple algorithm provides an alternative approach to traditional free energy minimization programs for RNA secondary structure prediction. A complete computation of all non-pseudoknotted secondary structures can reveal structures that would not be predicted by methods that sample the RNA folding landscape based on thermodynamic predictions. The free energy minimization approach is often successful but is limited by not considering RNA tertiary and protein interactions and the possibility that kinetics rather than thermodynamics determines the functional RNA fold. Efficient parallel computing and filters based on experimental data make practical the complete enumeration of all non-pseudoknotted structures. Efficient parallel computing for Crumple is implemented in a ring graph approach. Filters for experimental data include constraints from chemical probing of solvent accessibility, enzymatic cleavage of paired or unpaired nucleotides, phylogenetic covariation, and the minimum number and lengths of helices determined from crystallography or cryo-electron microscopy. The minimum number and length of helices has a significant effect on reducing conformational space. Pairing constraints reduce conformational space more than single nucleotide constraints. Examples with Alfalfa Mosaic Virus RNA and Trypanosome brucei guide RNA demonstrate the importance of evaluating all possible structures when pseduoknots, RNA-protein interactions, and metastable structures are important for biological function. Crumple software is freely available at http://adenosine.chem.ou.edu/software.html.
Bovine viral diarrhea virus (BVDV) is the prototype representative of the pestivirus genus in the Flaviviridae family. It has been shown that the initiation of translation of BVDV RNA occurs by an internal ribosome entry mechanism mediated by the 5' untranslated region of the viral RNA . The 5' and 3' boundaries of the IRES of the cytopathic BVDV NADL have been mapped and it has been suggested that the IRES extends into the coding of the BVDV polyprotein . A putative pseudoknot structure has been recognized in the BVDV 5'UTR in close proximity to the AUG start codon. A pseudoknot structure is characteristic for flavivirus IRESes and in the case of the closely related classical swine fever virus (CSFV) and the more distantly related Hepatitis C virus (HCV) pseudoknot function in translation has been demonstrated.
To characterize the BVDV IRESes in detail, we studied the BVDV translational initiation by transfection of dicistronic expression plasmids into mammalian cells. A region coding for the amino terminus of the BVDV SD-1 polyprotein contributes considerably to efficient initiation of translation. The translation efficiency mediated by the IRES of BVDV strains NADL and SD-1 approximates the poliovirus type I IRES directed translation in BHK cells. Compared to the poliovirus IRES increased expression levels are mediated by the BVDV IRES of strain SD-1 in murine cell lines, while lower levels are observed in human cell lines. Site directed mutagenesis revealed that a RNA pseudoknot upstream of the initiator AUG is an important structural element for IRES function. Mutants with impaired ability to base pair in stem I or II lost their translational activity. In mutants with repaired base pairing either in stem 1 or in stem 2 full translational activity was restored. Thus, the BVDV IRES translation is dependent on the pseudoknot integrity. These features of the pestivirus IRES are reminiscent of those of the classical swine fever virus, a pestivirus, and the hepatitis C viruses, another genus of the Flaviviridae.
The IRES of the non-cytopathic BVDV SD-1 strain displays features known from other pestivirus IRESes. The predicted pseudoknot in the 5'UTR of BVDV SD-1 virus represents an important structural element in BVDV translation.
Key elements of the conformational switch model describing regulation of alfalfa mosaic virus (AMV) replication (R. C. Olsthoorn, S. Mertens, F. T. Brederode, and J. F. Bol, EMBO J. 18:4856-4864, 1999) have been tested using biochemical assays and functional studies in nontransgenic protoplasts. Although comparative sequence analysis suggests that the 3′ untranslated regions of AMV and ilarvirus RNAs have the potential to fold into pseudoknots, we were unable to confirm that a proposed pseudoknot forms or has a functional role in regulating coat protein-RNA binding or viral RNA replication. Published work has suggested that the pseudoknot is part of a tRNA-like structure (TLS); however, we argue that the canonical sequence and functional features that define the TLS are absent. We suggest here that the absence of the TLS correlates directly with the distinctive requirement for coat protein to activate replication in these viruses. Experimental data are evidence that elevated magnesium concentrations proposed to stabilize the pseudoknot structure do not block coat protein binding. Additionally, covarying nucleotide changes proposed to reestablish pseudoknot pairings do not rescue replication. Furthermore, as described in the accompanying paper (L. M. Guogas, S. M. Laforest, and L. Gehrke, J. Virol. 79:5752-5761, 2005), coat protein is not, by definition, inhibitory to minus-strand RNA synthesis. Rather, the activation of viral RNA replication by coat protein is shown to be concentration dependent. We describe the 3′ organization model as an alternate model of AMV replication that offers an improved fit to the available data.
The intergenic region internal ribosome entry site (IGR IRES) of the Dicistroviridae family adopts an overlapping triple pseudoknot structure to directly recruit the 80S ribosome in the absence of initiation factors. The pseudoknot I (PKI) domain of the IRES mimics a tRNA-like codon:anticodon interaction in the ribosomal P site to direct translation initiation from a non-AUG initiation codon in the A site. In this study, we have performed a comprehensive mutational analysis of this region to delineate the molecular parameters that drive IRES translation. We demonstrate that IRES-mediated translation can initiate at an alternate adjacent and overlapping start site, provided that basepairing interactions within PKI remain intact. Consistent with this, IGR IRES translation tolerates increases in the variable loop region that connects the anticodon- and codon-like elements within the PKI domain, as IRES activity remains relatively robust up to a 4-nucleotide insertion in this region. Finally, elements from an authentic tRNA anticodon stem-loop can functionally supplant corresponding regions within PKI. These results verify the importance of the codon:anticodon interaction of the PKI domain and further define the specific elements within the tRNA-like domain that contribute to optimal initiator Met-tRNAi-independent IRES translation.
Trans-translation releases stalled ribosomes from truncated mRNAs and tags defective proteins for proteolytic degradation using transfer-messenger RNA (tmRNA). This small stable RNA represents a hybrid of tRNA- and mRNA-like domains connected by a variable number of pseudoknots. Comparative sequence analysis of tmRNAs found in bacteria, plastids, and mitochondria provides considerable insights into their secondary structures. Progress toward understanding the molecular mechanism of template switching, which constitutes an essential step in trans-translation, is hampered by our limited knowledge about the three-dimensional folding of tmRNA.
To facilitate experimental testing of the molecular intricacies of trans-translation, which often require appropriately modified tmRNA derivatives, we developed a procedure for building three-dimensional models of tmRNA. Using comparative sequence analysis, phylogenetically-supported 2-D structures were obtained to serve as input for the program ERNA-3D. Motifs containing loops and turns were extracted from the known structures of other RNAs and used to improve the tmRNA models. Biologically feasible 3-D models for the entire tmRNA molecule could be obtained. The models were characterized by a functionally significant close proximity between the tRNA-like domain and the resume codon. Potential conformational changes which might lead to a more open structure of tmRNA upon binding to the ribosome are discussed. The method, described in detail for the tmRNAs of Escherichia coli, Bacillus anthracis, and Caulobacter crescentus, is applicable to every tmRNA.
Improved molecular models of biological significance were obtained. These models will guide in the design of experiments and provide a better understanding of trans-translation. The comparative procedure described here for tmRNA is easily adopted for the modeling the members of other RNA families.
In a previous study it was shown that RNase P from E. coli cleaves the tRNA-like structure of turnip yellow mosaic virus (TYMV) RNA in vitro (Guerrier-Takada et al. (1988) Cell, 53, 267-272). Cleavage takes place at the 3' side of the loop that crosses the deep groove of the pseudoknot structure present in the aminoacyl acceptor domain. In the present study fragments of TYMV RNA with mutations in the pseudoknot, generated by transcription in vitro, were tested for susceptibility to cleavage by RNase P. Changes in the specificity with respect to the site of cleavage and decreases in the rate of cleavage were observed with most of these substrates. The behaviour of various mutants in the reaction catalyzed by RNase P is in agreement with the present model of the TYMV RNA pseudoknot (Dumas et al. (1987), J. Biomol. Struct. Dyn. 263, 652-657). Base substitutions in the loop that crosses the shallow groove of the pseudoknot structure resulted, however, in an unexpected decrease in the rate of cleavage, probably due to conformational changes in the substrates. Studies on other tRNA-like structures revealed an important role in the reaction with RNase P for both the nucleotide at the 3' side of the loop that spans the deep groove and the nucleotide at position 4, which correspond to positions--1 and 73, respectively, in tRNA precursors.
Three tRNA-associated properties of a representative set of tymoviral RNAs have been quantitatively assessed using higher plant (wheat germ) proteins: aminoacylation, EF-1alpha*GTP binding, and 3'-adenylation of 3'-CC forms of the RNAs by CTP, ATP:tRNA nucleotidyltransferase. The RNAs fall into three classes differing in the extent of tRNA mimicry. Turnip yellow mosaic (TYMV) and kennedya yellow mosaic virus RNAs had activities in all three properties similar to those of a higher plant tRNAValtranscript, and thus are remarkable tRNA mimics. Although the isolated approximately 83 nt long tRNA-like structures showed high activity in these assays, in the case of TYMV, the 6318 nt long TYMV RNA was an even better substrate for valylation. Eggplant mosaic virus RNA, which has a differently constructed acceptor stem pseudoknot, differed from the above tymoviral RNAs in binding more weakly to EF-1alpha*GTP. Erysimum latent virus RNA, which lacks an identifiable anticodon domain, could not be valylated and had very low 3'-adenylation activity. The range of tRNA mimicry within the tymovirus genus thus ranges from extremely highly developed to minimal. The implications on the role of the tRNA mimicry in viral biology are discussed.
The 104 nucleotides long 3' terminal region of TMV RNA was shown previously to contain two pseudoknotted structures (Rietveld et al. (1984), EMBO J. 3, 2613-2619). We here present evidence for the occurrence, within the 204 nucleotides long 3' noncoding region, of another highly structured domain located immediately adjacent to the tRNA-like structure of 95 nucleotides (Joshi et al. (1985) Nucleic Acids Res. 13, 347-354). A model for the three-dimensional folding of this region, containing three more pseudoknots, is proposed on the basis of chemical modification and enzymatic digestion. The existence of these three consecutive pseudoknots was supported by sequence comparisons with the RNA from the related tobamoviruses TMV-L, CcTMV and CGMMV. Coaxial stacking of the six double helical segments involved gives rise to the formation of a 25 basepair long quasi-continuous double helix. The results show that the three-dimensional folding of the 3' non-translated region of tobamoviral RNAs is largely maintained by the formation of five pseudoknots. The organisation of this region in the RNA of the tobamovirus CcTMV suggests that recombinational events among aminoacylatable plant viral RNAs have to be considered.
Transfer-messenger RNA (tmRNA) is a unique molecule that combines properties from both tRNA and mRNA, and facilitates a novel translation reaction termed trans -translation. According to phylogenetic sequence analysis among various bacteria and chemical probing analysis, the secondary structure of the 350-400 nt RNA is commonly characterized by a tRNA-like structure, and four pseudoknots with different sizes. A mutational analysis using a number of Escherichia coli tmRNA variants as well as a chemical probing analysis has recently demonstrated not only the presence of the smallest pseudoknot, PK1, upstream of the internal coding region, but also its direct implication in trans -translation. Here, NMR methods were used to investigate the structure of the 31 nt pseudoknot PK1 and its 11 mutants in which nucleotide substitutions are introduced into each of two stems or the linking loops. NMR results provide evidence that the PK1 RNA is folded into a pseudoknot structure in the presence of Mg(2+). Imino proton resonances were observed consistent with formation of two helical stem regions and these stems stacked to each other as often seen in pseudoknot structures, in spite of the existence of three intervening nucleo-tides, loop 3, between the stems. Structural instability of the pseudoknot structure, even in the presence of Mg(2+), was found in the PK1 mutants except in the loop 3 mutants which still maintained the pseudoknot folding. These results together with their biological activities indicate that trans -translation requires the pseudoknot structure stabilized by Mg(2+)and specific residues G61 and G62 in loop 3.
The importance of certain structural features of the 5′ untranslated region of classical swine fever virus (CSFV) RNA for the function of the internal ribosome entry site (IRES) was investigated by mutagenesis followed by in vitro transcription and translation. Deletions made from the 5′ end of the CSFV genome sequence showed that the IRES boundary was close to nucleotide 65: thus, the IRES includes the whole of domain II but no sequences upstream of this domain. Deletions which invaded domain II even to a small extent reduced activity to about 20% that of the full-length structure, and this 20% residual activity persisted with more extensive deletions until the whole of domain II had been removed and the deletions invaded the pseudoknot, whereupon IRES activity fell to zero. The importance of both stems of the pseudoknot was verified by making mutations in both sides of each stem; this severely reduced IRES activity, but the compensating mutations which restored base pairing caused almost full IRES function to be regained. The importance of the length of the loop linking the two stems of the pseudoknot was demonstrated by the finding that a reduction in length from the wild-type AUAAAAUU to AUU almost completely abrogated IRES activity. Random A→U substitutions in the wild-type sequence showed that IRES activity was fairly proportional to the number of A residues retained in this pseudoknot loop, with a preference for clustered neighboring A residues rather than dispersed As. Finally, it was found that the sequence of the highly conserved domain IIIa loop is, rather surprisingly, not important for the maintenance of full IRES activity, although amputation of the entire domain IIIa stem and loop was highly debilitating. These results are interpreted in the light of recent models, derived from cryo-electron microscopy, of the interaction of the closely related hepatitis C virus IRES with 40S ribosomal subunits.
Using a combined master equation and kinetic cluster approach, we investigate RNA pseudoknot folding and unfolding kinetics. The energetic parameters are computed from a recently developed Vfold model for RNA secondary structure and pseudoknot folding thermodynamics. The folding kinetics theory is based on the complete conformational ensemble, including all the native-like and non-native states. The predicted folding and unfolding pathways, activation barriers, Arrhenius plots, and rate-limiting steps lead to several findings. First, for the PK5 pseudoknot, a misfolded 5′ hairpin emerges as a stable kinetic trap in the folding process, and the detrapping from this misfolded state is the rate-limiting step for the overall folding process. The calculated rate constant and activation barrier agree well with the experimental data. Second, as an application of the model, we investigate the kinetic folding pathways for hTR (human Telomerase RNA) pseudoknot. The predicted folding and unfolding pathways not only support the proposed role of conformational switch between hairpin and pseudoknot in hTR activity, but also reveal molecular mechanism for the conformational switch. Furthermore, for an experimentally studied hTR mutation, whose hairpin intermediate is destabilized, the model predicts a long-lived transient hairpin structure, and the switch between the transient hairpin intermediate and the native pseudoknot may be responsible for the observed hTR activity. Such finding would help resolve the apparent contradiction between the observed hTR activity and the absence of a stable hairpin.
Kinetics; RNA pseudoknot; Activation energy; Misfolded state; Telomerase
RNA virus genomes contain cis-acting sequence and structural elements that participate in viral replication. We previously identified a bulged stem-loop secondary structure at the upstream end of the 3′ untranslated region (3′ UTR) of the genome of the coronavirus mouse hepatitis virus (MHV). This element, beginning immediately downstream of the nucleocapsid gene stop codon, was shown to be essential for virus replication. Other investigators discovered an adjacent downstream pseudoknot in the 3′ UTR of the closely related bovine coronavirus (BCoV). This pseudoknot was also shown to be essential for replication, and it has a conserved counterpart in every group 1 and group 2 coronavirus. In MHV and BCoV, the bulged stem-loop and pseudoknot are, in part, mutually exclusive, because of the overlap of the last segment of the stem-loop and stem 1 of the pseudoknot. This led us to hypothesize that they form a molecular switch, possibly regulating a transition occurring during viral RNA synthesis. We have now performed an extensive genetic analysis of the two components of this proposed switch. Our results define essential and nonessential components of these structures and establish the limits to which essential parts of each element can be destabilized prior to loss of function. Most notably, we have confirmed the interrelationship of the two putative switch elements. Additionally, we have identified a pseudoknot loop insertion mutation that appears to point to a genetic interaction between the pseudoknot and a distant region of the genome.
The analysis of sequence-structure relations of RNA is based on a specific notion and folding of RNA structure. The notion of coarse grained structure employed here is that of canonical RNA pseudoknot contact-structures with at most two mutually crossing bonds (3-noncrossing). These structures are folded by a novel, ab initio prediction algorithm, cross, capable of searching all 3-noncrossing RNA structures. The algorithm outputs the minimum free energy structure.
After giving some background on RNA pseudoknot structures and providing an outline of the folding algorithm being employed, we present in this paper various, statistical results on the mapping from RNA sequences into 3-noncrossing RNA pseudoknot structures. We study properties, like the fraction of pseudoknot structures, the dominant pseudoknot-shapes, neutral walks, neutral neighbors and local connectivity. We then put our results into context of molecular evolution of RNA.
Our results imply that, in analogy to RNA secondary structures, 3-noncrossing pseudoknot RNA represents a molecular phenotype that is well suited for molecular and in particular neutral evolution. We can conclude that extended, percolating neutral networks of pseudoknot RNA exist.
Many functional RNA molecules fold into pseudoknot structures, which are often essential for the formation of an RNA’s 3D structure. Currently the design of RNA molecules, which fold into a specific structure (known as RNA inverse folding) within biotechnological applications, is lacking the feature of incorporating pseudoknot structures into the design. Hairpin-(H)- and kissing hairpin-(K)-type pseudoknots cover a wide range of biologically functional pseudoknots and can be represented on a secondary structure level.
The RNA inverse folding program antaRNA, which takes secondary structure, target GC-content and sequence constraints as input, is extended to provide solutions for such H- and K-type pseudoknotted secondary structure constraint.
We demonstrate the easy and flexible interchangeability of modules within the antaRNA framework by incorporating pKiss as structure prediction tool capable of predicting the mentioned pseudoknot types. The performance of the approach is demonstrated on a subset of the Pseudobase ++ dataset.
This new service is available via a standalone version and is also part of the Freiburg RNA Tools webservice. Furthermore, antaRNA is available in Galaxy and is part of the RNA-workbench Docker image.
Pseudoknot RNA; Inverse folding RNA; RNAdesign; Synthetic biology; Biotechnology
The 3' noncoding region of turnip yellow mosaic virus RNA includes an 82-nucleotide-long tRNA-like structure domain and a short upstream region that includes a potential pseudoknot overlapping the coat protein termination codon. Genomic RNAs with point mutations in the 3' noncoding region that result in poor replication in protoplasts and no systemic symptoms in planta were inoculated onto Chinese cabbage plants in an effort to obtain second-site suppressor mutations. Putative second-site suppressor mutations were identified by RNase protection and sequencing and were then introduced into genomic cDNA clones to permit their characterization. A C-57----U mutation in the tRNA-like structure was a strong suppressor of the C-55----A mutation which prevented both systemic infection and in vitro valylation of the viral RNA. Both of these phenotypes were rescued in the double mutant. An A-107----C mutation was a strong second-site suppressor of the U-96----G mutation, permitting the double mutant to establish systemic infection. The C-107 and G-96 mutations are located on opposite strands of one helix of a potential pseudoknot, and the results support a functional role for the pseudoknot structure. A mutation near the 5' end of the genome (G + 92----A), at position -3 relative to the initiation codon of the essential open reading frame 206, was found to be a general potentiator of viral replication, probably as a result of enhanced expression of open reading frame 206. The A + 92 mutation enhanced the replication of mutant TYMC-G96 in protoplasts but was not a sufficiently potent suppressor to permit systemic spread of the A + 92/G-96 double mutant in plants.
Motivation: Thermodynamics-based dynamic programming RNA secondary structure algorithms have been of immense importance in molecular biology, where applications range from the detection of novel selenoproteins using expressed sequence tag (EST) data, to the determination of microRNA genes and their targets. Dynamic programming algorithms have been developed to compute the minimum free energy secondary structure and partition function of a given RNA sequence, the minimum free-energy and partition function for the hybridization of two RNA molecules, etc. However, the applicability of dynamic programming methods depends on disallowing certain types of interactions (pseudoknots, zig-zags, etc.), as their inclusion renders structure prediction an nondeterministic polynomial time (NP)-complete problem. Nevertheless, such interactions have been observed in X-ray structures.
Results: A non-Boltzmannian Monte Carlo algorithm was designed by Wang and Landau to estimate the density of states for complex systems, such as the Ising model, that exhibit a phase transition. In this article, we apply the Wang-Landau (WL) method to compute the density of states for secondary structures of a given RNA sequence, and for hybridizations of two RNA sequences. Our method is shown to be much faster than existent software, such as RNAsubopt. From density of states, we compute the partition function over all secondary structures and over all pseudoknot-free hybridizations. The advantage of the WL method is that by adding a function to evaluate the free energy of arbitary pseudoknotted structures and of arbitrary hybridizations, we can estimate thermodynamic parameters for situations known to be NP-complete. This extension to pseudoknots will be made in the sequel to this article; in contrast, the current article describes the WL algorithm applied to pseudoknot-free secondary structures and hybridizations.
Availability: The WL RNA hybridization web server is under construction at http://bioinformatics.bc.edu/clotelab/.
Based on the experimentally determined atomic coordinates for RNA helices and the self-avoiding walks of the P (phosphate) and C4 (carbon) atoms in the diamond lattice for the polynucleotide loop conformations, we derive a set of conformational entropy parameters for RNA pseudoknots. Based on the entropy parameters, we develop a folding thermodynamics model that enables us to compute the sequence-specific RNA pseudoknot folding free energy landscape and thermodynamics. The model is validated through extensive experimental tests both for the native structures and for the folding thermodynamics. The model predicts strong sequence-dependent helix-loop competitions in the pseudoknot stability and the resultant conformational switches between different hairpin and pseudoknot structures. For instance, for the pseudoknot domain of human telomerase RNA, a native-like and a misfolded hairpin intermediates are found to coexist on the (equilibrium) folding pathways, and the interplay between the stabilities of these intermediates causes the conformational switch that may underlie a human telomerase disease.
The approximately 200-nucleotide-long 3'-terminal noncoding region of tobacco mosaic virus (TMV) RNA contains a tRNA-like structure and, in its immediate upstream region, three consecutive pseudoknots, each of which is composed of two double-helical segments. To elucidate the biological functions of the pseudoknot region, we constructed several deletion mutant TMV-L (a tomato strain) RNAs by using an in vitro transcription system and tested their ability to multiply in both tobacco plants and protoplasts. When deletions were introduced just downstream of the termination codon of the coat protein gene in the 5'-to-3' direction progressively, five of six double-helical segments were dispensable for viral multiplication, indicating that the pseudoknot structures are not essential for multiplication. However, extension of the deletion into the central pseudoknot region resulted in reduction in viral multiplication, accompanied by loss of development of mosaic symptoms on systemic tobacco plants. Cessation of multiplication was observed when the sequence involved in formation of double-helical segment I just upstream of the tRNA-like structure was deleted irrespective of the start point and extent of deletion. Point mutations that destabilized double-helical segment I resulted in a loss or great reduction of viral multiplication, whereas the double mutants in which the double helix was restored by additional compensating base substitutions restored multiplication to nearly the wild-type level. Thus, double-helical segment I just upstream of the tRNA-like structure is a structural feature essential for viral multiplication.
The genomes of positive-strand RNA viruses undergo conformational shifts that complicate efforts to equate structures with function. We have initiated a detailed analysis of secondary and tertiary elements within the 3′ end of Turnip crinkle virus (TCV) that are required for viral accumulation in vivo. MPGAfold, a massively parallel genetic algorithm, suggested the presence of five hairpins (H4a, H4b, and previously identified hairpins H4, H5, and Pr) and one H-type pseudoknot (Ψ3) within the 3′-terminal 194 nucleotides (nt). In vivo compensatory mutagenesis analyses confirmed the existence of H4a, H4b, Ψ3 and a second pseudoknot (Ψ2) previously identified in a TCV satellite RNA. In-line structure probing of the 194-nt fragment supported the coexistence of H4, H4a, H4b, Ψ3 and a pseudoknot that connects H5 and the 3′ end (Ψ1). Stepwise replacements of TCV elements with the comparable elements from Cardamine chlorotic fleck virus indicated that the complete 142-nt 3′ end, and subsets containing Ψ3, H4a, and H4b or Ψ3, H4a, H4b, H5, and Ψ2, form functional domains for virus accumulation in vivo. A new 3-D molecular modeling protocol (RNA2D3D) predicted that H4a, H4b, H5, Ψ3, and Ψ2 are capable of simultaneous existence and bears some resemblance to a tRNA. The related Japanese iris necrotic ring virus does not have comparable domains. These results provide a framework for determining how interconnected elements participate in processes that require 3′ untranslated region sequences such as translation and replication.
tmRNA combines tRNA- and mRNA-like properties and ameliorates problems arising from stalled ribosomes. Research on the mechanism, structure and biology of tmRNA is served by the tmRNA website (http://www.indiana.edu/~tmrna), a collection of sequences, alignments, secondary structures and other information. Because many of these sequences are not in GenBank, a BLAST server has been added; another new feature is an abbreviated alignment for the tRNA-like domain only. Many tmRNA sequences from plastids have been added, five found in public sequence data and another 10 generated by direct sequencing; detection in early-branching members of the green plastid lineage brings coverage to all three primary plastid lineages. The new sequences include the shortest known tmRNA sequence. While bacterial tmRNAs usually have a lone pseudoknot upstream of the mRNA segment and a string of three or four pseudoknots downstream, plastid tmRNAs collectively show loss of pseudoknots at both postions. The pseudoknot-string region is also too short to contain the usual pseudoknot number in another new entry, the tmRNA sequence from a bacterial endosymbiont of insect cells, Tremblaya princeps. Pseudoknots may optimize tmRNA function in free-living bacteria, yet become dispensible when the endosymbiotic lifestyle relaxes selective pressure for fast growth.
To understand the role of structural elements of RNA pseudoknots in controlling the extent of -1 type ribosomal frameshifting, we determined the crystal structure of a high-efficiency frameshifting mutant of the pseudoknot from potato leaf roll virus (PLRV). Correlations of the structure with available in vitro frameshifting data for PLRV pseudoknot mutants implicate sequence and length of a stem-loop linker as modulators of frameshifting efficiency. Although the sequences and overall structures of the RNA pseudoknots from PLRV and beet western yellow virus (BWYV) are similar, nucleotide deletions in the linker and adjacent minor groove loop abolish frameshifting only with the latter. Conversely, mutant PLRV pseudoknots with up to four nucleotides deleted in this region exhibit nearly wild-type frameshifting efficiencies. The crystal structure helps rationalize the different tolerances for deletions in the PLRV and BWYV RNAs and we have used it to build a three-dimensional model of the PRLV pseudoknot with a four-nucleotide deletion. The resulting structure defines a minimal RNA pseudoknot motif composed of 22 nucleotides capable of stimulating -1 type ribosomal frameshifts.
Programmed −1 ribosomal frameshifting (PRF) and stop codon readthrough are two translational recoding mechanisms utilized by some RNA viruses to express their structural and enzymatic proteins at a defined ratio. Efficient recoding usually requires an RNA pseudoknot located several nucleotides downstream from the recoding site. To assess the strategic importance of the recoding pseudoknots, we have carried out a large scale genome-wide analysis in which we used an in-house developed program to detect all possible H-type pseudoknots within the genomic mRNAs of 81 animal viruses. Pseudoknots are detected downstream from ~85% of the recoding sites, including many previously unknown pseudoknots. ~78% of the recoding pseudoknots are the most stable pseudoknot within the viral genomes. However, they are not as strong as some designed pseudoknots that exhibit roadblocking effect on the translating ribosome. Strong roadblocking pseudoknots are not detected within the viral genomes. These results indicate that the decoding pseudoknots have evolved to possess optimal stability for efficient recoding. We also found that the sequence at the gag-pol frameshift junction of HIV1 harbors potential elaborated pseudoknots encompassing the frameshift site. A novel mechanism is proposed for possible involvement of the elaborated pseudoknots in the HIV1 PRF event.