PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of bioinfoLink to Publisher's site
 
Bioinformatics. Mar 15, 2012; 28(6): 792–798.
Published online Jan 27, 2012. doi:  10.1093/bioinformatics/bts044
PMCID: PMC3307117
TurboKnot: rapid prediction of conserved RNA secondary structures including pseudoknots
Matthew G. Seetin1 and David H. Mathews1,2*
1Department of Biochemistry and Biophysics, Center for RNA Biology and 2Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
* To whom correspondence should be addressed.
Associate Editor: Ivo Hofacker
Received July 14, 2011; Revised November 25, 2011; Accepted January 23, 2012.
Motivation: Many RNA molecules function without being translated into proteins, and function depends on structure. Pseudoknots are motifs in RNA secondary structures that are difficult to predict but are also often functionally important.
Results: TurboKnot is a new algorithm for predicting the secondary structure, including pseudoknotted pairs, conserved across multiple sequences. TurboKnot finds 81.6% of all known base pairs in the systems tested, and 75.6% of predicted pairs were found in the known structures. Pseudoknots are found with half or better of the false-positive rate of previous methods.
Availability: The program is available for download under an open-source license as part of the RNAstructure package at: http://rna.urmc.rochester.edu.
Contact: david_mathews/at/urmc.rochester.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
In many cases, RNA functions in cells without being translated into proteins. These non-coding RNAs (ncRNAs) carry out a wide range of functions, including catalysis, intracellular protein trafficking, immunity and gene regulation (Doudna and Cech, 2002; Gesteland et al., 2005; Marraffini and Sontheimer, 2010; Nissen et al., 2000; Tucker and Breaker, 2005; Walter and Blobel, 1982; Wu and Belasco, 2008). The functions of these ncRNAs depend on their structures.
RNA structure is hierarchical, beginning with the sequence of nucleotides (the primary structure), then the set of all canonical, i.e. GC, AU and GU base pairs (the secondary structure), and ultimately the full 3D positioning of all the atoms in the molecule (the tertiary structure). The secondary structure tends to form on a faster time scale and with stronger interactions than those interactions that fold the tertiary structure, so the secondary structure can usually be predicted independently of knowing the overall tertiary structure (Tinoco and Bustamante, 1999).
The most accurate method of secondary structure determination is comparative sequence analysis. It relies on the hypothesis that sequences that serve the same function share the same structure. But comparative sequence analysis is impractical in many situations because it requires significant human insight and a large number of diverse sequences (Gutell et al., 2002; Woese and Pace, 1993). Commonly, the secondary structure for a single sequence is predicted using free energy minimization. These methods use a set of thermodynamic parameters that predict the folding free energy change of a given structure and a dynamic programming algorithm to find the structure with lowest free energy change (Mathews and Turner, 2006; Mathews et al., 2004).
The thermodynamic parameters can be used with a different type of dynamic programming algorithm to predict a partition function, which can be used to predict the base pairing probability for all possible canonical pairs (McCaskill, 1990). Base pair probabilities can be used to assemble structures called maximum expected accuracy (MEA) structures (Do et al., 2006; Hamada et al., 2009; Knudsen and Hein, 1999; Lu et al., 2009). Briefly, MEA prediction seeks to maximize the sum of probabilities of each base pair for all paired bases and the probabilities of being single-stranded for all unpaired bases. Empirically, MEA structure prediction works because that base pairs predicted to be the most probable are the most likely to be in the known structure (Mathews, 2004).
Although comparative sequence analysis is not yet automated, a number of methods have been developed to improve structure prediction by using the information available in homologous sequences. (Bernhart et al., 2008; Harmanci et al., 2007, 2008; Mathews and Turner, 2002; Steffen et al., 2006; Will et al., 2007; Xu et al., 2007). These methods use a supplied input sequence alignment, predict the structure for each sequence and then find the common structure, or align and fold the sequences simultaneously. These programs have been reviewed previously (Bernhart et al., 2008).
The recently published TurboFold algorithm is one such method that improves secondary structure prediction by predicting a conserved structure using two or more homologous sequences (Harmanci et al., 2011). It refines base pairing probabilities of an arbitrary number of sequences using their probabilistic alignments in an iterative manner. Structures are then assembled using a MEA algorithm from the final set of base pairing probabilities.
Because of computational complexity, most programs that predict secondary structure do not predict pseudoknots. Formally, a pseudoknot occurs if an RNA has two base pairs, one between bases i and j, and a second between i and j, such that i< i <j< j. Pseudoknots make up only a small fraction of base pairs, but they are often found in ncRNAs. Predicting the minimum free energy secondary structure that includes pseudoknots of any topology has been proven to be NP-hard (Lyngsø and Pederson, 2000). In spite of this, a number of single-sequence methods have been developed to predict structures with pseudoknots (Akutsu, 2000; Condon et al., 2004; Dirks and Pierce, 2003; Reeder and Giegerich, 2004; Rivas and Eddy, 1999; Uemura et al., 1999). These methods limit the set of possible pseudoknot topologies to accelerate the search. A number of available methods has been previously reviewed (Liu et al., 2010).
Recently, a new algorithm was published, ProbKnot, that uses a single sequence partition function to predict pairing probabilities and assigns base pairs if the two bases are mutually maximally probable to pair with one another (Bellaousov and Mathews, 2010). This algorithm can find pseudoknots of any topology. While the partition function does not include terms for pseudoknotted structures, the ProbKnot algorithm was able to find pseudoknotted base pairs with reasonable accuracy compared with other methods, and at minimal computational cost: the algorithm is O(N2) in addition to the O(N3) partition function calculation.
Approaches to the pseudoknot prediction problem that employ information from sequence and structural homology are few in number. Like single sequence methods (Abrahams et al., 1990; Isambert and Siggia, 2000), a Monte Carlo alignment and folding method has also been published (Meyer and Miklos, 2007). Also analogous to single sequence methods (Dawson et al., 2007; Jabbari et al., 2008; Ren et al., 2005), another approach identifies common pseudoknot-free structures and then iteratively matches regions that can form pseudoknots (Ruan et al., 2004). A third approach computes a score matrix based on alignment and thermodynamic information and uses a maximum weight matching (MWM) algorithm to assemble an optimal secondary structure (Tabaska et al., 1998; Witwer et al., 2004). The latter two approaches are dependent on a high-quality sequence alignment as input, either one assembled manually or one computed from sequences with a large degree of sequence identity (70% or more). Such alignments are not always available, particularly for newly discovered classes of RNAs.
This contribution reports a new algorithm, TurboKnot, for predicting RNA secondary structures conserved in multiple sequences, including pseudoknots. TurboKnot assembles structures in the same way as ProbKnot, but with the pairing probabilities predicted by a TurboFold calculation using multiple sequences to inform the pairing probabilities. The additional computational cost compared with TurboFold is small compared with the TurboFold calculation itself, just O(MN2), where M is the number of sequences used and N is the length of the longest sequence. It is benchmarked against two algorithms capable of predicting pseudoknots that also use multiple sequences as input, ILM and Hxmatch (Ruan et al., 2004; Witwer et al., 2004). It is also benchmarked against the single sequence ProbKnot algorithm (Bellaousov and Mathews, 2010). Finally, two algorithms that cannot predict pseudoknots are included in the benchmark, TurboFold (Harmanci et al., 2011) and MEA (MaxExpect) (Lu et al., 2009).
2.1 Predicting base pair probabilities
For TurboKnot, base pair probabilities were predicted using TurboFold (Harmanci et al., 2011). This algorithm employs the nearest neighbor thermodynamic parameters (Mathews et al., 1999, 2004; Xia et al., 1998). The per-branching-helix bonus parameter, however, was left at -0.6 kcal/mol, consistent with optical melting data (Diamond et al., 2001), rather than using the optimized parameter of Mathews et al. (2004). TurboFold was run using four iterations instead of the default of three for an increase in accuracy at the expense of computational cost, but otherwise was run with the default parameters.
2.2 Assembling structures
Structures were constructed using the same method as the ProbKnot algorithm. First, maximal pairing probabilities for each base i were identified from the TurboFold output and stored in an array Pmax(i). Next, all pairing probabilities for all base pairs, P(i,j) were checked, and if P(i,j) is equal to Pmax(i) and Pmax(j), a base pair was assigned between bases i and j. This process can be iterated, where subsequent iterations only examine the bases that remained unpaired from previous iterations, but subsequent iterations were not found to increase the accuracy of the results (data not shown). Finally, all base pairs involved in helices shorter in length than 3 bp were removed (helices were not considered to be interrupted by single bulges or 1 × 1 internal loops). This structure prediction process was performed independently for each sequence run in a single TurboFold calculation.
2.3 MWM
MWM calculations were performed with Wmatch from MATHPROG (http://elib.zib.de/pub/Packages/mathprog/matching/weighted/) (Micali and Vazirani, 1980). Edges were only supplied to the algorithm for consideration if their computed pairing probability was ≥10−6. As with TurboKnot, all helices shorter than 3 bp were removed from the structure.
2.4 Assessment
Sensitivity and positive predictive value (PPV) were used to evaluate each algorithm tested. Sensitivity measures the fraction of pairs in each known structure found:
A mathematical equation, expression, or formula.
 Object name is bts044um1.jpg
PPV measures the fraction of pairs predicted that are correct:
A mathematical equation, expression, or formula.
 Object name is bts044um2.jpg
Because it is difficult to determine the exact register of pairs by comparative analysis and because pairing can fluctuate because of thermal motions, predicted base pairs between i and j were counted as correct if i was paired with j, j − 1, or j + 1, or if j was paired with i − 1 or i + 1. Assessment using only exact matching base pairs is supplied in Supplementary Tables S1 and S2.
All sequences were tested on a randomly chosen subset (with replacement) of seven different subtypes of RNA: 5S RNA, tRNA, RNase P RNA, SRP RNA, tmRNA, telomerase RNA and group I introns (Brown, 1998; Chen et al., 2000; Damberger and Gutell, 1994; Sprinzl et al., 1998; Szymanski et al., 1998; Waring and Davies, 1984; Zwieb and Wower, 2000). For all multisequence methods, five RNAs of each type were selected to be computed together in each calculation. Particular RNA sequences could be chosen more than once as a part of different calculations. These were considered to be separate results, as multiple sequence methods may give different results for the same sequence depending on the other sequences used in the calculation. For consistency, duplicate sequences were also counted as independent calculations when using single-sequence methods to predict structures, even though the resulting structures would be identical. Average values of sensitivity and PPV were computed for each class of RNAs. The overall average is the mean of the individual averages on sequence types.
2.5 Pseudoknot evaluation
Pseudoknots were identified using the algorithm of Smit et al. (2008) to identify the fewest pairs that could be removed to remove the pseudoknot from the structure. Predicted pseudoknotted base pairs were only counted as correct if the ij pair was in the accepted structure (allowing thermal fluctuation as in Section 2.4), the pair was counted as a pseudoknot by the Smit et al. algorithm, and the pair was pseudoknotted. A pair was considered pseudoknotted if it met the i< i <j< j criterion, with at least one other pair ij that was also a correct pair compared with the accepted structure (Bellaousov and Mathews, 2010).
Over the range of RNA families tested, which have a variety of functions and structures, TurboKnot averaged a sensitivity of 79.8% (Table 1) and a PPV of 72.9% (Table 2). For comparison, TurboKnot was benchmarked against two other multisequence methods that can predict pseudoknots: ILM version 1.0 (Ruan et al., 2004) and Hxmatch version 1.2.1 (Witwer et al., 2004). Each was run with default parameters. Sequences were aligned for input into these algorithms using MUSCLE version 3.6 (Edgar, 2004), which performs well for RNA sequence alignment compared with other available tools (Wilm et al., 2006). In addition to these two methods, benchmarks were run using ProbKnot (Bellaousov and Mathews, 2010), a single sequence method that can predict pseudoknots, TurboFold (Harmanci et al., 2011), the underlying multisequence method that cannot predict pseudoknots and MaxExpect (Lu et al., 2009), a single sequence method that cannot predict pseudoknots. Other single-sequence methods capable of predicting pseudoknots were recently benchmarked (Bellaousov and Mathews, 2010), and ProbKnot compares favorably against these. It is used here as a representative of this class of algorithms. One algorithm, DotKnot, has been published more recently. Its performance on this test set is included in Supplementary Tables S3–S5 for comparison. Similarly, multisequence algorithms that cannot predict pseudoknots were recently benchmarked (Harmanci et al., 2011), and TurboFold compares favorable and is used as a representative of them here.
Table 1.
Table 1.
Sensitivities of tested methods for all base pairs
Table 2.
Table 2.
PPVs of tested methods for all base pairs
The average sensitivity and PPV of TurboKnot were better than any other method capable of predicting pseudoknots. Additionally, TurboKnot had a higher sensitivity than TurboFold, albeit at a trade-off in PPV. Since TurboKnot considers a larger structure space than TurboFold, it is natural to find more correct base pairs with a cost to the PPV.
For pseudoknots in particular, TurboKnot found 289 correct pseudoknotted base pairs out of a total of 5897 in the known structures, and it made 672 false positive predictions of pseudoknotted pairs (Table 3). This is more true positives and fewer false positives than any other tested method. When considering structures with pseudoknots rather than individual pairs, TurboKnot found at least one true positive pseudoknot in 59 structures out of 239 predictions, and out of 426 tested structures that contain at least one pseudoknot (Supplementary Table S6). This is again more true positives and fewer false positives than any other tested method.
Table 3.
Table 3.
Evaluation of tested methods for pseudoknotted base pairs
The use of an MWM algorithm to assemble structures given the TurboFold pair probabilities was also considered as a control (Supplementary Tables S7 and S8). This approach resulted in the identification of the exact same set of correct base pairs as those based on mutual maximum probability pairs, plus a small number of extra incorrect pairs. The additional, incorrect pairs were so few that they only affected the second decimal places of the PPV percentages. The exception was the tmRNA family, in which a small number of additional correct base pairs were identified. The minor difference between these two approaches was due to how well the TurboFold algorithm refines the pairing probabilities by excluding structures that are not conserved.
As another control, an MWM algorithm was used to predict structures from pair probabilities calculated using a single sequence partition function. This added a number of spurious base pairs (Supplementary Table S9). This emphasized the importance of the ProbKnot approach when single sequences are used.
Time benchmarks were performed on all tested algorithms. Groups of five random sequences were used, with average lengths varying from 80.0 nt to 458.8 nt (Table 4). The Hxmatch algorithm performed the fastest. TurboKnot was significantly slower, but it could still carry out the computations in a reasonable amount of time for a user with a typical desktop computer.
Table 4.
Table 4.
Evaluation of time performance
Figure 1 shows a sample structure prediction for Mycobacterium tuberculosis RNase P using TurboKnot, ProbKnot or TurboFold (Brown, 1998). The TurboKnot and TurboFold structures are similar, but the TurboKnot algorithm is able to identify most of the pseudoknotted base pairs. ProbKnot has a lower sensitivity and PPV overall for non-pseudoknotted base pairs in this structure. It identifies the four correct pseudoknotted base pairs that TurboKnot missed, but it also finds eight pseudoknotted base pairs that are not in the accepted structure.
Fig. 1.
Fig. 1.
Accepted and predicted structures for M.tuberculosis RNase P. (a) The accepted structure of M.tuberculosis RNase P. (Brown, 1998). (b) The structure predicted by TurboKnot. (c) The structure predicted by ProbKnot. (d) The structure predicted by TurboFold. (more ...)
TurboKnot overextends some helices as compared with TurboFold when there are valid bases available to pair. The accuracy of the extra base pairs added when TurboKnot extends helices compared to TurboFold is low; of the base pairs on ends of helices predicted by TurboKnot and not TurboFold, 27.6% were correct in all the structures. This is for cases where a helix is in common in the predictions of TurboKnot and TurboFold. Single nucleotide bulges are not considered to break a helix. These helices that are extended by 1 bp on an end are sometimes correct, so they do increase the sensitivity score of TurboKnot compared with TurboFold but at the expense of PPV.
In the same way as the ProbKnot algorithm, TurboKnot is able to assemble structures containing pseudoknots using base pair probabilities. Using probabilities from TurboFold that are functions of structural alignment and homology information in addition to thermodynamic stability significantly improves predictions when considering all base pairs and when considering just pseudoknots. Compared with TurboKnot, some methods were able to find more true pseudoknotted pairs on some systems. For example, Hxmatch found more true pseudoknotted pairs in telomerase RNA, but in the process it predicted more than twice as many false positive pairs as there were true pseudoknotted pairs in the accepted structures.
Single sequence methods, with or without pseudoknot-prediction capabilities, were reported as having poor performance on the prediction of tmRNA structure (Bellaousov and Mathews, 2010). The results with the sequences used here are consistent with that result. TurboKnot (and TurboFold), however, show significant improvement for this difficult class of RNAs. TurboKnot shows its best pseudoknot-prediction capabilities on this class of RNAs, with 49% of all predicted pseudoknotted base pairs being true positives.
In contrast, Group I introns show slightly worse sensitivity in TurboFold and TurboKnot compared with MaxExpect and ProbKnot, although they did still have higher PPVs. This is likely because the long insertions present in some Group I introns makes accurate sequence alignment difficult. Indeed, ILM and Hxmatch, which are dependent on a sequence alignment algorithm for input, did not perform well with randomly chosen groups of sequences from this class of RNAs. These algorithms may have more success, however, when sequences with higher sequence similarity are used or when using a manually curated alignment, which would allow for more accurate sequence alignment. Such groups of sequences are not always available.
ILM was benchmarked in a previous study using just one sequence rather than an alignment, and, unexpectedly, it appears less accurate when using multiple sequences as input compared with previously reported benchmarks (Bellaousov and Mathews, 2010; Ruan et al., 2004). This is likely because it was hindered by errors made by the sequence alignment algorithm, which aligns sequences but not secondary structure (Edgar, 2004). It fared much better when given manually curated alignments based on structural homology (Ruan et al., 2004). Because Hxmatch, which was given the same input sequence alignments, had similar results on similar classes of RNAs, it is likely tools that consider both secondary structure and alignment at the same time are necessary to accurately align functionally similar RNAs which are not chosen to have a relatively high sequence identity.
While TurboKnot has significantly improved pseudoknot prediction accuracy, more work is needed to raise the accuracy to the current standard for non-pseudoknotted base pairs. Dirks and Pierce (2004) report an algorithm that computes a partition function that includes pseudoknots of some limited topologies in O(N4) time. This partition function could be used by TurboFold and TurboKnot to consider a larger structure space, ideally increasing the probability of true pseudoknots at the expense of false ones that are predicted currently at an acceptable increase in computational expense.
Supplementary Material
Supplementary Data
ACKNOWLEDGEMENTS
The authors thank A.O. Harmanci and G. Sharma for discussions and S. Bellaousov for performing MWM calculations. Computer time was provided by the University of Rochester Center for Research Computing.
Funding: National Institutes of Health (grant number R01HG004002 to D.H.M).
Conflict of Interest: none declared.
  • Abrahams J.P., et al. Prediction of RNA secondary structure, including pseudoknotting, by computer simulation. Nucleic Acids Res. 1990;18:3035–3044. [PMC free article] [PubMed]
  • Akutsu T. Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots. Discr. Appl. Math. 2000;104:45–62.
  • Bellaousov S., Mathews D.H. ProbKnot: fast prediction of RNA secondary structure including pseudoknots. RNA. 2010;16:1870–1880. [PubMed]
  • Bernhart S.H., et al. RNAalifold: improved consensus structure prediction for RNA alignments. BMC Bioinformatics. 2008;9:474. [PMC free article] [PubMed]
  • Brown J.W. The ribonuclease P database. Nucleic Acids Res. 1998;26:351–352. [PMC free article] [PubMed]
  • Chen J.L., et al. Secondary structure of vertebrate telomerase RNA. Cell. 2000;100:503–514. [PubMed]
  • Condon A., et al. Classifying RNA pseudoknotted structures. Theor. Comput. Sci. 2004;320:35–50.
  • Damberger S.H., Gutell R.R. A comparative database of group I intron structures. Nucleic Acids Res. 1994;22:3508–3510. [PMC free article] [PubMed]
  • Dawson W.K., et al. Prediction of RNA pseudoknots using heuristic modeling with mapping and sequential folding. PLoS One. 2007;2:e905. [PMC free article] [PubMed]
  • Diamond J.M., et al. Thermodynamics of three-way multibranch loops in RNA. Biochemistry. 2001;40:6971–6981. [PubMed]
  • Dirks R.M., Pierce N.A. A partition function algorithm for nucleic acid secondary structure including pseudoknots. J. Comput. Chem. 2003;24:1664–1677. [PubMed]
  • Dirks R.M., Pierce N.A. An algorithm for computing nucleic acid base-pairing probabilities including pseudoknots. J. Comput. Chem. 2004;25:1295–1304. [PubMed]
  • Do C.B., et al. CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics. 2006;22:e90–98. [PubMed]
  • Doudna J., Cech T. The chemical repertoire of natural ribozymes. Nature. 2002;418:222–228. [PubMed]
  • Edgar R.C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–1797. [PMC free article] [PubMed]
  • Gesteland R.F., et al. The RNA World. 3rd. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press; 2005.
  • Gutell R.R., et al. The accuracy of ribosomal RNA comparative structure models. Curr. Opin. Struct. Biol. 2002;12:301–310. [PubMed]
  • Hamada M., et al. Prediction of RNA secondary structure using generalized centroid estimators. Bioinformatics. 2009;25:465–473. [PubMed]
  • Harmanci A.O., et al. Efficient pairwise RNA structure prediction using probabilistic alignment constraints in Dynalign. BMC Bioinformatics. 2007;8:130. [PMC free article] [PubMed]
  • Harmanci A.O., et al. PARTS: Probabilistic Alignment for RNA joinT Secondary structure prediction. Nucleic Acids Res. 2008;36:2406–2417. [PMC free article] [PubMed]
  • Harmanci A.O., et al. TurboFold: Iterative probabilistic estimation of secondary structures for multiple RNA sequences. BMC Bioinformatics. 2011;12:108. [PMC free article] [PubMed]
  • Isambert H., Siggia E.D. Modeling RNA folding paths with pseudoknots: application to hepatitis delta virus ribozyme. Proc. Natl Acad. Sci. USA. 2000;97:6515–6520. [PubMed]
  • Jabbari H., et al. Novel and efficient RNA secondary structure prediction using hierarchical folding. J. Comput. Biol. 2008;15:139–163. [PubMed]
  • Knudsen B., Hein J.J. RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics. 1999;15:446–454. [PubMed]
  • Liu B., et al. RNA pseudoknots: folding and finding. F1000 Biol. Rep. 2010;2:8. [PMC free article] [PubMed]
  • Lu Z.J., et al. Improved RNA secondary structure prediction by maximizing expected pair accuracy. RNA. 2009;15:1805–1813. [PubMed]
  • Lyngsø R., Pederson C. RNA pseudoknot prediction in energy-based models. J. Comput. Biol. 2000;7:409–427. [PubMed]
  • Marraffini L.A., Sontheimer E.J. CRISPR interference: RNA-directed adaptive immunity in bacteria and archaea. Nat. Rev. Genet. 2010;11:181–190. [PMC free article] [PubMed]
  • Mathews D.H. Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. RNA. 2004;10:1178–1190. [PubMed]
  • Mathews D.H., Turner D.H. Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J. Mol. Biol. 2002;317:191–203. [PubMed]
  • Mathews D.H., Turner D.H. Prediction of RNA secondary structure by free energy minimization. Curr. Opin. Struct. Biol. 2006;16:270–278. [PubMed]
  • Mathews D.H., et al. Expanded sequence dependence of thermodynamic parameters provides improved prediction of RNA secondary structure. J. Mol. Biol. 1999;288:911–940. [PubMed]
  • Mathews D.H., et al. Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc. Natl Acad. Sci. USA. 2004;101:7287–7292. [PubMed]
  • McCaskill J.S. The equilibrium partition function and base pair probabilities for RNA secondary structure. Biopolymers. 1990;29:1105–1119. [PubMed]
  • Meyer I.M., Miklos I. SimulFold: simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework. PLoS Comput. Biol. 2007;3:e149. [PubMed]
  • Micali S., Vazirani V.V. 21st Annual Symposium on Foundations of Computer Science. New York: IEEE Computer Society Press; 1980. An O(V1/2E) algorithm for finding maximum matching in general graphs; pp. 17–27.
  • Nissen P., et al. The structural basis of ribosomal activity in peptide bond synthesis. Science. 2000;289:920–930. [PubMed]
  • Reeder J., Giegerich R. Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics. BMC Bioinformatics. 2004;5:104. [PMC free article] [PubMed]
  • Ren J., et al. HotKnots: heuristic prediction of RNA secondary structures including pseudoknots. RNA. 2005;11:1494–1504. [PubMed]
  • Rivas E., Eddy S.R. A dynamic programming algorithm for RNA structure prediction including pseudoknots. J. Mol. Biol. 1999;285:2053–2068. [PubMed]
  • Ruan J., et al. An iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots. Bioinformatics. 2004;20:58–66. [PubMed]
  • Smit S., et al. From knotted to nested RNA structures: a variety of computational methods for pseudoknot removal. RNA. 2008;14:410–416. [PubMed]
  • Sprinzl M., et al. Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 1998;26:148–153. [PMC free article] [PubMed]
  • Steffen P., et al. RNAshapes: an integrated RNA analysis package based on abstract shapes. Bioinformatics. 2006;22:500–503. [PubMed]
  • Szymanski M., et al. 5S rRNA data bank. Nucleic Acids Res. 1998;26:156–159. [PMC free article] [PubMed]
  • Tabaska J.E., et al. An RNA folding method capable of identifying pseudoknots and base triples. Bioinformatics. 1998;14:691–699. [PubMed]
  • Tinoco I., Jr, Bustamante C. How RNA folds. J. Mol. Biol. 1999;293:271–281. [PubMed]
  • Tucker B.J., Breaker R.R. Riboswitches as versatile gene control elements. Curr. Opin. Struct. Biol. 2005;15:342–348. [PubMed]
  • Uemura Y., et al. Tree adjoining grammars for RNA structure prediction. Theor. Comput. Sci. 1999;210:277–303.
  • Walter P., Blobel G. Signal recognition particle contains a 7S RNA essential for protein translocation across the endoplasmic reticulum. Nature. 1982;299:691–698. [PubMed]
  • Waring R.B., Davies R.W. Assessment of a model for intron RNA secondary structure relevant to RNA self-splicing - a review. Gene. 1984;28:277–291. [PubMed]
  • Will S., et al. Inferring noncoding RNA families and classes by means of genome-scale structure-based clustering. PLoS Comput. Biol. 2007;3:e65. [PubMed]
  • Wilm A., et al. An enhanced RNA alignment benchmark for sequence alignment programs. Algorithms Mol. Biol. 2006;1:19. [PMC free article] [PubMed]
  • Witwer C., et al. Prediction of consensus RNA secondary structures including pseudoknots. IEEE/ACM Trans. Comput. Biol. Bioinform. 2004;1:66–77. [PubMed]
  • Woese C.R., Pace N.R. Probing RNA structure, function, and history by comparative analysis. In: Gesteland R.F., editor. The RNA World. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press; 1993. pp. 91–117.
  • Wu L., Belasco J.G. Let me count the ways: mechanisms of gene regulation by miRNAs and siRNAs. Mol. Cell. 2008;29:1–7. [PubMed]
  • Xia T., et al. Thermodynamic parameters for an expanded nearest-neighbor model for formation of RNA duplexes with Watson-Crick pairs. Biochemistry. 1998;37:14719–14735. [PubMed]
  • Xu X., et al. RNA Sampler: a new sampling based algorithm for common RNA secondary structure prediction and structural alignment. Bioinformatics. 2007;23:1883–1891. [PubMed]
  • Zwieb C., Wower J. tmRDB (tmRNA database) Nucleic Acids Res. 2000;28:169–170. [PMC free article] [PubMed]
Articles from Bioinformatics are provided here courtesy of
Oxford University Press