|Home | About | Journals | Submit | Contact Us | Français|
RNAstructure is a software package for RNA secondary structure prediction and analysis. This contribution describes a new set of web servers to provide its functionality. The web server offers RNA secondary structure prediction, including free energy minimization, maximum expected accuracy structure prediction and pseudoknot prediction. Bimolecular secondary structure prediction is also provided. Additionally, the server can predict secondary structures conserved in either two homologs or more than two homologs. Folding free energy changes can be predicted for a given RNA structure using nearest neighbor rules. Secondary structures can be compared using circular plots or the scoring methods, sensitivity and positive predictive value. Additionally, structure drawings can be rendered as SVG, postscript, jpeg or pdf. The web server is freely available for public use at: http://rna.urmc.rochester.edu/RNAstructureWeb.
RNA is an important biomolecule, as a catalyst (1), a director of post-transcriptional modification (2) and gene regulation (3), a target of drugs (4,5) and also a pharmaceutical (6,7). Its structure is generally hierarchical (8). Primary structure is the sequence of nucleotides, and is a covalent bond structure. Secondary structure is the set of the canonical base pairs, and tertiary structure is the set of additional contacts and the complete three-dimensional structure.
Because RNA folding is generally hierarchical, secondary structure can often be predicted and analyzed without predicting tertiary structure. Secondary structure prediction can provide a framework for understanding the mechanism of action of RNA. Secondary structure prediction is also an important consideration in the design of siRNA and antisense DNA oligonucleotides both to avoid structure in the mRNA target of these agents and in avoiding self-structure in agents, either of which prevents them from hybridizing with their targets (9–14). RNA structure prediction can also make predictions about which regions of sequence are accessible for interacting with proteins (15). Finally, secondary structure prediction can be used to identify novel functional RNA sequences encoded in genomes (16–18).
RNAstructure is a software package for RNA secondary structure prediction and analysis (19). It was first reported in 1998 as a free energy minimization program for Microsoft Windows (20). It has been expanded to include methods for predicting bimolecular structure (12), conserved structures in multiple homologs (21–24) and siRNA design (9). Several methods are available for predicting structures for a single sequence, including maximum expected accuracy (25), stochastic sampling (26), exhaustive traceback (27) and pseudoknot prediction (28). Graphical user interfaces are provided for Microsoft Windows, Macintosh OS-X and Linux. Command line interfaces are also available for these operating systems. Finally, the set of underlying C++ classes are available as a library for use by programmers.
This report describes a new set of web servers that provide the functionality of RNAstructure. These web servers are open to the public and can be found at http://rna.urmc.rochester.edu/RNAstructureWeb.
The RNAstructure web servers are organized around two schemes. The first scheme provides specific programs for web users. Table 1 shows a list of programs that are available. The second scheme is a set of themes defined by the biological problem. For example, to predict a secondary structure for a single sequence, the server called Predict a Secondary Structure can be used to run calculations by multiple programs. The list of themes and the exact programs included are listed in Table 2. This approach is designed to be user-friendly because it bundles all the available methods for addressing a specific problem, and for this reason most users will want to use these servers.
The structure prediction servers work using either bare sequences pasted into a window, or FASTA-formatted files uploaded to the server. The exception to this is the set of servers that predict a structure common to three or more sequences. For these servers, including Multilign and TurboFold, the sequence window requires multiple FASTA-formatted sequences. Alternatively, a FASTA-formatted file with multiple sequences can be uploaded.
Servers that act on structures, such as CircleCompare, draw, efn2, RemovePseudoknots or scorer, require an upload of a CT file. These can be generated manually or by a structure prediction server. Another alternative is to use a dot-bracket formatted structure and convert it to CT format using the dot2ct component of the RNAstructure server.
For each individual server, sample input data can be generated automatically to illustrate format and to provide a test case for the server. For structure prediction servers, clicking a link pastes sample sequences into the sequence windows. For the servers that act on structures, a link is provided to download a sample CT file, and this can be uploaded back to the server as sample data.
For Fold, partition and Predict a Secondary Structure, structure prediction can be restrained using SHAPE mapping data, when available, to improve the accuracy of structure prediction (39). AllSub, Dynalign, Fold, partition and Predict a Secondary Structure accept folding constraints, including the ability for force specific base pairs, forbid specific pairs, force a nucleotide paired, force a nucleotide unpaired, and specify that a nucleotide is accessible to chemical modification (33,34). These constraints and restraints are mediated by file uploads.
Once a calculation is submitted, a notification page is displayed. This page continues to refresh until the calculation output is available. At that time, the page is replaced with a page that contains the output. This page can be bookmarked and returned to at a later time. Alternatively, an e-mail address can be provided as part of the input data. If this is done, an e-mail is sent to the address when the calculation is complete and the results are available.
The structure prediction servers display the predicted structure using an SVG drawing, which can be rendered by web browsers with an SVG plugin (Figure 1). If more than one structure is predicted, i.e. suboptimal structures or a structure sample is present, ‘previous’ and ‘next’ buttons are displayed to enable scrolling between structures. For each predicted structure, the structure can be downloaded as a jpeg, svg, pdf, postscript or CT file.
Each output page displays the RNAstructure executable and the command line used to generate the results (40). Each input file is a link, allowing download of the processed-form data. This makes clear exactly what calculation is performed, and facilitates a user learning the capabilities of the RNAstructure software.
The RNAstructure web servers have limitations on the calculations that can be done. This ensures that the resource will be available to the broader community with reasonable wait times. A list of limitations is available with the online help. Currently, for example, the single sequence structure prediction methods are limited to sequences of 2500 nucleotides or less. TurboFold is limited to a maximum of 12 sequences up to 1600 nucleotides in length. If the required calculation exceeds the web server requirements, the software can be downloaded (http://rna.urmc.rochester.edu/RNAstructure.html) and run locally.
Extensive online help is available for using the RNAstructure web server. The help documents each of the input and output fields, and is organized by individual server. For each server help page, a link is provided to the underlying RNAstructure executable (19), which provides additional details about the program. A separate help page details the server limitations, as explained above.
RNAstructure is designed to be a user-friendly software package, accessible to the community of investigators studying RNA (19). These new web servers continue this tradition, and provide the software to a wider user base. The accuracy of the algorithms has been extensively tested in prior publications. As the algorithms in RNAstructure are improved, and new algorithms developed, the web servers will be updated to continue to make RNAstructure available to the community.
National Institutes of Health (NIH) [R01 GM076485 to D.H.M.]. Funding for open access charge: NIH [R01 GM076485].
Conflict of interest statement. None declared.