|Home | About | Journals | Submit | Contact Us | Français|
RNA helicases are ubiquitous and essential enzymes that function in nearly all aspects of RNA metabolism. The RNA helicase database (www.rnahelicase.org) integrates the wealth of accumulating information on RNA helicases in a readily accessible format. The database is a portal that allows straightforward retrieval of comprehensive information on sequence, structure and on biochemical and cellular functions of all RNA helicases from the most widely used model organisms Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, mouse and human. Also included are RNA helicases from other organisms that are subject to specific investigation. The database is structured according to the most recent helicase classification into helicase superfamilies (SFs) and families, and thus emphasizes phyologenetic relations between RNA helicases as well. Information on individual RNA helicases can be accessed through various browsing routes or through text-based searches of the database.
Nearly all aspects of RNA metabolism involve RNA helicases, enzymes that use ATP to bind or remodel RNA and RNA–protein complexes (1). As one of the largest class of enzymes in RNA metabolism, RNA helicases are encoded by organisms from all kingdoms of life and by many viruses (2). In structure and sequence, RNA helicases are closely related to DNA helicases (3). However, RNA helicases outnumber their DNA bound cousins by a considerable margin and RNA helicases often perform functions substantially different from those attributed to DNA helicases (1–3).
RNA helicases are not only essential for most processes of RNA metabolism including ribosome biogenesis, pre-mRNA splicing and translation initiation (1–3), but also for sensing viral RNAs in the context of the innate immune system and for the biogenesis and function of miRNAs (4,5). Defects or misregulation of certain RNA helicases have been linked to numerous health issues including cancer, neuro-degenerative disorders and infectious diseases (6–9). Given their central biological roles, RNA helicases are subject to intensive ongoing research in diverse fields. To date, more than 7000 articles and roughly 500 reviews have been published on subjects related to RNA helicases. Currently, about 500 publications appear per year, with an upward trajectory.
With increasing volume and diversity of the accumulating data on RNA helicases the need arises to integrate the wealth of information in a timely and readily accessible format. To address this issue, we have compiled the RNA helicase database (www.rnahelicase.org). This database is a completely restructured version of the DExH/D protein database from 1999, which covered only a subset of RNA helicases (10). We have now expanded the scope to RNA helicases in general. We approached the design of the RNA helicase database from the view of user who is interested in the fast retrieval of comprehensive information on one or more specific RNA helicases, and in examining phylogenetic relations of RNA helicases to each other. Therefore, the RNA helicase database aims foremost to enable researchers to locate and retrieve comprehensive information about sequence, structure, biochemical and cellular function of RNA helicases. The database provides ready access to sequence, structure, biochemical and cellular function of all RNA helicases from the most widely used model organisms Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, mouse and human. Also included are RNA helicases from other organisms that are subject to specific investigation, e.g. NS3 from hepatitis C virus (11), the P4 RNA packaging motor from Φ12 (12) or RNA helicase A from bovine (13). Focus on these RNA helicases covers the vast majority of published data on RNA helicases (estimated to >95%), while providing an enduring structure for the database. Notwithstanding, the database can be readily expanded to other organisms, if this need arises in the future.
In addition to enabling access to information on individual RNA helicases, the database emphasizes phylogenetic relations between related RNA helicases. As briefly outlined below and in more detail in several recent reviews, RNA helicases are categorized in superfamilies and families, and many features are shared between RNA helicases of a given family (3,14,15). Functions and features of one RNA helicase often provide clues for other, less well-characterized enzymes from the same family (3). However, family relations are often not apparent to investigators outside the RNA helicase field. We therefore believe that emphasizing family relations is particularly useful for database users new to research on RNA helicases.
The structure of the RNA helicase database is based on the most recent helicase classification into helicase superfamilies (SFs) and families (3,14,15). According to sequence and protein structures, all RNA and DNA helicases are classified into six SFs (15). SFs 1 and 2 contain non-ring-forming helicases, SFs 3–6 contain ring-forming enzymes. RNA helicases are found in SFs 1–5. All eukaryotic RNA helicases identified to date belong to the non-ring-forming SFs 1 and 2 (3). Ring-forming RNA helicases are found in bacteria and viruses (12,16). SFs 1 and 2 are further divided into families (Figure 1). Although all SF1 and SF2 proteins share extensive structural similarities in the overall fold of their helicase core, the respective families have distinct structure and sequence characteristics that are reflected in specific functional features (3).
The user enters the database through the homepage (Figure 2). From this page, via the main navigation (Figure 3A), pages for the helicase SFs 3–5, as well as pages for the SF1 and 2 families can be accessed (Figures 2 and and3B).3B). From these pages, one can directly reach ‘individual protein pages’ (Figure 2). These pages contain information on orthologs of a given RNA helicase from S. cerevisiae, C. elegans, D. melanogaster, mouse and human (Figure 3C). DEAD-box and DEAH/RHA helicases from E. coli are listed on separate pages (Figure 2), because there are considerably fewer enzymes than in eukaryotes and because the structure of the individual protein pages, which emphasizes orthologs of multiple organisms, appears not suitable for representing only one protein from E. coli.
Individual protein pages provide links to various sequence databases with original information about a particular RNA helicase. Links for each protein exist to Genbank and Uniprot, as well as to organism-specific databases SGD (S. cerevisiae), Wormbase (C. elegans) and Flybase (D. melanogaster) (Figure 3C). If structural information is available, respective links to the PDB database are given. The user also has the option to perform a targeted PubMed literature search for the helicase orthologs (Figure 3C). The cellular function of each protein is given only where explicitly tested, although it is generally believed that the cellular functions are largely conserved for orthologs. Finally, a curated alignment of the listed orthologs is provided. This alignment enables the user to identify regions of conservation among orthologs that extend beyond the characteristic helicase motifs.
Individual protein pages were designed based on our roughly decade-long experience with the DExH/D database. Guided by this experience, we chose to directly refer to the large sequence databases for detailed protein information, and not to copy this information into our database. Directly linking to comprehensive, constantly updated sequence databases ensures that information is current without direct intervention from our side.
Further features of the RNA helicase database include full-search ability by text-based queries that are commonplace from the ubiquitous Google search engine (Figure 3A). The search function can be accessed from any page and provides an alternative, yet important route to individual protein pages. A third route of access to individual protein pages is established by a list of proteins in the database (Figure 2). This list shows all RNA helicases in our database, ordered by SF and family and by the organism of origin. Our rationale for incorporating the various routes of access to the individual protein pages is the heterogeneity in the database users, which ranges from researchers looking for updates on information on a helicase that they have long been working on, to those who retrieve the first comprehensive information on a protein they just came across in their research. Finally, the database also contains other information of interest for research on RNA helicases, such as an RNA helicase primer, the naming code for DDX/DHX RNA helicases and a list of laboratories with specific interest in RNA helicases.
Funding for open access charge: National Institutes of Health (GM067700 to E.J.).
Conflict of interest statement. None declared.
We thank the many users of previous versions of our database for constructive feedback and valuable suggestions.