The mammalian genome encodes thousands of non-protein-coding RNAs (ncRNAs). Ribosomal RNAs (rRNAs), transfer RNAs (tRNAs) and small nuclear RNAs (snRNAs), fulfil mainly housekeeping roles in mRNA translation and splicing. Small nucleolar RNAs (snoRNAs) and the related small Cajal body-specific RNAs (scaRNAs) guide modifications of other RNAs (1
). MicroRNAs (miRNAs) regulate gene expression by controlling mRNA translation and turnover (4
), PIWI-interacting RNAs (piRNAs) are thought to be important in spermatogenesis (6
), while larger ncRNAs have been discovered to be developmentally regulated (8
) and to function in a range of processes including genomic imprinting, intracellular protein trafficking and brain development (11
). The abundance of ncRNAs has only become apparent in the past few years and was largely unexpected. Although many recently identified ncRNAs remain of unknown function, and appear to be evolving rapidly (15
), it is increasingly clear that ncRNAs represent a diverse and important class of functional output from mammalian genomes.
RNAdb is a comprehensive database of mammalian ncRNAs. The focus of the database is on ncRNAs that have restricted expression and whose function is likely to be regulatory. Housekeeping RNAs (rRNAs, tRNAs, snRNAs) are not included and are covered elsewhere (18
). The aim of the database is to provide a nucleotide sequence-based platform to facilitate both bioinformatic and experimental research in the burgeoning field of RNomics. Already, RNAdb has been used to develop machine-learning algorithms for identifying ncRNAs (20
), annotate transcripts from a large-scale transcriptome project (8
) and examine ncRNA evolution (17
). In addition to containing sequence data, individual ncRNA entries in RNAdb are annotated based upon publicly available information in the literature or secondary databases. In this way, the database can also be browsed or searched by the casual user interested in learning more about particular ncRNAs.
Since the original release of RNAdb two years ago (21
), the number of known mammalian ncRNAs has grown considerably. In recognition of this growth, we have updated the database to include tens of thousands of novel ncRNAs. Some of these have been characterized in isolation, continuing the trend of ad hoc
discovery by which many earlier ncRNAs were identified. The majority, however, comes from large-scale cloning and sequencing studies or structural alignment-based predictions. As well as incorporating new ncRNA datasets, the current release of RNAdb provides other enhancements, including microarray-based expression data, closer interface with specialized ncRNA resources such as miRBase and snoRNA-LBME-db (3
), and the availability of data for use as custom tracks on the UCSC Genome Browser (23