The primary aims of the miRNA Registry are two-fold. The first is to assign unique names to distinct miRNAs prior to publication of their discovery. A web interface has been developed to facilitate the submission of miRNA sequences for naming. To avoid accidental overlap of gene names, and to minimize ‘pre-booking’ of assignments, the Registry will assign a name only after a paper describing the sequence has been accepted for publication. Authors are advised to use temporary names in initial submission of articles to journals for peer-review. On acceptance, final names are discussed and agreed with the corresponding author. The miRNA Registry maintains complete confidentiality for pre-publication data.
miRNAs are given numerical identifiers based on sequence similarity. At the time of writing, the last assigned name is miR-318 from Drosophila melanogaster
. The next miRNA with no similarity to previously identified sequences will receive the name miR-319. It is desirable for homologues in different organisms to receive the same name. Names are based on the similarity of the excised ~22 nt sequence to previously identified miRNAs. Identical mature sequences are assigned the same name—if they originate from seperate genomic loci in a given organism they are given numberical suffixes, such as mir-6-1
). Sequences with one or two base changes are assigned suffixes of the form miR-181a and miR-181b (17
). Homologous sequences with more base differences may be suggested by sequence similarity in the hairpin portion of the primary transcript, and such cases are discussed and names agreed with the corresponding author. Some miRNA hairpin precursors give rise to two excised miRNAs, one from each arm. Different naming conventions have been used to describe these sequences. Where cloning studies have allowed researchers to determine which arm of the precursor gives rise to the predominantly expressed miRNA, an asterisk has been used to denote the less predominant form, as in miR-56 and miR-56* from C.elegans
). Previous reports have also denoted miRNAs from opposite arms of the hairpin precursor as, for example, miR-142-s (5′ arm) and miR-142-as (3′ arm) (5
). Current opinion favours using names of the form miR-142-5p and miR-142-3p to designate miRNAs from the 5′ and 3′ arms, respectively, until the data are sufficient to confirm which is predominantly expressed (T. Tuschl and D. Bartel, personal communication). Capitalisation of names should not be relied upon to confer meaning, but historically, mir-16
has been used to designate the gene (and also the predicted stem–loop portion of the primary transcript), whereas miR-16 signifies the excised ~22 nt sequence. Plant gene names follow a slightly different convention—of the form MIR156
The second aim of the miRNA Registry is to provide a comprehensive and searchable database of all published miRNA sequences. To this end, submitted sequences are moved to the public sections of the database on their publication. The website includes a browsable list of miRNA entries, name, keyword and publication searches, and allows the user to search a sequence against the database of predicted hairpins and mature miRNAs. Each database entry represents a predicted stem–loop containing the miRNA, with the bounds of the excised sequence(s) reported. The publication describing the discovery of the miRNA is cited as the primary reference. A brief description of the genomic location, homologous sequences and possible targets is provided, with links to literature references for more information. Cross-links to nucleotide databases, model organism databases and RNA family databases are given. Hairpin base-paired structures are depicted as predicted by the RNAfold program from the ViennaRNA package (26
). A typical entry page is shown in Figure .
The entry for mir-1 from C.elegans. The predicted stem–loop portion of the primary transcript and the excised miRNA sequence are depicted. Links to other data sources, references and annotation are also shown.
A commitment to the long-term curation of the miRNA Registry ensures the rapid dissemination of new sequence data and annotation. Each database entry is identified by a stable accession number in addition to the miRNA gene name. This enables the rationalisation of gene names as more data become available, whilst maintaining information for tracking changes from initial published names and descriptions. At the time of writing, the database contains only published miRNA loci, but miRNA annotation guidelines allow for the computational identification of homologues of validated miRNA sequences (24
). The size of the database is likely to increase significantly as such sequences are curated by us and others. As more information becomes available about the biogenesis of miRNAs, we predict that it will become desirable to curate sequence information for the primary transcipt and the hairpin precursor, as well as the excised mature miRNA. Close integration with the Rfam database (27
) facilitates the classification of related miRNA sequences into families.