Search tips
Search criteria

Results 1-9 (9)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
1.  Vive la différence: naming structural variants in the human reference genome 
Human Genomics  2013;7(1):12.
The HUGO Gene Nomenclature Committee has approved gene symbols for the majority of protein-coding genes on the human reference genome. To adequately represent regions of complex structural variation, the Genome Reference Consortium now includes alternative representations of some of these regions as part of the reference genome. Here, we describe examples of how we name novel genes in these regions and how this nomenclature is displayed on our website,
PMCID: PMC3648363  PMID: 23634723
Gene nomenclature; Reference genome; Structural variants; Human
2. the HGNC resources in 2013 
Nucleic Acids Research  2012;41(D1):D545-D552.
The HUGO Gene Nomenclature Committee situated at the European Bioinformatics Institute assigns unique symbols and names to human genes. Since 2011, the data within our database has expanded largely owing to an increase in naming pseudogenes and non-coding RNA genes, and we now have >33 500 approved symbols. Our gene families and groups have also increased to nearly 500, with ∼45% of our gene entries associated to at least one family or group. We have also redesigned the HUGO Gene Nomenclature Committee website creating a constant look and feel across the site and improving usability and readability for our users. The site provides a public access portal to our database with no restrictions imposed on access or the use of the data. Within this article, we review our online resources and data with particular emphasis on the updates to our website.
PMCID: PMC3531211  PMID: 23161694
3.  FlyBase: improvements to the bibliography 
Nucleic Acids Research  2012;41(D1):D751-D757.
An accurate, comprehensive, non-redundant and up-to-date bibliography is a crucial component of any Model Organism Database (MOD). Principally, the bibliography provides a set of references that are specific to the field served by the MOD. Moreover, it serves as a backbone to which all curated biological data can be attributed. Here, we describe the organization and main features of the bibliography in FlyBase (, the MOD for Drosophila melanogaster. We present an overview of the current content of the bibliography, the pipeline for identifying and adding new references, the presentation of data within Reference Reports and effective methods for searching and retrieving bibliographic data. We highlight recent improvements in these areas and describe the advantages of using the FlyBase bibliography over alternative literature resources. Although this article is focused on bibliographic data, many of the features and tools described are applicable to browsing and querying other datasets in FlyBase.
PMCID: PMC3531214  PMID: 23125371
4.  Gene family matters: expanding the HGNC resource 
Human Genomics  2012;6(1):4.
The HUGO Gene Nomenclature Committee (HGNC) assigns approved gene symbols to human loci. There are currently over 33,000 approved gene symbols, the majority of which represent protein-coding genes, but we also name other locus types such as non-coding RNAs, pseudogenes and phenotypic loci. Where relevant, the HGNC organise these genes into gene families and groups. The HGNC website is an online repository of HGNC-approved gene nomenclature and associated resources for human genes, and includes links to genomic, proteomic and phenotypic information. In addition to this, we also have dedicated gene family web pages and are currently expanding and generating more of these pages using data curated by the HGNC and from information derived from external resources that focus on particular gene families. Here, we review our current online resources with a particular focus on our gene family data, using it to highlight our new Gene Symbol Report and gene family data downloads.
PMCID: PMC3437568  PMID: 23245209
5.  A revised nomenclature for transcribed human endogenous retroviral loci 
Mobile DNA  2011;2:7.
Endogenous retroviruses (ERVs) and ERV-like sequences comprise 8% of the human genome. A hitherto unknown proportion of ERV loci are transcribed and thus contribute to the human transcriptome. A small proportion of these loci encode functional proteins. As the role of ERVs in normal and diseased biological processes is not yet established, transcribed ERV loci are of particular interest. As more transcribed ERV loci are likely to be identified in the near future, the development of a systematic nomenclature is important to ensure that all information on each locus can be easily retrieved.
Here we present a revised nomenclature of transcribed human endogenous retroviral loci that sorts loci into groups based on Repbase classifications. Each symbol is of the format ERV + group symbol + unique number. Group symbols are based on a mixture of Repbase designations and well-supported symbols used in the literature. The presented guidelines will allow newly identified loci to be easily incorporated into the scheme.
The naming system will be employed by the HUGO Gene Nomenclature Committee for naming transcribed human ERV loci. We hope that the system will contribute to clarifying a certain aspect of a sometimes confusing nomenclature for human endogenous retroviruses. The presented system may also be employed for naming transcribed loci of human non-ERV repeat loci.
PMCID: PMC3113919  PMID: 21542922
6. the HGNC resources in 2011 
Nucleic Acids Research  2010;39(Database issue):D514-D519.
The HUGO Gene Nomenclature Committee (HGNC) aims to assign a unique gene symbol and name to every human gene. The HGNC database currently contains almost 30 000 approved gene symbols, over 19 000 of which represent protein-coding genes. The public website,, displays all approved nomenclature within Symbol Reports that contain data curated by HGNC editors and links to related genomic, phenotypic and proteomic information. Here we describe improvements to our resources, including a new Quick Gene Search, a new List Search, an integrated HGNC BioMart and a new Statistics and Downloads facility.
PMCID: PMC3013772  PMID: 20929869
7.  FlyBase: enhancing Drosophila Gene Ontology annotations 
Nucleic Acids Research  2008;37(Database issue):D555-D559.
FlyBase ( is a database of Drosophila genetic and genomic information. Gene Ontology (GO) terms are used to describe three attributes of wild-type gene products: their molecular function, the biological processes in which they play a role, and their subcellular location. This article describes recent changes to the FlyBase GO annotation strategy that are improving the quality of the GO annotation data. Many of these changes stem from our participation in the GO Reference Genome Annotation Project—a multi-database collaboration producing comprehensive GO annotation sets for 12 diverse species.
PMCID: PMC2686450  PMID: 18948289
8.  Natural Language Processing in aid of FlyBase curators 
BMC Bioinformatics  2008;9:193.
Despite increasing interest in applying Natural Language Processing (NLP) to biomedical text, whether this technology can facilitate tasks such as database curation remains unclear.
PaperBrowser is the first NLP-powered interface that was developed under a user-centered approach to improve the way in which FlyBase curators navigate an article. In this paper, we first discuss how observing curators at work informed the design and evaluation of PaperBrowser. Then, we present how we appraise PaperBrowser's navigational functionalities in a user-based study using a text highlighting task and evaluation criteria of Human-Computer Interaction. Our results show that PaperBrowser reduces the amount of interactions between two highlighting events and therefore improves navigational efficiency by about 58% compared to the navigational mechanism that was previously available to the curators. Moreover, PaperBrowser is shown to provide curators with enhanced navigational utility by over 74% irrespective of the different ways in which they highlight text in the article.
We show that state-of-the-art performance in certain NLP tasks such as Named Entity Recognition and Anaphora Resolution can be combined with the navigational functionalities of PaperBrowser to support curation quite successfully.
PMCID: PMC2375127  PMID: 18410678
9.  Serum-deprivation stimulates cap-binding by PARN at the expense of eIF4E, consistent with the observed decrease in mRNA stability 
Nucleic Acids Research  2005;33(1):376-387.
PARN, a poly(A)-specific ribonuclease, binds the 5′ cap-structure of mRNA and initiates deadenylation-dependent decay. Eukaryotic initiation factor 4E (eIF4E) also binds to the cap structure, an interaction that is critical for initiating cap-dependent translation. The stability of various mRNA transcripts in human cell lines is reduced under conditions of serum starvation as determined by both functional and chemical half-lives. Serum starvation also leads to enhanced cap association by PARN. In contrast, the 5′ cap occupancy by eIF4E decreases under serum-deprivation, as does the translation of reporter transcripts. Further, we show that PARN is a phosphoprotein and that this modification can be modulated by serum status. Taken together, these data are consistent with a natural competition existing at the 5′ cap structure between PARN and eIF4E that may be regulated by changes in post-translational modifications. These phosphorylation-induced changes in the interplay of PARN and eIF4E may determine whether the mRNA is translated or decayed.
PMCID: PMC546156  PMID: 15653638

Results 1-9 (9)