PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-9 (9)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  tagtog: interactive and text-mining-assisted annotation of gene mentions in PLOS full-text articles 
The breadth and depth of biomedical literature are increasing year upon year. To keep abreast of these increases, FlyBase, a database for Drosophila genomic and genetic information, is constantly exploring new ways to mine the published literature to increase the efficiency and accuracy of manual curation and to automate some aspects, such as triaging and entity extraction. Toward this end, we present the ‘tagtog’ system, a web-based annotation framework that can be used to mark up biological entities (such as genes) and concepts (such as Gene Ontology terms) in full-text articles. tagtog leverages manual user annotation in combination with automatic machine-learned annotation to provide accurate identification of gene symbols and gene names. As part of the BioCreative IV Interactive Annotation Task, FlyBase has used tagtog to identify and extract mentions of Drosophila melanogaster gene symbols and names in full-text biomedical articles from the PLOS stable of journals. We show here the results of three experiments with different sized corpora and assess gene recognition performance and curation speed. We conclude that tagtog-named entity recognition improves with a larger corpus and that tagtog-assisted curation is quicker than manual curation.
Database URL: www.tagtog.net, www.flybase.org
doi:10.1093/database/bau033
PMCID: PMC3978375  PMID: 24715220
2.  FlyBase 102—advanced approaches to interrogating FlyBase 
Nucleic Acids Research  2013;42(D1):D780-D788.
FlyBase (http://flybase.org) is the leading website and database of Drosophila genes and genomes. Whether you are using the fruit fly Drosophila melanogaster as an experimental system or wish to understand Drosophila biological knowledge in relation to human disease or to other model systems, FlyBase can help you successfully find the information you are looking for. Here, we demonstrate some of our more advanced searching systems and highlight some of our new tools for searching the wealth of data on FlyBase. The first section explores gene function in FlyBase, using our TermLink tool to search with Controlled Vocabulary terms and our new RNA-Seq Search tool to search gene expression. The second section of this article describes a few ways to search genomic data in FlyBase, using our BLAST server and the new implementation of GBrowse 2, as well as our new FeatureMapper tool. Finally, we move on to discuss our most powerful search tool, QueryBuilder, before describing pre-computed cuts of the data and how to query the database programmatically.
doi:10.1093/nar/gkt1092
PMCID: PMC3964969  PMID: 24234449
3.  The Drosophila phenotype ontology 
Background
Phenotype ontologies are queryable classifications of phenotypes. They provide a widely-used means for annotating phenotypes in a form that is human-readable, programatically accessible and that can be used to group annotations in biologically meaningful ways. Accurate manual annotation requires clear textual definitions for terms. Accurate grouping and fruitful programatic usage require high-quality formal definitions that can be used to automate classification. The Drosophila phenotype ontology (DPO) has been used to annotate over 159,000 phenotypes in FlyBase to date, but until recently lacked textual or formal definitions.
Results
We have composed textual definitions for all DPO terms and formal definitions for 77% of them. Formal definitions reference terms from a range of widely-used ontologies including the Phenotype and Trait Ontology (PATO), the Gene Ontology (GO) and the Cell Ontology (CL). We also describe a generally applicable system, devised for the DPO, for recording and reasoning about the timing of death in populations. As a result of the new formalisations, 85% of classifications in the DPO are now inferred rather than asserted, with much of this classification leveraging the structure of the GO. This work has significantly improved the accuracy and completeness of classification and made further development of the DPO more sustainable.
Conclusions
The DPO provides a set of well-defined terms for annotating Drosophila phenotypes and for grouping and querying the resulting annotation sets in biologically meaningful ways. Such queries have already resulted in successful function predictions from phenotype annotation. Moreover, such formalisations make extended queries possible, including cross-species queries via the external ontologies used in formal definitions. The DPO is openly available under an open source license in both OBO and OWL formats. There is good potential for it to be used more broadly by the Drosophila community, which may ultimately result in its extension to cover a broader range of phenotypes.
doi:10.1186/2041-1480-4-30
PMCID: PMC3816596  PMID: 24138933
Drosophila; Phenotype; Ontology; OWL; OBO; Gene ontology; FlyBase
4.  Opportunities for text mining in the FlyBase genetic literature curation workflow 
FlyBase is the model organism database for Drosophila genetic and genomic information. Over the last 20 years, FlyBase has had to adapt and change to keep abreast of advances in biology and database design. We are continually looking for ways to improve curation efficiency and efficacy. Genetic literature curation focuses on the extraction of genetic entities (e.g. genes, mutant alleles, transgenic constructs) and their associated phenotypes and Gene Ontology terms from the published literature. Over 2000 Drosophila research articles are now published every year. These articles are becoming ever more data-rich and there is a growing need for text mining to shoulder some of the burden of paper triage and data extraction. In this article, we describe our curation workflow, along with some of the problems and bottlenecks therein, and highlight the opportunities for text mining. We do so in the hope of encouraging the BioCreative community to help us to develop effective methods to mine this torrent of information.
Database URL: http://flybase.org
doi:10.1093/database/bas039
PMCID: PMC3500518  PMID: 23160412
5.  FlyBase 101 – the basics of navigating FlyBase 
Nucleic Acids Research  2011;40(D1):D706-D714.
FlyBase (http://flybase.org) is the leading database and web portal for genetic and genomic information on the fruit fly Drosophila melanogaster and related fly species. Whether you use the fruit fly as an experimental system or want to apply Drosophila biological knowledge to another field of study, FlyBase can help you successfully navigate the wealth of available Drosophila data. Here, we review the FlyBase web site with novice and less-experienced users of FlyBase in mind and point out recent developments stemming from the availability of genome-wide data from the modENCODE project. The first section of this paper explains the organization of the web site and describes the report pages available on FlyBase, focusing on the most popular, the Gene Report. The next section introduces some of the search tools available on FlyBase, in particular, our heavily used and recently redesigned search tool QuickSearch, found on the FlyBase homepage. The final section concerns genomic data, including recent modENCODE (http://www.modencode.org) data, available through our Genome Browser, GBrowse.
doi:10.1093/nar/gkr1030
PMCID: PMC3245098  PMID: 22127867
6.  Inside FlyBase 
Fly  2009;3(1):112-114.
As research in the biological sciences continues to advance at a rapid pace, it is increasingly important that the data be captured, standardized, organized and made accessible to the scientific community. This is the job of a biocurator. Here we describe the process of biocuration from our perspective as FlyBase curators.
PMCID: PMC2837272  PMID: 19182544
biocuration; ontology; model organism database; FlyBase; career
7.  Drosophila Neurotrophins Reveal a Common Mechanism for Nervous System Formation 
PLoS Biology  2008;6(11):e284.
Neurotrophic interactions occur in Drosophila, but to date, no neurotrophic factor had been found. Neurotrophins are the main vertebrate secreted signalling molecules that link nervous system structure and function: they regulate neuronal survival, targeting, synaptic plasticity, memory and cognition. We have identified a neurotrophic factor in flies, Drosophila Neurotrophin (DNT1), structurally related to all known neurotrophins and highly conserved in insects. By investigating with genetics the consequences of removing DNT1 or adding it in excess, we show that DNT1 maintains neuronal survival, as more neurons die in DNT1 mutants and expression of DNT1 rescues naturally occurring cell death, and it enables targeting by motor neurons. We show that Spätzle and a further fly neurotrophin superfamily member, DNT2, also have neurotrophic functions in flies. Our findings imply that most likely a neurotrophin was present in the common ancestor of all bilateral organisms, giving rise to invertebrate and vertebrate neurotrophins through gene or whole-genome duplications. This work provides a missing link between aspects of neuronal function in flies and vertebrates, and it opens the opportunity to use Drosophila to investigate further aspects of neurotrophin function and to model related diseases.
Author Summary
Neurotrophins are secreted proteins that link nervous system structure and function in vertebrates. They regulate neuronal survival, thus adjusting cell populations, and connectivity, enabling the formation of neuronal circuits. They also regulate patterns of dendrites and axons, synaptic function, memory, learning, and cognition; and abnormal neurotrophin function underlies psychiatric disorders. Despite such relevance for nervous system structure and function, neurotrophins have been missing from invertebrates. We show here the identification and functional demonstration of a neurotrophin family in the fruit fly, Drosophila. Our findings imply that the neurotrophins may be present in all animals with a centralised nervous system (motor and sensory systems) or brain, supporting the notion of a common origin for the brain in evolution. This work bridges a void in the understanding of the Drosophila and human nervous systems, and it opens the opportunity to use the powerful fruit fly for neurotrophin related studies.
Members of the neurotrophin superfamily mediate critical roles in neuronal survival and targeting in the fruit flyDrosophila. Although this is an accepted role for neurotrophins in vertebrates, scant previous evidence has been able to demonstrate such a conserved role in invertebrates.
doi:10.1371/journal.pbio.0060284
PMCID: PMC2586362  PMID: 19018662
8.  FlyBase: enhancing Drosophila Gene Ontology annotations 
Nucleic Acids Research  2008;37(Database issue):D555-D559.
FlyBase (http://flybase.org) is a database of Drosophila genetic and genomic information. Gene Ontology (GO) terms are used to describe three attributes of wild-type gene products: their molecular function, the biological processes in which they play a role, and their subcellular location. This article describes recent changes to the FlyBase GO annotation strategy that are improving the quality of the GO annotation data. Many of these changes stem from our participation in the GO Reference Genome Annotation Project—a multi-database collaboration producing comprehensive GO annotation sets for 12 diverse species.
doi:10.1093/nar/gkn788
PMCID: PMC2686450  PMID: 18948289
9.  Natural Language Processing in aid of FlyBase curators 
BMC Bioinformatics  2008;9:193.
Background
Despite increasing interest in applying Natural Language Processing (NLP) to biomedical text, whether this technology can facilitate tasks such as database curation remains unclear.
Results
PaperBrowser is the first NLP-powered interface that was developed under a user-centered approach to improve the way in which FlyBase curators navigate an article. In this paper, we first discuss how observing curators at work informed the design and evaluation of PaperBrowser. Then, we present how we appraise PaperBrowser's navigational functionalities in a user-based study using a text highlighting task and evaluation criteria of Human-Computer Interaction. Our results show that PaperBrowser reduces the amount of interactions between two highlighting events and therefore improves navigational efficiency by about 58% compared to the navigational mechanism that was previously available to the curators. Moreover, PaperBrowser is shown to provide curators with enhanced navigational utility by over 74% irrespective of the different ways in which they highlight text in the article.
Conclusion
We show that state-of-the-art performance in certain NLP tasks such as Named Entity Recognition and Anaphora Resolution can be combined with the navigational functionalities of PaperBrowser to support curation quite successfully.
doi:10.1186/1471-2105-9-193
PMCID: PMC2375127  PMID: 18410678

Results 1-9 (9)