PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-8 (8)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  WormBase 2014: new views of curated biology 
Nucleic Acids Research  2013;42(D1):D789-D793.
WormBase (http://www.wormbase.org/) is a highly curated resource dedicated to supporting research using the model organism Caenorhabditis elegans. With an electronic history predating the World Wide Web, WormBase contains information ranging from the sequence and phenotype of individual alleles to genome-wide studies generated using next-generation sequencing technologies. In recent years, we have expanded the contents to include data on additional nematodes of agricultural and medical significance, bringing the knowledge of C. elegans to bear on these systems and providing support for underserved research communities. Manual curation of the primary literature remains a central focus of the WormBase project, providing users with reliable, up-to-date and highly cross-linked information. In this update, we describe efforts to organize the original atomized and highly contextualized curated data into integrated syntheses of discrete biological topics. Next, we discuss our experiences coping with the vast increase in available genome sequences made possible through next-generation sequencing platforms. Finally, we describe some of the features and tools of the new WormBase Web site that help users better find and explore data of interest.
doi:10.1093/nar/gkt1063
PMCID: PMC3965043  PMID: 24194605
2.  Automatic categorization of diverse experimental information in the bioscience literature 
BMC Bioinformatics  2012;13:16.
Background
Curation of information from bioscience literature into biological knowledge databases is a crucial way of capturing experimental information in a computable form. During the biocuration process, a critical first step is to identify from all published literature the papers that contain results for a specific data type the curator is interested in annotating. This step normally requires curators to manually examine many papers to ascertain which few contain information of interest and thus, is usually time consuming. We developed an automatic method for identifying papers containing these curation data types among a large pool of published scientific papers based on the machine learning method Support Vector Machine (SVM). This classification system is completely automatic and can be readily applied to diverse experimental data types. It has been in use in production for automatic categorization of 10 different experimental datatypes in the biocuration process at WormBase for the past two years and it is in the process of being adopted in the biocuration process at FlyBase and the Saccharomyces Genome Database (SGD). We anticipate that this method can be readily adopted by various databases in the biocuration community and thereby greatly reducing time spent on an otherwise laborious and demanding task. We also developed a simple, readily automated procedure to utilize training papers of similar data types from different bodies of literature such as C. elegans and D. melanogaster to identify papers with any of these data types for a single database. This approach has great significance because for some data types, especially those of low occurrence, a single corpus often does not have enough training papers to achieve satisfactory performance.
Results
We successfully tested the method on ten data types from WormBase, fifteen data types from FlyBase and three data types from Mouse Genomics Informatics (MGI). It is being used in the curation work flow at WormBase for automatic association of newly published papers with ten data types including RNAi, antibody, phenotype, gene regulation, mutant allele sequence, gene expression, gene product interaction, overexpression phenotype, gene interaction, and gene structure correction.
Conclusions
Our methods are applicable to a variety of data types with training set containing several hundreds to a few thousand documents. It is completely automatic and, thus can be readily incorporated to different workflow at different literature-based databases. We believe that the work presented here can contribute greatly to the tremendous task of automating the important yet labor-intensive biocuration effort.
doi:10.1186/1471-2105-13-16
PMCID: PMC3305665  PMID: 22280404
3.  WormBase 2012: more genomes, more data, new website 
Nucleic Acids Research  2011;40(D1):D735-D741.
Since its release in 2000, WormBase (http://www.wormbase.org) has grown from a small resource focusing on a single species and serving a dedicated research community, to one now spanning 15 species essential to the broader biomedical and agricultural research fields. To enhance the rate of curation, we have automated the identification of key data in the scientific literature and use similar methodology for data extraction. To ease access to the data, we are collaborating with journals to link entities in research publications to their report pages at WormBase. To facilitate discovery, we have added new views of the data, integrated large-scale datasets and expanded descriptions of models for human disease. Finally, we have introduced a dramatic overhaul of the WormBase website for public beta testing. Designed to balance complexity and usability, the new site is species-agnostic, highly customizable, and interactive. Casual users and developers alike will be able to leverage the public RESTful application programming interface (API) to generate custom data mining solutions and extensions to the site. We report on the growth of our database and on our work in keeping pace with the growing demand for data, efforts to anticipate the requirements of users and new collaborations with the larger science community.
doi:10.1093/nar/gkr954
PMCID: PMC3245152  PMID: 22067452
4.  Worm Phenotype Ontology: Integrating phenotype data within and beyond the C. elegans community 
BMC Bioinformatics  2011;12:32.
Background
Caenorhabditis elegans gene-based phenotype information dates back to the 1970's, beginning with Sydney Brenner and the characterization of behavioral and morphological mutant alleles via classical genetics in order to understand nervous system function. Since then C. elegans has become an important genetic model system for the study of basic biological and biomedical principles, largely through the use of phenotype analysis. Because of the growth of C. elegans as a genetically tractable model organism and the development of large-scale analyses, there has been a significant increase of phenotype data that needs to be managed and made accessible to the research community. To do so, a standardized vocabulary is necessary to integrate phenotype data from diverse sources, permit integration with other data types and render the data in a computable form.
Results
We describe a hierarchically structured, controlled vocabulary of terms that can be used to standardize phenotype descriptions in C. elegans, namely the Worm Phenotype Ontology (WPO). The WPO is currently comprised of 1,880 phenotype terms, 74% of which have been used in the annotation of phenotypes associated with greater than 18,000 C. elegans genes. The scope of the WPO is not exclusively limited to C. elegans biology, rather it is devised to also incorporate phenotypes observed in related nematode species. We have enriched the value of the WPO by integrating it with other ontologies, thereby increasing the accessibility of worm phenotypes to non-nematode biologists. We are actively developing the WPO to continue to fulfill the evolving needs of the scientific community and hope to engage researchers in this crucial endeavor.
Conclusions
We provide a phenotype ontology (WPO) that will help to facilitate data retrieval, and cross-species comparisons within the nematode community. In the larger scientific community, the WPO will permit data integration, and interoperability across the different Model Organism Databases (MODs) and other biological databases. This standardized phenotype ontology will therefore allow for more complex data queries and enhance bioinformatic analyses.
doi:10.1186/1471-2105-12-32
PMCID: PMC3039574  PMID: 21261995
5.  WormBase: a comprehensive resource for nematode research 
Nucleic Acids Research  2009;38(Database issue):D463-D467.
WormBase (http://www.wormbase.org) is a central data repository for nematode biology. Initially created as a service to the Caenorhabditis elegans research field, WormBase has evolved into a powerful research tool in its own right. In the past 2 years, we expanded WormBase to include the complete genomic sequence, gene predictions and orthology assignments from a range of related nematodes. This comparative data enrich the C. elegans data with improved gene predictions and a better understanding of gene function. In turn, they bring the wealth of experimental knowledge of C. elegans to other systems of medical and agricultural importance. Here, we describe new species and data types now available at WormBase. In addition, we detail enhancements to our curatorial pipeline and website infrastructure to accommodate new genomes and an extensive user base.
doi:10.1093/nar/gkp952
PMCID: PMC2808986  PMID: 19910365
6.  WormBase 2007 
Nucleic Acids Research  2007;36(Database issue):D612-D617.
WormBase (www.wormbase.org) is the major publicly available database of information about Caenorhabditis elegans, an important system for basic biological and biomedical research. Derived from the initial ACeDB database of C. elegans genetic and sequence information, WormBase now includes the genomic, anatomical and functional information about C. elegans, other Caenorhabditis species and other nematodes. As such, it is a crucial resource not only for C. elegans biologists but the larger biomedical and bioinformatics communities. Coverage of core areas of C. elegans biology will allow the biomedical community to make full use of the results of intensive molecular genetic analysis and functional genomic studies of this organism. Improved search and display tools, wider cross-species comparisons and extended ontologies are some of the features that will help scientists extend their research and take advantage of other nematode species genome sequences.
doi:10.1093/nar/gkm975
PMCID: PMC2238927  PMID: 17991679
7.  WormBase: new content and better access 
Nucleic Acids Research  2006;35(Database issue):D506-D510.
WormBase (), a model organism database for Caenorhabditis elegans and other related nematodes, continues to evolve and expand. Over the past year WormBase has added new data on C.elegans, including data on classical genetics, cell biology and functional genomics; expanded the annotation of closely related nematodes with a new genome browser for Caenorhabditis remanei; and deployed new hardware for stronger performance. Several existing datasets including phenotype descriptions and RNAi experiments have seen a large increase in new content. New datasets such as the C.remanei draft assembly and annotations, the Vancouver Fosmid library and TEC-RED 5′ end sites are now available as well. Access to and searching WormBase has become more dependable and flexible via multiple mirror sites and indexing through Google.
doi:10.1093/nar/gkl818
PMCID: PMC1669750  PMID: 17099234
8.  Initiation of male sperm-transfer behavior in Caenorhabditis elegans requires input from the ventral nerve cord 
BMC Biology  2006;4:26.
Background
The Caenorhabditis elegans male exhibits a stereotypic behavioral pattern when attempting to mate. This behavior has been divided into the following steps: response, backing, turning, vulva location, spicule insertion, and sperm transfer. We and others have begun in-depth analyses of all these steps in order to understand how complex behaviors are generated. Here we extend our understanding of the sperm-transfer step of male mating behavior.
Results
Based on observation of wild-type males and on genetic analysis, we have divided the sperm-transfer step of mating behavior into four sub-steps: initiation, release, continued transfer, and cessation. To begin to understand how these sub-steps of sperm transfer are regulated, we screened for ethylmethanesulfonate (EMS)-induced mutations that cause males to transfer sperm aberrantly. We isolated an allele of unc-18, a previously reported member of the Sec1/Munc-18 (SM) family of proteins that is necessary for regulated exocytosis in C. elegans motor neurons. Our allele, sy671, is defective in two distinct sub-steps of sperm transfer: initiation and continued transfer. By a series of transgenic site-of-action experiments, we found that motor neurons in the ventral nerve cord require UNC-18 for the initiation of sperm transfer, and that UNC-18 acts downstream or in parallel to the SPV sensory neurons in this process. In addition to this neuronal requirement, we found that non-neuronal expression of UNC-18, in the male gonad, is necessary for the continuation of sperm transfer.
Conclusion
Our division of sperm-transfer behavior into sub-steps has provided a framework for the further detailed analysis of sperm transfer and its integration with other aspects of mating behavior. By determining the site of action of UNC-18 in sperm-transfer behavior, and its relation to the SPV sensory neurons, we have further defined the cells and tissues involved in the generation of this behavior. We have shown both a neuronal and non-neuronal requirement for UNC-18 in distinct sub-steps of sperm-transfer behavior. The definition of circuit components is a crucial first step toward understanding how genes specify the neural circuit and hence the behavior.
doi:10.1186/1741-7007-4-26
PMCID: PMC1564418  PMID: 16911797

Results 1-8 (8)