PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (37)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
more »
Document Types
1.  Finding Our Way through Phenotypes 
Deans, Andrew R. | Lewis, Suzanna E. | Huala, Eva | Anzaldo, Salvatore S. | Ashburner, Michael | Balhoff, James P. | Blackburn, David C. | Blake, Judith A. | Burleigh, J. Gordon | Chanet, Bruno | Cooper, Laurel D. | Courtot, Mélanie | Csösz, Sándor | Cui, Hong | Dahdul, Wasila | Das, Sandip | Dececchi, T. Alexander | Dettai, Agnes | Diogo, Rui | Druzinsky, Robert E. | Dumontier, Michel | Franz, Nico M. | Friedrich, Frank | Gkoutos, George V. | Haendel, Melissa | Harmon, Luke J. | Hayamizu, Terry F. | He, Yongqun | Hines, Heather M. | Ibrahim, Nizar | Jackson, Laura M. | Jaiswal, Pankaj | James-Zorn, Christina | Köhler, Sebastian | Lecointre, Guillaume | Lapp, Hilmar | Lawrence, Carolyn J. | Le Novère, Nicolas | Lundberg, John G. | Macklin, James | Mast, Austin R. | Midford, Peter E. | Mikó, István | Mungall, Christopher J. | Oellrich, Anika | Osumi-Sutherland, David | Parkinson, Helen | Ramírez, Martín J. | Richter, Stefan | Robinson, Peter N. | Ruttenberg, Alan | Schulz, Katja S. | Segerdell, Erik | Seltmann, Katja C. | Sharkey, Michael J. | Smith, Aaron D. | Smith, Barry | Specht, Chelsea D. | Squires, R. Burke | Thacker, Robert W. | Thessen, Anne | Fernandez-Triana, Jose | Vihinen, Mauno | Vize, Peter D. | Vogt, Lars | Wall, Christine E. | Walls, Ramona L. | Westerfeld, Monte | Wharton, Robert A. | Wirkner, Christian S. | Woolley, James B. | Yoder, Matthew J. | Zorn, Aaron M. | Mabee, Paula
PLoS Biology  2015;13(1):e1002033.
Imagine if we could compute across phenotype data as easily as genomic data; this article calls for efforts to realize this vision and discusses the potential benefits.
Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility.
doi:10.1371/journal.pbio.1002033
PMCID: PMC4285398  PMID: 25562316
2.  Gateways to the FANTOM5 promoter level mammalian expression atlas 
Genome Biology  2015;16(1):22.
The FANTOM5 project investigates transcription initiation activities in more than 1,000 human and mouse primary cells, cell lines and tissues using CAGE. Based on manual curation of sample information and development of an ontology for sample classification, we assemble the resulting data into a centralized data resource (http://fantom.gsc.riken.jp/5/). This resource contains web-based tools and data-access points for the research community to search and extract data related to samples, genes, promoter activities, transcription factors and enhancers across the FANTOM5 atlas.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0560-6) contains supplementary material, which is available to authorized users.
doi:10.1186/s13059-014-0560-6
PMCID: PMC4310165
3.  The Porifera Ontology (PORO): enhancing sponge systematics with an anatomy ontology 
Background
Porifera (sponges) are ancient basal metazoans that lack organs. They provide insight into key evolutionary transitions, such as the emergence of multicellularity and the nervous system. In addition, their ability to synthesize unusual compounds offers potential biotechnical applications. However, much of the knowledge of these organisms has not previously been codified in a machine-readable way using modern web standards.
Results
The Porifera Ontology is intended as a standardized coding system for sponge anatomical features currently used in systematics. The ontology is available from http://purl.obolibrary.org/obo/poro.owl, or from the project homepage http://porifera-ontology.googlecode.com/. The version referred to in this manuscript is permanently available from http://purl.obolibrary.org/obo/poro/releases/2014-03-06/.
Conclusions
By standardizing character representations, we hope to facilitate more rapid description and identification of sponge taxa, to allow integration with other evolutionary database systems, and to perform character mapping across the major clades of sponges to better understand the evolution of morphological features. Future applications of the ontology will focus on creating (1) ontology-based species descriptions; (2) taxonomic keys that use the nested terms of the ontology to more quickly facilitate species identifications; and (3) methods to map anatomical characters onto molecular phylogenies of sponges. In addition to modern taxa, the ontology is being extended to include features of fossil taxa.
doi:10.1186/2041-1480-5-39
PMCID: PMC4177528  PMID: 25276334
Morphology; Taxonomic identification; Phylogenetics; Evolution
4.  Deletions of chromosomal regulatory boundaries are associated with congenital disease 
Genome Biology  2014;15(9):423.
Background
Recent data from genome-wide chromosome conformation capture analysis indicate that the human genome is divided into conserved megabase-sized self-interacting regions called topological domains. These topological domains form the regulatory backbone of the genome and are separated by regulatory boundary elements or barriers. Copy-number variations can potentially alter the topological domain architecture by deleting or duplicating the barriers and thereby allowing enhancers from neighboring domains to ectopically activate genes causing misexpression and disease, a mutational mechanism that has recently been termed enhancer adoption.
Results
We use the Human Phenotype Ontology database to relate the phenotypes of 922 deletion cases recorded in the DECIPHER database to monogenic diseases associated with genes in or adjacent to the deletions. We identify combinations of tissue-specific enhancers and genes adjacent to the deletion and associated with phenotypes in the corresponding tissue, whereby the phenotype matched that observed in the deletion. We compare this computationally with a gene-dosage pathomechanism that attempts to explain the deletion phenotype based on haploinsufficiency of genes located within the deletions. Up to 11.8% of the deletions could be best explained by enhancer adoption or a combination of enhancer adoption and gene-dosage effects.
Conclusions
Our results suggest that enhancer adoption caused by deletions of regulatory boundaries may contribute to a substantial minority of copy-number variation phenotypes and should thus be taken into account in their medical interpretation.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0423-1) contains supplementary material, which is available to authorized users.
doi:10.1186/s13059-014-0423-1
PMCID: PMC4180961  PMID: 25315429
5.  Nose to tail, roots to shoots: spatial descriptors for phenotypic diversity in the Biological Spatial Ontology 
Background
Spatial terminology is used in anatomy to indicate precise, relative positions of structures in an organism. While these terms are often standardized within specific fields of biology, they can differ dramatically across taxa. Such differences in usage can impair our ability to unambiguously refer to anatomical position when comparing anatomy or phenotypes across species. We developed the Biological Spatial Ontology (BSPO) to standardize the description of spatial and topological relationships across taxa to enable the discovery of comparable phenotypes.
Results
BSPO currently contains 146 classes and 58 relations representing anatomical axes, gradients, regions, planes, sides, and surfaces. These concepts can be used at multiple biological scales and in a diversity of taxa, including plants, animals and fungi. The BSPO is used to provide a source of anatomical location descriptors for logically defining anatomical entity classes in anatomy ontologies. Spatial reasoning is further enhanced in anatomy ontologies by integrating spatial relations such as dorsal_to into class descriptions (e.g., ‘dorsolateral placode’ dorsal_to some ‘epibranchial placode’).
Conclusions
The BSPO is currently used by projects that require standardized anatomical descriptors for phenotype annotation and ontology integration across a diversity of taxa. Anatomical location classes are also useful for describing phenotypic differences, such as morphological variation in position of structures resulting from evolution within and across species.
doi:10.1186/2041-1480-5-34
PMCID: PMC4137724  PMID: 25140222
Anatomy; Spatial relationships; Position; Axes; Reasoning; BSPO; Ontology; Phenotype
6.  Standardized description of scientific evidence using the Evidence Ontology (ECO) 
The Evidence Ontology (ECO) is a structured, controlled vocabulary for capturing evidence in biological research. ECO includes diverse terms for categorizing evidence that supports annotation assertions including experimental types, computational methods, author statements and curator inferences. Using ECO, annotation assertions can be distinguished according to the evidence they are based on such as those made by curators versus those automatically computed or those made via high-throughput data review versus single test experiments. Originally created for capturing evidence associated with Gene Ontology annotations, ECO is now used in other capacities by many additional annotation resources including UniProt, Mouse Genome Informatics, Saccharomyces Genome Database, PomBase, the Protein Information Resource and others. Information on the development and use of ECO can be found at http://evidenceontology.org. The ontology is freely available under Creative Commons license (CC BY-SA 3.0), and can be downloaded in both Open Biological Ontologies and Web Ontology Language formats at http://code.google.com/p/evidenceontology. Also at this site is a tracker for user submission of term requests and questions. ECO remains under active development in response to user-requested terms and in collaborations with other ontologies and database resources.
Database URL: Evidence Ontology Web site: http://evidenceontology.org
doi:10.1093/database/bau075
PMCID: PMC4105709  PMID: 25052702
7.  Representing Kidney Development Using the Gene Ontology 
PLoS ONE  2014;9(6):e99864.
Gene Ontology (GO) provides dynamic controlled vocabularies to aid in the description of the functional biological attributes and subcellular locations of gene products from all taxonomic groups (www.geneontology.org). Here we describe collaboration between the renal biomedical research community and the GO Consortium to improve the quality and quantity of GO terms describing renal development. In the associated annotation activity, the new and revised terms were associated with gene products involved in renal development and function. This project resulted in a total of 522 GO terms being added to the ontology and the creation of approximately 9,600 kidney-related GO term associations to 940 UniProt Knowledgebase (UniProtKB) entries, covering 66 taxonomic groups. We demonstrate the impact of these improvements on the interpretation of GO term analyses performed on genes differentially expressed in kidney glomeruli affected by diabetic nephropathy. In summary, we have produced a resource that can be utilized in the interpretation of data from small- and large-scale experiments investigating molecular mechanisms of kidney function and development and thereby help towards alleviating renal disease.
doi:10.1371/journal.pone.0099864
PMCID: PMC4062467  PMID: 24941002
8.  A method for increasing expressivity of Gene Ontology annotations using a compositional approach 
BMC Bioinformatics  2014;15:155.
Background
The Gene Ontology project integrates data about the function of gene products across a diverse range of organisms, allowing the transfer of knowledge from model organisms to humans, and enabling computational analyses for interpretation of high-throughput experimental and clinical data. The core data structure is the annotation, an association between a gene product and a term from one of the three ontologies comprising the GO. Historically, it has not been possible to provide additional information about the context of a GO term, such as the target gene or the location of a molecular function. This has limited the specificity of knowledge that can be expressed by GO annotations.
Results
The GO Consortium has introduced annotation extensions that enable manually curated GO annotations to capture additional contextual details. Extensions represent effector–target relationships such as localization dependencies, substrates of protein modifiers and regulation targets of signaling pathways and transcription factors as well as spatial and temporal aspects of processes such as cell or tissue type or developmental stage. We describe the content and structure of annotation extensions, provide examples, and summarize the current usage of annotation extensions.
Conclusions
The additional contextual information captured by annotation extensions improves the utility of functional annotation by representing dependencies between annotations to terms in the different ontologies of GO, external ontologies, or an organism’s gene products. These enhanced annotations can also support sophisticated queries and reasoning, and will provide curated, directional links between many gene products to support pathway and network reconstruction.
doi:10.1186/1471-2105-15-155
PMCID: PMC4039540  PMID: 24885854
Gene Ontology; Functional annotation; Annotation extension; Manual curation
9.  Unification of multi-species vertebrate anatomy ontologies for comparative biology in Uberon 
Background
Elucidating disease and developmental dysfunction requires understanding variation in phenotype. Single-species model organism anatomy ontologies (ssAOs) have been established to represent this variation. Multi-species anatomy ontologies (msAOs; vertebrate skeletal, vertebrate homologous, teleost, amphibian AOs) have been developed to represent ‘natural’ phenotypic variation across species. Our aim has been to integrate ssAOs and msAOs for various purposes, including establishing links between phenotypic variation and candidate genes.
Results
Previously, msAOs contained a mixture of unique and overlapping content. This hampered integration and coordination due to the need to maintain cross-references or inter-ontology equivalence axioms to the ssAOs, or to perform large-scale obsolescence and modular import. Here we present the unification of anatomy ontologies into Uberon, a single ontology resource that enables interoperability among disparate data and research groups. As a consequence, independent development of TAO, VSAO, AAO, and vHOG has been discontinued.
Conclusions
The newly broadened Uberon ontology is a unified cross-taxon resource for metazoans (animals) that has been substantially expanded to include a broad diversity of vertebrate anatomical structures, permitting reasoning across anatomical variation in extinct and extant taxa. Uberon is a core resource that supports single- and cross-species queries for candidate genes using annotations for phenotypes from the systematics, biodiversity, medical, and model organism communities, while also providing entities for logical definitions in the Cell and Gene Ontologies.
The ontology release files associated with the ontology merge described in this manuscript are available at: http://purl.obolibrary.org/obo/uberon/releases/2013-02-21/
Current ontology release files are available always available at: http://purl.obolibrary.org/obo/uberon/releases/
doi:10.1186/2041-1480-5-21
PMCID: PMC4089931  PMID: 25009735
Evolutionary biology; Morphological variation; Phenotype; Semantic integration; Bio-ontology
10.  Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research 
F1000Research  2014;2:30.
Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species.
We have generated a cross-species phenotype ontology for human, mouse and zebrafish that contains classes from the Human Phenotype Ontology, Mammalian Phenotype Ontology, and generated classes for zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases.
This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.
doi:10.12688/f1000research.2-30.v2
PMCID: PMC3799545  PMID: 24358873
11.  The environment ontology: contextualising biological and biomedical entities 
As biological and biomedical research increasingly reference the environmental context of the biological entities under study, the need for formalisation and standardisation of environment descriptors is growing. The Environment Ontology (ENVO; http://www.environmentontology.org) is a community-led, open project which seeks to provide an ontology for specifying a wide range of environments relevant to multiple life science disciplines and, through an open participation model, to accommodate the terminological requirements of all those needing to annotate data using ontology classes. This paper summarises ENVO’s motivation, content, structure, adoption, and governance approach. The ontology is available from http://purl.obolibrary.org/obo/envo.owl - an OBO format version is also available by switching the file suffix to “obo”.
doi:10.1186/2041-1480-4-43
PMCID: PMC3904460  PMID: 24330602
Environment; Ecosystem; Biome; Ontology
12.  The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data 
Nucleic Acids Research  2013;42(Database issue):D966-D974.
The Human Phenotype Ontology (HPO) project, available at http://www.human-phenotype-ontology.org, provides a structured, comprehensive and well-defined set of 10,088 classes (terms) describing human phenotypic abnormalities and 13,326 subclass relations between the HPO classes. In addition we have developed logical definitions for 46% of all HPO classes using terms from ontologies for anatomy, cell types, function, embryology, pathology and other domains. This allows interoperability with several resources, especially those containing phenotype information on model organisms such as mouse and zebrafish. Here we describe the updated HPO database, which provides annotations of 7,278 human hereditary syndromes listed in OMIM, Orphanet and DECIPHER to classes of the HPO. Various meta-attributes such as frequency, references and negations are associated with each annotation. Several large-scale projects worldwide utilize the HPO for describing phenotype information in their datasets. We have therefore generated equivalence mappings to other phenotype vocabularies such as LDDB, Orphanet, MedDRA, UMLS and phenoDB, allowing integration of existing datasets and interoperability with multiple biomedical resources. We have created various ways to access the HPO database content using flat files, a MySQL database, and Web-based tools. All data and documentation on the HPO project can be found online.
doi:10.1093/nar/gkt1026
PMCID: PMC3965098  PMID: 24217912
13.  The Gene Ontology (GO) Cellular Component Ontology: integration with SAO (Subcellular Anatomy Ontology) and other recent developments 
Background
The Gene Ontology (GO) (http://www.geneontology.org/) contains a set of terms for describing the activity and actions of gene products across all kingdoms of life. Each of these activities is executed in a location within a cell or in the vicinity of a cell. In order to capture this context, the GO includes a sub-ontology called the Cellular Component (CC) ontology (GO-CCO). The primary use of this ontology is for GO annotation, but it has also been used for phenotype annotation, and for the annotation of images. Another ontology with similar scope to the GO-CCO is the Subcellular Anatomy Ontology (SAO), part of the Neuroscience Information Framework Standard (NIFSTD) suite of ontologies. The SAO also covers cell components, but in the domain of neuroscience.
Description
Recently, the GO-CCO was enriched in content and links to the Biological Process and Molecular Function branches of GO as well as to other ontologies. This was achieved in several ways. We carried out an amalgamation of SAO terms with GO-CCO ones; as a result, nearly 100 new neuroscience-related terms were added to the GO. The GO-CCO also contains relationships to GO Biological Process and Molecular Function terms, as well as connecting to external ontologies such as the Cell Ontology (CL). Terms representing protein complexes in the Protein Ontology (PRO) reference GO-CCO terms for their species-generic counterparts. GO-CCO terms can also be used to search a variety of databases.
Conclusions
In this publication we provide an overview of the GO-CCO, its overall design, and some recent extensions that make use of additional spatial information. One of the most recent developments of the GO-CCO was the merging in of the SAO, resulting in a single unified ontology designed to serve the needs of GO annotators as well as the specific needs of the neuroscience community.
doi:10.1186/2041-1480-4-20
PMCID: PMC3852282  PMID: 24093723
Gene ontology; Cellular component ontology; Subcellular anatomy ontology; Neuroscience; Annotation; Ontology language; Ontology integration; Neuroscience information framework
14.  Ontology based molecular signatures for immune cell types via gene expression analysis 
BMC Bioinformatics  2013;14:263.
Background
New technologies are focusing on characterizing cell types to better understand their heterogeneity. With large volumes of cellular data being generated, innovative methods are needed to structure the resulting data analyses. Here, we describe an ‘Ontologically BAsed Molecular Signature’ (OBAMS) method that identifies novel cellular biomarkers and infers biological functions as characteristics of particular cell types. This method finds molecular signatures for immune cell types based on mapping biological samples to the Cell Ontology (CL) and navigating the space of all possible pairwise comparisons between cell types to find genes whose expression is core to a particular cell type’s identity.
Results
We illustrate this ontological approach by evaluating expression data available from the Immunological Genome project (IGP) to identify unique biomarkers of mature B cell subtypes. We find that using OBAMS, candidate biomarkers can be identified at every strata of cellular identity from broad classifications to very granular. Furthermore, we show that Gene Ontology can be used to cluster cell types by shared biological processes in order to find candidate genes responsible for somatic hypermutation in germinal center B cells. Moreover, through in silico experiments based on this approach, we have identified genes sets that represent genes overexpressed in germinal center B cells and identify genes uniquely expressed in these B cells compared to other B cell types.
Conclusions
This work demonstrates the utility of incorporating structured ontological knowledge into biological data analysis – providing a new method for defining novel biomarkers and providing an opportunity for new biological insights.
doi:10.1186/1471-2105-14-263
PMCID: PMC3844401  PMID: 24004649
15.  Dovetailing biology and chemistry: integrating the Gene Ontology with the ChEBI chemical ontology 
BMC Genomics  2013;14:513.
Background
The Gene Ontology (GO) facilitates the description of the action of gene products in a biological context. Many GO terms refer to chemical entities that participate in biological processes. To facilitate accurate and consistent systems-wide biological representation, it is necessary to integrate the chemical view of these entities with the biological view of GO functions and processes. We describe a collaborative effort between the GO and the Chemical Entities of Biological Interest (ChEBI) ontology developers to ensure that the representation of chemicals in the GO is both internally consistent and in alignment with the chemical expertise captured in ChEBI.
Results
We have examined and integrated the ChEBI structural hierarchy into the GO resource through computationally-assisted manual curation of both GO and ChEBI. Our work has resulted in the creation of computable definitions of GO terms that contain fully defined semantic relationships to corresponding chemical terms in ChEBI.
Conclusions
The set of logical definitions using both the GO and ChEBI has already been used to automate aspects of GO development and has the potential to allow the integration of data across the domains of biology and chemistry. These logical definitions are available as an extended version of the ontology from http://purl.obolibrary.org/obo/go/extensions/go-plus.owl.
doi:10.1186/1471-2164-14-513
PMCID: PMC3733925  PMID: 23895341
16.  MouseFinder: candidate disease genes from mouse phenotype data 
Human Mutation  2012;33(5):858-866.
Mouse phenotype data represents a valuable resource for the identification of disease-associated genes, especially where the molecular basis is unknown and there is no clue to the candidate gene’s function, pathway involvement or expression pattern. However, until recently these data have not been systematically used due to difficulties in mapping between clinical features observed in humans and mouse phenotype annotations. Here, we describe a semantic approach to solve this problem and demonstrate highly significant recall of known disease-gene associations and orthology relationships. A web application (MouseFinder; www.mousemodels.org) has been developed to allow users to search the results of our whole-phenome comparison of human and mouse. We demonstrate its use in identifying ARTN as a strong candidate gene within the 1p34.1-p32 mapped locus for a hereditary form of ptosis.
doi:10.1002/humu.22051
PMCID: PMC3327758  PMID: 22331800
phenotype; candidate disease genes; model organism; mouse
17.  Phenotypic overlap in the contribution of individual genes to CNV pathogenicity revealed by cross-species computational analysis of single-gene mutations in humans, mice and zebrafish 
Disease Models & Mechanisms  2012;6(2):358-372.
SUMMARY
Numerous disease syndromes are associated with regions of copy number variation (CNV) in the human genome and, in most cases, the pathogenicity of the CNV is thought to be related to altered dosage of the genes contained within the affected segment. However, establishing the contribution of individual genes to the overall pathogenicity of CNV syndromes is difficult and often relies on the identification of potential candidates through manual searches of the literature and online resources. We describe here the development of a computational framework to comprehensively search phenotypic information from model organisms and single-gene human hereditary disorders, and thus speed the interpretation of the complex phenotypes of CNV disorders. There are currently more than 5000 human genes about which nothing is known phenotypically but for which detailed phenotypic information for the mouse and/or zebrafish orthologs is available. Here, we present an ontology-based approach to identify similarities between human disease manifestations and the mutational phenotypes in characterized model organism genes; this approach can therefore be used even in cases where there is little or no information about the function of the human genes. We applied this algorithm to detect candidate genes for 27 recurrent CNV disorders and identified 802 gene-phenotype associations, approximately half of which involved genes that were previously reported to be associated with individual phenotypic features and half of which were novel candidates. A total of 431 associations were made solely on the basis of model organism phenotype data. Additionally, we observed a striking, statistically significant tendency for individual disease phenotypes to be associated with multiple genes located within a single CNV region, a phenomenon that we denote as pheno-clustering. Many of the clusters also display statistically significant similarities in protein function or vicinity within the protein-protein interaction network. Our results provide a basis for understanding previously un-interpretable genotype-phenotype correlations in pathogenic CNVs and for mobilizing the large amount of model organism phenotype data to provide insights into human genetic disorders.
doi:10.1242/dmm.010322
PMCID: PMC3597018  PMID: 23104991
18.  Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research 
F1000Research  2013;2:30.
Phenotype analyses, e.g. investigating metabolic processes, tissue formation, or organism behavior, are an important element of most biological and medical research activities. Biomedical researchers are making increased use of ontological standards and methods to capture the results of such analyses, with one focus being the comparison and analysis of phenotype information between species.
We have generated a cross-species phenotype ontology for human, mouse and zebra fish that contains zebrafish phenotypes. We also provide up-to-date annotation data connecting human genes to phenotype classes from the generated ontology. We have included the data generation pipeline into our continuous integration system ensuring stable and up-to-date releases.
This article describes the data generation process and is intended to help interested researchers access both the phenotype annotation data and the associated cross-species phenotype ontology. The resource described here can be used in sophisticated semantic similarity and gene set enrichment analyses for phenotype data across species. The stable releases of this resource can be obtained from http://purl.obolibrary.org/obo/hp/uberpheno/.
doi:10.12688/f1000research.2-30.v1
PMCID: PMC3799545  PMID: 24358873
19.  A knowledge based approach to matching human neurodegenerative disease and animal models 
Neurodegenerative diseases present a wide and complex range of biological and clinical features. Animal models are key to translational research, yet typically only exhibit a subset of disease features rather than being precise replicas of the disease. Consequently, connecting animal to human conditions using direct data-mining strategies has proven challenging, particularly for diseases of the nervous system, with its complicated anatomy and physiology. To address this challenge we have explored the use of ontologies to create formal descriptions of structural phenotypes across scales that are machine processable and amenable to logical inference. As proof of concept, we built a Neurodegenerative Disease Phenotype Ontology (NDPO) and an associated Phenotype Knowledge Base (PKB) using an entity-quality model that incorporates descriptions for both human disease phenotypes and those of animal models. Entities are drawn from community ontologies made available through the Neuroscience Information Framework (NIF) and qualities are drawn from the Phenotype and Trait Ontology (PATO). We generated ~1200 structured phenotype statements describing structural alterations at the subcellular, cellular and gross anatomical levels observed in 11 human neurodegenerative conditions and associated animal models. PhenoSim, an open source tool for comparing phenotypes, was used to issue a series of competency questions to compare individual phenotypes among organisms and to determine which animal models recapitulate phenotypic aspects of the human disease in aggregate. Overall, the system was able to use relationships within the ontology to bridge phenotypes across scales, returning non-trivial matches based on common subsumers that were meaningful to a neuroscientist with an advanced knowledge of neuroanatomy. The system can be used both to compare individual phenotypes and also phenotypes in aggregate. This proof of concept suggests that expressing complex phenotypes using formal ontologies provides considerable benefit for comparing phenotypes across scales and species.
doi:10.3389/fninf.2013.00007
PMCID: PMC3653101  PMID: 23717278
phenotype; ontology; Neuroscience Information Framework; neurodegenerative disease; semantics
20.  A Unified Anatomy Ontology of the Vertebrate Skeletal System 
PLoS ONE  2012;7(12):e51070.
The skeleton is of fundamental importance in research in comparative vertebrate morphology, paleontology, biomechanics, developmental biology, and systematics. Motivated by research questions that require computational access to and comparative reasoning across the diverse skeletal phenotypes of vertebrates, we developed a module of anatomical concepts for the skeletal system, the Vertebrate Skeletal Anatomy Ontology (VSAO), to accommodate and unify the existing skeletal terminologies for the species-specific (mouse, the frog Xenopus, zebrafish) and multispecies (teleost, amphibian) vertebrate anatomy ontologies. Previous differences between these terminologies prevented even simple queries across databases pertaining to vertebrate morphology. This module of upper-level and specific skeletal terms currently includes 223 defined terms and 179 synonyms that integrate skeletal cells, tissues, biological processes, organs (skeletal elements such as bones and cartilages), and subdivisions of the skeletal system. The VSAO is designed to integrate with other ontologies, including the Common Anatomy Reference Ontology (CARO), Gene Ontology (GO), Uberon, and Cell Ontology (CL), and it is freely available to the community to be updated with additional terms required for research. Its structure accommodates anatomical variation among vertebrate species in development, structure, and composition. Annotation of diverse vertebrate phenotypes with this ontology will enable novel inquiries across the full spectrum of phenotypic diversity.
doi:10.1371/journal.pone.0051070
PMCID: PMC3519498  PMID: 23251424
21.  The Plant Ontology as a Tool for Comparative Plant Anatomy and Genomic Analyses 
Plant and Cell Physiology  2012;54(2):e1.
The Plant Ontology (PO; http://www.plantontology.org/) is a publicly available, collaborative effort to develop and maintain a controlled, structured vocabulary (‘ontology’) of terms to describe plant anatomy, morphology and the stages of plant development. The goals of the PO are to link (annotate) gene expression and phenotype data to plant structures and stages of plant development, using the data model adopted by the Gene Ontology. From its original design covering only rice, maize and Arabidopsis, the scope of the PO has been expanded to include all green plants. The PO was the first multispecies anatomy ontology developed for the annotation of genes and phenotypes. Also, to our knowledge, it was one of the first biological ontologies that provides translations (via synonyms) in non-English languages such as Japanese and Spanish. As of Release #18 (July 2012), there are about 2.2 million annotations linking PO terms to >110,000 unique data objects representing genes or gene models, proteins, RNAs, germplasm and quantitative trait loci (QTLs) from 22 plant species. In this paper, we focus on the plant anatomical entity branch of the PO, describing the organizing principles, resources available to users and examples of how the PO is integrated into other plant genomics databases and web portals. We also provide two examples of comparative analyses, demonstrating how the ontology structure and PO-annotated data can be used to discover the patterns of expression of the LEAFY (LFY) and terpene synthase (TPS) gene homologs.
doi:10.1093/pcp/pcs163
PMCID: PMC3583023  PMID: 23220694
Bioinformatics; Comparative genomics; Genome annotation; Ontology; Plant anatomy; Terpene synthase
22.  Ontologies as integrative tools for plant science 
American journal of botany  2012;99(8):1263-1275.
Premise of the study
Bio-ontologies are essential tools for accessing and analyzing the rapidly growing pool of plant genomic and phenomic data. Ontologies provide structured vocabularies to support consistent aggregation of data and a semantic framework for automated analyses and reasoning. They are a key component of the semantic web.
Methods
This paper provides background on what bio-ontologies are, why they are relevant to botany, and the principles of ontology development. It includes an overview of ontologies and related resources that are relevant to plant science, with a detailed description of the Plant Ontology (PO). We discuss the challenges of building an ontology that covers all green plants (Viridiplantae).
Key results
Ontologies can advance plant science in four keys areas: (1) comparative genetics, genomics, phenomics, and development; (2) taxonomy and systematics; (3) semantic applications; and (4) education.
Conclusions
Bio-ontologies offer a flexible framework for comparative plant biology, based on common botanical understanding. As genomic and phenomic data become available for more species, we anticipate that the annotation of data with ontology terms will become less centralized, while at the same time, the need for cross-species queries will become more common, causing more researchers in plant science to turn to ontologies.
doi:10.3732/ajb.1200222
PMCID: PMC3492881  PMID: 22847540
bio-ontologies; genome annotation; OBO Foundry; phenomics; plant anatomy; plant genomics; Plant Ontology; plant systematics; semantic web
23.  On the Use of Gene Ontology Annotations to Assess Functional Similarity among Orthologs and Paralogs: A Short Report 
PLoS Computational Biology  2012;8(2):e1002386.
A recent paper (Nehrt et al., PLoS Comput. Biol. 7:e1002073, 2011) has proposed a metric for the “functional similarity” between two genes that uses only the Gene Ontology (GO) annotations directly derived from published experimental results. Applying this metric, the authors concluded that paralogous genes within the mouse genome or the human genome are more functionally similar on average than orthologous genes between these genomes, an unexpected result with broad implications if true. We suggest, based on both theoretical and empirical considerations, that this proposed metric should not be interpreted as a functional similarity, and therefore cannot be used to support any conclusions about the “ortholog conjecture” (or, more properly, the “ortholog functional conservation hypothesis”). First, we reexamine the case studies presented by Nehrt et al. as examples of orthologs with divergent functions, and come to a very different conclusion: they actually exemplify how GO annotations for orthologous genes provide complementary information about conserved biological functions. We then show that there is a global ascertainment bias in the experiment-based GO annotations for human and mouse genes: particular types of experiments tend to be performed in different model organisms. We conclude that the reported statistical differences in annotations between pairs of orthologous genes do not reflect differences in biological function, but rather complementarity in experimental approaches. Our results underscore two general considerations for researchers proposing novel types of analysis based on the GO: 1) that GO annotations are often incomplete, potentially in a biased manner, and subject to an “open world assumption” (absence of an annotation does not imply absence of a function), and 2) that conclusions drawn from a novel, large-scale GO analysis should whenever possible be supported by careful, in-depth examination of examples, to help ensure the conclusions have a justifiable biological basis.
Author Summary
Understanding gene function—how individual genes contribute to the biology of an organism at the molecular, cellular and organism levels—is one of the primary aims of biomedical research. It has been a longstanding tenet of model organism research that experimental knowledge obtained in one organism is often applicable to other organisms, particularly if the organisms share the relevant genes because they inherited them from their common ancestor. Nevertheless this tenet is, like any hypothesis, not beyond question. A recent paper has termed this hypothesis a “conjecture,” and performed a statistical analysis, the results of which were interpreted as evidence against the hypothesis. This statistical analysis relied on a computational representation of gene function, the Gene Ontology (GO). As representatives of the international consortium that produces the GO, we show how the apparent evidence against the “ortholog conjecture” can be better explained as an artifact of how molecular biology knowledge is accumulated. In short, a complementarity between knowledge obtained in mouse and human experimental systems was incorrectly interpreted as a disagreement. We discuss the proper interpretation of GO annotations and potential sources of bias, with an eye toward enhancing the informed use of the GO by the scientific community.
doi:10.1371/journal.pcbi.1002386
PMCID: PMC3280971  PMID: 22359495
24.  Hematopoietic Cell Types: Prototype for a Revised Cell Ontology 
The Cell Ontology (CL) aims for the representation of in vivo and in vitro cell types from all of biology. The CL is a candidate reference ontology of the OBO Foundry and requires extensive revision to bring it up to current standards for biomedical ontologies, both in its structure and its coverage of various subfields of biology. We have now addressed the specific content of one area of the CL, the section of the ontology dealing with hematopoietic cells. This section has been extensively revised to improve its content and eliminate multiple inheritance in the asserted hierarchy, and the groundwork was laid for structuring the hematopoietic cell type terms as cross-products incorporating logical definitions built from relationships to external ontologies, such as the Protein Ontology and the Gene Ontology. The methods and improvements to the CL in this area represent a paradigm for improvement of the entire ontology over time.
doi:10.1016/j.jbi.2010.01.006
PMCID: PMC2892030  PMID: 20123131
ontology; hematopoietic cells; immunology
25.  Cross-Product Extensions of the Gene Ontology 
The Gene Ontology (GO) consists of nearly 30,000 classes for describing the activities and locations of gene products. Manual maintenance of an ontology of this size is a considerable effort, and errors and inconsistencies inevitably arise. Reasoners can be used to assist with ontology development, automatically placing classes in a subsumption hierarchy based on their properties. However, the historic lack of computable definitions within the GO has prevented the user of these tools.
In this paper we present preliminary results of an ongoing effort to normalize the GO by explicitly stating the definitions of compositional classes in a form that can be used by reasoners. These definitions are partitioned into mutually exclusive cross-product sets, many of which reference other OBO Foundry candidate ontologies for chemical entities, proteins, biological qualities and anatomical entities. Using these logical definitions we are gradually beginning to automate many aspects of ontology development, detecting errors and filling in missing relationships. These definitions also enhance the GO by weaving it into the fabric of a wider collection of interoperating ontologies, increasing opportunities for data integration and enhancing genomic analyses.
doi:10.1016/j.jbi.2010.02.002
PMCID: PMC2910209  PMID: 20152934

Results 1-25 (37)