Ontologies have become a critical component of many applications in biomedical informatics. However, the landscape of the ontology tools today is largely fragmented, with independent tools for ontology editing, publishing, and peer review: users develop an ontology in an ontology editor, such as Protégé; and publish it on a Web server or in an ontology library, such as BioPortal, in order to share it with the community; they use the tools provided by the library or mailing lists and bug trackers to collect feedback from users. In this paper, we present a set of tools that bring the ontology editing and publishing closer together, in an integrated platform for the entire ontology lifecycle. This integration streamlines the workflow for collaborative development and increases integration between the ontologies themselves through the reuse of terms.
The National Center for Biomedical Ontology is now in its seventh year. The goals of this National Center for Biomedical Computing are to: create and maintain a repository of biomedical ontologies and terminologies; build tools and web services to enable the use of ontologies and terminologies in clinical and translational research; educate their trainees and the scientific community broadly about biomedical ontology and ontology-based technology and best practices; and collaborate with a variety of groups who develop and use ontologies and terminologies in biomedicine. The centerpiece of the National Center for Biomedical Ontology is a web-based resource known as BioPortal. BioPortal makes available for research in computationally useful forms more than 270 of the world's biomedical ontologies and terminologies, and supports a wide range of web services that enable investigators to use the ontologies to annotate and retrieve data, to generate value sets and special-purpose lexicons, and to perform advanced analytics on a wide range of biomedical data.
Collaborative technologies; knowledge representations; knowledge acquisition and knowledge management; controlled terminologies and vocabularies; ontologies; knowledge bases; applications that link biomedical knowledge from diverse primary sources (includes automated indexing); statistical analysis of large datasets; methods for integration of information from disparate sources; discovery; and text and data mining methods; automated learning; information retrieval; HIT data standards; representing; identifying; and modeling biological structures; developing and refining ehr data standards (including image standards)
Ontologies are commonly used in biomedicine to organize concepts to describe domains such as anatomies, environments, experiment, taxonomies etc. NCBO BioPortal currently hosts about 180 different biomedical ontologies. These ontologies have been mainly expressed in either the Open Biomedical Ontology (OBO) format or the Web Ontology Language (OWL). OBO emerged from the Gene Ontology, and supports most of the biomedical ontology content. In comparison, OWL is a Semantic Web language, and is supported by the World Wide Web consortium together with integral query languages, rule languages and distributed infrastructure for information interchange. These features are highly desirable for the OBO content as well. A convenient method for leveraging these features for OBO ontologies is by transforming OBO ontologies to OWL.
We have developed a methodology for translating OBO ontologies to OWL using the organization of the Semantic Web itself to guide the work. The approach reveals that the constructs of OBO can be grouped together to form a similar layer cake. Thus we were able to decompose the problem into two parts. Most OBO constructs have easy and obvious equivalence to a construct in OWL. A small subset of OBO constructs requires deeper consideration. We have defined transformations for all constructs in an effort to foster a standard common mapping between OBO and OWL. Our mapping produces OWL-DL, a Description Logics based subset of OWL with desirable computational properties for efficiency and correctness. Our Java implementation of the mapping is part of the official Gene Ontology project source.
Our transformation system provides a lossless roundtrip mapping for OBO ontologies, i.e. an OBO ontology may be translated to OWL and back without loss of knowledge. In addition, it provides a roadmap for bridging the gap between the two ontology languages in order to enable the use of ontology content in a language independent manner.
In this article we describe an approach to representing and building ontologies
advocated by the Bioinformatics and Medical Informatics groups at the University
of Manchester. The hand-crafting of ontologies offers an easy and rapid avenue to
delivering ontologies. Experience has shown that such approaches are unsustainable.
Description logic approaches have been shown to offer computational support for
building sound, complete and logically consistent ontologies. A new knowledge
representation language, DAML + OIL, offers a new standard that is able to support
many styles of ontology, from hand-crafted to full logic-based descriptions with
reasoning support. We describe this language, the OilEd editing tool, reasoning
support and a strategy for the language’s use. We finish with a current example,
in the Gene Ontology Next Generation (GONG) project, that uses DAML + OIL as
the basis for moving the Gene Ontology from its current hand-crafted, form to one
that uses logical descriptions of a concept’s properties to deliver a more complete
version of the ontology.
Recently, ideas from the field of ontology have been picked up by computer scientists as a basis for encoding knowledge and with the hope of achieving interoperability and intelligent system behavior. The use of anatomy ontologies to represent space in biological organisms, specifically mouse and human are reviewed here.
Ontology has long been the preserve of philosophers and logicians. Recently, ideas from this field have been picked up by computer scientists as a basis for encoding knowledge and with the hope of achieving interoperability and intelligent system behavior. In bioinformatics, ontologies might allow hitherto impossible query and data-mining activities. We review the use of anatomy ontologies to represent space in biological organisms, specifically mouse and human.
Cheminformatics is the application of informatics techniques to solve chemical problems in silico. There are many areas in biology where cheminformatics plays an important role in computational research, including metabolism, proteomics, and systems biology. One critical aspect in the application of cheminformatics in these fields is the accurate exchange of data, which is increasingly accomplished through the use of ontologies. Ontologies are formal representations of objects and their properties using a logic-based ontology language. Many such ontologies are currently being developed to represent objects across all the domains of science. Ontologies enable the definition, classification, and support for querying objects in a particular domain, enabling intelligent computer applications to be built which support the work of scientists both within the domain of interest and across interrelated neighbouring domains. Modern chemical research relies on computational techniques to filter and organise data to maximise research productivity. The objects which are manipulated in these algorithms and procedures, as well as the algorithms and procedures themselves, enjoy a kind of virtual life within computers. We will call these information entities. Here, we describe our work in developing an ontology of chemical information entities, with a primary focus on data-driven research and the integration of calculated properties (descriptors) of chemical entities within a semantic web context. Our ontology distinguishes algorithmic, or procedural information from declarative, or factual information, and renders of particular importance the annotation of provenance to calculated data. The Chemical Information Ontology is being developed as an open collaborative project. More details, together with a downloadable OWL file, are available at http://code.google.com/p/semanticchemistry/ (license: CC-BY-SA).
We propose a typology of representational artifacts for health care and life sciences domains and associate this typology with different kinds of formal ontology and logic, drawing conclusions as to the strengths and limitations for ontology of different kinds of logical resources, with a focus on description logics.
The four types of domain representation we consider are: (i) lexico-semantic representation, (ii) representation of types of entities, (iii) representations of background knowledge, and (iv) representation of individuals.
We advocate a clear distinction of the four kinds of representation in order to provide a more rational basis for using of ontologies and related artifacts to advance integration of data and interoperability of associated reasoning systems.
We highlight the fact that only a minor portion of scientifically relevant facts in a domain such as biomedicine can be adequately represented by formal ontologies when the latter are conceived as representations of entity types. In particular, the attempt to encode default or probabilistic knowledge using ontologies so conceived is prone to produce unintended, erroneous models.
biomedical ontology; description logic; formal ontology; knowledge representation
Data generated from cancer nanotechnology research are so diverse and large in volume that it is difficult to share and efficiently use them without informatics tools. In particular, ontologies that provide a unifying knowledge framework for annotating the data are required to facilitate the semantic integration, knowledge-based searching, unambiguous interpretation, mining and inferencing of the data using informatics methods. In this paper, we discuss the design and development of NanoParticle Ontology (NPO), which is developed within the framework of the Basic Formal Ontology (BFO), and implemented in the Ontology Web Language (OWL) using well-defined ontology design principles. The NPO was developed to represent knowledge underlying the preparation, chemical composition, and characterization of nanomaterials involved in cancer research. Public releases of the NPO are available through BioPortal website, maintained by the National Center for Biomedical Ontology. Mechanisms for editorial and governance processes are being developed for the maintenance, review, and growth of the NPO.
ontology; cancer; nanotechnology; nanoparticle; nanomaterial; informatics; NPO; BFO; annotation; BioPortal
To develop and apply formal ontology creation methods to the domain of antimicrobial prescribing and to formally evaluate the resulting ontology through intrinsic and extrinsic evaluation studies.
We extended existing ontology development methods to create the ontology and implemented the ontology using Protégé-OWL. Correctness of the ontology was assessed using a set of ontology design principles and domain expert review via the laddering technique. We created three artifacts to support the extrinsic evaluation (set of prescribing rules, alerts and an ontology-driven alert module, and a patient database) and evaluated the usefulness of the ontology for performing knowledge management tasks to maintain the ontology and for generating alerts to guide antibiotic prescribing.
The ontology includes 199 classes, 10 properties, and 1,636 description logic restrictions. Twenty-three Semantic Web Rule Language rules were written to generate three prescribing alerts: 1) antibiotic-microorganism mismatch alert; 2) medication-allergy alert; and 3) non-recommended empiric antibiotic therapy alert. The evaluation studies confirmed the correctness of the ontology, usefulness of the ontology for representing and maintaining antimicrobial treatment knowledge rules, and usefulness of the ontology for generating alerts to provide feedback to clinicians during antibiotic prescribing.
This study contributes to the understanding of ontology development and evaluation methods and addresses one knowledge gap related to using ontologies as a clinical decision support system component—a need for formal ontology evaluation methods to measure their quality from the perspective of their intrinsic characteristics and their usefulness for specific tasks.
Ontology; Clinical decision support; Evaluation
We present an analysis of some considerations involved in expressing the Gene Ontology (GO) as a machine-processible ontology, reflecting principles of formal ontology.
GO is a controlled vocabulary that is intended to facilitate communication between
biologists by standardizing usage of terms in database annotations. Making such
controlled vocabularies maximally useful in support of bioinformatics applications
requires explicating in machine-processible form the implicit background information
that enables human users to interpret the meaning of the vocabulary terms.
In the case of GO, this process would involve rendering the meanings of GO into
a formal (logical) language with the help of domain experts, and adding additional
information required to support the chosen formalization. A controlled vocabulary
augmented in these ways is commonly called an ontology. In this paper, we make a
modest exploration to determine the ontological requirements for this extended version
of GO. Using the terms within the three GO hierarchies (molecular function,
biological process and cellular component), we investigate the facility with which
GO concepts can be ontologized, using available tools from the philosophical and
ontological engineering literature.
The bio-ontology community falls into two camps: first we have biology domain experts, who actually hold the knowledge we wish to capture in ontologies; second, we have ontology specialists, who hold knowledge about techniques and best practice on ontology development. In the bio-ontology domain, these two camps have often come into conflict, especially where pragmatism comes into conflict with perceived best practice. One of these areas is the insistence of computer scientists on a well-defined semantic basis for the Knowledge Representation language being used. In this article, we will first describe why this community is so insistent. Second, we will illustrate this by examining the semantics of the Web Ontology Language and the semantics placed on the Directed Acyclic Graph as used by the Gene Ontology. Finally we will reconcile the two representations, including the broader Open Biomedical Ontologies format. The ability to exchange between the two representations means that we can capitalise on the features of both languages. Such utility can only arise by the understanding of the semantics of the languages being used. By this illustration of the usefulness of a clear, well-defined language semantics, we wish to promote a wider understanding of the computer science perspective amongst potential users within the biological community.
Motivation: For many years, the Unified Medical Language System (UMLS) semantic network (SN) has been used as an upper-level semantic framework for the categorization of terms from terminological resources in biomedicine. BioTop has recently been developed as an upper-level ontology for the biomedical domain. In contrast to the SN, it is founded upon strict ontological principles, using OWL DL as a formal representation language, which has become standard in the semantic Web. In order to make logic-based reasoning available for the resources annotated or categorized with the SN, a mapping ontology was developed aligning the SN with BioTop.
Methods: The theoretical foundations and the practical realization of the alignment are being described, with a focus on the design decisions taken, the problems encountered and the adaptations of BioTop that became necessary. For evaluation purposes, UMLS concept pairs obtained from MEDLINE abstracts by a named entity recognition system were tested for possible semantic relationships. Furthermore, all semantic-type combinations that occur in the UMLS Metathesaurus were checked for satisfiability.
Results: The effort-intensive alignment process required major design changes and enhancements of BioTop and brought up several design errors that could be fixed. A comparison between a human curator and the ontology yielded only a low agreement. Ontology reasoning was also used to successfully identify 133 inconsistent semantic-type combinations.
Availability: BioTop, the OWL DL representation of the UMLS SN, and the mapping ontology are available at http://www.purl.org/biotop/.
Many biomedical terminologies, classifications, and ontological resources such as the NCI Thesaurus (NCIT), International Classification of Diseases (ICD), Systematized Nomenclature of Medicine (SNOMED), Current Procedural Terminology (CPT), and Gene Ontology (GO) have been developed and used to build a variety of IT applications in biology, biomedicine, and health care settings. However, virtually all these resources involve incompatible formats, are based on different modeling languages, and lack appropriate tooling and programming interfaces (APIs) that hinder their wide-scale adoption and usage in a variety of application contexts. The Lexical Grid (LexGrid) project introduced in this paper is an ongoing community-driven initiative, coordinated by the Mayo Clinic Division of Biomedical Statistics and Informatics, designed to bridge this gap using a common terminology model called the LexGrid model. The key aspect of the model is to accommodate multiple vocabulary and ontology distribution formats and support of multiple data stores for federated vocabulary distribution. The model provides a foundation for building consistent and standardized APIs to access multiple vocabularies that support lexical search queries, hierarchy navigation, and a rich set of features such as recursive subsumption (e.g., get all the children of the concept penicillin). Existing LexGrid implementations include the LexBIG API as well as a reference implementation of the HL7 Common Terminology Services (CTS) specification providing programmatic access via Java, Web, and Grid services.
Integrative neuroscience research needs a scalable informatics framework that enables semantic integration of diverse types of neuroscience data. This paper describes the use of the Web Ontology Language (OWL) and other Semantic Web technologies for the representation and integration of molecular-level data provided by several of SenseLab suite of neuroscience databases.
Based on the original database structure, we semi-automatically translated the databases into OWL ontologies with manual addition of semantic enrichment. The SenseLab ontologies are extensively linked to other biomedical Semantic Web resources, including the Subcellular Anatomy Ontology, Brain Architecture Management System, the Gene Ontology, BIRNLex and UniProt. The SenseLab ontologies have also been mapped to the Basic Formal Ontology and Relation Ontology, which helps ease interoperability with many other existing and future biomedical ontologies for the Semantic Web. In addition, approaches to representing contradictory research statements are described. The SenseLab ontologies are designed for use on the Semantic Web that enables their integration into a growing collection of biomedical information resources.
We demonstrate that our approach can yield significant potential benefits and that the Semantic Web is rapidly becoming mature enough to realize its anticipated promises. The ontologies are available online at http://neuroweb.med.yale.edu/senselab/
Semantic Web; neuroscience; description logic; ontology mapping; Web Ontology Language; integration
The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index—a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics “under the hood.”
semantic Web; ontology-based indexing; semantic annotation; data integration; information mining; information retrieval; biomedical data; biomedical ontologies
Motivation: The systematic observation of phenotypes has become a crucial tool of functional genomics, and several large international projects are currently underway to identify and characterize the phenotypes that are associated with genotypes in several species. To integrate phenotype descriptions within and across species, phenotype ontologies have been developed. Applying ontologies to unify phenotype descriptions in the domain of physiology has been a particular challenge due to the high complexity of the underlying domain.
Results: In this study, we present the outline of a theory and its implementation for an ontology of physiology-related phenotypes. We provide a formal description of process attributes and relate them to the attributes of their temporal parts and participants. We apply our theory to create the Cellular Phenotype Ontology (CPO). The CPO is an ontology of morphological and physiological phenotypic characteristics of cells, cell components and cellular processes. Its prime application is to provide terms and uniform definition patterns for the annotation of cellular phenotypes. The CPO can be used for the annotation of observed abnormalities in domains, such as systems microscopy, in which cellular abnormalities are observed and for which no phenotype ontology has been created.
Availability and implementation: The CPO and the source code we generated to create the CPO are freely available on http://cell-phenotype.googlecode.com.
Supplementary data are available at Bioinformatics online.
Application oriented ontologies are important for reliably communicating and
managing data in databases. Unfortunately, they often differ in the
definitions they use and thus do not live up to their potential. This
problem can be reduced when using a standardized and ontologically
consistent template for the top-level categories from a top-level formal
foundational ontology. This would support ontological consistency within
application oriented ontologies and compatibility between them. The Basic
Formal Ontology (BFO) is such a foundational ontology for the biomedical
domain that has been developed following the single inheritance policy. It
provides the top-level template within the Open Biological and Biomedical
Ontologies Foundry. If it wants to live up to its expected role, its three
top-level categories of material entity (i.e., ‘object’,
‘fiat object part’, ‘object
aggregate’) must be exhaustive, i.e. every concrete material entity
must instantiate exactly one of them.
By systematically evaluating all possible basic configurations of material
building blocks we show that BFO's top-level categories of material
entity are not exhaustive. We provide examples from biology and everyday
life that demonstrate the necessity for two additional categories:
‘fiat object part aggregate’ and
‘object with fiat object part aggregate’. By
distinguishing topological coherence, topological adherence, and metric
proximity we furthermore provide a differentiation of clusters and groups as
two distinct subcategories for each of the three categories of material
entity aggregates, resulting in six additional subcategories of material
We suggest extending BFO to incorporate two additional categories of material
entity as well as two subcategories for each of the three categories of
material entity aggregates. With these additions, BFO would exhaustively
cover all top-level types of material entity that application oriented
ontologies may use as templates. Our result, however, depends on the premise
that all material entities are organized according to a constitutive
The Cell Ontology (CL) aims for the representation of in vivo and in vitro cell types from all of biology. The CL is a candidate reference ontology of the OBO Foundry and requires extensive revision to bring it up to current standards for biomedical ontologies, both in its structure and its coverage of various subfields of biology. We have now addressed the specific content of one area of the CL, the section of the ontology dealing with hematopoietic cells. This section has been extensively revised to improve its content and eliminate multiple inheritance in the asserted hierarchy, and the groundwork was laid for structuring the hematopoietic cell type terms as cross-products incorporating logical definitions built from relationships to external ontologies, such as the Protein Ontology and the Gene Ontology. The methods and improvements to the CL in this area represent a paradigm for improvement of the entire ontology over time.
ontology; hematopoietic cells; immunology
We begin by describing recent developments in the burgeoning discipline of applied ontology, focusing especially on the ways ontologies are providing a means for the consistent representation of scientific data. We then introduce Basic Formal Ontology (BFO), a top-level ontology that is serving as domain-neutral framework for the development of lower level ontologies in many specialist disciplines, above all in biology and medicine. BFO is a bicategorial ontology, embracing both three-dimensionalist (continuant) and four-dimensionalist (occurrent) perspectives within a single framework. We examine how BFO-conformant domain ontologies can deal with the consistent representation of scientific data deriving from the measurement of processes of different types, and we outline on this basis the first steps of an approach to the classification of such processes within the BFO framework.1
The OBML 2010 workshop, held at the University of Mannheim on September 9-10, 2010, is the 2nd in a series of meetings organized by the Working Group “Ontologies in Biomedicine and Life Sciences” of the German Society of Computer Science (GI) and the German Society of Medical Informatics, Biometry and Epidemiology (GMDS). Integrating, processing and applying the rapidly expanding information generated in the life sciences — from public health to clinical care and molecular biology — is one of the most challenging problems that research in these fields is facing today. As the amounts of experimental data, clinical information and scientific knowledge increase, there is a growing need to promote interoperability of these resources, support formal analyses, and to pre-process knowledge for further use in problem solving and hypothesis formulation.
The OBML workshop series pursues the aim of gathering scientists who research topics related to life science ontologies, to exchange ideas, discuss new results and establish relationships. The OBML group promotes the collaboration between ontologists, computer scientists, bio-informaticians and applied logicians, as well as the cooperation with physicians, biologists, biochemists and biometricians, and supports the establishment of this new discipline in research and teaching. Research topics of OBML 2010 included medical informatics, Semantic Web applications, formal ontology, bio-ontologies, knowledge representation as well as the wide range of applications of biomedical ontologies to science and medicine. A total of 14 papers were presented, and from these we selected four manuscripts for inclusion in this special issue.
An interdisciplinary audience from all areas related to biomedical ontologies attended OBML 2010. In the future, OBML will continue as an annual meeting that aims to bridge the gap between theory and application of ontologies in the life sciences. The next event emphasizes the special topic of the ontology of phenotypes, in Berlin, Germany on October 6-7, 2011.
The mouse has long been an important model for the study of human genetic disease. Through the application of genetic engineering and mutagenesis techniques, the number of unique mutant mouse models and the amount of phenotypic data describing them are growing exponentially. Describing phenotypes of mutant mice in a computationally useful manner that will facilitate data mining is a major challenge for bioinformatics. Here we describe a tool, the Mammalian Phenotype Ontology (MP), for classifying and organizing phenotypic information related to the mouse and other mammalian species. The MP Ontology has been applied to mouse phenotype descriptions in the Mouse Genome Informatics Database (MGI, http://www.informatics.jax.org/), the Rat Genome Database (RGD, http://rgd.mcw.edu), the Online Mendelian Inheritance in Animals (OMIA, http://omia.angis.org.au/) and elsewhere. Use of this ontology allows comparisons of data from diverse sources, can facilitate comparisons across mammalian species, assists in identifying appropriate experimental disease models, and aids in the discovery of candidate disease genes and molecular signaling pathways.
Ontology; Phenotype; Mammal; Annotation; Model System
Ontologies are intended to capture and formalize a domain of knowledge. The
ontologies comprising the Open Biological Ontologies (OBO) project, which includes
the Gene Ontology (GO), are formalizations of various domains of biological
knowledge. Ontologies within OBO typically lack computable definitions that serve to
differentiate a term from other similar terms. The computer is unable to determine the
meaning of a term, which presents problems for tools such as automated reasoners.
Reasoners can be of enormous benefit in managing a complex ontology. OBO term
names frequently implicitly encode the kind of definitions that can be used by
computational tools, such as automated reasoners. The definitions encoded in the
names are not easily amenable to computation, because the names are ostensibly
natural language phrases designed for human users. These names are highly regular
in their grammar, and can thus be treated as valid sentences in some formal or
computable language.With a description of the rules underlying this formal language,
term names can be parsed to derive computable definitions, which can then be
reasoned over. This paper describes the effort to elucidate that language, called Obol,
and the attempts to reason over the resulting definitions. The current implementation
finds unique non-trivial definitions for around half of the terms in the GO, and
has been used to find 223 missing relationships, which have since been added to
the ontology. Obol has utility as an ontology maintenance tool, and as a means of
generating computable definitions for a whole ontology.
The software is available under an open-source license from: http://www.fruitfly.
org/~cjm/obol. Supplementary material for this article can be found at: http://www.
Motivation: The classification of biological entities in terms of species and taxa is an important endeavor in biology. Although a large amount of statements encoded in current biomedical ontologies is taxon-dependent there is no obvious or standard way for introducing taxon information into an integrative ontology architecture, supposedly because of ongoing controversies about the ontological nature of species and taxa.
Results: In this article, we discuss different approaches on how to represent biological taxa using existing standards for biomedical ontologies such as the description logic OWL DL and the Open Biomedical Ontologies Relation Ontology. We demonstrate how hidden ambiguities of the species concept can be dealt with and existing controversies can be overcome. A novel approach is to envisage taxon information as qualities that inhere in biological organisms, organism parts and populations.
Availability: The presented methodology has been implemented in the domain top-level ontology BioTop, openly accessible at http://purl.org/biotop. BioTop may help to improve the logical and ontological rigor of biomedical ontologies and further provides a clear architectural principle to deal with biological taxa information.
Controlled vocabularies are common within bioinformatics resources. They can be used to
give a summary of the knowledge held about a particular entity. They are also used to
constrain values given for particular attributes of an entity. This helps create a shared
understanding of a domain and aids increased precision and recall during querying of
resources. Ontologies can also provide such facilities, but can also enhance their utility.
Controlled vocabularies are often simply lists of words, but may be viewed as a kind of
ontology. Ideally ontologies are structurally enriched with relationships between terms
within the vocabulary. Use of such rich forms of vocabularies in database annotation could
enhance those resources usability by both humans and computers. The representation of
the knowledge content of biological resources in a computationally accessible form opens
the prospect of greater support for a biologist investigating new data.
Phenotype ontologies are used in species-specific databases for the annotation of mutagenesis experiments and to characterize human diseases. The Entity-Quality (EQ) formalism is a means to describe complex phenotypes based on one or more affected entities and a quality. EQ-based definitions have been developed for many phenotype ontologies, including the Human and Mammalian Phenotype ontologies.
We analyze formalizations of complex phenotype descriptions in the Web Ontology Language (OWL) that are based on the EQ model, identify several representational challenges and analyze potential solutions to address these challenges.
In particular, we suggest a novel, role-based approach to represent relational qualities such as concentration of iron in spleen, discuss its ontological foundation in the General Formal Ontology (GFO) and evaluate its representation in OWL and the benefits it can bring to the representation of phenotype annotations.
Our analysis of OWL-based representations of phenotypes can contribute to improving consistency and expressiveness of formal phenotype descriptions.