The HIV epidemic has been continuously growing among women, and in some parts of the world, HIV-infected women outnumber men. Women's greater vulnerability to HIV, both biologically and socially, influences their health risk and health outcome. This disparity between sexes has been established for other diseases, for example, autoimmune diseases, malignancies and cardiovascular diseases. Differences in drug effects and treatment outcomes have also been demonstrated.
Despite proven sex and gender differences, women continue to be underrepresented in clinical trials, and the absence of gender analyses in published literature is striking. There is a growing advocacy for consideration of women in research, in particular in the HIV field, and gender mainstreaming of policies is increasingly called for. However, these efforts have not translated into improved reporting of sex-disaggregated data and provision of gender analysis in published literature; science editors, as well as publishers, lag behind in this effort.
Instructions for authors issued by journals contain many guidelines for good standards of reporting, and a policy on sex-disaggregated data and gender analysis should not be amiss here. It is time for editors and publishers to demonstrate leadership in changing the paradigm in the world of scientific publication. We encourage authors, peer reviewers and fellow editors to lend their support by taking necessary measures to substantially improve reporting of gender analysis. Editors' associations could play an essential role in facilitating a transition to improved standard editorial policies.
Several fields have created ontologies for their subdomains. For example, the biological sciences have developed extensive ontologies such as the Gene Ontology, which is considered a great success. Ontologies could provide similar advantages to the Modeling and Simulation community. They provide a way to establish common vocabularies and capture knowledge about a particular domain with community-wide agreement. Ontologies can support significantly improved (semantic) search and browsing, integration of heterogeneous information sources, and improved knowledge discovery capabilities. This paper discusses the design and development of an ontology for Modeling and Simulation called the Discrete-event Modeling Ontology (DeMO), and it presents prototype applications that demonstrate various uses and benefits that such an ontology may provide to the Modeling and Simulation community.
Discrete systems; simulation environments; standards; Web-based environments
Biological ontologies are now being widely used for annotation, sharing and retrieval of the biological data. Many of these ontologies are hosted under the umbrella of the Open Biological Ontologies Foundry. In order to support interterminology mapping, composite terms in these ontologies need to be translated into atomic or primitive terms in other, orthogonal ontologies, for example, gluconeogenesis (biological process term) to glucose (chemical ontology term). Identifying such decompositional ontology translations is a challenging problem. In this paper, we propose a network-theoretic approach based on the structure of the integrated OBO relationship graph. We use a network-theoretic measure, called the clustering coefficient, to find relevant atomic terms in the neighborhood of a composite term. By eliminating the existing GO to ChEBI Ontology mappings from OBO, we evaluate whether the proposed approach can re-identify the corresponding relationships. The results indicate that the network structure provides strong cues for decompositional ontology translation and the existing relationships can be used to identify new translations.
network theory; biomedical ontologies; ontology translation; open biomedical ontologies
Ontologies have increasingly been used in the biomedical domain, which has prompted the emergence of different initiatives to facilitate their development and integration. The Open Biological and Biomedical Ontologies (OBO) Foundry consortium provides a repository of life-science ontologies, which are developed according to a set of shared principles. This consortium has developed an ontology called OBO Relation Ontology aiming at standardizing the different types of biological entity classes and associated relationships. Since ontologies are primarily intended to be used by humans, the use of graphical notations for ontology development facilitates the capture, comprehension and communication of knowledge between its users. However, OBO Foundry ontologies are captured and represented basically using text-based notations. The Unified Modeling Language (UML) provides a standard and widely-used graphical notation for modeling computer systems. UML provides a well-defined set of modeling elements, which can be extended using a built-in extension mechanism named Profile. Thus, this work aims at developing a UML profile for the OBO Relation Ontology to provide a domain-specific set of modeling elements that can be used to create standard UML-based ontologies in the biomedical domain.
We have studied the OBO Relation Ontology, the UML metamodel and the UML profiling mechanism. Based on these studies, we have proposed an extension to the UML metamodel in conformance with the OBO Relation Ontology and we have defined a profile that implements the extended metamodel. Finally, we have applied the proposed UML profile in the development of a number of fragments from different ontologies. Particularly, we have considered the Gene Ontology (GO), the PRotein Ontology (PRO) and the Xenopus Anatomy and Development Ontology (XAO).
The use of an established and well-known graphical language in the development of biomedical ontologies provides a more intuitive form of capturing and representing knowledge than using only text-based notations. The use of the profile requires the domain expert to reason about the underlying semantics of the concepts and relationships being modeled, which helps preventing the introduction of inconsistencies in an ontology under development and facilitates the identification and correction of errors in an already defined ontology.
The Gene Ontology (GO) (http://www.geneontology.org/) contains a set of terms for describing the activity and actions of gene products across all kingdoms of life. Each of these activities is executed in a location within a cell or in the vicinity of a cell. In order to capture this context, the GO includes a sub-ontology called the Cellular Component (CC) ontology (GO-CCO). The primary use of this ontology is for GO annotation, but it has also been used for phenotype annotation, and for the annotation of images. Another ontology with similar scope to the GO-CCO is the Subcellular Anatomy Ontology (SAO), part of the Neuroscience Information Framework Standard (NIFSTD) suite of ontologies. The SAO also covers cell components, but in the domain of neuroscience.
Recently, the GO-CCO was enriched in content and links to the Biological Process and Molecular Function branches of GO as well as to other ontologies. This was achieved in several ways. We carried out an amalgamation of SAO terms with GO-CCO ones; as a result, nearly 100 new neuroscience-related terms were added to the GO. The GO-CCO also contains relationships to GO Biological Process and Molecular Function terms, as well as connecting to external ontologies such as the Cell Ontology (CL). Terms representing protein complexes in the Protein Ontology (PRO) reference GO-CCO terms for their species-generic counterparts. GO-CCO terms can also be used to search a variety of databases.
In this publication we provide an overview of the GO-CCO, its overall design, and some recent extensions that make use of additional spatial information. One of the most recent developments of the GO-CCO was the merging in of the SAO, resulting in a single unified ontology designed to serve the needs of GO annotators as well as the specific needs of the neuroscience community.
Gene ontology; Cellular component ontology; Subcellular anatomy ontology; Neuroscience; Annotation; Ontology language; Ontology integration; Neuroscience information framework
In recent years, as a knowledge-based discipline, bioinformatics has been made more computationally amenable. After its beginnings as a technology advocated by computer scientists to overcome problems of heterogeneity, ontology has been taken up by biologists themselves as a means to consistently annotate features from genotype to phenotype. In medical informatics, artifacts called ontologies have been used for a longer period of time to produce controlled lexicons for coding schemes. In this article, we review the current position in ontologies and how they have become institutionalized within biomedicine. As the field has matured, the much older philosophical aspects of ontology have come into play. With this and the institutionalization of ontology has come greater formality. We review this trend and what benefits it might bring to ontologies and their use within biomedicine.
bio-ontology; medical ontology; annotation; knowledge; knowledge representation; history
An abstraction network is an auxiliary network of nodes and links that provides a compact, high-level view of an ontology. Such a view lends support to ontology orientation, comprehension, and quality-assurance efforts. A methodology is presented for deriving a kind of abstraction network, called a partial-area taxonomy, for the Ontology of Clinical Research (OCRe). OCRe was selected as a representative of ontologies implemented using the Web Ontology Language (OWL) based on shared domains. The derivation of the partial-area taxonomy for the Entity hierarchy of OCRe is described. Utilizing the visualization of the content and structure of the hierarchy provided by the taxonomy, the Entity hierarchy is audited, and several errors and inconsistencies in OCRe’s modeling of its domain are exposed. After appropriate corrections are made to OCRe, a new partial-area taxonomy is derived. The generalizability of the paradigm of the derivation methodology to various families of biomedical ontologies is discussed.
We present an analysis of some considerations involved in expressing the Gene Ontology (GO) as a machine-processible ontology, reflecting principles of formal ontology.
GO is a controlled vocabulary that is intended to facilitate communication between
biologists by standardizing usage of terms in database annotations. Making such
controlled vocabularies maximally useful in support of bioinformatics applications
requires explicating in machine-processible form the implicit background information
that enables human users to interpret the meaning of the vocabulary terms.
In the case of GO, this process would involve rendering the meanings of GO into
a formal (logical) language with the help of domain experts, and adding additional
information required to support the chosen formalization. A controlled vocabulary
augmented in these ways is commonly called an ontology. In this paper, we make a
modest exploration to determine the ontological requirements for this extended version
of GO. Using the terms within the three GO hierarchies (molecular function,
biological process and cellular component), we investigate the facility with which
GO concepts can be ontologized, using available tools from the philosophical and
ontological engineering literature.
Effective Medline database exploration is critical for the understanding of high throughput experimental results and the development of novel hypotheses about the mechanisms underlying the targeted biological processes. While existing solutions enhance Medline exploration through different approaches such as document clustering, network presentations of underlying conceptual relationships and the mapping of search results to MeSH and Gene Ontology trees, we believe the use of multiple ontologies from the Open Biomedical Ontology can greatly help researchers to explore literature from different perspectives as well as to quickly locate the most relevant Medline records for further investigation.
We developed an ontology-based interactive Medline exploration solution called PubOnto to enable the interactive exploration and filtering of search results through the use of multiple ontologies from the OBO foundry. The PubOnto program is a rich internet application based on the FLEX platform. It contains a number of interactive tools, visualization capabilities, an open service architecture, and a customizable user interface. It is freely accessible at: .
With the growing amount of biomedical data available in public databases it has become increasingly important to annotate data in a consistent way in order to allow easy access to this rich source of information. Annotating the data using controlled vocabulary terms and ontologies makes it much easier to compare and analyze data from different sources. However, finding the correct controlled vocabulary terms can sometimes be a difficult task for the end user annotating these data.
In order to facilitate the location of the correct term in the correct controlled vocabulary or ontology, the Ontology Lookup Service was created. However, using the Ontology Lookup Service as a web service is not always feasible, especially for researchers without bioinformatics support. We have therefore created a Java front end to the Ontology Lookup Service, called the OLS Dialog, which can be plugged into any application requiring the annotation of data using controlled vocabulary terms, making it possible to find and use controlled vocabulary terms without requiring any additional knowledge about web services or ontology formats.
As a user-friendly open source front end to the Ontology Lookup Service, the OLS Dialog makes it straightforward to include controlled vocabulary support in third-party tools, which ultimately makes the data even more valuable to the biomedical community.
The integration of biomedical terminologies is indispensable to the process
of information integration. When terminologies are linked merely
through the alignment of their leaf terms, however, differences in context
and ontological structure are ignored. Making use of the SNAP and
SPAN ontologies, we show how three reference domain ontologies can be
integrated at a higher level, through what we shall call the OBR framework (for: Ontology
of Biomedical Reality). OBR is designed to facilitate
inference across the boundaries of domain ontologies in anatomy, physiology
ontology integration; top-level ontology; domain ontology; terminology; biomedicine
Integrating and relating images with clinical and molecular data is a crucial activity in translational research, but challenging because the information in images is not explicit in standard computer-accessible formats. We have developed an ontology-based representation of the semantic contents of radiology images called AIM (Annotation and Image Markup). AIM specifies the quantitative and qualitative content that researchers extract from images. The AIM ontology enables semantic image annotation and markup, specifying the entities and relations necessary to describe images. AIM annotations, represented as instances in the ontology, enable key use cases for images in translational research such as disease status assessment, query, and inter-observer variation analysis. AIM will enable ontology-based query and mining of images, and integration of images with data in other ontology-annotated bioinformatics databases. Our ultimate goal is to enable researchers to link images with related scientific data so they can learn the biological and physiological significance of the image content.
The Gene Ontology (GO) has become the internationally accepted standard for representing function, process, and location aspects of gene products. The wealth of GO annotation data provides a valuable source of implicit knowledge of relationships among these aspects. We describe a new method for association rule mining to discover implicit co-occurrence relationships across the GO sub-ontologies at multiple levels of abstraction. Prior work on association rule mining in the GO has concentrated on mining knowledge at a single level of abstraction and/or between terms from the same sub-ontology. We have developed a bottom-up generalization procedure called Cross-Ontology Data Mining-Level by Level (COLL) that takes into account the structure and semantics of the GO, generates generalized transactions from annotation data and mines interesting multi-level cross-ontology association rules. We applied our method on publicly available chicken and mouse GO annotation datasets and mined 5368 and 3959 multi-level cross ontology rules from the two datasets respectively. We show that our approach discovers more and higher quality association rules from the GO as evaluated by biologists in comparison to previously published methods. Biologically interesting rules discovered by our method reveal unknown and surprising knowledge about co-occurring GO terms.
Sensor networks are a concept that has become very popular in data acquisition and processing for multiple applications in different fields such as industrial, medicine, home automation, environmental detection, etc. Today, with the proliferation of small communication devices with sensors that collect environmental data, semantic Web technologies are becoming closely related with sensor networks. The linking of elements from Semantic Web technologies with sensor networks has been called Semantic Sensor Web and has among its main features the use of ontologies. One of the key challenges of using ontologies in sensor networks is to provide mechanisms to integrate and exchange knowledge from heterogeneous sources (that is, dealing with semantic heterogeneity). Ontology alignment is the process of bringing ontologies into mutual agreement by the automatic discovery of mappings between related concepts. This paper presents a system for ontology alignment in the Semantic Sensor Web which uses fuzzy logic techniques to combine similarity measures between entities of different ontologies. The proposed approach focuses on two key elements: the terminological similarity, which takes into account the linguistic and semantic information of the context of the entity's names, and the structural similarity, based on both the internal and relational structure of the concepts. This work has been validated using sensor network ontologies and the Ontology Alignment Evaluation Initiative (OAEI) tests. The results show that the proposed techniques outperform previous approaches in terms of precision and recall.
semantic sensor web; ontology alignment; fuzzy logic
Development of an ontology for the description of crystallization experiments and results is proposed.
When crystallization screening is conducted many outcomes are observed but typically the only trial recorded in the literature is the condition that yielded the crystal(s) used for subsequent diffraction studies. The initial hit that was optimized and the results of all the other trials are lost. These missing results contain information that would be useful for an improved general understanding of crystallization. This paper provides a report of a crystallization data exchange (XDX) workshop organized by several international large-scale crystallization screening laboratories to discuss how this information may be captured and utilized. A group that administers a significant fraction of the world’s crystallization screening results was convened, together with chemical and structural data informaticians and computational scientists who specialize in creating and analysing large disparate data sets. The development of a crystallization ontology for the crystallization community was proposed. This paper (by the attendees of the workshop) provides the thoughts and rationale leading to this conclusion. This is brought to the attention of the wider audience of crystallographers so that they are aware of these early efforts and can contribute to the process going forward.
crystallization screening data; crystallization ontology
The technological advances in the past decade have lead to massive progress in the field of biotechnology. The documentation of the progress made exists in the form of research articles. The PubMed is the current most used repository for bio-literature. PubMed consists of about 17 million abstracts as of 2007 that require methods to efficiently retrieve and browse large volume of relevant information. The State-of-the-art technologies such as GOPubmed use simple keyword-based techniques for retrieving abstracts from the PubMed and linking them to the Gene Ontology (GO). This paper changes the paradigm by introducing semantics enabled technique to link the PubMed to the Gene Ontology, called, SEGOPubmed for ontology-based browsing. Latent Semantic Analysis (LSA) framework is used to semantically interface PubMed abstracts to the Gene Ontology.
The Empirical analysis is performed to compare the performance of the SEGOPubmed with the GOPubmed. The analysis is initially performed using a few well-referenced query words. Further, statistical analysis is performed using GO curated dataset as ground truth. The analysis suggests that the SEGOPubmed performs better than the classic GOPubmed as it incorporates semantics.
The LSA technique is applied on the PubMed abstracts obtained based on the user query and the semantic similarity between the query and the abstracts. The analyses using well-referenced keywords show that the proposed semantic-sensitive technique outperformed the string comparison based techniques in associating the relevant abstracts to the GO terms. The SEGOPubmed also extracted the abstracts in which the keywords do not appear in isolation (i.e. they appear in combination with other terms) that could not be retrieved by simple term matching techniques.
In recent years, ontologies have become a mainstream topic in biomedical research. When biological entities are described using a common schema, such as an ontology, they can be compared by means of their annotations. This type of comparison is called semantic similarity, since it assesses the degree of relatedness between two entities by the similarity in meaning of their annotations. The application of semantic similarity to biomedical ontologies is recent; nevertheless, several studies have been published in the last few years describing and evaluating diverse approaches. Semantic similarity has become a valuable tool for validating the results drawn from biomedical studies such as gene clustering, gene expression data analysis, prediction and validation of molecular interactions, and disease gene prioritization.
We review semantic similarity measures applied to biomedical ontologies and propose their classification according to the strategies they employ: node-based versus edge-based and pairwise versus groupwise. We also present comparative assessment studies and discuss the implications of their results. We survey the existing implementations of semantic similarity measures, and we describe examples of applications to biomedical research. This will clarify how biomedical researchers can benefit from semantic similarity measures and help them choose the approach most suitable for their studies.
Biomedical ontologies are evolving toward increased coverage, formality, and integration, and their use for annotation is increasingly becoming a focus of both effort by biomedical experts and application of automated annotation procedures to create corpora of higher quality and completeness than are currently available. Given that semantic similarity measures are directly dependent on these evolutions, we can expect to see them gaining more relevance and even becoming as essential as sequence similarity is today in biomedical research.
While classifications of mental disorders have existed for over one hundred years, it still remains unspecified what terms such as 'mental disorder', 'disease' and 'illness' might actually denote. While ontologies have been called in aid to address this shortfall since the GALEN project of the early 1990s, most attempts thus far have sought to provide a formal description of the structure of some pre-existing terminology or classification, rather than of the corresponding structures and processes on the side of the patient.
We here present a view of mental disease that is based on ontological realism and which follows the principles embodied in Basic Formal Ontology (BFO) and in the application of BFO in the Ontology of General Medical Science (OGMS). We analyzed statements about what counts as a mental disease provided (1) in the research agenda for the DSM-V, and (2) in Pies' model. The results were used to assess whether the representational units of BFO and OGMS were adequate as foundations for a formal representation of the entities in reality that these statements attempt to describe. We then analyzed the representational units specific to mental disease and provided corresponding definitions.
Our key contributions lie in the identification of confusions and conflations in the existing terminology of mental disease and in providing what we believe is a framework for the sort of clear and unambiguous reference to entities on the side of the patient that is needed in order to avoid these confusions in the future.
One criterion for the well-formedness of ontologies is that their hierarchical structure forms a lattice. Formal Concept Analysis (FCA) has been used as a technique for assessing the quality of ontologies, but is not scalable to large ontologies such as SNOMED CT (> 300k concepts). We developed a methodology called Lattice-based Structural Auditing (LaSA), for auditing biomedical ontologies, implemented through automated SPARQL queries, in order to exhaustively identify all non-lattice pairs in SNOMED CT. The percentage of non-lattice pairs ranges from 0 to 1.66 among the 19 SNOMED CT hierarchies. Preliminary manual inspection of a limited portion of the over 544k non-lattice pairs, among over 356 million candidate pairs, revealed inconsistent use of precoordination in SNOMED CT, but also a number of false positives. Our results are consistent with those based on FCA, with the advantage that the LaSA pipeline is scalable and applicable to ontological systems consisting mostly of taxonomic links.