PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-10 (10)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  Effects of Guideline-Based Training on the Quality of Formal Ontologies: A Randomized Controlled Trial 
PLoS ONE  2013;8(5):e61425.
Background
The importance of ontologies in the biomedical domain is generally recognized. However, their quality is often too poor for large-scale use in critical applications, at least partially due to insufficient training of ontology developers.
Objective
To show the efficacy of guideline-based ontology development training on the performance of ontology developers. The hypothesis was that students who received training on top-level ontologies and design patterns perform better than those who only received training in the basic principles of formal ontology engineering.
Methods
A curriculum was implemented based on a guideline for ontology design. A randomized controlled trial on the efficacy of this curriculum was performed with 24 students from bioinformatics and related fields. After joint training on the fundamentals of ontology development the students were randomly allocated to two groups. During the intervention, each group received training on different topics in ontology development. In the assessment phase, all students were asked to solve modeling problems on topics taught differentially in the intervention phase. Primary outcome was the similarity of the students’ ontology artefacts compared with gold standard ontologies developed by the authors before the experiment; secondary outcome was the intra-group similarity of group members’ ontologies.
Results
The experiment showed no significant effect of the guideline-based training on the performance of ontology developers (a) the ontologies developed after specific training were only slightly but not significantly closer to the gold standard ontologies than the ontologies developed without prior specific training; (b) although significant differences for certain ontologies were detected, the intra-group similarity was not consistently influenced in one direction by the differential training.
Conclusion
Methodologically limited, this study cannot be interpreted as a general failure of a guideline-based approach to ontology development. Further research is needed to increase insight into whether specific development guidelines and practices in ontology design are effective.
doi:10.1371/journal.pone.0061425
PMCID: PMC3646875  PMID: 23667440
2.  Proposed actions are no actions: re-modeling an ontology design pattern with a realist top-level ontology 
Journal of Biomedical Semantics  2012;3(Suppl 2):S2.
Background
Ontology Design Patterns (ODPs) are representational artifacts devised to offer solutions for recurring ontology design problems. They promise to enhance the ontology building process in terms of flexibility, re-usability and expansion, and to make the result of ontology engineering more predictable. In this paper, we analyze ODP repositories and investigate their relation with upper-level ontologies. In particular, we compare the BioTop upper ontology to the Action ODP from the NeOn an ODP repository. In view of the differences in the respective approaches, we investigate whether the Action ODP can be embedded into BioTop. We demonstrate that this requires re-interpreting the meaning of classes of the NeOn Action ODP in the light of the precepts of realist ontologies.
Results
As a result, the re-design required clarifying the ontological commitment of the ODP classes by assigning them to top-level categories. Thus, ambiguous definitions are avoided. Classes of real entities are clearly distinguished from classes of information artifacts. The proposed approach avoids the commitment to the existence of unclear future entities which underlies the NeOn Action ODP. Our re-design is parsimonious in the sense that existing BioTop content proved to be largely sufficient to define the different types of actions and plans.
Conclusions
The proposed model demonstrates that an expressive upper-level ontology provides enough resources and expressivity to represent even complex ODPs, here shown with the different flavors of Action as proposed in the NeOn ODP. The advantage of ODP inclusion into a top-level ontology is the given predetermined dependency of each class, an existing backbone structure and well-defined relations. Our comparison shows that the use of some ODPs is more likely to cause problems for ontology developers, rather than to guide them. Besides the structural properties, the explanation of classification results were particularly hard to grasp for 'self-sufficient' ODPs as compared with implemented and 'embedded' upper-level structures which, for example in the case of BioTop, offer a detailed description of classes and relations in an axiomatic network. This ensures unambiguous interpretation and provides more concise constraints to leverage on in the ontology engineering process.
doi:10.1186/2041-1480-3-S2-S2
PMCID: PMC3448525  PMID: 23046561
3.  OntoCheck: verifying ontology naming conventions and metadata completeness in Protégé 4 
Journal of Biomedical Semantics  2012;3(Suppl 2):S4.
Background
Although policy providers have outlined minimal metadata guidelines and naming conventions, ontologies of today still display inter- and intra-ontology heterogeneities in class labelling schemes and metadata completeness. This fact is at least partially due to missing or inappropriate tools. Software support can ease this situation and contribute to overall ontology consistency and quality by helping to enforce such conventions.
Objective
We provide a plugin for the Protégé Ontology editor to allow for easy checks on compliance towards ontology naming conventions and metadata completeness, as well as curation in case of found violations.
Implementation
In a requirement analysis, derived from a prior standardization approach carried out within the OBO Foundry, we investigate the needed capabilities for software tools to check, curate and maintain class naming conventions. A Protégé tab plugin was implemented accordingly using the Protégé 4.1 libraries. The plugin was tested on six different ontologies. Based on these test results, the plugin could be refined, also by the integration of new functionalities.
Results
The new Protégé plugin, OntoCheck, allows for ontology tests to be carried out on OWL ontologies. In particular the OntoCheck plugin helps to clean up an ontology with regard to lexical heterogeneity, i.e. enforcing naming conventions and metadata completeness, meeting most of the requirements outlined for such a tool. Found test violations can be corrected to foster consistency in entity naming and meta-annotation within an artefact. Once specified, check constraints like name patterns can be stored and exchanged for later re-use. Here we describe a first version of the software, illustrate its capabilities and use within running ontology development efforts and briefly outline improvements resulting from its application. Further, we discuss OntoChecks capabilities in the context of related tools and highlight potential future expansions.
Conclusions
The OntoCheck plugin facilitates labelling error detection and curation, contributing to lexical quality assurance in OWL ontologies. Ultimately, we hope this Protégé extension will ease ontology alignments as well as lexical post-processing of annotated data and hence can increase overall secondary data usage by humans and computers.
doi:10.1186/2041-1480-3-S2-S4
PMCID: PMC3448530  PMID: 23046606
4.  Towards an ontological representation of morbidity and mortality in Description Logics 
Journal of Biomedical Semantics  2012;3(Suppl 2):S7.
Background
Despite the high coverage of biomedical ontologies, very few sound definitions of death can be found. Nevertheless, this concept has its relevance in epidemiology, such as for data integration within mortality notification systems. We here introduce an ontological representation of the complex biological qualities and processes that inhere in organisms transitioning from life to death. We further characterize them by causal processes and their temporal borders.
Results
Several representational difficulties were faced, mainly regarding kinds of processes with blurred or fiat borders that change their type in a continuous rather than discrete mode. Examples of such hard to grasp concepts are life, death and its relationships with injuries and diseases. We illustrate an iterative optimization of definitions within four versions of the ontology, so as to stress the typical problems encountered in representing complex biological processes. We point out possible solutions for representing concepts related to biological life cycles, preserving identity of participating individuals, i.e. for a patient in transition from life to death. This solution however required the use of extended description logics not yet supported by tools. We also focus on the interdependencies and need to change further parts if one part is changed.
Conclusion
The axiomatic definition of mortality we introduce allows the description of biologic processes related to the transition from healthy to diseased or injured, and up to a final death state. Exploiting such definitions embedded into descriptions of pathogen transmissions by arthropod vectors, the complete sequence of infection and disease processes can be described, starting from the inoculation of a pathogen by a vector, until the death of an individual, preserving the identity of the patient.
doi:10.1186/2041-1480-3-S2-S7
PMCID: PMC3448531  PMID: 23046681
5.  Unintended consequences of existential quantifications in biomedical ontologies 
BMC Bioinformatics  2011;12:456.
Background
The Open Biomedical Ontologies (OBO) Foundry is a collection of freely available ontologically structured controlled vocabularies in the biomedical domain. Most of them are disseminated via both the OBO Flatfile Format and the semantic web format Web Ontology Language (OWL), which draws upon formal logic. Based on the interpretations underlying OWL description logics (OWL-DL) semantics, we scrutinize the OWL-DL releases of OBO ontologies to assess whether their logical axioms correspond to the meaning intended by their authors.
Results
We analyzed ontologies and ontology cross products available via the OBO Foundry site http://www.obofoundry.org for existential restrictions (someValuesFrom), from which we examined a random sample of 2,836 clauses.
According to a rating done by four experts, 23% of all existential restrictions in OBO Foundry candidate ontologies are suspicious (Cohens' κ = 0.78). We found a smaller proportion of existential restrictions in OBO Foundry cross products are suspicious, but in this case an accurate quantitative judgment is not possible due to a low inter-rater agreement (κ = 0.07). We identified several typical modeling problems, for which satisfactory ontology design patterns based on OWL-DL were proposed. We further describe several usability issues with OBO ontologies, including the lack of ontological commitment for several common terms, and the proliferation of domain-specific relations.
Conclusions
The current OWL releases of OBO Foundry (and Foundry candidate) ontologies contain numerous assertions which do not properly describe the underlying biological reality, or are ambiguous and difficult to interpret. The solution is a better anchoring in upper ontologies and a restriction to relatively few, well defined relation types with given domain and range constraints.
doi:10.1186/1471-2105-12-456
PMCID: PMC3280341  PMID: 22115278
6.  Ontology patterns for tabular representations of biomedical knowledge on neglected tropical diseases 
Bioinformatics  2011;27(13):i349-i356.
Motivation: Ontology-like domain knowledge is frequently published in a tabular format embedded in scientific publications. We explore the re-use of such tabular content in the process of building NTDO, an ontology of neglected tropical diseases (NTDs), where the representation of the interdependencies between hosts, pathogens and vectors plays a crucial role.
Results: As a proof of concept we analyzed a tabular compilation of knowledge about pathogens, vectors and geographic locations involved in the transmission of NTDs. After a thorough ontological analysis of the domain of interest, we formulated a comprehensive design pattern, rooted in the biomedical domain upper level ontology BioTop. This pattern was implemented in a VBA script which takes cell contents of an Excel spreadsheet and transforms them into OWL-DL. After minor manual post-processing, the correctness and completeness of the ontology was tested using pre-formulated competence questions as description logics (DL) queries. The expected results could be reproduced by the ontology. The proposed approach is recommended for optimizing the acquisition of ontological domain knowledge from tabular representations.
Availability and implementation: Domain examples, source code and ontology are freely available on the web at http://www.cin.ufpe.br/~ntdo.
Contact: fss3@cin.ufpe.br
doi:10.1093/bioinformatics/btr226
PMCID: PMC3117366  PMID: 21685092
7.  The Pitfalls of Thesaurus Ontologization – the Case of the NCI Thesaurus 
Thesauri that are “ontologized” into OWL-DL semantics are highly amenable to modeling errors resulting from falsely interpreting existential restrictions. We investigated the OWL-DL representation of the NCI Thesaurus (NCIT) in order to assess the correctness of existential restrictions. A random sample of 354 axioms using the someValuesFrom operator was taken. According to a rating performed by two domain experts, roughly half of these examples, and in consequence more than 76,000 axioms in the OWL-DL version, make incorrect assertions if interpreted according to description logics semantics. These axioms therefore constitute a huge source for unintended models, rendering most logic-based reasoning unreliable. After identifying typical error patterns we discuss some possible improvements. Our recommendation is to either amend the problematic axioms in the OWL-DL formalization or to consider some less strict representational format.
PMCID: PMC3041372  PMID: 21347074
8.  Development of FuGO: An Ontology for Functional Genomics Investigations 
The development of the Functional Genomics Investigation Ontology (FuGO) is a collaborative, international effort that will provide a resource for annotating functional genomics investigations, including the study design, protocols and instrumentation used, the data generated and the types of analysis performed on the data. FuGO will contain both terms that are universal to all functional genomics investigations and those that are domain specific. In this way, the ontology will serve as the “semantic glue” to provide a common understanding of data from across these disparate data sources. In addition, FuGO will reference out to existing mature ontologies to avoid the need to duplicate these resources, and will do so in such a way as to enable their ease of use in annotation. This project is in the early stages of development; the paper will describe efforts to initiate the project, the scope and organization of the project, the work accomplished to date, and the challenges encountered, as well as future plans.
doi:10.1089/omi.2006.10.199
PMCID: PMC2783628  PMID: 16901226
9.  Survey-based naming conventions for use in OBO Foundry ontology development 
BMC Bioinformatics  2009;10:125.
Background
A wide variety of ontologies relevant to the biological and medical domains are available through the OBO Foundry portal, and their number is growing rapidly. Integration of these ontologies, while requiring considerable effort, is extremely desirable. However, heterogeneities in format and style pose serious obstacles to such integration. In particular, inconsistencies in naming conventions can impair the readability and navigability of ontology class hierarchies, and hinder their alignment and integration. While other sources of diversity are tremendously complex and challenging, agreeing a set of common naming conventions is an achievable goal, particularly if those conventions are based on lessons drawn from pooled practical experience and surveys of community opinion.
Results
We summarize a review of existing naming conventions and highlight certain disadvantages with respect to general applicability in the biological domain. We also present the results of a survey carried out to establish which naming conventions are currently employed by OBO Foundry ontologies and to determine what their special requirements regarding the naming of entities might be. Lastly, we propose an initial set of typographic, syntactic and semantic conventions for labelling classes in OBO Foundry ontologies.
Conclusion
Adherence to common naming conventions is more than just a matter of aesthetics. Such conventions provide guidance to ontology creators, help developers avoid flaws and inaccuracies when editing, and especially when interlinking, ontologies. Common naming conventions will also assist consumers of ontologies to more readily understand what meanings were intended by the authors of ontologies used in annotating bodies of data.
doi:10.1186/1471-2105-10-125
PMCID: PMC2684543  PMID: 19397794
10.  Facilitating the development of controlled vocabularies for metabolomics technologies with text mining 
BMC Bioinformatics  2008;9(Suppl 5):S5.
Background
Many bioinformatics applications rely on controlled vocabularies or ontologies to consistently interpret and seamlessly integrate information scattered across public resources. Experimental data sets from metabolomics studies need to be integrated with one another, but also with data produced by other types of omics studies in the spirit of systems biology, hence the pressing need for vocabularies and ontologies in metabolomics. However, it is time-consuming and non trivial to construct these resources manually.
Results
We describe a methodology for rapid development of controlled vocabularies, a study originally motivated by the needs for vocabularies describing metabolomics technologies. We present case studies involving two controlled vocabularies (for nuclear magnetic resonance spectroscopy and gas chromatography) whose development is currently underway as part of the Metabolomics Standards Initiative. The initial vocabularies were compiled manually, providing a total of 243 and 152 terms. A total of 5,699 and 2,612 new terms were acquired automatically from the literature. The analysis of the results showed that full-text articles (especially the Materials and Methods sections) are the major source of technology-specific terms as opposed to paper abstracts.
Conclusions
We suggest a text mining method for efficient corpus-based term acquisition as a way of rapidly expanding a set of controlled vocabularies with the terms used in the scientific literature. We adopted an integrative approach, combining relatively generic software and data resources for time- and cost-effective development of a text mining tool for expansion of controlled vocabularies across various domains, as a practical alternative to both manual term collection and tailor-made named entity recognition methods.
doi:10.1186/1471-2105-9-S5-S5
PMCID: PMC2367623  PMID: 18460187

Results 1-10 (10)