|Home | About | Journals | Submit | Contact Us | Français|
The National Center for Biomedical Ontology (NCBO) is one of the National Centers for Biomedical Computing funded under the NIH Roadmap Initiative. Contributing to the national computing infrastructure, NCBO has developed BioPortal, a web portal that provides access to a library of biomedical ontologies and terminologies (http://bioportal.bioontology.org) via the NCBO Web services. BioPortal enables community participation in the evaluation and evolution of ontology content by providing features to add mappings between terms, to add comments linked to specific ontology terms and to provide ontology reviews. The NCBO Web services (http://www.bioontology.org/wiki/index.php/NCBO_REST_services) enable this functionality and provide a uniform mechanism to access ontologies from a variety of knowledge representation formats, such as Web Ontology Language (OWL) and Open Biological and Biomedical Ontologies (OBO) format. The Web services provide multi-layered access to the ontology content, from getting all terms in an ontology to retrieving metadata about a term. Users can easily incorporate the NCBO Web services into software applications to generate semantically aware applications and to facilitate structured data collection.
Ontologies provide domain knowledge to drive data annotation, data integration, information retrieval, natural language processing and decision support. As the number of large data sets are growing, providing a framework for data analysis and data integration using ontologies continues to be of critical importance (1). However, until recently, there has been a lack of common services for accessing this rich content from software applications. There has also been a lack of services to facilitate ontology development by reusing existing ontology content. BioPortal fills these gaps (2). BioPortal is a Web portal that provides access to a library of biomedical ontologies and terminologies developed in Web Ontology Language (OWL), Resource Description Framework (RDF)(S), Open Biological and Biomedical Ontologies (OBO) format, Protégé frames and Rich Release Format (http://bioportal.bioontology.org). BioPortal has a service-oriented architecture; the NCBO Web services provide the functionality found in BioPortal and these Web services can be incorporated into other software applications to access and use ontology content. BioPortal groups ontologies by domain to ease finding relevant ontologies and allows users to browse, search and visualize the content of ontologies. Registered users are able to add mappings between terms, to add comments on individual terms within the ontology and to provide reviews of ontologies. This user-generated content provides critical evaluation and feedback mechanism for ontology developers. The specific focus on enabling community feedback to BioPortal content is a distinguishing characteristic of the system.
In 2008, BioPortal contained 72 ontologies (300000 total classes) and has grown significantly over the last 3 years to contain 260 ontologies (4.8 million total classes). Ontologies from a number of different groups are published in BioPortal, including caBIG (https://cabig.nci.nih.gov/), the recipients of Clinical and Translational Science Awards (http://www.ctsaweb.org/), the Consultative Group on International Agricultural Research (http://www.cgiar.org/), the OBO library (http://obofoundry.org/), the Proteomics Standards Initiative (http://www.psidev.info/), the Unified Medical Language System (http://www.nlm.nih.gov/research/umls/) and the World Health Organization (WHO) Family of International Classifications (http://www.who.int/classifications/en/). In addition to the increase in ontology content within BioPortal, non-biomedical organizations have also installed their own instances of BioPortal software. These organizations include DataONE (http://www.dataone.org/), the Marine Metadata Interoperability Project (http://mmisw.org/orr/) and other groups that require Official Use Only levels of privacy for their ontology content and access to the Web services (e.g. annotating HIPPA regulated data) or need a repository for ontologies that cover domains not relevant to biomedicine.
When we initially released BioPortal, the system included RESTful Web services to get ontology metadata, to get individual ontology terms, to download ontologies and to search within ontologies. Since then, we have increased the number of Web services to provide expanded functionality and to include Web services to create and get ontology views, to get all terms from an ontology, to get instances, to post and get ontology mappings, to post and get comments and to get ontologies and individual ontology terms in RDF. BioPortal is designed to store multiple versions of the same ontology, which enables a historical overview of the ontology as it evolves. Each ontology has a global (virtual) ontology identifier and each new version of the ontology has an ontology version identifier. Many of the Web services can be called with either the virtual ontology identifier or the ontology version identifier.
Ontology views are subsets of one or more ontologies. Ontology subsets are also referred to as slims by the GO Consortium and value sets when used for structured data entry. These subsets are a useful mechanism to work with smaller amounts of ontology content. For example, views can serve as value sets to populate a Web form select menu or as portions of ontologies to re-use in developing a new ontology. The ‘View’ Web services include functionality to get a list of all ontologies that have views and to create a view using the ‘View extraction’ Web service. The View extraction Web service is designed to extract branches of ontologies given a term to serve as the root node in the ontology view. This Web service is very popular for generating views of content specific portions of large ontologies such as the NCBI Taxonomy, International Classification of External Causes of Injuries (http://www.who.int/classifications/icd/adaptations/iceci/en/index.html) and the Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT, http://www.ihtsdo.org/snomed-ct).
The ‘Get All Terms’ Web service returns the term details for all terms within an ontology. This Web service can be called using a specific ontology version identifier or with the virtual ontology identifier, therefore providing a common Web service signature that always returns data from the latest version of the ontology. Due to the large size of some ontologies, this Web service returns ‘pages’ of data to minimize the load on both client and server in dealing with extremely large XML files. The Web service is particularly valuable for use of ontology data in other knowledge base systems that require a custom ontology format, such as the Ontology Management Cell from the i2b2 clinical research data management Hive (https://www.i2b2.org/software/index.html).
The ‘Get Term’ Web service is now expanded to access instances from OWL ontologies. The Web service returns ‘pages’ of results containing all instances for a class. Based on the design of the ontology, it is useful to access the instances since these may be the terms to use for data annotation. For example, in the MGED Ontology the values to specify a requested Minimum Information About a Microarray Experiment (MIAME) checklist item are located as instances in the ontology.
The suite of ‘Mapping’ Web services provides access to the millions of ontology mappings published in BioPortal. The mapping data includes mappings provided by the ontology content providers, for example, mappings based on common Concept Unique Identifiers (CUI) in UMLS, mappings specified in OBO ontologies through the OBO xref property, mappings submitted directly to BioPortal by users and mappings generated automatically by algorithms such as LOOM (3). The Mapping Web services are parameterized to allow a high degree of flexibility to access the data. For example, the service can return mappings between individual terms, all mappings for a given term, or all mappings for a given ontology. Registered users can also submit mappings directly to BioPortal by using the ‘create new mapping’ Web service. This service allows automatic publishing of mapping content generated by ontology alignment software.
The ‘Notes’ Web services provide the ability for registered users to add comments directly to an ontology term or to the ontology. There are different types of notes that provide varying levels of structure for the note. For example, the note type ‘comment’ simply provides a text box to add unstructured comments. However, the ‘new term proposal’ note type provides fields to include the preferred name, synonyms, definition and reason for proposing the new term (Figure 1). Each note type has a specific XML response from the Notes Web service, easily allowing software applications to post or consume the content of the Notes Web service. For example, via the service-oriented architecture of BioPortal, the notes functionality is being incorporated into the ontology editing software WebProtégé (http://protegewiki.stanford.edu/wiki/WebProtege) and could also be incorporated into OBO-Edit (http://oboedit.org/). An email alert system is incorporated into the Notes feature, therefore the ontology author and other users interested in comments for a given ontology can be notified via email of new comments. The ontology author can then engage with the user to address the concern using the Notes feature and archive the Note once resolved.
The ‘RDF’ Web services are designed to return RDF snippets for individual terms and the content of an entire ontology in RDF. The goal of the RDF Web services is to provide the essential information about a term in RDF. The RDF Term Web service returns the term id, preferred name, synonyms, definitions and super-classes together with selected locally defined annotation properties for the ontology. The URI for each term is either the original URI present in the ontology file or the URI specified by the respective authority. If the ontology developers do not provide URIs for the terms, BioPortal generates these using the purl.bioontology.org server. By generating RDF for the ontologies, the entire content of BioPortal ontologies can be exposed as Linked Open Data. In addition to the RDF Web services, the BioPortal prototype SPARQL endpoint provides access to BioPortal ontology content in RDF and is available at: http://sparql.bioontology.org.
We have also developed other Web services that use BioPortal ontology content. These services include the NCBO Annotator Web service, which ‘tags’ textual data with ontology terms from BioPortal (4); the NCBO Resource Index Web services, which provide access to an ontology-based index of publically available biomedical data (5) and the NCBO Ontology Recommender Web service (6), which given a set of keywords or textual data as input generates a ranked score of which ontology best ‘covers’ the data. More details on all NCBO Web services can be found at: http://www.bioontology.org/wiki/index.php/NCBO_REST_services.
We have also wrapped the NCBO Web services as widgets for easy embedding of this functionality into Web sites. The ‘Form Autocomplete’ widget provides variations to fill-in the form menu with the full term URI, term identifier, or the term name (Figure 2). The ‘Jump To’ widget also provides the term autocomplete function, but allows users to ‘jump’ directly into BioPortal to learn more about the term (e.g. term details, position in hierarchy, database records annotated with the term). The ‘RSS Feed’ provides updates on new versions and comments posted to the ontology and can be added to Web sites to notify community members of these updates. The ‘Graph’ and ‘Tree’ widgets provide a way to display an ontology structure on your own web site. The Widgets are available for each ontology in BioPortal and the widget code is accessible from the Widgets tab.
BioPortal is a Web portal that provides access to a library of biomedical ontologies and terminologies via the NCBO Web services. While the overall architecture is domain independent, the NCBO instance is focused on publishing ontology content relevant for biomedicine. The ontology content in BioPortal continues to grow and the usage of the site continues to increase. Over the last 2 years, the number of ontology terms in BioPortal has grown by more than an order of magnitude and the number of ontologies increased 4-fold. The usage data for 2010 shows that almost 36000 unique visitors accessed 35 million pages via the BioPortal Web services, processing ~100GB of data.
The ease of use of the NCBO Web services and Widgets provide a convenient mechanism to incorporate ontology content into software applications. A number of data-annotation applications use NCBO Web services, including ISAcreator (7) for annotation of high-throughput experimental data, openMDR (8) a caBIG tool for annotation of data elements relevant to cancer related studies, the ECG Gadget (http://wiki.cvrgrid.org/index.php/ECGGadget) for annotation of electrocardiographs and BioScholar (https://wiki.birncommunity.org/display/NEWBIRNCC/BioScholar) a tool from BIRN to construct a knowledge repository derived from PDF files. The Widgets have also been added to a number of Web sites to enhance data annotation. These Web sites include REDfly, a database of Drosophila transcriptional cis-regulatory elements (9), the RadLex Tree Browser that provides a radiologist-friendly view of the RadLex terminology (10,11) and the model organism databases such as Oryzabase (12). The NCBO Web services are facilitating ontology-based search of content in caBIG resources and ODiSSea from Elsevier that integrates journal publications in SciVerse Hub with publically available databases via an ontology-based index. Table 1 lists additional software applications using NCBO technology. Additional use cases and sample code can be accessed from: http://www.bioontology.org/wiki/index.php/Sample_Code_Cookbook.
NCBO is continuing to develop new ontology-based tools. The User Group (https://mailman.stanford.edu/mailman/listinfo/bioportal-user-group) provides a mechanism to learn more about upcoming features and provide feedback. User feedback can also be sent to our support mailing list (firstname.lastname@example.org).
National Center for Biomedical Ontology, under roadmap-initiative from the National Institutes of Health [grant U54 HG004028]. Funding for open access charge: National Institutes of Health [grant U54 HG004028].
Conflict of interest statement. None declared.
We thank Alex Skrenchuk from Stanford University for computer support.