The Cell Ontology (CL) is a biomedical ontology originally built to represent in vivo
and in vitro
cell types, including those observed in specific developmental stages, of all the major model organisms.[1
] The CL now aims to become a reference ontology within the Open Biomedical Ontology (OBO) Foundry (www.obofoundry.org
] The CL both serves the terminology needs of data annotation and provides a base ontology from which compound terms in other ontologies can be derived by means of cross-product term formation.[3
] Within the Mouse Genome Informatics resource (www.informatics.jax.org
), for instance, the CL is used in conjunction with Gene Ontology (GO) during annotation of mouse gene products to indicate the cell type in which a gene product is active, an approach that is now being adopted by other model organism databases within the GO Consortium. In ontology development for the GO CL terms are employed in the formation of new GO terms using the cross-product of core GO biological process terms and CL terms: for instance, the GO term “leukocyte differentiation” can be defined using the GO term “cell differentiation” and the CL term “leukocyte.”[4
(this issue)] The Immunology Database and Analysis Portal (www.immport.org
) is using the CL as a reference of cell types for the mapping of results from the analysis of flow cytometry data. The CL is also frequently used to compose descriptions of phenotypes.
The Cell Ontology is currently constructed using two relations, is_a and develops_from. The first relation is used to relate specific cell types to more general cell types (for example, between “T cell” and “lymphocyte”); the latter relation is used to indicate cell lineage relationships (for example, between “neuron” and “neuroblast”). The ontology, as it was initially developed, relies upon a number of artificial high level terms to capture types of cellular qualities, such as “cell in vivo,” “cell by organism,” and “cell by class,” a term which itself has the is_a child terms “cell by function,” “cell by histology,” “cell by lineage,” “cell by ploidy,” etc. These subclasses of cells have further is_a children denoting more specific qualities of cells. Depending on the qualities of a particular cell type it may have one or more of these terms as is_a ancestors. For instance, in the original form of the ontology the cell type “macrophage” is a direct subtype of “mononuclear phagocyte” and “professional antigen presenting cell,” and an indirect subtype of “cell by function,” “cell by histology,” “cell by nuclear number,” and “animal cell” ().
Term placement for cell type term “macrophage” in the Cell Ontology.
With its multiple inheritance structure, the original CL can be described as having separate ontologies of cell types delineated by particular cellular qualities overlaid upon each other, i.e. an ontology with multiple axes of differentia that are variously and sometimes arbitrarily applied to individual cell types. Furthermore the high level terms themselves do not represent actual cell types, so the ontology is not a true is_a hierarchy. This unwieldy ontological construct is not ideal for developing proper inference about cell types, nor does it always provide obvious placement of new cell type terms.
Discussions among interested parties in the past few years have focused on how best to restructure the CL to eliminate the complexity of its multiple inheritance structure. One way to achieve this restructuring is to use an increasingly common methodology for developing ontologies: Assign logical definitions to classes based on their properties, and then let automated tools – called reasoners – infer the multiple inheritance hierarchy. This strategy exploits the work done in other ontologies. For example, neurons can be classified according to the types of chemical entities that are released, and the ontology of chemical entities can be used to determine the subsumption relationships between types of neuron.
The hematopoietic/immune cell types are of particular interest because of their roles in the immune response and consequent involvement in human health and disease. These cell types in particular have been the focus of two rounds of intensive curation in recent years. A set of improvements for hematopoietic cells was done in 2006 in conjunction with the revision of the terms for immunological processes in the GO.[6
] At that time 80 new hematopoietic cell type terms were introduced, many other terms were revised, and many improvements in ontology structure were made for these specific cell types.
A second, more extensive round of revisions to the hematopoietic cell type terms in CL is described herein. These revisions grew out of the proceedings of a National Institute of Allergy and Infectious Disease (NIAID) sponsored “Workshop on Immune Cell Representation in the Cell Ontology,” held in May 2008, where domain experts and biomedical ontologists worked together on two goals: 1) revising the existing terms and developing additional terms for T cells, B cells, natural killer cells, monocytes and macrophages, and dendritic cells, and 2) establishing a new paradigm for a comprehensive revision of the whole of the CL. These changes in the representation of hematopoietic cells were needed to represent these cell types in a more complete and accurate manner. The goals were to represent all major hematopoietic cell types identified in the literature in the ontology and to define these cell type in an in-depth manner that greatly increases the descriptiveness of the ontology for data annotation and logical inference.