The Names table contains some 16,000 terms in eight languages denoting brain structures found in the four species most studied by neuroscientists. In the database schema () the homonyms problem is resolved by devoting multiple rows of the Names table to the same character string. For example, the character string ‘arcuate nucleus’ appears three times in the Names column of the Names Table. The three entries have different numeric IDs and link through different entries in the Concept ID column to different concepts, whose standard names are ‘arcuate nucleus of the hypothalamus’, ‘arcuate nucleus of the medulla’ and ‘ventral posteromedial nucleus’ (). When a user submits the query string ‘arcuate nucleus’, BrainInfo displays a list of all Names that contain that string and the standard name of the corresponding structure. This enables the user to disambiguate homonyms by selecting the appropriate concept/entity from the NeuroNames Standard Names column.
BrainInfo Response to Query ‘arcuate nucleus’
In neuroanatomy there are on average six different English, Latin and anglicized Latin synonyms for the same brain structure. Since different Names can refer to the same concept (blue solid ovals in ), the relation of the Concept ID column in the Names Table to the Concepts Table is many to one (). This allows users to interpret synonyms. Note that the one-to-many relation of names to concepts addressed by multiple rows for the same name combined with the many-to-one relation of names to concepts gives the many-to-many relation of names to concepts illustrated in . By interacting with users to disambiguate homonyms and interpret synonyms before initiating retrieval, BrainInfo is able to narrow the search and greatly reduce false positive and false negative items responses to users’ queries.
From an applied ontology point of view, the most important attributes of names in the Names table are: 1) the concept to which the name applies, which is used to resolve ambiguities, and 2) the name’s use-frequency. The most important attributes recorded for each concept include the NeuroNames standard name and acronym (labeled ‘Default Name’ and ‘Default Acronym’ in the Concepts Table), which are selected from the set of names for the concept in the Names Table. These are used in composing text for the BrainInfo website, including the definition of the concept and the source of the definition.
The Concepts and Models Tables
The Concepts Table (), which, through the Concept_Model junction table, stands in many-to-many relation to the Models Table () contains the names and definitions of both concepts and models. Rows in the Models Table identify a model, a structural concept and, if the model is hierarchical, the parent of the concept. The most important attribute of a concept in a hierarchical model is its parent, which BrainInfo uses to assemble hierarchical displays of the model.
Use cases: disambiguating homonyms and interpreting synonyms
The following use cases illustrate how the ontologic principles built into BrainInfo support precise retrieval of information when non-standard terminologies are involved. The first illustrates BrainInfo’s interpretation of a query that a user poses in non-standard terminology. The second illustrates how BrainInfo assists a user’s interpretation of information at a website that uses a different terminology.
A user submits a query to BrainInfo by keying a character string, ‘arcuate nucleus’, into a search box. BrainInfo compares the input character string to names in the Names table () and presents a list of matches with the standard name(s) of the concept(s) to which each name corresponds (). The user clicks the standard name that corresponds to the concept he has in mind, ‘ventral posteromedial nucleus’, and BrainInfo displays the Central Directory for that concept (). The Central Directory has iconic buttons for most kinds of information about brain structures that are likely to interest a user. Each button hyperlinks to a selection of pages in websites containing pertinent information.
BrainInfo’s Central Directory for ventral posteromedial nucleus
If BrainInfo does not have links to information in a given category, the icon is grayed to save the user’s navigating to a dead end. BrainInfo can then help the user search PubMed for information on the topic. When the user clicks the button labeled ‘What is Written about It?’ BrainInfo uses the ontology to compose a query that includes the standard name of the structure with its synonyms and sends the sometimes lengthy query string to PubMed. For example, the query submitted by BrainInfo for the ventral posteromedial nucleus reads: “arcuate nucleus-3” OR “Nucleus ventralis posteromedialis” OR “semilunar nucleus” OR “thalamic gustatory nucleus” OR “ventral posterior medial nucleus” OR “ventral posteromedial nucleus” OR “ventral posteromedial thalamic nucleus”. The value added by this application of the ontology is to eliminate false negative omission of citations resulting from authors’ use of terms other than ‘ventral posteromedial nucleus’ in referring to the structure.
If BrainInfo retrieves a page of information from a website that uses different terminology from the NeuroNames standard, BrainInfo uses the ontology to provide clarification in the format “Look for [terms used by the website authors]” (). By disambiguating homonyms and interpreting synonyms BrainInfo achieves one of its major purposes, viz., to eliminate terminology as an obstacle to effective communication.
The NeuroNames ontology aids interpretation of other websites
The BrainInfo Portal was established in 2001. It now links to several thousand pages in more than 50 of the most informative neuroscience sites on the Web. Several observations suggest that BrainInfo is providing useful service to the neuroscience community. In recent years an average of 400 unique visitors have viewed an average of 2000 pages per day. The top 20 institutional affiliations of identifiable users include universities and governmental agencies with large concentrations of neuroscientists, such as Harvard University, the National Institutes of Health (US), Oxford University, Washington University St. Louis, the National Health Service (UK), McGill University and the University of California at Los Angeles (UCLA). At least 20% of users return to the site one or more times during a given month, and the total number of unique visitors per year exceeds 100,000.
Translation of NeuroNames into OWL
A large portion of the NeuroNames brain hierarchy has been translated into OWL for the Neuroscience Information Framework (Bug et al., 2008
). The NeuroNames hierarchy provides the core anatomical ontology for the Neuroscience Information Framework Standard (NIFSTD) gross anatomy module. All ontology modules in the NIFSTD are normalized to the same upper ontology, the Basic Formal Ontology (BFO). The initial volumetric partonomy of NeuroNames was refactored to an “is-a” hierarchy through the creation of categories such as “Predominantly gray part of hypothalamus” and listing the NeuroNames parts underneath. As the reasoning of OWL over partonomies became more powerful, these somewhat contrived and artificial categories were replaced through the assignment of the “part of” relationship from the OBO relations ontology. As part of the NIFSTD infrastructure, each term within the NIFSTD is presented as its own page on the NeuroLex Wiki (NeuroLex, 2011
). The NIFSTD has progressively added more bridging relationships among modules, e.g., defining cell types according to the brain region in which the cell soma lies, through the definition of bridge files.
Limitations of the BrainInfo portal as a web textbook
A serious challenge to development of BrainInfo as a web-resource compared to conventional publications is the copyright constraint. While images of most brain structures appear on the Web in one form or another, the original photomicrographs illustrating the definitions of the several hundred primary structures of the brain reside in copyrighted publications. We have been able to scan and display the original images of cortical areas from publications that are out of copyright, from Brodmann (1909) up to sources from the early 1960s. We are generally unable to display original images from later publications, because publishers who readily grant permission to republish images in a conventional textbook are hesitant to grant permission to publish them on the Web.
Another challenge arises from the great variability in care for accuracy exercised by the authors of neuroanatomy websites. Some of the best images for illustrating some structures show erroneous labels for others. We address this issue by linking to such images only for the structures that are correctly labeled and for which no other image is available. If later we find an equivalent image that is more accurately labeled, we eliminate the link to the first and link to the new image.
A third challenge, which fortunately occurs infrequently, is the disappearance or reorganization of a website that results in the loss of access to informative pages. In six years we have lost or discontinued contact with three websites on those bases.
Perhaps the most serious limitation of NeuroNames and BrainInfo in the eyes of its users is the failure to achieve comprehensive coverage of the neuroanatomical domain. While the NeuroNames vocabulary is estimated to contain 90% of English and Latin names of neuroanatomical structures a person encounters in the neuroscientific literature, the NeuroNames ontology only includes the text definitions of about 70% of the structures. And it provides even less complete information about the connectivity, cells, genes expressed, models, function and other features of specific structures. The main reason for less than comprehensive coverage is that in the beginning we populated the knowledge base with a focus on information not readily available in standard English-language textbooks. It is apparent, however, that the Web is becoming the first line of inquiry, even for basic information about the classical structures. As a result we are currently incorporating text definitions of all concepts in the NeuroNames ontology.
A further challenge to NeuroNames and BrainInfo is common to all web-based resources. That is the challenge of gaining access to constructive peer review. The NeuroNames brain hierarchy was subjected to, and improved by, peer review when it was first published (Bowden and Martin, 1995
). In the subsequent 15 years the number of concepts in the ontology has more than tripled without review. For several years we sought critique through a ‘Feedback’ button on the home page and a ‘Comments’ button on every informational page. Both attracted little other than e-graffiti.
In the long run the greatest limiting factor to NeuroNames development may prove to be its dependence on the continuous effort of a single individual. At this time it is unclear whether the web portal and textbook format of BrainInfo can merge into an institutional framework in a way that maintains its growth and development. Continuous development requires long-term expenditure of scholarly time and effort. It requires at least one individual who, like a textbook author, uses semiautomated informatics tools to modify and extend the ontology as the domain evolves. In the world of conventional publication, when Author A of a successful textbook XYZ
moves on, the publisher recruits a new Author B and the series continues as A’s textbook of xyz
by B. In the world of the noncommercial internet, when an author moves on the resource risks deterioration as untended links break and lack of updating leads to obsolescence of its content. In 2009 the International Neuroinformatics Coordinating Facility (INCF, 2011
and the Center for Research in Biological Systems (CREBS, 2010
) assumed sponsorship of BrainInfo to maintain current functions. We are pursuing several potential mechanisms to assure that the system will continue to grow in usefulness to the neuroscience community.