To maximize its value and durability, NeuroMorpho.Org adopted available bioinformatics standards and guidelines, while making necessary adjustments to accommodate the specific needs of its morphological content. Neuronal reconstructions are heterogeneous in terms of both their numerical specification (file format, size, resolution, etc.) and the scope of the originating studies (e.g. anatomical, behavioral, electrophysiological, pharmacological or developmental). The NeuroMorpho.Org infrastructure had to develop a unified set of design principles and choice of metadata to achieve several important goals:
- to store the various data types according to a pre-defined structured schema;
- to support different reconstruction formats, while providing a standardized output;
- to facilitate complex queries and data retrieval;
- to enable rapid development and deployment while remaining user-friendly;
- to be interoperable with other neuroscience resources;
- to optimize reliability, security, robustness, and scalability.
There have been several previous efforts to create repositories for neuronal reconstructions (reviewed in Ascoli, 2006b
). An extended list of these existing archives is available at NeuroMorpho.Org under the “Tools and Links” menu. In general, all data available through these databases are mirrored in NeuroMorpho.Org. There are however substantial differences that make NeuroMorpho.Org unique among such resources. The foremost is that data in NeuroMorpho.Org are contributed by a large (and continuously increasing) number of different researchers rather than an individual laboratory. A primary goal of NeuroMorpho.Org is to achieve and maintain dense coverage of all the publicly available digital reconstructions rather than provide a static venue to distribute a particular subset of neuronal morphologies.
Reconstructions in NeuroMorpho.Org are organized according to cell types, animal species, brain regions, technical protocols, tracing methods, numerous morphometrics measures, and several other dimensions. In general, metadata greatly adds to the significance and scientific value of data, enables effective user searches, as well as integration with related resources. Three general categories of metadata in NeuroMorpho.Org relate to (1) the data source, such as the laboratory and researcher providing the reconstructions, the reference articles, and the archive internet address (if any); (2) the subject of the study, such as animal species and strain, area of the brain, and neuron type; and (3) the experimental methodology, such as histological protocols, reconstruction hardware and software, and format of the original data. The absence of a widely accepted neuroscience ontology was a cause of considerable hardships in the process, which stresses the need for a standard terminology in the field. We attempted to follow the classifications used in the original studies describing the reconstructions. In particular, the detailed information retrieved from peer-reviewed journal publications was initially evaluated, and relevant data was extracted. As more studies and data were processed, metadata was optimized accordingly. Several examples of metadata and their descriptions are provided in . The database schema is continuously updated and publicly posted on NeuroMorpho.Org under the “Tools and Links” tab.
Examples of metadata extracted from peer-review publications and their descriptions.
The basic NeuroMorpho.Org framework consists of a standard three tier architecture (web client, web server, and relational database). This organization is scalable, robust, and flexible, while at the same time allowing for easy management and network deployment (). All the original data provided by researchers are stored in a back-end relational database, including raw and processed files, images, and metadata (). NeuroMorpho.Org utilizes MySQL V5.0 as the database management system, with several add-on custom applications (written in Java, C++, and Matlab). The web application server, Apache Tomcat 5.5, runs on a 2.0 GHz Intel dual quad core processor machine under the Linux Fedora 8 operating system.
Structure and information flow among the components of NeuroMorpho.Org. (A) Organization of the processing pipeline. (B) Role of MRALD in the three-tier data retrieval architecture.
Web-accessible, secure, and user-friendly interaction with the images and metadata in the back-end database is enabled by MRALD (Blake et al., 2002
). MRALD is a platform and database independent application, is easily customized to new domains, and has been deployed in mission critical systems (e.g., aviation) continuously since 2001. MRALD’s form builder enables system designers to rapidly generate intuitive hyper-text markup language (HTML)-based data retrieval forms; forms can be associated with specific users for access control. Data interaction is also possible via custom java server pages (JSPs) and keyword search. Hidden from the user, MRALD translates requests into structured query language (SQL) and interrogates the underlying database via Java database connectivity (JDBC). It can return results in multiple formats, including HTML, extensible markup language (XML), comma or configurable separated values (CSV), tab-delimited text, Excel spreadsheet files, and in other, user-defined, formats. MRALD’s web-based administration features include form update, insertion, and deletion; a schema visualization tool; user account management; and the ability to assign data and users to collaborative communities (Smith et al., 2004
). MRALD’s internal workflow processing for translating HTML into SQL is customizable, and can be extended by a developer to insert new steps (e.g., filters) into the normal processing pipeline (). MRALD is freely available to the academic community at neuroinformatics.mitre.org.
On the front-end web page, NeuroMorpho.Org offers three complementary search interfaces, namely by metadata, by morphometric measure, and by keywords. In the first interface, search terms are available to the user as drop-down menus grouped in four general sections (Animal, Experiment, Anatomy, or Source). Within each section, menus reflect the underlying metadata types (e.g. sex and age; histological protocol and reconstruction method; brain region and cell type; deposition date and original format). Sub-menus appear dynamically as appropriate (). For example, upon selection of the term “Rat” under “Species”, a “Strain” sub-menu is offered including relevant options (“Sprague Dawley”, “Long-Evans”, etc.). In the second search interface, value ranges can be assigned to morphometric features such as soma surface, number of branches, arbor length or volume, height, width, etc. Features can be specified alone or in combination to construct simple or more complex queries. The third interface allows users to retrieve data by typing keywords in a simple search bar (Ascoli et al., 2007b
Figure 3 Representative search through the Metadata page. Filter criteria can be combined and fine tuned through cascades of drop-down menus to increasing levels of detail. In this example, Monkey is selected as species, Macaque as strain, Lucifer Yellow as staining (more ...)
When activated by any of these three interfaces, MRALD processes the queries, generates the results dynamically, and sends them to the web client. The number of cells matching a given set of search criteria can also be requested before visualizing the results. The simplest option to display the search results is in a Summary format, in which each neuron is represented with a thumbnail image and an abridged set of its metadata. Clicking on one of these neuron entries calls the individual page of that reconstruction, with links to all raw and processed data, metadata, and related files (described below).
Alternatively, users can browse through search results (or the entire content of NeuroMorpho.Org) after sorting them by any of four criteria: brain region, animal species, cell type, and laboratory name. An overall view is rendered as a mouse-sensitive pie chart (http://cewolf.sourceforge.net
), and results are visualized as they are loaded to minimize wait time even for massive data sets (). The result of this organization is that a user can navigate from a conceptual query to the raw data in three clicks.
Data content diversity in NeuroMorpho.Org: pie chart representations of different species (top) and brain regions (bottom).
The data presentations by brain regions, cell types, and animal species have clear biological meaning, to be organized according to the NIF standard ontology (Bug et al., 2008
). The view by laboratory name can be useful for data contributors to demonstrate to funding agencies and promotion committees that they have followed through with an effective data sharing plan.
NeuroMorpho.Org does not require any user registration or login to search and download data. For each neuron in the database, both graphic representations and flat files are made available through direct links for visualization and download. Flat files include the original reconstruction file as provided by the laboratory of origin, the version converted into a standardized format, the log detailing all modifications, and a document listing any remaining notes or irregularities (see Standardization process
section below). Users may choose to download one or all of these four files for any number of neurons as a single compressed archive. Each neuron is illustrated with a static two-dimensional image as well as a 3D animation of the extending arborization while it rotates around the cell central axis. Moreover, to allow for interactive 3D manipulation of neurons, the Cell Viewer Application Cvapp (Cannon et al., 1998
) was custom modified, streamlining functionality and enabling automatic online deployment through the Java Network Launching Protocol (JNLP).
The database also stores the PubMed identity (PMID) for all referenced papers (see Data model and data management
section below), and a corresponding XML file, created through Java Server Pages (JSP), is accessed by the NIF Broker that mediates the Entrez LinkOut functionality service provided by PubMed (Marenco et al., 2008
). This architecture design allows direct reciprocal access between the peer-reviewed reference and the raw data. In particular, a link from the individual neuron page to the PubMed abstract of the publication(s) describing the experiment provides the users with a broader perspective on the reconstructions. To access the reconstructions from PubMed, users can follow the LinkOut option on the top right corner Links menu (see Figure 2 in Marenco et al., 2008
), which leads to NeuroMorpho.Org through the Neuroscience Database Gateway, a precursor of the NIF (see Figure 3 in Marenco et al., 2008