Models can be submitted by anyone to the curation pipeline of the database (). At present, BioModels Database aims to store and annotate models that can be encoded with SBML. CellML models are also accepted. These model formats are synonymous with models that can be integrated or iterated forwards in time, such as ordinary differential equation models. Although we are aware that this means we can cover only a restricted part of the modeling field, we make this our initial focus for the following reason: (i) since a crucial part of the curation process is the verification that the models produce numerical results similar to the ones described in the reference article, iterative simulations over ranges of parameter values and perturbation of simulations at equilibrium are mandatory and (ii) a very large number of such models have already been published, and the pace of their publication is increasing steadily. As a consequence, they are sufficient to consume all the curation workforce we have, and we can envision to gather in the near future.
Pipeline describing the structure of BioModels database.
To be accepted in BioModels Database, a model must be compliant with MIRIAM, the Minimal Information Requested in the Annotation of Models (10
). One of the requirements of MIRIAM is that a model has to be associated with a reference description that provides directly, or through references, the structure of the model, the necessary quantitative parameters and presents the results of numerical analysis of the model. BioModels Database further refines the notion of reference description, by considering only models described in the peer-reviewed scientific litterature.
A series of automated tasks are performed by the pipeline prior to human intervention (see Materials and Methods for details):
- Verification that the file is well-formed XML.
- If necessary, conversion to the latest version of SBML.
- Verification of the syntax of SBML.
- Series of consistency checks, enforcing the validity of the model.
If any of those steps is not completed, a member of the distributed team of curators can reject the model, or instead correct it and resubmit it to the pipeline. The last and most important step, of the curation process, is verifying that when instantiated in a simulation, the model provides results corresponding to the reference scientific article. Curators do not normally challenge the biological relevance of the models, and assume the peer-review process already filtered out unsuitable contributions. However, in specific cases, curators can spot mistakes in an article and, with the agreement of the authors, modify the model accordingly. Once the model is verified to be valid SBML, and to correspond well to the article, it is accepted in the production database for annotation.
In order to be confident in reusing an encoded model, one should be able to trace its origin, and the people who were involved in its inception. The following information is therefore added to the model: (i) either a PubMed identifier (http://www.pubmed.gov
) or a DOI (http://www.doi.org
) or an URL that permits identifying the peer-review article describing the model; (ii) name and contact details of the individuals who actually contributed to the encoding of the model in its present form; (iii) name and contact of the the person who finally entered the model in the production database and who should be contacted if there is a problem with the encoding of the model or the annotation.
In addition, model components are annotated with references to relevant resources, such as terms from controled vocabularies (Taxonomy, Gene Ontology, ChEBI, etc.) and links to other databases (UniProt, KEGG, Reactome, etc.). This annotation is a crucial feature of BioModels Database in that it permits the unambiguous identification of molecular species or reactions and enables effective search stategies.