Search tips
Search criteria

Results 1-8 (8)

Clipboard (0)

Select a Filter Below

Year of Publication
1.  COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project 
BMC Bioinformatics  2014;15(1):369.
With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other essential information in a consistent fashion. These constitute various separate components required to reproduce a given published scientific result.
We describe the Open Modeling EXchange format (OMEX). Together with the use of other standard formats from the Computational Modeling in Biology Network (COMBINE), OMEX is the basis of the COMBINE Archive, a single file that supports the exchange of all the information necessary for a modeling and simulation experiment in biology. An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. The content of a COMBINE Archive consists of files encoded in COMBINE standards whenever possible, but may include additional files defined by an Internet Media Type. Several tools that support the COMBINE Archive are available, either as independent libraries or embedded in modeling software.
The COMBINE Archive facilitates the reproduction of modeling and simulation experiments in biology by embedding all the relevant information in one file. Having all the information stored and exchanged at once also helps in building activity logs and audit trails. We anticipate that the COMBINE Archive will become a significant help for modellers, as the domain moves to larger, more complex experiments such as multi-scale models of organs, digital organisms, and bioengineering.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0369-z) contains supplementary material, which is available to authorized users.
PMCID: PMC4272562  PMID: 25494900
Data format; Archive; Computational modeling; Reproducible research; Reproducible science
2.  FieldML, a proposed open standard for the Physiome project for mathematical model representation 
The FieldML project has made significant progress towards the goal of addressing the need to have open standards and open source software for representing finite element method (FEM) models and, more generally, multivariate field models, such as many of the models that are core to the euHeart project and the Physiome project. FieldML version 0.5 is the most recently released format from the FieldML project. It is an XML format that already has sufficient capability to represent the majority of euHeart’s explicit models such as the anatomical FEM models and simulation solution fields. The details of FieldML version 0.5 are presented, as well as its limitations and some discussion of the progress being made to address these limitations.
Electronic supplementary material
The online version of this article (doi:10.1007/s11517-013-1097-7) contains supplementary material, which is available to authorized users.
PMCID: PMC3825639  PMID: 23900627
Science technology; Life sciences biomedicine; Biochemical research methods; Mathematical computational biology; Mathematical models; FieldML; CellML
3.  Declarative Representation of Uncertainty in Mathematical Models 
PLoS ONE  2012;7(7):e39721.
An important aspect of multi-scale modelling is the ability to represent mathematical models in forms that can be exchanged between modellers and tools. While the development of languages like CellML and SBML have provided standardised declarative exchange formats for mathematical models, independent of the algorithm to be applied to the model, to date these standards have not provided a clear mechanism for describing parameter uncertainty. Parameter uncertainty is an inherent feature of many real systems. This uncertainty can result from a number of situations, such as: when measurements include inherent error; when parameters have unknown values and so are replaced by a probability distribution by the modeller; when a model is of an individual from a population, and parameters have unknown values for the individual, but the distribution for the population is known. We present and demonstrate an approach by which uncertainty can be described declaratively in CellML models, by utilising the extension mechanisms provided in CellML. Parameter uncertainty can be described declaratively in terms of either a univariate continuous probability density function or multiple realisations of one variable or several (typically non-independent) variables. We additionally present an extension to SED-ML (the Simulation Experiment Description Markup Language) to describe sampling sensitivity analysis simulation experiments. We demonstrate the usability of the approach by encoding a sample model in the uncertainty markup language, and by developing a software implementation of the uncertainty specification (including the SED-ML extension for sampling sensitivty analyses) in an existing CellML software library, the CellML API implementation. We used the software implementation to run sampling sensitivity analyses over the model to demonstrate that it is possible to run useful simulations on models with uncertainty encoded in this form.
PMCID: PMC3389025  PMID: 22802941
4.  Reproducible computational biology experiments with SED-ML - The Simulation Experiment Description Markup Language 
BMC Systems Biology  2011;5:198.
The increasing use of computational simulation experiments to inform modern biological research creates new challenges to annotate, archive, share and reproduce such experiments. The recently published Minimum Information About a Simulation Experiment (MIASE) proposes a minimal set of information that should be provided to allow the reproduction of simulation experiments among users and software tools.
In this article, we present the Simulation Experiment Description Markup Language (SED-ML). SED-ML encodes in a computer-readable exchange format the information required by MIASE to enable reproduction of simulation experiments. It has been developed as a community project and it is defined in a detailed technical specification and additionally provides an XML schema. The version of SED-ML described in this publication is Level 1 Version 1. It covers the description of the most frequent type of simulation experiments in the area, namely time course simulations. SED-ML documents specify which models to use in an experiment, modifications to apply on the models before using them, which simulation procedures to run on each model, what analysis results to output, and how the results should be presented. These descriptions are independent of the underlying model implementation. SED-ML is a software-independent format for encoding the description of simulation experiments; it is not specific to particular simulation tools. Here, we demonstrate that with the growing software support for SED-ML we can effectively exchange executable simulation descriptions.
With SED-ML, software can exchange simulation experiment descriptions, enabling the validation and reuse of simulation experiments in different tools. Authors of papers reporting simulation experiments can make their simulation protocols available for other scientists to reproduce the results. Because SED-ML is agnostic about exact modeling language(s) used, experiments covering models from different fields of research can be accurately described and combined.
PMCID: PMC3292844  PMID: 22172142
6.  Revision history aware repositories of computational models of biological systems 
BMC Bioinformatics  2011;12:22.
Building repositories of computational models of biological systems ensures that published models are available for both education and further research, and can provide a source of smaller, previously verified models to integrate into a larger model.
One problem with earlier repositories has been the limitations in facilities to record the revision history of models. Often, these facilities are limited to a linear series of versions which were deposited in the repository. This is problematic for several reasons. Firstly, there are many instances in the history of biological systems modelling where an 'ancestral' model is modified by different groups to create many different models. With a linear series of versions, if the changes made to one model are merged into another model, the merge appears as a single item in the history. This hides useful revision history information, and also makes further merges much more difficult, as there is no record of which changes have or have not already been merged. In addition, a long series of individual changes made outside of the repository are also all merged into a single revision when they are put back into the repository, making it difficult to separate out individual changes. Furthermore, many earlier repositories only retain the revision history of individual files, rather than of a group of files. This is an important limitation to overcome, because some types of models, such as CellML 1.1 models, can be developed as a collection of modules, each in a separate file.
The need for revision history is widely recognised for computer software, and a lot of work has gone into developing version control systems and distributed version control systems (DVCSs) for tracking the revision history. However, to date, there has been no published research on how DVCSs can be applied to repositories of computational models of biological systems.
We have extended the Physiome Model Repository software to be fully revision history aware, by building it on top of Mercurial, an existing DVCS. We have demonstrated the utility of this approach, when used in conjunction with the model composition facilities in CellML, to build and understand more complex models. We have also demonstrated the ability of the repository software to present version history to casual users over the web, and to highlight specific versions which are likely to be useful to users.
Providing facilities for maintaining and using revision history information is an important part of building a useful repository of computational models, as this information is useful both for understanding the source of and justification for parts of a model, and to facilitate automated processes such as merges. The availability of fully revision history aware repositories, and associated tools, will therefore be of significant benefit to the community.
PMCID: PMC3033326  PMID: 21235804
7.  A Bayesian Search for Transcriptional Motifs 
PLoS ONE  2010;5(11):e13897.
Identifying transcription factor (TF) binding sites (TFBSs) is an important step towards understanding transcriptional regulation. A common approach is to use gaplessly aligned, experimentally supported TFBSs for a particular TF, and algorithmically search for more occurrences of the same TFBSs. The largest publicly available databases of TF binding specificities contain models which are represented as position weight matrices (PWM). There are other methods using more sophisticated representations, but these have more limited databases, or aren't publicly available. Therefore, this paper focuses on methods that search using one PWM per TF. An algorithm, MATCHTM, for identifying TFBSs corresponding to a particular PWM is available, but is not based on a rigorous statistical model of TF binding, making it difficult to interpret or adjust the parameters and output of the algorithm. Furthermore, there is no public description of the algorithm sufficient to exactly reproduce it. Another algorithm, MAST, computes a p-value for the presence of a TFBS using true probabilities of finding each base at each offset from that position. We developed a statistical model, BaSeTraM, for the binding of TFs to TFBSs, taking into account random variation in the base present at each position within a TFBS. Treating the counts in the matrices and the sequences of sites as random variables, we combine this TFBS composition model with a background model to obtain a Bayesian classifier. We implemented our classifier in a package (SBaSeTraM). We tested SBaSeTraM against a MATCHTM implementation by searching all probes used in an experimental Saccharomyces cerevisiae TF binding dataset, and comparing our predictions to the data. We found no statistically significant differences in sensitivity between the algorithms (at fixed selectivity), indicating that SBaSeTraM's performance is at least comparable to the leading currently available algorithm. Our software is freely available at:
PMCID: PMC2987817  PMID: 21124986
8.  An overview of the CellML API and its implementation 
BMC Bioinformatics  2010;11:178.
CellML is an XML based language for representing mathematical models, in a machine-independent form which is suitable for their exchange between different authors, and for archival in a model repository. Allowing for the exchange and archival of models in a computer readable form is a key strategic goal in bioinformatics, because of the associated improvements in scientific record accuracy, the faster iterative process of scientific development, and the ability to combine models into large integrative models.
However, for CellML models to be useful, tools which can process them correctly are needed. Due to some of the more complex features present in CellML models, such as imports, developing code ab initio to correctly process models can be an onerous task. For this reason, there is a clear and pressing need for an application programming interface (API), and a good implementation of that API, upon which tools can base their support for CellML.
We developed an API which allows the information in CellML models to be retrieved and/or modified. We also developed a series of optional extension APIs, for tasks such as simplifying the handling of connections between variables, dealing with physical units, validating models, and translating models into different procedural languages.
We have also provided a Free/Open Source implementation of this application programming interface, optimised to achieve good performance.
Tools have been developed using the API which are mature enough for widespread use. The API has the potential to accelerate the development of additional tools capable of processing CellML, and ultimately lead to an increased level of sharing of mathematical model descriptions.
PMCID: PMC2858041  PMID: 20377909

Results 1-8 (8)