Summary: JSBML, the official pure Java programming library for the Systems Biology Markup Language (SBML) format, has evolved with the advent of different modeling formalisms in systems biology and their ability to be exchanged and represented via extensions of SBML. JSBML has matured into a major, active open-source project with contributions from a growing, international team of developers who not only maintain compatibility with SBML, but also drive steady improvements to the Java interface and promote ease-of-use with end users.
Availability and implementation: Source code, binaries and documentation for JSBML can be freely obtained under the terms of the LGPL 2.1 from the website http://sbml.org/Software/JSBML. More information about JSBML can be found in the user guide at http://sbml.org/Software/JSBML/docs/.
firstname.lastname@example.org or email@example.com
Supplementary data are available at Bioinformatics online.
The Computational Modeling in Biology Network (COMBINE) is a consortium of groups involved in the development of open community standards and formats used in computational modeling in biology. COMBINE’s aim is to act as a coordinator, facilitator, and resource for different standardization efforts whose domains of use cover related areas of the computational biology space. In this perspective article, we summarize COMBINE, its general organization, and the community standards and other efforts involved in it. Our goals are to help guide readers toward standards that may be suitable for their research activities, as well as to direct interested readers to relevant communities where they can best expect to receive assistance in how to develop interoperable computational models.
standardization; data sharing; reproducible science; computational modeling; biology; file formats
With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other essential information in a consistent fashion. These constitute various separate components required to reproduce a given published scientific result.
We describe the Open Modeling EXchange format (OMEX). Together with the use of other standard formats from the Computational Modeling in Biology Network (COMBINE), OMEX is the basis of the COMBINE Archive, a single file that supports the exchange of all the information necessary for a modeling and simulation experiment in biology. An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. The content of a COMBINE Archive consists of files encoded in COMBINE standards whenever possible, but may include additional files defined by an Internet Media Type. Several tools that support the COMBINE Archive are available, either as independent libraries or embedded in modeling software.
The COMBINE Archive facilitates the reproduction of modeling and simulation experiments in biology by embedding all the relevant information in one file. Having all the information stored and exchanged at once also helps in building activity logs and audit trails. We anticipate that the COMBINE Archive will become a significant help for modellers, as the domain moves to larger, more complex experiments such as multi-scale models of organs, digital organisms, and bioengineering.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0369-z) contains supplementary material, which is available to authorized users.
Data format; Archive; Computational modeling; Reproducible research; Reproducible science
BioModels (http://www.ebi.ac.uk/biomodels/) is a repository of mathematical models of biological processes. A large set of models is curated to verify both correspondence to the biological process that the model seeks to represent, and reproducibility of the simulation results as described in the corresponding peer-reviewed publication. Many models submitted to the database are annotated, cross-referencing its components to external resources such as database records, and terms from controlled vocabularies and ontologies. BioModels comprises two main branches: one is composed of models derived from literature, while the second is generated through automated processes. BioModels currently hosts over 1200 models derived directly from the literature, as well as in excess of 140 000 models automatically generated from pathway resources. This represents an approximate 60-fold growth for literature-based model numbers alone, since BioModels’ first release a decade ago. This article describes updates to the resource over this period, which include changes to the user interface, the annotation profiles of models in the curation pipeline, major infrastructure changes, ability to perform online simulations and the availability of model content in Linked Data form. We also outline planned improvements to cope with a diverse array of new challenges.
Genomic data now allow the large-scale manual or semi-automated reconstruction of metabolic networks. A network reconstruction represents a highly curated organism-specific knowledge base. A few genome-scale network reconstructions have appeared for metabolism in the baker’s yeast Saccharomyces cerevisiae. These alternative network reconstructions differ in scope and content, and further have used different terminologies to describe the same chemical entities, thus making comparisons between them difficult. The formulation of a ‘community consensus’ network that collects and formalizes the ‘community knowledge’ of yeast metabolism is thus highly desirable. We describe how we have produced a consensus metabolic network reconstruction for S. cerevisiae. Special emphasis is laid on referencing molecules to persistent databases or using database-independent forms such as SMILES or InChI strings, since this permits their chemical structure to be represented unambiguously and in a manner that permits automated reasoning. The reconstruction is readily available via a publicly accessible database and in the Systems Biology Markup Language, and we describe the manner in which it can be maintained as a community resource. It should serve as a common denominator for system biology studies of yeast. Similar strategies will be of benefit to communities studying genome-scale metabolic networks of other organisms.
The Computational Modeling in Biology Network (COMBINE) is an initiative to coordinate the development of community standards and formats in computational systems biology and related fields. This report summarizes the topics and activities of the fourth edition of the annual COMBINE meeting, held in Paris during September 16-20 2013, and attended by a total of 96 people. This edition pioneered a first day devoted to modeling approaches in biology, which attracted a broad audience of scientists thanks to a panel of renowned speakers. During subsequent days, discussions were held on many subjects including the introduction of new features in the various COMBINE standards, new software tools that use the standards, and outreach efforts. Significant emphasis went into work on extensions of the SBML format, and also into community-building. This year’s edition once again demonstrated that the COMBINE community is thriving, and still manages to help coordinate activities between different standards in computational systems biology.
Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require precise quantitative data, and that they are well-suited for studies of large networks. While numerous groups have developed specific computational tools that provide original methods to analyse qualitative models, a standard format to exchange qualitative models has been missing.
We present the Systems Biology Markup Language (SBML) Qualitative Models Package (“qual”), an extension of the SBML Level 3 standard designed for computer representation of qualitative models of biological networks. We demonstrate the interoperability of models via SBML qual through the analysis of a specific signalling network by three independent software tools. Furthermore, the collective effort to define the SBML qual format paved the way for the development of LogicalModel, an open-source model library, which will facilitate the adoption of the format as well as the collaborative development of algorithms to analyse qualitative models.
SBML qual allows the exchange of qualitative models among a number of complementary software tools. SBML qual has the potential to promote collaborative work on the development of novel computational approaches, as well as on the specification and the analysis of comprehensive qualitative models of regulatory and signalling networks.
Multiple models of human metabolism have been reconstructed, but each represents only a subset of our knowledge. Here we describe Recon 2, a community-driven, consensus ‘metabolic reconstruction’, which is the most comprehensive representation of human metabolism that is applicable to computational modeling. Compared with its predecessors, the reconstruction has improved topological and functional features, including ~2× more reactions and ~1.7× more unique metabolites. Using Recon 2 we predicted changes in metabolite biomarkers for 49 inborn errors of metabolism with 77% accuracy when compared to experimental data. Mapping metabolomic data and drug information onto Recon 2 demonstrates its potential for integrating and analyzing diverse data types. Using protein expression data, we automatically generated a compendium of 65 cell type–specific models, providing a basis for manual curation or investigation of cell-specific metabolic properties. Recon 2 will facilitate many future biomedical studies and is freely available at http://humanmetabolism.org/.
Systems biology projects and omics technologies have led to a growing number of biochemical pathway models and reconstructions. However, the majority of these models are still created de novo, based on literature mining and the manual processing of pathway data.
To increase the efficiency of model creation, the Path2Models project has automatically generated mathematical models from pathway representations using a suite of freely available software. Data sources include KEGG, BioCarta, MetaCyc and SABIO-RK. Depending on the source data, three types of models are provided: kinetic, logical and constraint-based. Models from over 2 600 organisms are encoded consistently in SBML, and are made freely available through BioModels Database at http://www.ebi.ac.uk/biomodels-main/path2models. Each model contains the list of participants, their interactions, the relevant mathematical constructs, and initial parameter values. Most models are also available as easy-to-understand graphical SBGN maps.
To date, the project has resulted in more than 140 000 freely available models. Such a resource can tremendously accelerate the development of mathematical models by providing initial starting models for simulation and analysis, which can be subsequently curated and further parameterized.
Modular rate law; Constraint based models; Logical models; SBGN; SBML
The increasing use of computational simulation experiments to inform modern biological research creates new challenges to annotate, archive, share and reproduce such experiments. The recently published Minimum Information About a Simulation Experiment (MIASE) proposes a minimal set of information that should be provided to allow the reproduction of simulation experiments among users and software tools.
In this article, we present the Simulation Experiment Description Markup Language (SED-ML). SED-ML encodes in a computer-readable exchange format the information required by MIASE to enable reproduction of simulation experiments. It has been developed as a community project and it is defined in a detailed technical specification and additionally provides an XML schema. The version of SED-ML described in this publication is Level 1 Version 1. It covers the description of the most frequent type of simulation experiments in the area, namely time course simulations. SED-ML documents specify which models to use in an experiment, modifications to apply on the models before using them, which simulation procedures to run on each model, what analysis results to output, and how the results should be presented. These descriptions are independent of the underlying model implementation. SED-ML is a software-independent format for encoding the description of simulation experiments; it is not specific to particular simulation tools. Here, we demonstrate that with the growing software support for SED-ML we can effectively exchange executable simulation descriptions.
With SED-ML, software can exchange simulation experiment descriptions, enabling the validation and reuse of simulation experiments in different tools. Authors of papers reporting simulation experiments can make their simulation protocols available for other scientists to reproduce the results. Because SED-ML is agnostic about exact modeling language(s) used, experiments covering models from different fields of research can be accurately described and combined.
The Computational Modeling in Biology Network (COMBINE), is an initiative to coordinate the development of the various community standards and formats in computational systems biology and related fields. This report summarizes the activities pursued at the first annual COMBINE meeting held in Edinburgh on October 6-9 2010 and the first HARMONY hackathon, held in New York on April 18-22 2011. The first of those meetings hosted 81 attendees. Discussions covered both official COMBINE standards-(BioPAX, SBGN and SBML), as well as emerging efforts and interoperability between different formats. The second meeting, oriented towards software developers, welcomed 59 participants and witnessed many technical discussions, development of improved standards support in community software systems and conversion between the standards. Both meetings were resounding successes and showed that the field is now mature enough to develop representation formats and related standards in a coordinated manner.
The use of computational modeling to describe and analyze biological systems is at the heart of systems biology. This Perspective discusses the development and use of ontologies that are designed to add semantic information to computational models and simulations.
The use of computational modeling to describe and analyze biological systems is at the heart of systems biology. Model structures, simulation descriptions and numerical results can be encoded in structured formats, but there is an increasing need to provide an additional semantic layer. Semantic information adds meaning to components of structured descriptions to help identify and interpret them unambiguously. Ontologies are one of the tools frequently used for this purpose. We describe here three ontologies created specifically to address the needs of the systems biology community. The Systems Biology Ontology (SBO) provides semantic information about the model components. The Kinetic Simulation Algorithm Ontology (KiSAO) supplies information about existing algorithms available for the simulation of systems biology models, their characterization and interrelationships. The Terminology for the Description of Dynamics (TEDDY) categorizes dynamical features of the simulation results and general systems behavior. The provision of semantic information extends a model's longevity and facilitates its reuse. It provides useful insight into the biology of modeled processes, and may be used to make informed decisions on subsequent simulation experiments.
dynamics; kinetics; model; ontology; simulation
Summary: The specifications of the Systems Biology Markup Language (SBML) define standards for storing and exchanging computer models of biological processes in text files. In order to perform model simulations, graphical visualizations and other software manipulations, an in-memory representation of SBML is required. We developed JSBML for this purpose. In contrast to prior implementations of SBML APIs, JSBML has been designed from the ground up for the Java™ programming language, and can therefore be used on all platforms supported by a Java Runtime Environment. This offers important benefits for Java users, including the ability to distribute software as Java Web Start applications. JSBML supports all SBML Levels and Versions through Level 3 Version 1, and we have strived to maintain the highest possible degree of compatibility with the popular library libSBML. JSBML also supports modules that can facilitate the development of plugins for end user applications, as well as ease migration from a libSBML-based backend.
Availability: Source code, binaries and documentation for JSBML can be freely obtained under the terms of the LGPL 2.1 from the website http://sbml.org/Software/JSBML.
Supplementary information: Supplementary data are available at Bioinformatics online.
BioPAX (Biological Pathway Exchange) is a standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data (http://www.biopax.org). Pathway data captures our understanding of biological processes, but its rapid growth necessitates development of databases and computational tools to aid interpretation. However, the current fragmentation of pathway information across many databases with incompatible formats presents barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery.
pathway data integration; pathway database; standard exchange format; ontology; information system
A recent article in BMC Bioinformatics describes new advances in workflow systems for computational modeling in systems biology. Such systems can accelerate, and improve the consistency of, modeling through automation not only at the simulation and results-production stages, but also at the model-generation stage. Their work is a harbinger of the next generation of more powerful software for systems biologists.
See research article: http://www.biomedcentral.com/1471-2105/11/582/abstract/
Ever since the rise of systems biology at the end of the last century, mathematical representations of biological systems and their activities have flourished. They are being used to describe everything from biomolecular networks, such as gene regulation, metabolic processes and signaling pathways, at the lowest biological scales, to tissue growth and differentiation, drug effects, environmental interactions, and more. A very active area in the field has been the development of techniques that facilitate the construction, analysis and dissemination of computational models. The heterogeneous, distributed nature of most data resources today has increased not only the opportunities for, but also the difficulties of, developing software systems to support these tasks. The work by Li et al.  published in BMC Bioinformatics represents a promising evolutionary step forward in this area. They describe a workflow system - a visual software
environment enabling a user to create a connected set of operations to be performed sequentially using seperate tools and resources. Their system uses third-party data resources accessible over the Internet to elaborate and parametrize (that is, assign parameter values to) computational models in a semi-automated manner. In Li et al.'s work, the authors point towards a promising future for computational modeling and simultaneously highlight some of the difficulties that need to be overcome before we get there.
Quantitative models of biochemical and cellular systems are used to answer a variety of questions in the biological sciences. The number of published quantitative models is growing steadily thanks to increasing interest in the use of models as well as the development of improved software systems and the availability of better, cheaper computer hardware. To maximise the benefits of this growing body of models, the field needs centralised model repositories that will encourage, facilitate and promote model dissemination and reuse. Ideally, the models stored in these repositories should be extensively tested and encoded in community-supported and standardised formats. In addition, the models and their components should be cross-referenced with other resources in order to allow their unambiguous identification.
BioModels Database http://www.ebi.ac.uk/biomodels/ is aimed at addressing exactly these needs. It is a freely-accessible online resource for storing, viewing, retrieving, and analysing published, peer-reviewed quantitative models of biochemical and cellular systems. The structure and behaviour of each simulation model distributed by BioModels Database are thoroughly checked; in addition, model elements are annotated with terms from controlled vocabularies as well as linked to relevant data resources. Models can be examined online or downloaded in various formats. Reaction network diagrams generated from the models are also available in several formats. BioModels Database also provides features such as online simulation and the extraction of components from large scale models into smaller submodels. Finally, the system provides a range of web services that external software systems can use to access up-to-date data from the database.
BioModels Database has become a recognised reference resource for systems biology. It is being used by the community in a variety of ways; for example, it is used to benchmark different simulation systems, and to study the clustering of models based upon their annotations. Model deposition to the database today is advised by several publishers of scientific journals. The models in BioModels Database are freely distributed and reusable; the underlying software infrastructure is also available from SourceForge https://sourceforge.net/projects/biomodels/ under the GNU General Public License.
Summary: The XML-based Systems Biology Markup Language (SBML) has emerged as a standard for storage, communication and interchange of models in systems biology. As a machine-readable format XML is difficult for humans to read and understand. Many tools are available that visualize the reaction pathways stored in SBML files, but many components, e.g. unit declarations, complex kinetic equations or links to MIRIAM resources, are often not made visible in these diagrams. For a broader understanding of the models, support in scientific writing and error detection, a human-readable report of the complete model is needed. We present SBML2LaTEX, a Java-based stand-alone program to fill this gap. A convenient web service allows users to directly convert SBML to various formats, including DVI, LaTEX and PDF, and provides many settings for customization.
Availability: Source code, documentation and a web service are freely available at http://www.ra.cs.uni-tuebingen.de/software/SBML2LaTeX.
Supplementary information:Supplementary data are available at Bioinformatics online.
LibSBML is an application programming interface library for reading, writing, manipulating and validating content expressed in the Systems Biology Markup Language (SBML) format. It is written in ISO C and C++, provides language bindings for Common Lisp, Java, Python, Perl, MATLAB and Octave, and includes many features that facilitate adoption and use of both SBML and the library. Developers can embed libSBML in their applications, saving themselves the work of implementing their own SBML parsing, manipulation, and validation software.
LibSBML 3 was released in August 2007. Source code, binaries and documentation are freely available under LGPL open-source terms from http://sbml.org/software/libsbml.
Summary: MathSBML is a Mathematica package designed for manipulating Systems Biology Markup Language (SBML) models. It converts SBML models into Mathematica data structures and provides a platform for manipulating and evaluating these models. Once a model is read by MathSBML, it is fully compatible with standard Mathematica functions such as NDSolve (a differential-algebraic equations solver). MathSBML also provides an application programming interface for viewing, manipulating, running numerical simulations; exporting SBML models; and converting SBML models in to other formats, such as XPP, HTML and FORTRAN. By accessing the full breadth of Mathematica functionality, MathSBML is fully extensible to SBML models of any size or complexity.
Availability: Open Source (LGPL) at http://www.sbml.org and http://www.sf.net/projects/sbml.
BioModels Database (), part of the international initiative BioModels.net, provides access to published, peer-reviewed, quantitative models of biochemical and cellular systems. Each model is carefully curated to verify that it corresponds to the reference publication and gives the proper numerical results. Curators also annotate the components of the models with terms from controlled vocabularies and links to other relevant data resources. This allows the users to search accurately for the models they need. The models can currently be retrieved in the SBML format, and import/export facilities are being developed to extend the spectrum of formats supported by the resource.