With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other essential information in a consistent fashion. These constitute various separate components required to reproduce a given published scientific result.
We describe the Open Modeling EXchange format (OMEX). Together with the use of other standard formats from the Computational Modeling in Biology Network (COMBINE), OMEX is the basis of the COMBINE Archive, a single file that supports the exchange of all the information necessary for a modeling and simulation experiment in biology. An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. The content of a COMBINE Archive consists of files encoded in COMBINE standards whenever possible, but may include additional files defined by an Internet Media Type. Several tools that support the COMBINE Archive are available, either as independent libraries or embedded in modeling software.
The COMBINE Archive facilitates the reproduction of modeling and simulation experiments in biology by embedding all the relevant information in one file. Having all the information stored and exchanged at once also helps in building activity logs and audit trails. We anticipate that the COMBINE Archive will become a significant help for modellers, as the domain moves to larger, more complex experiments such as multi-scale models of organs, digital organisms, and bioengineering.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0369-z) contains supplementary material, which is available to authorized users.
Data format; Archive; Computational modeling; Reproducible research; Reproducible science
Accurate estimation of parameters of biochemical models is required to characterize the dynamics of molecular processes. This problem is intimately linked to identifying the most informative experiments for accomplishing such tasks. While significant progress has been made, effective experimental strategies for parameter identification and for distinguishing among alternative network topologies remain unclear. We approached these questions in an unbiased manner using a unique community-based approach in the context of the DREAM initiative (Dialogue for Reverse Engineering Assessment of Methods). We created an in silico test framework under which participants could probe a network with hidden parameters by requesting a range of experimental assays; results of these experiments were simulated according to a model of network dynamics only partially revealed to participants.
We proposed two challenges; in the first, participants were given the topology and underlying biochemical structure of a 9-gene regulatory network and were asked to determine its parameter values. In the second challenge, participants were given an incomplete topology with 11 genes and asked to find three missing links in the model. In both challenges, a budget was provided to buy experimental data generated in silico with the model and mimicking the features of different common experimental techniques, such as microarrays and fluorescence microscopy. Data could be bought at any stage, allowing participants to implement an iterative loop of experiments and computation.
A total of 19 teams participated in this competition. The results suggest that the combination of state-of-the-art parameter estimation and a varied set of experimental methods using a few datasets, mostly fluorescence imaging data, can accurately determine parameters of biochemical models of gene regulation. However, the task is considerably more difficult if the gene network topology is not completely defined, as in challenge 2. Importantly, we found that aggregating independent parameter predictions and network topology across submissions creates a solution that can be better than the one from the best-performing submission.
With the growing importance of computational models in systems biology there has been much interest in recent years to develop standard model interchange languages that permit biologists to easily exchange models between different software tools. In this chapter two chief model exchange standards, SBML and CellML are described. In addition, other related features including visual layout initiatives, ontologies and best practices for model annotation are discussed. Software tools such as developer libraries and basic editing tools are also introduced together with a discussion on the future of modeling languages and visualization tools in systems biology.
SBML; CellML; Ontology; MIRIAM; SBGN; TEDDY; MIASE; Standards; libSBML; SBMLEditor; Biomodels.net; SBO
A great variety of software applications are now employed in the metabolic engineering field. These applications have been created to support a wide range of experimental and analysis techniques. Computational tools are utilized throughout the metabolic engineering workflow to extract and interpret relevant information from large data sets, to present complex models in a more manageable form, and to propose efficient network design strategies. In this review, we present a number of tools that can assist in modifying and understanding cellular metabolic networks. The review covers seven areas of relevance to metabolic engineers. These include metabolic reconstruction efforts, network visualization, nucleic acid and protein engineering, metabolic flux analysis, pathway prospecting, post-structural network analysis and culture optimization. The list of available tools is extensive and we can only highlight a small, representative portion of the tools from each area.
metabolic engineering; software; genome-scale metabolic networks; network visualization
Many biological studies are carried out on large populations of cells, often in order to obtain enough material to make measurements. However, we now know that noise is endemic in biological systems and this results in cell-to-cell variability in what appears to be a population of identical cells. Although often neglected, this noise can have a dramatic effect on system responses to environmental cues with significant and often counter-intuitive biological outcomes. A recent study in BMC Systems Biology provides an example of this, documenting a bimodal distribution of activated extracellular signal-regulated kinase in a population of cells exposed to epidermal growth factor and demonstrating that the observed bimodality of the response is induced purely by noise.
See research article: http://www.biomedcentral.com/1752-0509/6/109
One problem with synthetic genes in genetically engineered organisms is that these foreign DNAs will eventually lose their functions over evolutionary time in absence of selective pressures. This general limitation can restrain the long-term study and industrial application of synthetic genetic circuits. Previous studies have shown that because of their crucial regulatory functions, prokaryotic promoters in synthetic genetic circuits are especially vulnerable to mutations. In this study, we rationally designed robust bidirectional promoters (BDPs), which are self-protected through the complementarity of their overlapping forward and backward promoter sequences on DNA duplex. When the transcription of a target non-essential gene (e.g. green fluorescent protein) was coupled to the transcription of an essential gene (e.g. antibiotic resistance gene) through the BDP, the evolutionary half-time of the gene of interest increases 4–10 times, depending on the strain and experimental conditions used. This design of using BDPs to increase the mutational stability of genetic circuits can be potentially applied to synthetic biology applications in general.
Genetically identical cells can show phenotypic variability. This is often caused by stochastic events that originate from randomness in biochemical processes involving in gene expression and other extrinsic cellular processes. From an engineering perspective, there have been efforts focused on theory and experiments to control noise levels by perturbing and replacing gene network components. However, systematic methods for noise control are lacking mainly due to the intractable mathematical structure of noise propagation through reaction networks. Here, we provide a numerical analysis method by quantifying the parametric sensitivity of noise characteristics at the level of the linear noise approximation. Our analysis is readily applicable to various types of noise control and to different types of system; for example, we can orthogonally control the mean and noise levels and can control system dynamics such as noisy oscillations. As an illustration we applied our method to HIV and yeast gene expression systems and metabolic networks. The oscillatory signal control was applied to p53 oscillations from DNA damage. Furthermore, we showed that the efficiency of orthogonal control can be enhanced by applying extrinsic noise and feedback. Our noise control analysis can be applied to any stochastic model belonging to continuous time Markovian systems such as biological and chemical reaction systems, and even computer and social networks. We anticipate the proposed analysis to be a useful tool for designing and controlling synthetic gene networks.
Stochastic gene expression at the single cell level can lead to significant phenotypic variation at the population level. To obtain a desired phenotype, the noise levels of intracellular protein concentrations may need to be tuned and controlled. Noise levels often decrease in relative amount as the mean values increase. This implies that the noise levels can be passively controlled by changing the mean values. In an engineering perspective, the noise levels can be further controlled while the mean values can be simultaneously adjusted to desired values. Here, systematic schemes for such simultaneous control are described by identifying where and by how much the system needs to be perturbed. The schemes can be applied to the design process of a potential therapeutic HIV-drug that targets a certain set of reactions that are identified by the proposed analysis, to prevent stochastic transition to the lytic state. In some cases, the simultaneous control cannot be performed efficiently, when the noise levels strongly change with the mean values. This problem is shown to be resolved by applying extra noise and feedback.
Motivation: The SBML Render Extension enables coloring and shape information of biochemical models to be stored in the Systems Biology Markup Language (SBML). Rendering of this stored graphical information in a portable and well supported system such as TeX would be useful for researchers preparing documentation and presentations. In addition, since the Render Extension is not yet supported by many applications, it is helpful for such rendering functionality be extended to the more popular CellDesigner annotation as well.
Results: SBML2TikZ supports automatic generation of graphics for biochemical models in the popular TeX typesetting system. The library generates a script of TeX macro commands for the vector graphics languages PGF/TikZ that can be compiled into scalable vector graphics described in a model.
Availability: Source code, documentation and compiled binaries for the SBML2TikZ library can be found at http://www.sbml2tikz.org. In addition, a web application is available at http://www.sys-bio.org/layout
Supplementary information: Supplementary data are available at Bioinformatics online.
We have created the Knowledgebase of Standard Biological Parts (SBPkb) as a publically accessible Semantic Web resource for synthetic biology (sbolstandard.org). The SBPkb allows researchers to query and retrieve standard biological parts for research and use in synthetic biology. Its initial version includes all of the information about parts stored in the Registry of Standard Biological Parts (partsregistry.org). SBPkb transforms this information so that it is computable, using our semantic framework for synthetic biology parts. This framework, known as SBOL-semantic, was built as part of the Synthetic Biology Open Language (SBOL), a project of the Synthetic Biology Data Exchange Group. SBOL-semantic represents commonly used synthetic biology entities, and its purpose is to improve the distribution and exchange of descriptions of biological parts. In this paper, we describe the data, our methods for transformation to SBPkb, and finally, we demonstrate the value of our knowledgebase with a set of sample queries. We use RDF technology and SPARQL queries to retrieve candidate “promoter” parts that are known to be both negatively and positively regulated. This method provides new web based data access to perform searches for parts that are not currently possible.
In synthetic biology, gene regulatory circuits are often constructed by combining smaller circuit components. Connections between components are achieved by transcription factors acting on promoters. If the individual components behave as true modules and certain module interface conditions are satisfied, the function of the composite circuits can in principle be predicted.
In this paper, we investigate one of the interface conditions: fan-out. We quantify the fan-out, a concept widely used in electrical engineering, to indicate the maximum number of the downstream inputs that an upstream output transcription factor can regulate. The fan-out is shown to be closely related to retroactivity studied by Del Vecchio, et al. An efficient operational method for measuring the fan-out is proposed and shown to be applied to various types of module interfaces. The fan-out is also shown to be enhanced by self-inhibitory regulation on the output. The potential role of an inhibitory regulation is discussed.
The proposed estimation method for fan-out not only provides an experimentally efficient way for quantifying the level of modularity in gene regulatory circuits but also helps characterize and design module interfaces, enabling the modular construction of gene circuits.
One problem with engineered genetic circuits in synthetic microbes is their stability over evolutionary time in the absence of selective pressure. Since design of a selective environment for maintaining function of a circuit will be unique to every circuit, general design principles are needed for engineering evolutionary robust circuits that permit the long-term study or applied use of synthetic circuits.
We first measured the stability of two BioBrick-assembled genetic circuits propagated in Escherichia coli over multiple generations and the mutations that caused their loss-of-function. The first circuit, T9002, loses function in less than 20 generations and the mutation that repeatedly causes its loss-of-function is a deletion between two homologous transcriptional terminators. To measure the effect between transcriptional terminator homology levels and evolutionary stability, we re-engineered six versions of T9002 with a different transcriptional terminator at the end of the circuit. When there is no homology between terminators, the evolutionary half-life of this circuit is significantly improved over 2-fold and is independent of the expression level. Removing homology between terminators and decreasing expression level 4-fold increases the evolutionary half-life over 17-fold. The second circuit, I7101, loses function in less than 50 generations due to a deletion between repeated operator sequences in the promoter. This circuit was re-engineered with different promoters from a promoter library and using a kanamycin resistance gene (kanR) within the circuit to put a selective pressure on the promoter. The evolutionary stability dynamics and loss-of-function mutations in all these circuits are described. We also found that on average, evolutionary half-life exponentially decreases with increasing expression levels.
A wide variety of loss-of-function mutations are observed in BioBrick-assembled genetic circuits including point mutations, small insertions and deletions, large deletions, and insertion sequence (IS) element insertions that often occur in the scar sequence between parts. Promoter mutations are selected for more than any other biological part. Genetic circuits can be re-engineered to be more evolutionary robust with a few simple design principles: high expression of genetic circuits comes with the cost of low evolutionary stability, avoid repeated sequences, and the use of inducible promoters increases stability. Inclusion of an antibiotic resistance gene within the circuit does not ensure evolutionary stability.
Motivation: Model exchange in systems and synthetic biology has been standardized for computers with the Systems Biology Markup Language (SBML) and CellML, but specialized software is needed for the generation of models in these formats. Text-based model definition languages allow researchers to create models simply, and then export them to a common exchange format. Modular languages allow researchers to create and combine complex models more easily. We saw a use for a modular text-based language, together with a translation library to allow other programs to read the models as well.
Summary: The Antimony language provides a way for a researcher to use simple text statements to create, import, and combine biological models, allowing complex models to be built from simpler models, and provides a special syntax for the creation of modular genetic networks. The libAntimony library allows other software packages to import these models and convert them either to SBML or their own internal format.
Availability: The Antimony language specification and the libAntimony library are available under a BSD license from http://antimony.sourceforge.net/
Synthetic biology is an engineering discipline that builds on modeling practices from systems biology and wet-lab techniques from genetic engineering. As synthetic biology advances, efficient procedures will be developed that will allow a synthetic biologist to design, analyze and build biological networks. In this idealized pipeline, computer-aided design (CAD) is a necessary component. The role of a CAD application would be to allow efficient transition from a general design to a final product. TinkerCell is a design tool for serving this purpose in synthetic biology. In TinkerCell, users build biological networks using biological parts and modules. The network can be analyzed using one of several functions provided by TinkerCell or custom programs from third-party sources. Since best practices for modeling and constructing synthetic biology networks have not yet been established, TinkerCell is designed as a flexible and extensible application that can adjust itself to changes in the field.
synthetic biology; modeling; software; CAD; simulation; systems biology; design; standards; computational
Genetic circuits can be assembled from standardized biological parts called BioBricks. Examples of BioBricks include promoters, ribosome-binding sites, coding sequences and transcriptional terminators. Standard BioBrick assembly normally involves restriction enzyme digestion and ligation of two BioBricks at a time. The method described here is an alternative assembly strategy that allows for two or more PCR-amplified BioBricks to be quickly assembled and re-engineered using the Clontech In-Fusion PCR Cloning Kit. This method allows for a large number of parallel assemblies to be performed and is a flexible way to mix and match BioBricks. In-Fusion assembly can be semi-standardized by the use of simple primer design rules that minimize the time involved in planning assembly reactions. We describe the success rate and mutation rate of In-Fusion assembled genetic circuits using various homology and primer lengths. We also demonstrate the success and flexibility of this method with six specific examples of BioBrick assembly and re-engineering. These examples include assembly of two basic parts, part swapping, a deletion, an insertion, and three-way In-Fusion assemblies.
Probably one of the most characteristic features of a living system is its continual propensity to change as it juggles the demands of survival with the need to replicate. Internally these changes are manifest as changes in metabolite, protein and gene activities. Such changes have become increasingly obvious to experimentalists with the advent of high-throughput technologies. In this chapter we highlight some of the quantitative approaches used to rationalize the study of cellular dynamics. The chapter focuses attention on the analysis of quantitative models based on differential equations using biochemical control theory. Basic pathway motifs are discussed, including straight chain, branched and cyclic systems. In addition, some of the properties conferred by positive and negative feedback loops are discussed particularly in relation to bistability and oscillatory dynamics.
Motifs; control analysis; stability; dynamic models
Synthetic biology brings together concepts and techniques from engineering and biology. In this field, computer-aided design (CAD) is necessary in order to bridge the gap between computational modeling and biological data. Using a CAD application, it would be possible to construct models using available biological "parts" and directly generate the DNA sequence that represents the model, thus increasing the efficiency of design and construction of synthetic networks.
An application named TinkerCell has been developed in order to serve as a CAD tool for synthetic biology. TinkerCell is a visual modeling tool that supports a hierarchy of biological parts. Each part in this hierarchy consists of a set of attributes that define the part, such as sequence or rate constants. Models that are constructed using these parts can be analyzed using various third-party C and Python programs that are hosted by TinkerCell via an extensive C and Python application programming interface (API). TinkerCell supports the notion of a module, which are networks with interfaces. Such modules can be connected to each other, forming larger modular networks. TinkerCell is a free and open-source project under the Berkeley Software Distribution license. Downloads, documentation, and tutorials are available at .
An ideal CAD application for engineering biological systems would provide features such as: building and simulating networks, analyzing robustness of networks, and searching databases for components that meet the design criteria. At the current state of synthetic biology, there are no established methods for measuring robustness or identifying components that fit a design. The same is true for databases of biological parts. TinkerCell's flexible modeling framework allows it to cope with changes in the field. Such changes may involve the way parts are characterized or the way synthetic networks are modeled and analyzed computationally. TinkerCell can readily accept third-party algorithms, allowing it to serve as a platform for testing different methods relevant to synthetic biology.
Motivation: Simulations are an essential tool when analyzing biochemical networks. Researchers and developers seeking to refine simulation tools or develop new ones would benefit greatly from being able to compare their simulation results.
Summary: We present an approach to compare simulation results between several SBML capable simulators and provide a website for the community to share simulation results.
Availability: The website with simulation results and additional material can be found under: http://sys-bio.org/sbwWiki/compare. The software used to generate the simulation results is available on the website for download.
Synthetic biology is a useful tool to investigate the dynamics of small biological networks and to assess our capacity to predict their behavior from computational models. In this work we report the construction of three different synthetic networks in Escherichia coli based upon the incoherent feed-forward loop architecture. The steady state behavior of the networks was investigated experimentally and computationally under different mutational regimes in a population based assay. Our data shows that the three incoherent feed-forward networks, using three different macromolecular inhibitory elements, reproduce the behavior predicted from our computational model. We also demonstrate that specific biological motifs can be designed to generate similar behavior using different components. In addition we show how it is possible to tune the behavior of the networks in a predicable manner by applying suitable mutations to the inhibitory elements.
Feed-forward networks; Modules; Simulation; Synthetic biology
Recent ChIP experiments of human and mouse embryonic stem cells have elucidated the architecture of the transcriptional regulatory circuitry responsible for cell determination, which involves the transcription factors OCT4, SOX2, and NANOG. In addition to regulating each other through feedback loops, these genes also regulate downstream target genes involved in the maintenance and differentiation of embryonic stem cells. A search for the OCT4–SOX2–NANOG network motif in other species reveals that it is unique to mammals. With a kinetic modeling approach, we ascribe function to the observed OCT4–SOX2–NANOG network by making plausible assumptions about the interactions between the transcription factors at the gene promoter binding sites and RNA polymerase (RNAP), at each of the three genes as well as at the target genes. We identify a bistable switch in the network, which arises due to several positive feedback loops, and is switched on/off by input environmental signals. The switch stabilizes the expression levels of the three genes, and through their regulatory roles on the downstream target genes, leads to a binary decision: when OCT4, SOX2, and NANOG are expressed and the switch is on, the self-renewal genes are on and the differentiation genes are off. The opposite holds when the switch is off. The model is extremely robust to parameter changes. In addition to providing a self-consistent picture of the transcriptional circuit, the model generates several predictions. Increasing the binding strength of NANOG to OCT4 and SOX2, or increasing its basal transcriptional rate, leads to an irreversible bistable switch: the switch remains on even when the activating signal is removed. Hence, the stem cell can be manipulated to be self-renewing without the requirement of input signals. We also suggest tests that could discriminate between a variety of feedforward regulation architectures of the target genes by OCT4, SOX2, and NANOG.
One key issue in developmental biology is how embryonic stem cells are regulated at the genetic level. Recent high throughput experiments have elucidated the architecture of the gene regulatory network responsible for embryonic stem cell fate decisions in human and mouse. In this work the authors develop a computational model to describe the mutual regulation of the genes involved in these networks and their subsequent effects on downstream target genes. They find that the core genetic network incorporates the functionality of a bistable switch, which arises due to positive feedback loops in the system. Also, this switch behaviour is very robust with respect to model parameters. The switch and architecture by which the genetic network regulates the downstream genes, is responsible for either maintaining the genes responsible for self-renewal on, and genes involved with differentiation off, or the opposite outcome, depending on whether the switch is on/off, respectively. The model also provides several predictions which can lead to further understanding of the network. The methods employed are fairly standard and transparent which facilitates further uncovering in future experimental investigations of genetic networks.