In this section we present several examples which serve as illustrations of how to use BioNetS and test the accuracy and efficiency of the numerical methods. One particular concern is the accuracy of the Euler methods. While these methods are only of order
, we show that when the approximations that lead to the chemical Langevin equations are valid, the difference between the numerical solutions of the SDEs and the exact discrete Gillespie method are negligible. Currently, the graphical user interface to BioNetS runs on the Macintosh OS X operating system, though the software will generate portable C/C++ code that can be compiled and run in any computing environment. The files needed to install and run BioNetS can be downloaded from http://x.amath.unc.edu/BioNetS
. The following examples illustrate the way in which models are entered and run in BioNetS. More detailed documentation is available with the software package.
We begin with a simple system that consists of the following two reactions:
In this system, monomer molecules M are produced at an average rate γ and degraded at an average rate δmM(t). Two monomers can then bind to form a dimer molecule D. The average forward and backward rates for the this reaction are k1M(t)(M(t) - 1) and k2D(t), respectively. The dimers are degraded at a rate δd. We will treat two cases. In the first case the cell volume is assumed to be constant, and in the second case the cell is allowed to grow and divide. To model cell growth, the cell volume Vc is treated as a random variable Vc =
αV, where V is a non-negative discrete random variable and α represents a unit of volume. The random variable V is governed by the reaction
The above reaction causes V to grow exponentially fast with an average rate of k3. Note that logistic growth is produced when the backward reaction in Eq. 32 is included.
We start by considering the simple case in which the volume of the cell remains constant. To use BioNetS follow these steps. Copy BioNetS onto your machine, and double click to launch. Help is included as part of the program, and accessed from the Help menu. The Help document will walk you through all the steps needed to enter reactions and run the simulator.
The user interface asks you to enter the reaction and corresponding rate constants in the top part of the script window as shown in Fig. . In the bottom part of the script window, you can toggle between panels. The Species panel is shown in Fig. and allows the user to specify how the simulator treats each chemical species, discrete or continuous. The Constants panel lists the order in which the rate constants are referenced. The Output panel allows the user to specify the ouput type. There are two ways to generate program output, either binary or ASCII. Binary output is based on MATLAB binary files, so it is possible to drive the program with MATLAB and use MATLAB's plotting routines to view the output. It is also possible to generate time series and histograms of the data from within BioNetS. Using ASCII files for I/O allows the simulator to be run through shell scripts. The Executable panel allows the user to generate either an executable file or source code. BioNetS generates portable C/C++ code that can be compiled and run in any computing environment. BioNetS can directly compile the C/C++ code. However, this requires the Developer tools, included on all recent Apple machines and available directly from http://developer.apple.com
for free. The compiled code can then be run from within BioNetS. The Comments panel is available for the user to enter descriptive comments about the model.
Figure 1 A) A screen shot of BioNetS that illustrates how the dimerization example with constant volume is entered. The tabs at the bottom of the screen allow the user to enter various options. B) A screen shot that illustrates how the dimerization example with (more ...)
To run BioNetS as a BioSpice agent, you need to move the source directory onto a OAA-supported system. Once there, open up the MakeOAA file and specify the locations of your oaalib folder. Then just type "make -f MakeOAA" (without the quotes) to create the agent.
Figures and show plots of time series for the monomer number generated by BioNetS. The parameter values used to generate these figures are given in the caption. Figure is the result obtained when M and D are treated as discrete variables. Figure is the result from the chemical Langevin equations. The solid line shown in both panels is the result from the following equations for the first moments:
Figure 2 A) A single realization of M(t) for the discrete process. B) A single realization of M(t) produced by the chemical Langevin equations. The solid line in both panels is the result produced from Eqs. 33 and 34. The parameter values used to generate these (more ...)
Figure shows the probability density function (PDF) of the dimer concentration at various times for the discrete and continuous case. Notice that for all times, the agreement between the two different methods is very good. At the final time, t = 200 s, the system has reached steady state. These figures indicate that the chemical Langevin equations are accurately capturing the dynamics and steady-state behavior of the discrete system.
Figure 3 PDFs for the dimer concentration at various times. The staircase plots are the results for the discrete case and the continuous lines are the results for the continuous case. Panel A is for the case of constant volume and Panel B is the varying volume (more ...)
Cell growth and division
In this section we describe how cell growth and division can be modeled using BioNetS. We will assume that the cell is experiencing exponential growth up until the time it divides. As discussed above, the cell volume Vc = αV is treated as a random variable. In this model cell division occurs when V exceeds a threshold value Vmax. Note that the choice of Vmax influences the degree of variability observed in the cell division times: cells growing from V = 1 to Vmax = 2 will have a large amount of variability in their division times, while those growing from V = 100 to Vmax = 200 will have less variable times, and those ranging from V = 1000 to Vmax = 2000 will be still less variable. Changing the range of V in this way requires rescaling the relationship of V to the cell volume by adjusting the value of α. When cell division occurs the volume is halved, and the proteins are randomly divided between the two cells using a binomial distribution. Only one of the daughter cells is tracked. Because second order reactions require two molecules to collide, the rate constants for these reactions should scale like k1 = k'1/Vc. We also assume that the production rate of monomers scales as γ = γ'Vc. This is a reasonable assumption, because as the cell grows the transcription and translation machinery increases. These assumptions produce the following rate equations for the concentrations
The terms in Eqs. 35 and 36 that involve k3
arise because of dilution due to cell growth. We use the same parameter values as in the constant volume case except δm
= 1 and δd
= 0. The cell growth rate is k3
= 0.02 (assuming a scaling of 1 time unit to one minute, this yields an average cell division time of ln 2/k3
35 minutes, typical for bacteria), the scale factor for the cell volume is α = 1 (for simplicity), and Vmax
= 100. With these choices of parameter values, Eqs. 35 and 36 are identical with Eqs. 33 and 34, and we expect the average behavior of this system to be similar to that of the constant volume case.
The screen shot shown in Fig. illustrates how this model is entered into BioNetS. Figure shows the time series for the volume and monomer number treating each variable discretely. The results for the continuous case are virtually identical. In Fig. the concentration is plotted as a function of time. This figure should be compared with Fig. . The solid line in Fig. is the result from solving Eqs. 35–37. Figure shows the PDFs for the dimer concentration at various times. Both the discrete and continuous results are shown. By comparing Fig. with Fig. , we see not surprisingly that for this simple example the main effect of volume growth is to act as an additional noise source and increase the variability of the distributions.
Figure 4 A) Time series for the volume and monomer number for the case of cell growth and division. The parameter values used to generate these figures were γ = 50, δm = 1.0, δd = 0.0, k1 = 0.01, k2 = 0.1, k3 = 0.02, and Vmax = 100. The (more ...)
A chemical oscillator
We next use BioNetS to simulate a two gene system that has been studied in the literature [37
]. In this system, the protein A
coded for by gene a
acts as an activator for gene a
and gene r
, by binding to the promoter regions, Pa
, of the respective gene. This increases the rate of mRNAa
production by a factor αa
, respectively. The protein R, acts as a represser for both genes by binding to A
to form the inactive complex A_R
. All gene products, mRNA and protein, are actively degraded. However, the heterodimer A_R
protects the R
subunit from degradation. The system consists of 9 chemical species and the following 14 biochemical reactions:
Figure shows the way this system is entered into BioNetS.
Figure 5 A screen shot of the BioNetS user interface for the chemical oscillator example. The parameter values shown in the figure are the ones used to produce the results shown in Figs. , , . The area to the right of the (more ...)
An interesting feature of the system is that it is capable of producing sustained oscillations [37
]. Figure shows a times series for the repressor protein number when all the chemical species are treated as discrete random variables. The values of the rate constants used to generate this figure are listed in Fig. .
Sample paths for the repressor protein number. Panel A is the fully discrete case and Panel B is the hybrid model.
The chemical species Pa, Pr, Pr_A, and Pr_A are binary random variables: they can only take on the values 0 or 1. Therefore, these species can not be approximated as continuous random variables. All the other chemical species appear in sufficient quantities to justify the continuum approximation. Figure shows a time series corresponding to Fig. using the hybrid model. The hybrid model was run using the semi-implicit Euler method, and for these parameter values, runs 3 times faster than full model. Visually, the agreement between the two methods appears good. To test the accuracy of the Euler method, we used BioNetS to construct 2-D histograms of R versus mRN Ar. The results for the discrete and hybrid models are shown in Figs. and . To construct these histograms 10, 000 oscillations were used. Excellent agreement between the discrete and hybrid model is seen. This indicates that the hybrid model is accurately sampling the steady-state distribution. To verify that the hybrid model faithfully captures the dynamics of the system, we computed the power spectra of both models. The results are shown in Figs. and . Again, excellent agreement is seen between the discrete and hybrid model.
Figure 7 2-D histograms of repressor mRNA number versus repressor protein number. Red indicates regions of large frequency (i.e., where the system spends a lot of time) and blue indicates regions of low frequency. Panel A is for the discrete system and Panel B (more ...)
Power spectra for the repressor protein number. Panel A is the discrete case and B is the hybrid model.
An engineered promoter system
Using standard techniques in modern molecular biology, it is possible to design novel systems of promoter-gene pairs, such that virtually any desired regulatory network architecture may be instantiated; such networks are often called "synthetic gene networks." Recent implementations have included direct negative [22
] and positive [23
] feedback, a bistable switch [12
], an oscillator [11
], an intercellular communication system [38
], and a bimodal self-activating system [39
In this example, we use BioNetS to implement a model of a simple, open-loop network based around a novel engineered promoter, which has been designed and constructed by N. Guido and J. J. Collins at Boston University. The promoter, called OROlac, combines the Olac, OR1, and OR2 operator sites, so that it is repressed by the lac repressor protein (LacI) and activated by the lambda repressor protein (CI); see Fig. . Experiments have been conducted in which the promoter, along with other sites to produce the activator and repressor proteins, is integrated into a high copy number plasmid and inserted into a strain of Escherichia coli. The promoter's activity is observed using a fluorescent reporter, Green Fluorescent Protein (GFP). A detailed modeling study with direct comparisons to experimental results has been carried out using a fully discrete stochastic approach, and will be reported elsewhere (McMillen et al., manuscript in preparation). Our goal here is to provide a reasonably complex test case to evaluate the performance of BioNetS.
Figure 9 Schematic of the OROlac engineered promoter. The promoter fuses three operator sites, one of which (Olac) yields repression when the LacI tetramer is bound to it, while binding of the CI dimer to OR2 yields approximately ten-fold activation (the OR1 site (more ...)
The processes to be captured by the model are: transcription and degradation of mRNA strands; translation of mRNA into protein; degradation of protein; formation of protein multimers (dimers in the case of CI, tetramers in the case of LacI); LacI binding to isopropyl-β-D-thiogalactopyranoside (IPTG), a chemical inducer that reduces LacI's binding affinity for Olac; and protein-DNA binding at the OROlac promoter's operator sites. We define the following chemical species: G, GFP; Mg, mRNA coding for GFP; X, CI monomer; X2, CI dimer; Mx, mRNA coding for CI; Dx, the arabinose-inducible pBAD promoter site producing CI; Y, LacI monomer; Y2, LacI dimer; Y4, LacI tetramer; I0, IPTG (present in massive excess and thus taken to be constant); YI, LacI tetramer bound to IPTG; My, mRNA coding for LacI; and Dy, the PLtetO1 site constitutively producing LacI. In addition to these, we define species D0 through D8, representing the various permutations of proteins bound to the three operator sites in the OROlac promoter (see Table for a list). There are twelve combinatorial possibilities, but we eliminate three of them on the basis that CI (X2) binding OR2 but notOR1 is unlikely, because of the low binding affinitity of CI for OR2 compared toOR1. Table also lists the effect on the basal rate of production when the promoter is in each state. This reflects the regulatory effect of the proteins; for example, CI bound to OR2 leads to a 10-fold increase in transcription rate, while LacI bound to Olac halts transcription completely (note that we assume in the event of simultaneous binding of activator and repressor, repression "wins" and transcription is halted).
The following irreversible reactions represent the processes of transcription, translation, and degradation:
As in previous reactions, the caligraphic letters represent individual molecules of each species. We scale all times and rates by the cell division time.
Experimental measurements generally provide equilibrium rather than rate constants, and thus when writing reversible reactions we use the following notational convention: a reaction with equilibrium constant K has forward rate constant KR and backward rate constant R, where R is a scaling factor which sets the speed at which the reaction approaches equilibrium (we will consider three values of R: 1, 10, and 100). Using this notation, we represent protein-protein binding with the following set of reactions
Finally, protein-DNA binding is given by:
In all, the system consists of 21 species, participating in 34 reactions. The reactions are entered into BioNetS using the same method described in the previous examples. We use BioNetS' ability to represent individual species as either discrete or continuous to formulate three versions of the model: fully discrete, fully continuous, and a hybrid version in which the DNA species D0 through D8 are discrete while all other species are continuous. We vary the value of R, the scaling factor for reversible reactions, and keep all other parameters fixed at the following nondimensionalized values: βg = 0.1, βy = 1, βx = 0.5, βT = 10, γmrna = 3.5, γprot = 0.7, Ky = 0.01, Ky2 = 0.1, KyI = 2 × 10-6, Kx = 0.05, K1 = 0.3, K2 = 2K1,K3 = 0.008, K4 = 1.4 × l0-4K3, I0 = 1 × 106.
To evaluate the steady-state probability distributions produced by the reaction system, simulations 250000 cell cycles in length were used to accumulate histograms (a built-in feature of BioNetS) of the number of molecules of GFP (species G), for each of the three versions of the model. As Fig. shows, the resulting distributions are essentially identical, indicating that the continuum approximations used in the fully continuous and hybrid forms of the model were valid. Not all species in the system are well approximated as continuous variables: Fig. shows the continuous probability distribution for species D8, representing a promoter fully populated with two CI dimers and a LacI tetramer-IPTG complex. This situation is very rare in our parameter regime, and the system spends essentially all of its time with D8 = 0. The fully continuous model, however, fluctuates into negative values, indicating that the continuum approximation has broken down. This does not significantly affect the distribution for GFP because the other, more common DNA states dominate the system's behavior; note, however, that if we were considering genomic DNA rather than a high copy number plasmid, we would not be able to employ a fully continuous model. The hybrid model, by treating the DNA species as continuous, eliminates the fluctuations into negative values. In general, the appropriate approximations will depend on both the system and the variables of interest: in the present example, if we were interested in the behavior of the operator sites themselves, we would not be able to use the fully continuous version of the model, but as a model solely of GFP expression the approximation suffices. Comparisons between types of models should be made to test the underlying assumptions, and BioNetS facilitates this process.
Figure 10 A) Probability densities of GFP molecules, for three versions of the OROlac promoter model. Densities were generated by accumulating statistics in runs 250000 cell cycles in duration. (Solid line) Fully discrete model. (Dashed line) Fully continuous model. (more ...)
We used simulations 200 cell cycles in length to test the speed at which the three model versions ran. In each case, 200 simulations were run using a consistent set of 200 different random seeds; all runs were started with identical initial conditions. For the fully continuous and hybrid systems, the semi-implicit scheme was numerically stable and yielded consistent histograms for all time step sizes between dt = 0.001 and dt = 0.5, but the latter corresponds to just two time points per cell division cycle (recall that all times are scaled by the cell division time), and we chose instead to sample 20 points per cycle and set dt = 0.05. As shown in Table , the fully continuous method was always fastest, with the degree of improvement over the exact, fully discrete method depending strongly on the value of R, the scaling factor for the reversible reaction rates. For R = 1, the fully continuous method was only 1.4-fold faster than the fully discrete method, but as R is increased this speed advantage increases to over 4-fold at R = 10, then to over 30-fold at R = 100. (Note that the speed advantage of the fully continuous over the fully discrete method increases with the abundances of the chemical species. Shifting parameters to generate higher protein numbers can yield cases in which the continuum approximation is hundreds of times faster than the discrete approach; runs not shown here.) Use of a hybrid discrete/continuous method did not, for this particular model system, offer any speed gain over the fully discrete approach; the increased time involved in computing the Jacobian for the semi-implicit method is more time-consuming than simply simulating the reactions directly. Optimizing efficiency requires testing various potential approaches, and BioNetS makes this a simple process.
Execution times for three versions of the OROlac promoter model.