Home | About | Journals | Submit | Contact Us | Français |

**|**Bioinformatics**|**PMC2935439

Formats

Article sections

- Abstract
- 1 INTRODUCTION
- 2 MODEL AND INVERSION ANALYSIS
- 3 BAYESIAN EXPERTS ON KINETIC LEARNING
- 4 APPLICATION
- 5 CONCLUDING REMARKS
- Supplementary Material
- REFERENCES

Authors

Related links

Bioinformatics. 2010 September 15; 26(18): i589–i595.

Published online 2010 September 4. doi: 10.1093/bioinformatics/btq389

PMCID: PMC2935439

The Institute of Statistical Mathematics, Research Organization of Information and Systems, 10-3 Midori-cho, Tachikawa, Tokyo, 190-8562, Japan

* To whom correspondence should be addressed.

Copyright © The Author(s) 2010. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article has been cited by other articles in PMC.

**Motivation** Biochemical reactions in cells are made of several types of biological circuits. In current systems biology, making differential equation (DE) models simulatable *in silico* has been an appealing, general approach to uncover a complex world of biochemical reaction dynamics. Despite of a need for simulation-aided studies, our research field has yet provided no clear answers: how to specify kinetic values in models that are difficult to measure from experimental/theoretical analyses on biochemical kinetics.

**Results:** We present a novel non-parametric Bayesian approach to this problem. The key idea lies in the development of a Dirichlet process (DP) prior distribution, called *Bayesian experts*, which reflects substantive knowledge on reaction mechanisms inherent in given models and experimentally observable kinetic evidences to the subsequent parameter search. The DP prior identifies significant local regions of unknown parameter space before proceeding to the posterior analyses. This article reports that a Bayesian expert-inducing stochastic search can effectively explore unknown parameters of *in silico* transcription circuits such that solutions of DEs reproduce transcriptomic time course profiles.

**Availability:** A sample source code is available at the URL http://daweb.ism.ac.jp/~yoshidar/lisdas/

**Contact:** pj.ca.msi@radihsoy

Cells can sense many different environmental signals, and respond to extraneous stimuli by switching activation/inactivation of specific molecules, such as transcription factors, via several types of biological circuits. Kinetic biochemical modelling of such reaction circuits (often called biochemical reaction *networks*) using DEs realizes experiments *in silico* to enhance our understanding of the complex dynamic systems (Alon, 2006; Chen *et al.*, 2004; Wilkinson, 2006). It enables us to test a plausibility of current knowledge and hypothetical models on biological circuits and to create new further models when inconsistencies arise in simulation, domain knowledge and experimentally observed quantitative data (Yoshida *et al.*, 2008).

Despite of a growing need for simulation-aided studies on bio-pathways, some fundamental issues prevent us from drawing their full potentials. One critical issue addressed here is related to uncertainty in specified kinetic parameters of models: in many cases, we have no clear answer on how to specify kinetic values and initial conditions on simulations because biochemical reaction kinetics differ depending on a cell-by-cell basis and intra-/extracellular environments such as temperatures, presence/absence of biochemical components in specific tissues.

With the aim to explore kinetic parameters in *in silico* biological circuits that make solutions of DEs fit experimental data, a range of advanced technologies have appeared in the last several years of computational systems biology, involving genetic algorithm (GA; Kikuchi *et al.*, 2003; Kimura *et al.*, 2004; Koh *et al.*, 2006), sequential Monte Carlo (SMC; Nagasaki *et al.*, 2006; Nakamura *et al.*, 2009), MCMC-driven parameter search methods (Finkenstädt *et al.*, 2008) and also gradient descent algorithms (Cao *et al.*, 2008; Fujarewicz *et al.*, 2007; Yoshida *et al.*, 2008). Quantitative data, such as transcriptomic profiles and protein expression values, are usually measured over times. In general, a Bayesian statistical method involves all the preceding methods as a unified framework: any estimation tasks are treated as evaluations of posterior distributions in which prior distributions embodying our prior belief on reaction kinetics are adapted to experimental data using likelihood functions. As demonstrated later, however, the practical relevance of the existing approaches remains far below what practical analyses require. Our view on this is that such a low performance depends highly on the lack of effective biochemical kinetic priors.

This article presents a previously unexplored class of prior distributions, called Bayesian experts on biochemical reaction kinetics. The key notion of our approach can be found in the construction of Dirichlet process (DP) prior distribution (Blei *et al.*, 2005; Escobar *et al.*, 1995; MacEachern *et al.*, 1998) whose base measure *G*_{0}(θ) is modelled by a mixture of *m* Gibbs distributions:

(1)

with mixing rates α_{i} (*i* = 1,…, *m*), temperature *T* > 0 and partition functions *Z*_{i}. The *m* potential functions ϕ_{i}(θ) defined on unknown parameters θ reflect substantive knowledge on reaction kinetics for a given model such that regions of interest in the parameter space are prescribed prior to a posterior analysis. In many applications, we are able to collect several kinetic evidences on such local regions by observing temporal patterns of experimental data. For instance, rates of increasing and/or decreasing to which DE solutions have to match can be guessed in advance by observing steepness of relevant local patterns of given time course data. Alternatively, maximum and minimum levels of gene expression data provide a helpful clue to determinations of steady-state levels to which a transcription circuit model reaches in equilibrium. The Gibbsian mixture distribution, enabling Bayesian expert system in kinetic explorations, aggregates such *m* constraint conditions on the kinetic parameters that are induced automatically in light of experimental data and reaction speeds inherent in given models.

This article describes our Bayesian experts, focusing on key examples of specific types of *in silico* transcription circuits. Transcription factor activities are modelled on Hill functions and mass action kinetics (Alon, 2006; Wilkinson, 2006) as described in Section 2. To emphasize the ability of the Bayesian expert prior, we explore unknown parameters using a simplified stochastic search algorithm without using more advanced computational techniques. The DP prior with additional technical details is introduced in Section 3. Experimental results appear in Section 4, demonstrating drastic efficiency gains from uses of the Bayesian experts and a benchmark on computational times. Section 5 concludes with more generalization of our approach. Supplementary Material and a sample program—called LiSDAS (Life Science Data Assimilation Systems)—are accessible at the Supplementary web site. LiSDAS features some basic functions to explore kinetic parameter values for given data and a model. The user can develop biological circuit models with the LiSDAS model construction file in place.

The proposed Bayesian method will be derived and described, in later sections, with examples on analyses of transcriptional reaction kinetics. Before that, we briefly summarize specific forms of the DEs making up elements of the transcription circuit models.

Control of transcription factor-inducing gene expressions is a key regulatory mechanism in cells. Transcription factors can sense external stimuli of cells by mediations of signal transduction pathways. A transcription factor—activated by the transmitted signals—acts as either activator or repressor in transcription processes of specific target genes with many recruited enzymatic activities at the promoter regions. One commonly used DE for describing the transcriptional process is Hill function which is in theory derived by considering an equilibrium binding of a transcription factor to its target site on the promoter (see Appendix of Alon, 2006). The time evolution of mRNA concentration [*X*] is modelled as

where the mRNA decay rate and the maximal rate of transcription are denoted by ψ_{rd} and ψ_{tc}. In addition to this, some studies commonly include a baseline ψ_{b} (e.g. Cantone *et al.*, 2009). The Hill functions involving the transcriptional activators and repressors are

(2)

(3)

The unknown kinetic parameters σ_{i} and *K*_{i} control, respectively, location and steepness of each Hill function. As σ_{i} gets larger, the Hill function becomes steeper and looks more like the indicator function *I*(·), taking the value one if argument is true, otherwise zero. The multiple set of the Hill functions totally defines the AND gate that describes an abrupt increasing/decreasing of mRNA transcript level in response to excess concentrations of the regulatory proteins in and .

Multimeric molecules can make up a transcription factor *complex* in any of transcription regulatory circuits. In modelling, we need to build a DE describing molecular binding reactions, such as protein–protein binding or one or more proteins interacting with small molecules. Instantaneous rate of a binding reaction is described here by mass action kinetics

(4)

In the formation of a complex *C*, the binding efficiency among *d* molecules, *P*_{i} (*i* = 1,…, *d*), is determined on the affinity proportional to the active masses with the binding rate constant ψ_{cb} and powers τ_{i}.

An mRNA transcribed at level [*X*] proceeds to its translation process. The model that we employ describes a production of protein *P* simply as

(5)

Instantaneous rate of translation is represented by the linear functions in the first two terms with the rates of degradation ψ_{pd} and translation ψ_{tl}, respectively. The last term involving (4) appears in (5) when the protein *P* binds to the other *d* − 1 molecules {*P*_{i}|*P*_{i} ≠ *P*, *i* = 1,…, *d*} in system. Note that we omit dissociation processes of the protein–protein complex *C*, relying on the assumption that molecular stability of *C* is much higher and reaction kinetics involved in the dissociation is negligible in the entire system.

A biological circuit is modelled as a set of DEs d*x*_{i}(*t*)/d*t* = *f*_{i}(pa_{i}(*x*), ψ) defining rates of change in evolving concentrations of the *p* biological entities, *x*(*t*) = (*x*_{i}(*t*))_{1≤i≤p}, over continuous times *t* . In the above context, each endogenous variable involved in a bio-pathway makes up a state variable *x*_{i}(*t*). The *i*-th state variable is regulated by the parent variables pa_{i}(*x*)—a subset of the state variables appearing in the right-hand side of each DE—with the rate equation *f*_{i} defined by the kinetic parameters ψ. Conduction of quantitative experiments along times enables us to measure changes of the *unknown* state variables, *x*(*t*), directly or indirectly via observed data *y*(*t*) = (*y*_{i}(*t*))_{1≤i≤p} _{+}^{p} at discrete time points *t* . Here, to cope with partially observed systems where data of the state variables are in part missing, it is helpful to define a set of observable variables, *i* {1,…, *p*}. For instance, in transcriptomic experiments using DNA microarrays, it is the case that amounts of protein expression levels are unobservable.

To proceed with an inversion analysis, the DEs are related to the experimental data using a state space model:

(6)

(7)

where *w*_{i}(*t*) denotes measurement error independently and identically distributed.

The processes *Y* = {*y*_{i}(*t*)|*t* , *i* } and *X* = {*x*(*t*)|*t* } follow (6) and (7) with initial state variables *x*(0) having a certain prior distribution *x*(0) ~ *p*(*x*(0)). Bayesian inversion analysis explores all the unknown parameters in the model—initial states *x*(0) and kinetic parameters ψ—based on the posterior distribution *P*(*x*(0), ψ|*Y*, Pa) under which a circuit structure Pa = {pa_{i}(*x*)}_{1≤i≤p}—a collection of parent variable sets—is specified and a priori knowledge on reaction kinetics is expressed via a prior distribution *P*(ψ). Rather than *P*(*x*(0), ψ|*Y*, Pa), in many cases, it is of more interest to evaluate the *augmented* posterior distribution *P*(*x*(0), ψ, Pa|*Y*) where a circuit structure is unknown or unreliable. In what follows, we focus on the estimation of ψ with a given circuit structure. LiSDAS, however, has a pre-mature function for structural search of reaction circuits.

All the unknown parameters relevant to an analysis are expressed by θ, i.e. θ = {ψ, *x*(0)} or augmented parameters involving circuit structures Pa. In usual, an analytical form of the posterior distribution *P*(θ|*Y*) *P*(*Y*|θ)*P*(θ) is unavailable due to non-linearity of DEs, thereby driving a need for efficient posterior approximation techniques in solving the inversion problems—search for posterior means or modes. The recent systems biology has explored parallel developments of SMC, MCMC-driven posterior explorations and gradient descent methods. However, with a view to scaling to larger numbers of unknown parameters, efficient and effective computation remains a challenge.

Our study started from Nagasaki *et al.* (2006) that applied a simplest SMC to the stochastic search for transcriptional reaction kinetics. The use of SMC algorithm had been suggested also by Quach *et al.* (2007) whereas they actually applied the unscented Kalman filter as a variant of conventional SMC. A key notion of the SMC and MCMC-driven search—a more efficient counterpart of SMC—is that the posterior distribution is approximately evaluated on a finite number of Monte Carlo samples called *particles* such that the empirical distributions are ensured to reach the true posterior distribution as the number of particles approaches to infinity. Whenever following such an approach, however, a quite huge number of particles are needed to draw as raising the number of unknown parameters. Nagasaki *et al.* (2006) and Yoshida *et al.* (2008) reported that the practical relevance of the SMC was limited by estimations of only a few parameters.

Some early studies focused on GA (Kikuchi *et al.*, 2003; Kimura *et al.*, 2004; Koh *et al.*, 2006) that explores values of parameters by alternating crossover and mutation of initially generated seeds for the parameters. With no loss of generality, any optimization problems for which GA aims to solve can be converted to posterior mode searches in Bayesian context. The key for success in GA is whether or not an initial set of candidate parameters lies in the region close to the posterior mode. According to Higuchi (1997), the intrinsic nature of GA is essentially the same to the SMC. Hence, the low practical relevance of the SMC (Nagasaki *et al.*, 2006; Yoshida *et al.*, 2008) can be a rule of thumb for a performance of GA.

The substantial merit of Bayesian kinetic inference is in the use of prior distributions, *P*(θ), for conducting effective regularization in the inversion analysis. To the best of our knowledge, however, no one has addressed this fundamental issue while the range of cutting-edge *computational* technologies have appeared over the last several years. The Bayesian expert system that we present is the first attempt to make a class of prior distributions aiming to enhance the power of previously explored class of kinetic search methods.

A prior distribution *P*(θ) on the unknown parameters is modelled by a DP written as

The distribution *G* that θ follows is treated as an infinite dimensional random variable distributed according to the DP—*DP*(*G*|γ, *G*_{0}). The DP is defined by the two parameter: base measure *G*_{0} and concentration parameter γ ≥ 0. The *G*_{0} is a distribution function defined on the support same as *G*. The smaller γ is, the more *G* looks like the *G*_{0}, and vice verse as larger γ. A key notion of DP is in a discrete representation of probability measure; *G* is decomposable via the sum of the Dirac measures δ_{θi}(·) at the infinite set of atoms θ_{i} ~*G*_{0} (*i* = 1,…, ∞). The explicit form is hence given as the infinite mixture *G*(θ) = ∑_{i=1}^{∞} π_{i}δ_{θi}(θ) with the mixing rates, π_{i} ≥ 0 and ∑_{i= 1}^{∞} π_{i} = 1. The infinite sequence of θ_{i} can be made by successively conducting stick breaking construction or Chinese restaurant process (Blei *et al.*, 2005; Escobar *et al.*, 1995).

Of the base measure with the Gibbsian mixture (1), the *m* potential functions represent a set of implicit functions ϕ_{i}(θ) ≈ 0 (*i* = 1,…, *m*) that specifies regions of interest in the kinetic parameters. We here describe a simple idea on how to make such potentials with an example of the transcription regulatory models in Section 2.1.

In many applications, maximal or minimal level of each observable variable can be a helpful kinetic evidence to prescribe steady-state conditions on which each variable should reach in simulation. We illustrate some examples on how to make such prior distributions.

The transcriptional process shown in Section 2.1 with each Hill curve varying sigmoidally between 0 and 1 defines the maximal production rate as

where the AND gate opens. The equilibrium condition d[*X*]/d*t* = 0 induces the steady-state level of [*X*]—the maximal level of the ODE solution—as the ratio of the production and dilution rates [X]_{max} = (ψ_{tc} + ψ_{b})/ψ_{rd}. If time course profiles are available on the mRNA, we are able to evaluate, in some ways, the maximal expression level of each mRNA, *y*_{max}. Then, all possible configurations of {ψ_{tc}, ψ_{rd}, ψ_{b}} such that *y*_{max} (ψ_{tc} + ψ_{b})/ψ_{rd} can be eliminated in advance by making an arbitral potential function. For instance, we will use, in later, ϕ_{max} = (λ*y*_{max} − (ψ_{tc} + ψ_{b})/ψ_{rd})^{2} with a given discount factor λ [1, ∞). Solutions with the parameters satisfying ϕ_{max} ≈ 0 yield trajectories saturated around at *y*_{max} when λ = 1. The specified λ = 1 is to express the prior belief that the observed maximal level results in an equilibrium state to which the underlying dynamic system reaches. The restriction can be weaken/strengthen by the control of the temperature parameter *T*. By analogy, the minimal expression level *y*_{min} can also be associated with the minimal production rate on the model with the closed AND gate, −ψ_{rd}[*X*] + ψ_{b} = 0, which derives a potential function ϕ_{min} = (λ*y*_{min} − ψ_{b}/ψ_{rd})^{2}.

Relevant to the mass action kinetics (4) and the translation process (5) that involve protein-level reactions, we can often specify the maximal/minimal expression level of protein in data-driven ways or empirical bases. Conventional DNA microarray experiments measure transcript levels of mRNA molecules in arbitrary units, and hence any kinetic parameters in a fitted model result in unit free as well as quantities of regulatory proteins. A natural way to cope with such cases is to put an arbitrary chosen maximal/minimal value to the solutions for (4) and (5). The DE (5) attains the maximal speed of translation when the maximal level of mRNA, [*X*]_{max} = (ψ_{tc} + ψ_{b})/ψ_{rd}, is specified:

where the minimal protein concentrations [*P*_{i}]_{min} are specified arbitrary in the rest of *d* − 1 reactants. Given this, a maximal level *p*_{max} is then related to the equilibrium state ψ_{tl}[*X*]_{max}/(ψ_{pd} + ψ_{cb} _{Pi≠P}[*P*_{i}]_{min}), which makes a potential function ϕ_{max} = (λ*p*_{max} − ψ_{tl}[*X*]_{max}/(ψ_{pd} + ψ_{cb} _{Pi≠P}[*P*_{i}]_{min}^{τi}))^{2} with a discount factor λ. By analogy, the solution of (4) describing a complex formation attains at the maximal rate of production with [*P*_{i}]_{max} (*i* = 1,…, *d*). The steady-state condition induces an expert ϕ_{max} = (λ*c*_{max} − ψ_{cb} _{i=1}^{d}[*P*_{i}]_{max}^{τi}/ψ_{cd})^{2} with an arbitrary specified maximum value of the complex, *c*_{max}.

As well as the maximal and minimal levels of production, experimental data provide us a further clue to what temporal patterns—reaction speeds—DEs must exhibit in simulation. Suppose that a temporal profile of mRNA is observed as shown in Figure 1, showing increasing and decreasing trends alternatively. Gradients of any fitted DE models should reproduce the observed steepness during each trend.

Schematic expression on making Bayesian experts for (i)–(*ii*) maximal/minimal production levels, and (iii) local reaction speeds.

Of many possible ways to develop such experts, we present a practically useful, simple procedure that is implemented on LiSDAS. The first task to be addressed is a decomposition of an entire time interval into *active* and *inactive* durations according to observed gene expression profiles, and values of gradients or steepness in each of the identified phases. The proposed procedure starts with fitting a non-parametric regressor, *f*_{i}(*t*) = μ_{i} + *w*_{i}^{T}*g*(*t*) + *v*_{i}(*t*), to the *i*-th mRNA profile, *y*_{i}(*t*) (*t* ), where μ_{i} and *v*_{i}(*t*) are intercept and residual, respectively. The *J* basis functions placed at equally-spaced grids in are aggregated to *g*(*t*) ^{J}, and *w*_{i} ^{J} is a vector of coefficients. LiSDAS fits the Gaussian radial functions by using _{2} regression scheme. A value of bandwidth common to all basis functions and a shrinkage parameter in the _{2} regression are prescribed by users.

The fitted curve offers a gradient function over time, ŵ_{i}^{T} (d*g*(*t*)/d*t*). Let *D*_{on} and *D*_{off} be *duration* times of increasing and decreasing trends that are identified as subsets of the entire time interval. Each duration keeps a same sign of the evaluated gradients during the consecutive times. The key idea here is to derive a set of experts such that each potential function reflects a reaction speed of increasing or decreasing during each of the identified local times.

The Hill curve transcription dynamics in Section 2.1 are locally linear where the concentration levels of all the transcription factors rise much above or fall much below the relevant *K*_{i}s. At the reaction speeds of transcriptional process with open/closed AND gates, we have the solutions:

where [*X*]_{0} is an initial condition.

Denote the start and end times of each identified duration by (*t*_{start}^{on}, *t*_{end}^{on}) for *D*_{on} and (*t*_{start}^{off}, *t*_{end}^{off}) for *D*_{off}, respectively. Of interest here is to identify regions of the parameter spaces such that the simulations instantaneously reproduce the observed reaction speeds—, respectively, for *t* [*t*_{start}^{on}, *t*_{end}^{on}] or *t* [*t*_{start}^{off}, *t*_{end}^{off}]. Specify and to [*X*] and [*X*]_{0} in the former ODE solution corresponding to the active duration where *t* is set to the length of the duration, *t*_{on} = *t*_{end}^{on} − *t*_{start}^{on}. An expert on reaction speed is then made of the potential function:

By analogy, the implicit function relevant to the latter solution induces the expert on speed in the inactive duration with *t*_{off} = *t*_{end}^{off} − *t*_{start}^{off}:

These two potential functions form elements of a Gibbsian distribution in various ways, for instance exp(−ϕ_{off} × ϕ_{on}/*T*). Alternatively, we would develop an expert based on either exp(−ϕ_{off}/*T*) or exp(−ϕ_{on}/*T*). In advance to the posterior exploration, LiSDAS generates many experts for all the identified duration times by joining/dissociating ϕ_{on} and ϕ_{off} at random.

Figure 1 summarizes all the potential functions that we use in later and a schematic view on the idea. It is important to see that the preceding potential functions constitute, with their *combination*, a single potential function in (1). For instance, the potential functions relevant to each mRNA—ϕ_{max}, ϕ_{min}, ϕ_{on} and ϕ_{off}—are defined on the transcriptional parameters (ψ_{rd}, ψ_{tc}, ψ_{b}), while the other potentials involve only the kinetics relevant to protein regulations. LiSDAS draws at random potential functions and conventional Gaussian priors (the user can specify in the model construction file) for each set of kinetic parameters. Each ϕ_{i} in (1) is then constructed by aggregating the extracted component prior distribution corresponding to different types of kinetic parameters.

DP is discrete with probability one, and represented as an infinite mixture *P*(θ|Θ_{∞}) = ∑_{i=1}^{∞} π_{i}δ_{θi}(θ) conditional on infinite number of the atoms, Θ_{∞} = {θ_{i}|*i* = 1,…, ∞} following *G*_{0}. Hence, the time course data also follow the infinite mixture

Of interest in the subsequent posterior analysis is to realize random sample from the posterior distribution of Θ_{∞}, *a posteriori*, and also the mixing rates π_{i}, conditional on *Y*. The Bayesian model in total forms a DP mixture (DPM) with the key hierarchical structure: (i) *G* ~ DP(*G*|γ, *G*_{0}), (ii) θ|*G* ~ *G* and (iii) *Y*|θ ~ *p*(*Y*|θ). In statistical science, there have been several excellent reviews, providing technical details on DPM (Blei *et al.*, 2005; Escobar *et al.*, 1995; MacEachern *et al.*, 1998). We therefore present only a specific computational algorithm.

The procedure is computationally very straightforward. Sample from the posterior is constructed in a recursive way as a conventional sampling algorithm does (e.g. Escobar *et al.*, 1995). Suppose that, in a current *distinct* parameter set, Θ_{k} = {θ_{i}|*i* = 1,…, *k*} with the realized values being all distinct, there are *n*_{i} occurrences of each θ_{i}. The total sample size is *N*_{k} = ∑_{i=1}^{k} *n*_{i}. Then, the (*k* + 1)-th case is newly generated by the conditional posterior (θ_{k+1}|Θ_{k}, *Y*) given as the mixture

with the mixing rates given by

The first *k* component distributions consist of the Dirac measures δ_{θj}(θ_{k+1}) placed at the preceding sample points, with the mixing rates involving the frequency of sampling *n*_{j} and the likelihood *P*(*Y*|θ_{j}). The last component represents the conditional posterior *P*(θ_{k+1}|*G*_{0}, *Y*) *P*(*Y*|θ_{k+1})*G*_{0}(θ_{k+1}), and the normalizing constant cannot be evaluated analytically. If a sample θ_{k+1} is drawn from one of the preceding samples, the occurrence rate is updated, *n*_{j} → *n*_{j} + 1. Given a new θ^{*} coming from the (*k* + 1)-th distribution, we increment the current set such that Θ_{k+1} = Θ_{k} + {θ^{*}} and *n*_{k+1} = 1. Random draws having the higher likelihood *P*(*Y*|θ) appear more frequently and result in an approximation of the posterior distribution *P*(θ|*Y*) as the sampling process proceeds iteratively until convergence.

The remaining task to be addressed is to make a sample from *P*(θ_{k+1}|*G*_{0}, *Y*), and evaluation of its normalizing constant. We here provide a simple Monte Carlo technique using importance sampling (IS). One difficulty is relevant to making a sample from the Gibbsian mixture *G*_{0} ∑_{i=1}^{m} α_{i}*Z*_{i}^{−1} exp(−*T*^{−1} ϕ_{i}(θ)). Our approach is very straightforward: (i) conduct the second-order approximation of each potential function around with Φ_{i}(θ) the Hessian matrix of ϕ_{i}, and (ii) the proposal distribution of IS is then defined as the Gaussian distribution with the covariance matrix given by the inverse of Hessian matrix times *T*. Let *Q*(θ) = ∑_{i=1}^{m} α_{i}*Q*_{i}(θ) be the Gaussian mixture proposal distribution. Assuming that we have a realization of *k* samples from the *Q* and the importance weights β_{i} = *G*_{0}(θ_{i}) / *Q*(θ_{i}) (*i* = 1,…, *k*), then the normalization constant appearing in the mixing rate of the (*k* + 1)-th component can be approximated by (∑_{j=1}^{k}β_{j})^{−1} ∑_{j=1}^{k} β_{j}*P*(*Y*|θ_{j}). The approximated mixing rates are then proportional to

One of the previously obtained *k* samples is replicated if a sampling process draws a component coincident to one of {1,…, *k*}. If not, we add a new case θ_{k+1} = θ^{*} from *Q*, and then assign the likelihood *P*(*Y*|θ_{k+1}) and the IS weight β_{k+1}. All the cases, whether new or old, are independently distributed according to the *Q*, thereby ensuring the statistical consistency of the Monte Carlo approximation of .

An example in kinetic search of a circadian clock transcriptional circuit illustrates the practical relevance of the Bayesian expert systems, and we make a comparison to the non-expert approach using a conventional SMC (Nagasaki *et al.*, 2006; Nakamura *et al.*, 2009). Our model considered here is an extended version of the gene regulatory model developed in Matsuno *et al.* (2000) where among several studies used in evaluating relevance of various parameter search algorithms (Nagasaki *et al.*, 2006; Nakamura *et al.*, 2009; Yoshida *et al.*, 2008). The original model describes a reaction circuit of 12 endogenous variables, involving five mRNAs, five translated proteins and two protein complexes, which constitute an interlock of several types of positive and negative feedback loops. We here added, to the previous model, seven clock-related variables and their regulatory mechanisms, reflecting in part the recent discoveries (Baggs *et al.*, 2009; Ueda *et al.*, 2005). The newly constructed DE model—its mathematical form is available in the Supplementary Material—contains 116 unknown parameters in total, consisting of 97 kinetic parameters and 19 initial conditions on *x*_{(0)}.

The search methods were applied to time course gene expression indices from GeneChip mouse genome microrrays (Ueda *et al.*, 2002). Of the 19 variables, temporal expression changes of the seven mRNAs were measured along 12 time points equally spaced on 44 h (http://sirius.cdb.riken.jp/MouseSCN/MouseSCNCCG(020707).html). A much detailed information on the specified prior distributions—values for γ, temperature and conventional Gaussian priors used in combinations—is accessible from the Supplementary web site, as well as the LiSDAS code that can reproduce the subsequent numerical analyses with a default configuration file.

Figure 2 displays simulation trajectories of the 19 endogenous variables, where the identified kinetic parameters and initial state variables were defined as the θ_{i} exhibiting the best fit in terms of *P*(*Y*|θ_{i}). The DEs were solved using the Euler discretization with 700 time steps equally spaced on the entire time interval. CPU time required for the execution of LiSDAS was ~40 min (Intel Core2 Quad processor, 2.66 GHz) for the particles of size *k* = 5 × 10^{5}. Compared with the very poor results of the non-expert SMC method using only the conventional Gaussian priors, the LiSDAS could capture the underlying circadian rhythm inherent in the observed data with much better degree of fitness. Note that the applied search procedure is almost the same as a random search. The particles were all drawn from *G*_{0}, and then the best fit value was merely chosen. Therefore, the good performance arose from only the added Bayesian expert system. To the best of our knowledge, the methods being able to estimate such many parameters have never appeared in the related research field, while so far, our previous study (Yoshida *et al.*, 2008) and the GA of Koh *et al.* (2006) could draw reasonably good performances in analyses of a transcription model and *in silico* signal transduction pathway, containing, 44 and 84 unknown parameters respectively.

This brief article has described the new idea on data-driven parameter search for bio-pathway DE models. The key contributions lie in the novel Bayesian expert systems induced by the base measure of DP prior. The expert systems rely on the Gibbsian potential functions that reflect automatically recognized model-specific contexts—reaction speeds, equilibrium states—to the subsequent search process so as to prune insignificant parameter spaces. The ability to aid in improving efficiency has been demonstrated in the example on transcriptional regulatory models. The practical benefit was surprisingly significant; the applied simple search algorithm could draw fairly good performances only by a plug-in of the expert systems.

With a view to general versatility of the current approach, developments of more practically relevant expert systems remain a challenge yet. A biological circuit is made of several types of biochemical reactions—phosphorylation cascade-mediated signal transductions and metabolic reactions—other than the transcriptional processes we studied. In many cases, these reactions are modelled by more complex equations, being either stochastic or non-stochastic, than the Hill functions used in this article. Though it would be straightforward, even for more complex DE systems, to relate observed maximal/minimal production levels to steady-state conditions, some difficulties arise in evaluations of experts on reaction speeds with a given more complex model due to unavailability of the analytical expression for the reaction speeds. We then need to explore some numerical techniques for making expert system priors.

Making further variants in expert construction is possible, and will then be inherently context specific. In view of biochemical kinetics, it would be more helpful to incorporate, to further prior modelling, differences of time scales on each reaction speed, such as difference in molecular stability of different proteins and mRNAs. Signal transduction pathways usually change in transcription factor activities on subsecond time scales. Binding of transcription factor to its target promoter reaches equilibrium in seconds. Transcription and translation of genes take many minutes in reaching steady state. We need to explore ways of reflecting these time scales to subjective prior distributions expressing a biologically significant subspace of kinetic parameters.

Another important issue is unreliability of model. *In silico* circuits embody currently obtained knowledge on many pairs of interacting molecules. In many applications, such a circuit structure is totally unreliable because of environmental dependency and diversity of cells. Very often, simulations fail to reproduce experimental data irrelevant to which kinetic parameters are specified. The inconsistency indicates presence or absence of some regulatory mechanisms to be reflected into model reconstruction. Recent systems biology therefore has explored a structural learning of reactions circuits in data-driven ways (Cantone *et al.*, 2009; Cao *et al.*, 2008; Porreca *et al.*, 2010). LiSDAS with the procedure specification file in place can explore the most plausible circuit with the highest degree of fit as well as kinetic values by adding/deleting edges completely at random. However, with a view to larger scale models, the ability of such a low-level search method reaches a performance limit due to a huge space of possible edge configurations. Our next challenge is the development of Bayesian experts in exploring circuit structure.

*Conflict of Interest*: none declared.

- Alon U. An Introduction to Systems Biology. London, UK: Chapman and Hall; 2006.
- Baggs JE, et al. Network features of the mammalian circadian clock. PLoS Biol. 2009;7:e1000052. [PMC free article] [PubMed]
- Blei DM, et al. Variational inference for Dirichlet process mixtures. Bayesian Anal. 2005;16:121–144.
- Cantone I, et al. A yeast synthetic network for
*in vivo*assessment of reverse-engineering and modeling approaches. Cell. 2009;137:172–181. [PubMed] - Cao J, et al. Estimating dynamic models for gene regulation networks. Bioinformatics. 2008;24:1619–1624. [PMC free article] [PubMed]
- Chen KC, et al. Integrative analysis of cell cycle control in budding yeast. Mol. Biol. Cell. 2004;15:3841–3862. [PMC free article] [PubMed]
- Escobar MD, et al. Bayesian density estimation and inference using mixtures. J. Am. Stat. Assoc. 1995;90:577–588.
- Koh G, et al. A decompositional approach to parameter estimation in pathway modeling: a case study of the Akt and MAPK pathways and their crosstalk. Bioinformatics. 2006;22:e271–e280. [PubMed]
- Finkenstädt B, et al. Reconstruction of transcriptional dynamics from gene reporter data using differential equations. Bioinformatics. 2008;24:2901–2907. [PMC free article] [PubMed]
- Fujarewicz K, et al. Adjoint systems for models of cell sinaling pathways and their applications to parameter fitting. IEEE/ACM Trans. Comput. Biol. Bioinformatics. 2007;4:322–335. [PubMed]
- Higuchi T. Monte Carlo filter using the genetic algorithm operators. J. Stat. Comput. Simul. 1997;59:1–23.
- Kikuchi S, et al. Dynamic modeling of genetic networks using genetic algorithm and S-system. Bioinformatics. 2003;19:643–650. [PubMed]
- Kimura S, et al. Inference of S-system models of genetic networks using a cooperative coevolutionary algorithm. Bioinformatics. 2004;21:1154–1163. [PubMed]
- MacEachern SN, et al. Estimating mixture of Dirichlet process models. J. Comput. Graph. Stat. 1998;7:223–238.
- Matsuno H, et al. Hybrid Petri net representation of gene regulatory network. Pac. Symp. Biocomput. 2000;5:341–352. [PubMed]
- Nagasaki M, et al. Genomic data assimilation for estimating hybrid functional petri net from time-course gene expression data. Genome Inform. 2006;17:46–61. [PubMed]
- Nakamura K, et al. Parameter estimation of in silico biological pathways with particle filtering towards a petascale computing. Pac. Symp. Biocomput. 2009;14:227–238. [PubMed]
- Perrin BE, et al. Inference of gene regulatory network using dynamic Bayesian networks. Bioinformatics. 2003;19:ii138–ii148. [PubMed]
- Porreca R, et al. Identification of genetic network dynamics with unate structure. Bioinformatics. 2010;26:1239–1245. [PubMed]
- Quach M, et al. Estimating parameters and hidden variables in non-linear state-space models based on ODEs for biological networks inference. Bioinformatics. 2007;23:3209–3216. [PubMed]
- Ueda HR, et al. A transcription factor response element for gene expression during circadian night. Nature. 2002;418:534–539. [PubMed]
- Ueda HR, et al. System-level identification of transcriptional circuits underlying mammalian circadian clocks. Nat. Genet. 2005;37:187–192. [PubMed]
- Wilkinson DJ. Stochastic Modelling for Systems Biology. London, UK: Chapman and Hall; 2006.
- Yoshida R, et al. Bayesian learning of biological pathways on genomic data assimilation. Bioinformatics. 2008;24:2592–2601. [PubMed]

Articles from Bioinformatics are provided here courtesy of **Oxford University Press**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |