Home | About | Journals | Submit | Contact Us | Français |

**|**BMC Syst Biol**|**v.4; 2010**|**PMC2955603

Formats

Article sections

- Abstract
- Background
- Methods
- Results
- Discussion
- Conclusions
- Authors' contributions
- Supplementary Material
- References

Authors

Related links

BMC Syst Biol. 2010; 4: 130.

Published online 2010 September 22. doi: 10.1186/1752-0509-4-130

PMCID: PMC2955603

Sophie Lèbre,^{1,}^{2} Jennifer Becq,^{3,}^{4,}^{5} Frédéric Devaux,^{6} Michael PH Stumpf,^{}^{#}^{1,}^{7} and Gaëlle Lelandais^{}^{#}^{3,}^{4,}^{5}

Sophie Lèbre: rf.artsinu.srnc-tiisl@erbel.eihpos; Jennifer Becq: rf.toredid-sirap-vinu@qceb.refinnej; Frédéric Devaux: rf.sne.eigoloib@xuaved; Michael PH Stumpf: ku.ca.lairepmi@fpmuts.m; Gaëlle Lelandais: rf.toredid-sirap-vinu@siadnalel.elleag

Received 2010 March 22; Accepted 2010 September 22.

Copyright ©2010 Lèbre et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article has been cited by other articles in PMC.

Biological networks are highly dynamic in response to environmental and physiological cues. This variability is in contrast to conventional analyses of biological networks, which have overwhelmingly employed static graph models which stay constant over time to describe biological systems and their underlying molecular interactions.

To overcome these limitations, we propose here a new statistical modelling framework, the ARTIVA formalism (Auto Regressive TIme VArying models), and an associated inferential procedure that allows us to learn temporally varying gene-regulation networks from biological time-course expression data. ARTIVA simultaneously infers the topology of a regulatory network and how it changes over time. It allows us to recover the chronology of regulatory associations for individual genes involved in a specific biological process (development, stress response, etc.).

We demonstrate that the ARTIVA approach generates detailed insights into the function and dynamics of complex biological systems and exploits efficiently time-course data in systems biology. In particular, two biological scenarios are analyzed: the developmental stages of *Drosophila melanogaster *and the response of *Saccharomyces cerevisiae *to benomyl poisoning.

ARTIVA does recover essential temporal dependencies in biological systems from transcriptional data, and provide a natural starting point to learn and investigate their dynamics in greater detail.

Molecular interactions and regulatory networks underlie the development and functioning of biological systems [1-3]. These networks reliably and robustly coordinate the molecular and biochemical processes inside a cell, while remaining flexible in order to respond to physiological and environmental changes. The changing nature of regulatory and signalling interactions is beyond doubt, and a dynamical point of view is already deeply enshrined into cell and molecular biology. Illustrations of such time-varying biological systems can be provided for instance by the development of the fruitfly *Drosophila melanogaster *- which is segmented into different life stages: embryogenesis, larva, pupa and adult, or the adaptation of cellular organisms (the yeast *Saccharomyces cerevisiae *for instance) to growth defects and cellular damages induced by environmental stresses. Because they are extensively studied, considerable large-scale functional screening data exist for these examples. But while a growing number of studies report detailed and time-resolved analyses of regulatory and signalling processes [4,5], mapping these temporally changing networks systematically remains a major and increasingly pressing challenge.

From available data, *in-silico *methods can generate hypotheses about biochemical and molecular mechanisms [6,7] and guide further experimental and theoretical investigations into regulatory interactions underlying biological systems. Biological networks are usually described mathematically in a way where each gene is represented by a node and the interactions (or regulatory associations) between genes as edges. A range of approaches has been proposed, which learn or infer correlative and causal relationships among the genes from high-throughput, in particular gene expression, data. However most of these approaches assume that the topology of the network, *i.e*. the sets of nodes and edges, stays constant over time. Inferring the temporal changes in biological networks is an important statistical challenge [8], but it does open up new perspectives for biological data analyses and will aid the generation of hypotheses about the dynamics of biological systems.

Serious attempts to reconstruct dynamic networks whose topology changes with time started in 2005 [9,10]. Yoshida *et al*. [9] employed a dynamic linear model with Markov switching for estimating time-dependent gene network structures from time-series gene expression data. Although promising this approach assumes that there is a fixed (user-specified) number of distinct networks or phases, and the switching between phases is modelled *via *a stochastic transition matrix that requires an estimation of many parameters. Talih and Hengarten [10] developed a Markov Chain Monte Carlo (MCMC) methodology to recover time-varying Gaussian graphical models in a financial context. Again the total number of distinct network topologies is assumed to be known *a priori *and the network evolution is restricted to changing at most a single edge at a time. More recently (2007-2008) methods in which the number of distinct regulatory phases is determined *a posteriori *have been proposed. Fujita *et al*. [11] developped a Dynamic Vector Autoregressive model to estimate time-varying gene regulatory networks. Notably, only the values of the network parameters change over time, meaning that the global topology of the network remains constant. Xuan and Murphy [12] introduced an iterative procedure based on a similar modelling *ansatz*, which switches between a convex optimization approach for determining a suitable candidate graph structure and a dynamic programming algorithm for calculating the segmentation of the time into distinct phases, *i.e*. the sets of successive time-points for which the graph structure remains unchanged. This time, the number of phases is explicitly determined, but it requires that the graph structure is decomposable. Finally, Robinson and Hartemink [13] used a MCMC sampler for the inference of non-stationary dynamical Bayesian networks, with the attractive feature that the network structure within a temporal phase depends on the structure of the contiguous phases.

The approaches cited so far produce global network topologies with global changes, meaning that all the genes of the network change their regulatory inputs simultaneously. In reality however, we would rather expect that each gene (or at most a subset of genes) has its own and characteristic regulatory pattern. To that end, Rao *et al*. [14] developed a method called *regime-SSM*, which is divided into two steps. The main idea is to first cluster the genes that share the same temporal phases before inferring, in a second step, the network topology describing the regulatory associations between genes within each cluster using an expectation-maximization (EM) algorithm. Ahmed and Xing [15] introduced in 2009 a machine learning algorithm (called TESLA) to infer time evolving networks (that are gene-specific), by solving a set of temporally smoothed *l*_{1}-regularized logistic regression problems *via *convex optimization techniques.

The challenge of inferring time-varying structures of gene regulation networks is only starting to be adressed and in this paper we present the ARTIVA algorithm (Auto Regressive TIme VArying models) that is particularly well-suited for addressing the issues raised above. Starting from time-course gene expression data, ARTIVA performs a gene-by-gene analysis and infers simultaneously (*i*) the topology of the regulatory network, and (*ii*) how it changes over time. In order to strike a balance between model refinement and the amount of information available to infer the model parameters, the ARTIVA model delimits temporal segments for each gene where the influence factors and their weights can be assumed homogeneous. For that we use a combination of efficient and robust methods: dynamical Bayesian networks (DBN) to model directed regulatory interactions between genes and Reversible Jump MCMC for inferring simultaneously the times when the network changes and the resulting network topologies. We evaluate the performance of ARTIVA on simulated data and illustrate our approach in the context of two different biological systems. We start by analyzing a commonly used dataset related to the developmental stages of *Drosophila melanogaster *and demonstrate the utility of our approach by a comparative analysis of the ARTIVA results with the TESLA results [15]. Next, we analyze the response of the yeast *Saccharomyces cerevisiae *to benomyl poisoning. This dataset represents an important challenge for the inference of time-varying networks since (*i*) the number of time-points is extremelly small (only 5 time points) and (*ii*) the expression values combine measurements obtained in wild-type and knock-out yeast strains. The biological relevance of the results obtained with ARTIVA are finally assessed using functional annotations and transcription factor binding information.

Bayesian Networks (BNs) have become a popular framework for representing regulatory networks [6,16] as they offer both a probabilistic interpretation of dependencies among expression of genes and a graphical representation that is more readily accessible than mathematical expressions (see Figure Figure1).1). For example if the expression level of gene *i*, here denoted by *X ^{i}*, determines the expression level of genes

$$j\leftarrow i\to k$$

can be drawn and the joint distribution of gene expression levels written,

$$P\left({X}^{i},{X}^{j},{X}^{k}\right)=P\left({X}^{k}|{X}^{i}\right)P\left({X}^{j}|{X}^{i}\right)P\left({X}^{i}\right),$$

where *P*(*a*|*b*) denotes the probability of *a *conditional on *b*. Because BNs aim to represent the joint probability distribution (in our case for the expression levels of *p *genes) the corresponding graphical representation is limited to graphs which contain no cycles (Figure (Figure1B).1B). This means that closed loops or complex feed-back structures (as in Figure Figure1A)1A) cannot faithfully be represented, whereas they are known to pervade regulatory networks [17].

With time-course measurements, this limitation can be overcome by employing a Dynamical Bayesian Network (DBN) formalism [18], where the expression levels of all the genes in a system are modelled as a generally discrete-time stochastic process (Figure (Figure1C).1C). For *p *genes and *n *measurements the expression levels are written as *X ^{i}*(

$$P({X}^{i}(t)|{X}^{r}(t-1),\text{\hspace{0.17em}}...,\text{\hspace{0.17em}}{X}^{s}(t-1))$$

(1)

This means that the expression level of gene *i *at time *t *depends on the expression levels of genes *r*, ..., *s *at time *t *- 1. Genes *r*, ..., *s *are called the 'parents' of gene *i *and denoted by Pa* ^{i }*(reciprocally gene

Let *p *be the number of observed genes and *n *the number of time-points at which expression levels are measured for each gene. In this study, the discrete-time stochastic process *X *= {*X ^{i}*(

$$\forall t\ge 2,\text{}X(t)=A(t)X(t-1)+B(t)+\u03f5(t)\text{}with\text{}\u03f5(t)~\mathcal{N}(0,\text{\hspace{0.17em}}\Sigma (t)),$$

(2)

where ** X**(

Crucially, the coefficient matrix ** A**(

For each gene, *i*, a set of time-points for which the regulatory inputs of the gene change is determined. These time-points are referred to as 'changepoints' and delimit homogeneous phases, *i.e*. sets of time-points for which the local network topology (edges between gene *i *and its parents Pa* ^{i }*) remains unchanged. Assuming

$$\begin{array}{ccc}\forall {\xi}_{h-1}^{i}\le t<{\xi}_{h}^{i}\text{,}& {b}^{i}(t)={b}_{h}^{i},& \forall 1\le j\le p,\text{}{a}^{ij}(t)={a}_{h}^{ij}\end{array}$$

(3)

For phase *h *of gene *i*, the parents ${\text{Pa}}_{h}^{i}$ of gene *i *include every gene *j *such that the coefficient ${a}_{h}^{ij}$ differs from 0: ${\text{Pa}}_{h}^{i}=\{j;\forall 1\le j\le p,\text{\hspace{0.17em}}{a}_{h}^{ij}\ne 0\}$. Hence for ${\xi}_{h-1}^{i}\le t<{\xi}_{h}^{i}$ the expression of gene *i *is modelled in a regression framework as:

$$\begin{array}{cc}{X}^{i}(t)={\displaystyle \sum _{j\in {\text{Pa}}_{h}^{i}}{a}_{h}^{ij}{X}^{j}(t-1)+{b}_{h}^{i}+{e}^{i}(t),}& with\text{}{e}^{i}(t)~\mathcal{N}(0,\text{\hspace{0.17em}}{({\sigma}_{h}^{i})}^{2}),\end{array}$$

(4)

where *X ^{j}*(

We want to infer the autoregressive time-varying network model, which belongs to the overall parameter space that is the union of the parameter spaces of all phases delimited by *k *changepoints (*k *= {0, ..., $\overline{k}$}). Furthermore, for each phase the number of incoming edges on each node (or the network topology) is unknown. Adding or removing a changepoint results in a change in the dimension of the system's state-space: for each additional changepoint a new network topology has to be estimated, and for each deleted changepoint the results previously obtained for the two distinct phases have to be reconciled. Thus, the dimension of the model is unknown and can vary substantially. In order to infer the posterior distribution Pr(*k*, *ξ*, *s*, Pa, *θ*, *σ*|*x*) given the observed data *x *over all of the system's parameters, we used a Reversible Jump Markov Chain Monte Carlo (RJ-MCMC) procedure. The principle of RJ-MCMC lies in constructing a reversible Markov chain sampler that can jump between parameter subspaces of different dimensions; thus allowing the generation of an ergodic Markov chain whose equilibrium distribution is the desired posterior distribution [21,22].

Presented in Figure Figure2,2, our inference procedure allows us to simultaneously consider all possible combinations of changepoints and network topologies within the different phases. In the RJ-MCMC procedure, the likelihood of the expression measurements *x*(1) observed at time-point *t *= 1 is denoted by Pr(*x*(1)). From the hierarchical structure of the overall parameter space, the joint probability distribution over all parameters can thus be written as the product:

$$\text{Pr}(\text{}k,\text{}\xi ,\text{}s,\text{}Pa,\text{}\theta ,\text{}\sigma ,\text{}x)\text{}=\text{Pr}(x(1)){\displaystyle \prod _{i=1}^{p}\{}\text{Pr}({k}^{i},\text{\hspace{0.17em}}{\xi}^{i},\text{\hspace{0.17em}}{s}^{i},\text{\hspace{0.17em}}{\text{Pa}}^{i},\text{\hspace{0.17em}}{\theta}^{i},\text{\hspace{0.17em}}{\sigma}^{i},\text{\hspace{0.17em}}{x}^{i})\}.$$

(5)

For each gene, *i*, we construct a RJ-MCMC sampler that directly samples from the joint distribution:

$$\begin{array}{l}\text{Pr}({k}^{i},\text{\hspace{0.17em}}{\xi}^{i},\text{\hspace{0.17em}}{s}^{i},\text{\hspace{0.17em}}{\text{Pa}}^{i},\text{\hspace{0.17em}}{\theta}^{i},\text{\hspace{0.17em}}{\sigma}^{i},\text{\hspace{0.17em}}{x}^{i})={\text{Pr}}_{\stackrel{\xaa}{k}}({k}^{i})\text{Pr}({\xi}^{i}|{k}^{i})\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\displaystyle \prod _{h=1}^{{k}^{i}+1}\text{P}}\text{r}({s}_{h}^{i},\text{\hspace{0.17em}}{\text{Pa}}_{h}^{i},\text{\hspace{0.17em}}{\theta}_{h}^{i},\text{\hspace{0.17em}}{\sigma}_{h}^{i})\text{Pr}({x}_{h}^{i}|{\xi}_{h-1}^{i},\text{\hspace{0.17em}}{\xi}_{h}^{i},\text{\hspace{0.17em}}{s}_{h}^{i},\text{\hspace{0.17em}}{\text{Pa}}_{h}^{i},\text{\hspace{0.17em}}{\theta}_{h}^{i},\text{\hspace{0.17em}}{\sigma}_{h}^{i}),\end{array}$$

(6)

Where ${\text{Pr}}_{\overline{k}}({k}^{i})$ Pr(*ξ ^{i}|k^{i}*) and are respectively the prior probabilities of the number of changepoints

$$\text{Pr}({x}_{h}^{i}|{\xi}_{h-1}^{i},\text{\hspace{0.17em}}{\xi}_{h}^{i},\text{\hspace{0.17em}}{s}_{h}^{i},\text{\hspace{0.17em}}{\text{Pa}}_{h}^{i},\text{\hspace{0.17em}}{\theta}_{h}^{i},\text{\hspace{0.17em}}{\sigma}_{h}^{i})={\displaystyle \prod _{{\xi}_{h}^{i}\le t<{\xi}_{h+1}^{i}}\text{Pr}({x}^{i}(t)|{\xi}_{h-1}^{i},\text{\hspace{0.17em}}{\xi}_{h}^{i},\text{\hspace{0.17em}}{s}_{h}^{i},\text{\hspace{0.17em}}{\text{Pa}}_{h}^{i},\text{\hspace{0.17em}}{\theta}_{h}^{i},\text{\hspace{0.17em}}{\sigma}_{h}^{i})}$$

(7)

is the likelihood of the expression levels ${x}_{h}^{i}={({x}^{i}(t))}_{{\xi}_{h}^{i}\le t<{\xi}_{h+1}^{i}}$ of gene *i *observed during phase *h*, and is a realization of the Gaussian distribution defined in Equation(4).

In order to reinforce sparsity of the network and following multiple changepoint approaches involving RJ-MCMC [23,24], we assume the number of changepoints *k ^{i }*to be distributed

$$\forall {k}^{i}\le \overline{k},\text{}{\text{Pr}}_{\overline{k}}({k}^{i})\propto {e}^{-\lambda}\frac{{\lambda}^{{k}^{i}}}{{k}^{i}!}{1}_{\{{k}^{i}\le \overline{k}\}}.$$

(8)

Similarly, the prior probability for the number of parents is a truncated Poisson distribution ${\text{Pr}}_{\stackrel{\xaa}{s}}({s}_{h}^{i})$ with mean Λ and maximum $\overline{s}$. Here *λ *and Λ can be interpreted as the expected number of changepoints and parent variables, respectively. Following [25], *λ *and Λ are drawn according to a Gamma distribution: $\lambda ,\text{\hspace{0.17em}}\Lambda ~\mathcal{G}a(\alpha ,\text{\hspace{0.17em}}\beta )$ where the shape parameter *α *and the scale parameter *β *are chosen so that the prior probability decreases when the numbers of changepoints or parents increase (we set *α *= 1, *β *= 0.5, see Additional file 1 for an illustration of the corresponding distribution). Conditional on there being *k ^{i }*changepoints, we assume that the changepoint positions vector

$$\text{Pr}({s}_{h}^{i},\text{\hspace{0.17em}}{\text{Pa}}_{h}^{i},\text{\hspace{0.17em}}{\theta}_{h}^{i}|{\sigma}_{h}^{i})={\text{Pr}}_{\stackrel{\xaa}{s}}({s}_{h}^{i})\text{Pr}({\text{Pa}}_{h}^{i}|{s}_{h}^{i})\text{Pr}({\theta}_{h}^{i}|{\sigma}_{h}^{i},\text{\hspace{0.17em}}{s}_{h}^{i},\text{\hspace{0.17em}}{\text{Pa}}_{h}^{i}).$$

(9)

Given the parent gene set ${\text{Pa}}_{h}^{i}$ of size ${s}_{h}^{i}$, the ${s}_{h}^{i}+1$ regression coefficients, ${\theta}_{{\text{Pa}}_{h}^{i}}=({b}_{h}^{i},\text{\hspace{0.17em}}{({a}_{h}^{ij})}_{j\in {\text{Pa}}_{h}^{i}})$, are assumed to be drawn from zero-mean Gaussian distributions with covariance ${({\sigma}_{h}^{i})}^{2}{\Sigma}_{{\text{Pa}}_{h}^{i}}$,

$$\text{Pr}({\theta}_{h}^{i}|{\sigma}_{h}^{i},\text{\hspace{0.17em}}{s}_{h}^{i},\text{\hspace{0.17em}}{\text{Pa}}_{h}^{i})=|2\pi {({\sigma}_{h}^{i})}^{2}{\Sigma}_{{\text{Pa}}_{h}^{i}}{|}^{-1/2}\mathrm{exp}\left[-\frac{{\theta}_{{\text{Pa}}_{h}^{i}}^{\u02c6}{\displaystyle {\sum}_{{\text{Pa}}_{h}^{i}}^{-1}{\theta}_{{\text{Pa}}_{h}^{i}}}}{2{({\sigma}_{h}^{i})}^{2}}\right],$$

(10)

where the symbol † denotes matrix transposition, ${\sum}_{{\text{Pa}}_{h}^{i}}=\text{\hspace{0.17em}}}{\delta}^{-2}{D}_{{\text{Pa}}_{h}^{i}}^{\u02c6}(x){D}_{{\text{Pa}}_{h}^{i}}(x)$ and ${D}_{{\text{Pa}}_{h}^{i}}(x)$ is a matrix of size ${m}^{i}({\xi}_{h}^{i}-{\xi}_{h-1}^{i})\times ({s}_{h}^{i}+1)$, whose first column is a vector of 1 when the regression model includes a constant, and each *j *+ 1* ^{th }*column contains the observed (eventually repeated) value ${({x}_{tl}^{j})}_{({\xi}_{h-1}^{i}-1)\le t<({\xi}_{h}^{i}-1);1\le l\le m}$ for all parent gene

A noticeable advantage of the model is that the marginalization over the regression parameters (*θ ^{i }*,

$$\text{Pr}({k}^{i},\text{\hspace{0.17em}}{\xi}^{i},\text{\hspace{0.17em}}{s}^{i},\text{\hspace{0.17em}}{\text{Pa}}^{i},\text{\hspace{0.17em}}{x}^{i})=\int \int \text{Pr}({k}^{i},\text{\hspace{0.17em}}{\xi}^{i},\text{\hspace{0.17em}}{s}^{i},\text{\hspace{0.17em}}{\text{Pa}}^{i},\text{\hspace{0.17em}}{\theta}^{i},\text{\hspace{0.17em}}{\sigma}^{i}|{x}^{i})d{\theta}^{i}d{\sigma}^{i}$$

(11)

(see Additional file 1, Section 2 for more details). Then the proposals are sampled from the analytical expression of the network topology posterior distribution (11) -- which is proportional to Pr(*k ^{i }*,

In order to traverse the parameter space of unknown dimension we propose here four different update moves (see Figure Figure22 and Additional file 1): birth of a new changepoint (*B*); death (removal) of an existing changepoint (*D*); shift of a changepoint to a different time-point (*S*); and update of the regression model defining the network topology within the phases (*R*). These moves occur with probabilities *b _{k }*for

$${b}_{k}=c\mathrm{min}\left\{1,\text{}\frac{{\text{Pr}}_{\stackrel{\xaa}{k}}({k}^{i}+1)}{{\text{Pr}}_{\stackrel{\xaa}{k}}({k}^{i})}\right\},\text{}{d}_{k+1}=c\mathrm{min}\left\{1,\text{}\frac{{\text{Pr}}_{\stackrel{\xaa}{k}}({k}^{i})}{{\text{Pr}}_{\stackrel{\xaa}{k}}({k}^{i}+1)}\right\}$$

(12)

where ${\text{Pr}}_{\overline{k}}$ is the prior distribution for the number of changepoints and the constant *c *is chosen to be smaller than 14 so that the regression model updates and changepoint position shifts are proposed more frequently than births and deaths of changepoints. This improves our ability to infer changepoint positions and the network structures (using the vector-autoregressive framework) within the different phases. Proposed shifts in changepoint positions are accepted using a standard Metropolis-Hastings step, while regression model updates within phases invoke a second RJ-MCMC criterion, which was adapted from the model selection approach of Andrieu and Doucet [25]. As proposals are sampled from the analytical network topology posterior distribution (11), the generation of the regression model parameters $({\theta}_{h}^{i},\text{\hspace{0.17em}}{\sigma}_{h}^{i})$ is optional. Together the four moves *B*, *D*, *S *and *R *allow the generation of samples from probability distributions defined on unions of spaces of different dimensions for both the number, *k ^{i}*, of changepoints and the number ${s}_{h}^{i}$ of parents within each phase for gene

Given *a priori *probabilities, the ARTIVA algorithm produces posterior probability estimations over the algorithm iterations for changepoint vectors and network topologies. These posterior probabilities give a detailed picture of all the results and allow in depth analyses of the entire regulatory network architecture. In this study we use in complement to posterior probabilities, the Bayes factor, *i.e*. the ratio of the posterior odds of an hypothesis over its prior odds [27]. The Bayes factor has the advantage to consider the posterior distribution with respect to the priors and to obtain quantitative measurements of the statistical significance of the ARTIVA results which are comparable between different datasets. As an indication, according to Kass and Raftery [27], a proposition is (*i*) *not *supported when it has a Bayes factor below 3, (*ii*) *positively *supported for a Bayes factor between 3 and 20 and (*iii*) *strongly *supported for a Bayes factor over 20. The performance of ARTIVA is evaluated on synthetic and real data (see the following section) by selecting the network structure according to the following procedure. For each gene *i *we first choose the number *k ^{i }*of changepoints having the greatest Bayes factor. Then the

In order to evaluate the accuracy of the ARTIVA algorithm to recover changepoints and network topologies correctly, expression data for randomly defined dynamic networks were generated. With respect to the experimental datasets analyzed later (see the following section), two types of expression data were produced. The first type -- referred to Wild-Type (WT) simulations -- match the 'Drosophila life cycle' data. This dataset contains time-series expression data of several genes and the algorithm must find the correlations between unknown parent genes and each target gene. The second type -- referred to as Knock-Out (KO) simulations -- is equivalent to the 'benomyl' dataset. This dataset only contains time-series expression data of target genes in different genetic contexts: wild-type and knock-out mutants for several transcription factors (TFs).

The simulation procedure for a given target gene, also presented in detail in Additional file 2, involves three main steps. First, the structure of the dynamic regulatory network is defined. This consists of randomly setting the number and the localization of changepoints, thereby defining regulatory phases. Then the parent genes and the corresponding coefficients are chosen for each phase. Once the regulatory network is defined, expression data can be generated from this network model. The expression values of the parent genes are first generated randomly (uniformly drawn from [-2, -0.1] [0.1, 2]) and subsequently used to calculate the target gene expression according to the autoregressive model presented in Eqn. (4). The whole procedure is repeated (to represent experimental replicates) and noise is added (to represent experimental variability). Because the simulations use the ARTIVA hypothesis concerning the expression associations between parent and target gene expression profiles (autoregressive model), we expect all the results to be correct under ideal conditions (like absence of noise). Therefore, this simulation protocol evaluates the ARTIVA performance and studies the influence of the following parameters:

• the quantity of noise in the data. For all phases *h *of gene *i*, the noise *e ^{i}*(

• the size of the temporal phases (*phasesize *= 1, 2, ..., 5, 12), and

• for WT simulations only, the number of potential parent genes (*Pa*# = 5, 10, 20, 40). This is not necessary for KO simulations because the potential parent genes are obviously restricted to the transcription factors for which KO data is being generated. Regardless of the number of potential parent genes, a maximum of 5 edges from parent genes to a target gene is allowed.

For each parameter value, 200 gene time-series of length = 12 timepoints were randomly generated with 8 and 4 replicates for WT and KO data respectively. Note that KO simulations present less replicates because each replicate already comprises the measurements of the gene expression levels for each knock-out mutant. All other parameters were set to their default values and are specified in Table Table1.1. The ARTIVA algorithm was run on each expression data set and we compared the proposed network model (selected as described in the previous subsection 'Model selection') with the original one. The ability of the algorithm to recover changepoints was evaluated via the Positive Predictive Value (PPV) and the Sensitivity,

$$PPV=\frac{TP}{(TP+FP)}\text{}Sensitivity\text{}=\frac{TP}{(TP+FN)}$$

(13)

with TP = True Positives, FP = False Positives and FN = False Negatives. The edges PPV and Sensitivity was computed for the phases whose changepoints were correctly inferred.

The first microarray dataset -- referred to as '*Drosophila *life cycle' data -- was produced by Arbeitman *et al*. [28]. It includes the mRNA expression levels of 4028 genes at 67 successive time-points spanning the four stages of the *D. melanogaster *life cycle: the embryonic (31 time-points), larval (10 time-points) and pupal stage (18 time-points) and the first 30 days of adulthood (8 time-points). Expression data were collected from the Gene Expression Omnibus database: http://www.ncbi.nlm.nih.gov/geo/. 4005 genes with consistent annotation are used for the analysis. Potential parent genes were restricted to genes with known transcriptional activity based on Gene Ontology information [29]. Hence, 136 genes were selected as potential parents. They belong to one of the four following Gene Ontology molecular functions: 'Transcription activator activity' (GO:0016563), 'Transcription repressor activity' (GO:0016564), 'Transcription factor activity' (GO:0003700) and 'Transcription cofactor activity' (GO:0003712). For each target gene, we gave priority to the 10 potential parent genes with the most highly correlated gene expression profiles over any successive 10 time-points.

The second microarray dataset -- referred to as 'benomyl' data -- was published by Lucau-Danila *et al*. [30]. In this study, the authors measured the changes in mRNA concentrations for each gene at successive times after addition of benomyl (an antimitotic drug) in the growth media of *Saccharomyces cerevisiae *cells. Parallel experiments were conducted in different genetic contexts: the wild type strain and knock-out (KO) strains in which the genes coding for different transcription factors connected to drug response, *YAP1*, *PDR1*, *PDR3*, and *YRR1*, were deleted. For each yeast strain, the measured expression values for 5 time-points (at 30 s, 2 min, 4 min, 10 min, 20 min) were obtained from the website: http://www.biologie.ens.fr/lgmgml/publication/benomyl. We only considered genes that (*i*) showed significant changes in mRNA levels during the time-course analysis in the WT strain (119 genes presented by Lucau-Danila *et al*. [30]), and (*ii*) had less than 20% of missing expression measurement data in the four KO strains. The resulting expression table comprised data for 78 genes (see Additional file 3 for complete list of genes). Hierarchical clustering was performed applying the 'hclust' function available in the R programming langage http://cran.r-project.org/, using Euclidean distance between gene expression profiles and the 'ward' method for gene agglomeration (see also Additional file 4).

The ARTIVA algorithm is implemented in R programming language. The source code is freely distributed to academic users upon request to the authors. A 50,000 iterations procedure lasts around 5 min times the number of genes for the analysis of 100 time-course measurements (for example 5 replicates over 20 time-points) with a 2.66 GHz Intel(R) Xeon(R) CPU and 4 G RAM.

To evaluate the performance of our ARTIVA approach, simulations are run in order to assess the impact of three major factors on the algorithm performances: noise in the data, minimal length of phases, and number of proposed parent genes (the latter for WT simulations only, see Methods). Sensitivity and Positive Predictive Value (PPV) calculated for the detection of changepoints and of models, *i.e*. the topology of the network within the phases, are presented in Table Table1.1. In WT simulations, the changepoint sensitivity is greater than 80% when the noise standard error reaches *σ _{i }*= 1. As noise increases further, the ability of the algorithm to recover changepoints decreases in terms of sensitivity, but still, the changepoint sensitivity remains greater than 70% when the noise standard deviation reaches

The edge detection in Table Table11 was evaluated when the correct changepoint segmentation was recovered. Once the correct changepoints are recovered, neither noise nor short phases appear to strongly affect the detection of edges. The edge sensitivity deteriorates for extreme situations only. Indeed, the edge sensitivity is equal to 18% when phase size is 1 for KO simulations. For WT simulations, the edge sensitivity is about 50% when phase size is 3 or when the number of proposed parents is 20. In all other cases, the edge sensitivity is greater than 75% and the edge PPV is greater than 95%.

Simulation studies such as the one performed here do, of course, only provide a partial insights into an algorithms performance and robustness. They are nevertheless essential to gain confidence in the performance of novel algorithms and to develop understanding of their likely limitations. Together these results serve to illustrate of the robustness of the ARTIVA algorithm. In particular, ARTIVA can deal with some of generic problems encountered in real experimental data. It still performs well when noise standard error is on the order of the mean value of the regression coefficients, when the number of measurement per phase is reduced to 8 or when the number of possible parents reaches 20. At some point, the ARTIVA algorithm misses some changepoints, but the PPV is still very large, meaning that we can have great confidence in the changepoints having a high posterior probability.

In light of the simulation analysis, we then apply our method to the well-studied expression datasets produced by Arbeitman *et al*. [28]. In this study, the authors report gene expression patterns for nearly one-third of all *D. melanogaster *genes during a complete time-course of development. The ARTIVA algorithm is run for each gene for 50,000 iterations, looking for parental relationships with the 10 transcription factors for which gene expression profiles were most highly correlated over any successive 10 time-points (see Methods). Out of the 4005 analyzed genes, 1704 (42%) were found to be involved in the time-varying networks spanning the whole *Drosophila *life-cycle (134 were identified as parent genes, 1623 as target genes and 53 were both parent and target genes). Interestingly, 2583 changepoints were also identified. The distribution over the time-points and with respect to the developmental stages is shown Figure Figure3.3. We observe that time intervals {18 to 19}, {31 to 33}, {41 to 43} and {59 to 61} contain more than 40% of the changepoints. Notably the intervals {31 to 33}, {41 to 43} and {59 to 61} include the developmental stage transitions from embryo to larva, from larva to pupa and from pupa to adult, respectively. The high number of changepoints at mid-embryogenesis (interval {18 to 19}) corresponds to a major morphological change related to a modification of transcriptional regulations, as described in [28].

To further evaluate ARTIVA, we compared our results with those obtained using the TESLA algorithm [15]. TESLA has been recently published (2009) and to our knowledge it is with ARTIVA, the only other procedure which recovers time varying regulatory networks where the changepoints are gene specific. As described in [15], we first discretized the expression measurements into two levels: 1 for up-regulation and 0 for down-regulation. The TESLA procedure requires specification of two parameters, *λ*_{1}, which is a sparsity coefficient, and *λ*_{2}, which is a smoothness penalty coefficient. Several combinations of (*λ*_{1}, *λ*_{2}) parameters were tested (data not shown), and we finally retained the average values presented by the authors in their simulation study [15], *i.e*. *λ*_{1 }= 0.01, *λ*_{2 }= 1. The TESLA analysis was run using the same subset of *Drosophila *genes used with ARTIVA, and the 2583 most significant temporal changes identified with TESLA are compared to the 2583 ARTIVA changepoints (Figure (Figure3,3, dashed line). In agreement with the ARTIVA results, an important number of regulatory changes (28%) occurred during the developmental stage transitions (mid-embryogenesis, embryo to larva, larva to pupa and pupa to adult), but notably this number is significantly lower than the one obtained with ARTIVA (40%, see previous paragraph). This is especially remarkable considering the last phase transition from pupa to adult. The observation of a significant number of changepoints at developmental stage transitions lends credibility and supports our ARTIVA results. Our method appears powerful in inferring the timepoints at which transcriptional control of individual genes switches.

In our second example, we apply ARTIVA to a selected set of 78 gene expression profiles from *Saccharomyces cerevisiae *cells grown under benomyl-induced stress conditions [30] (see Methods). A hierarchical cluster analysis identifies 18 clusters of genes with concordant transcription profiles (see Additional file 5). For each cluster, time varying networks are inferred using the included gene expressions measured in the wild type and four deletion strains (Yap1, Pdr1, Pdr3 and Yrr1), running the RJ-MCMC scheme for 50,000 iterations. Regulatory associations between parent and target genes are proposed if the deletion of a parent gene significantly alters the expression measurements of its target genes (compared to the WT situation) (see Methods). The results are presented in Figure Figure44 and Dataset S1. As an illustration, the cluster #1 comprises 10 genes (Figure (Figure4A)4A) for which two changepoints are detected at the 4 and 10 minute time-points (Figure (Figure4B),4B), when these genes fall under the regulatory control of Yap1 (Figure (Figure4C).4C). Even if Yap1 is the only transcription factor identified here, its regulatory interactions with the target genes in the third phase are highly significant (Bayes factor = 9.10^{3}) compared to those in the second phase (Bayes factor = 14.22). This explains the detection of two changepoints. The results obtained for all other clusters are combined to obtain a global view of the time-varying regulatory network involved in benomyl stress response (Figure (Figure4D).4D). In agreement with the pioneering study of Lucau-Danila *et al*. [30], the transcription factor Yap1 appears to have the predominant role in the benomyl stress response as ARTIVA identified edges with 79% of the analyzed genes (62 associations with clusters # {1, 2, 5, 6, 7, 8, 9, 13, 18, 17}). Also PDR1, being the parent gene of 24% of the genes, exerts significant control (19 associations with clusters # {5, 6}). Pdr3 and Yrr1 present only a small number of target genes (10 associations with cluster #6 and 2 associations with cluster #13, respectively).

Furthermore, our ARTIVA model provides a dynamic classification of the benomyl responsive-genes, based on their time of induction. Such a dynamical point of view can elucidate the chronology of events, especially regarding the Yap1 activity. ARTIVA identified three classes of Yap1 targets, depending on their time of induction: 4 minutes (clusters # {1, 7, 18}, orange arrows Figure Figure4D);4D); 10 minutes (clusters # {2, 8, 9, 13, 17}, yellow arrows Figure Figure4D);4D); and 20 minutes (cluster # {5, 6}, green arrows Figure Figure4D).4D). Almost all the genes included in the earliest group are known to be transcriptionally controlled by Yap1 (95% based on YEASTRACT information [31]). They encode proteins involved in redox control (*GPX2*, *TRR1*, *GSH1*, *GTT2*) or vacuolar transporters (*YCF1*). The middle group contains also an important rate of Yap1 targets (87%), which act at the level of the plasma membrane (*FLR1 *and *FRM2*) or encode proteins involved in response to toxins (for instance *AAD6*, *AAD16*, *ECM4*). Yap1 activity in the last group is partially overlapping with the actions of Pdr1 and Pdr3. Most of the genes in this group have unknown functions, but some of them are still labelled in the YEASTRACT database as being targets for Yap1 (74%), Pdr1 (32%) and Pdr3 (20%). Finally, *YRR1 *deserves a special mention. Unlike the genes that encode the transcription factors Yap1, Pdr1 and Pdr3, the *YRR1 *gene is transcriptionnally activated during the benomyl response. As a consequence, ARTIVA identified *YRR1 *(*i*) as a Yap1 target whose expression was induced 4 minutes after benomyl addition in the cell growth culture (see *** Figure Figure4A);4A); and (*ii*) as a parent for genes *SNG1 *and *YLL056C *at 10 minutes. Interestingly these observations highlight a sequential activity of Yap1 and Yrr1 transcription factors together with an overlap of their targets (Figure (Figure4E).4E). This regulatory model, in which Yrr1 seconds Yap1, is fully supported by recent experimental data [32].

The ARTIVA approach allows us to reverse engineer the temporally varying structure of transcriptional networks by inferring simultaneously the times at which regulatory inputs of genes change and the nature of these incoming inputs. Our approach is computationally efficient and can exploit powerful search heuristics to scan the space of potential incoming edges. Compared to others methodologies recently proposed in the literature, ARTIVA has the major advantage of combining efficient and well-tried techniques (Bayesian networks and RJ-MCMC sampler) in order to solve several related problems. First, with ARTIVA there is no need for prior information regarding either the number of regulatory phases or the number of regulatory interactions between parent and target genes. Starting from uninformative priors (such as truncated Poisson or uniform distributions, see Methods), the posterior distribution for the number of changepoints, their positions and the regulatory models within each recovered phase is directly obtained from the ARTIVA runs. Also, ARTIVA allows the detection of regulatory phases for individual genes. Finally, whereas many approaches -- like Bayesian Dirichlet Equivalent (BDE) score in a dynamic context [13] or the TESLA framework [15]-- require the expression measurements to be discretized, the ARTIVA procedure has the advantage to work directly with continuous datasets. Thus there is no need to set arbitrary thresholds to define up- and down-regulated groups of genes.

We demonstrate the performance of the ARTIVA algorithm by (*i*) applying it to simulated data (Table (Table1)1) and (*ii*) performing a comparative analysis of the ARTIVA and TESLA [15] results (Figure (Figure3).3). Because the simulations were such that they mirror the biological data analyzed afterwards as much as possible, we gain considerable confidence in the output of the ARTIVA approach when used on the two datasets considered here. Overall, the algorithm shows very good performance in retrieving the simulated dynamic networks, except in extremely unfavourable conditions, such as too much noise in the data or time series that are not sufficiently long and dense. These exploratory studies allow us to interpret the ARTIVA outputs more reliably.

The two biological networks presented in this study (Figures (Figures33 and and4)4) are very different, both from a biological and a technical point of view. Their respective analyses represent different challenges for the application of the ARTIVA algorithm. The '*Drosophila *life cycle' data is representative of data used for classical regulatory network inference; successive gene expression measurements spanning a given biological process - here the *Drosophila *development - in order to detect potential regulatory interactions from gene expression profiles. This data is particularly suited for the inference of a temporally varying regulation network, since (*i*) the number of time-points is large (more than 80% of all published time series expression datasets are short with 8 time-points or fewer [33]) and (*ii*) the transitions between the distinct stages of *Drosophila *development {Embryo (E), Larva (L), Pupa (P), Adult (A)} are well-described in the literature [28]. We can thus reasonably expect to identify changepoints precisely at transitions between life stages (Figure (Figure3).3). On the other hand, discrimination between parent and target genes represents an important additional step towards a complete description of the genetic networks that control development. These inferred temporal changes can form hypotheses as to how we can interfere rationally with developmental processes; *e.g*. arresting development in a given state by selectively knocking down transcription factors or targets at a given developmental stage.

The 'benomyl' dataset represents a particular challenge for ARTIVA to retrieve a dynamic regulatory network for two reasons. First, the number of time-points is extremely small (only 5 time-points), and no replicate data points are available. To manage the lack of data, we cluster genes with concordant transcription profiles and analyze them jointly with ARTIVA. This cluster analysis was possible because the maximal intracluster variability did not exceed 0.2 (see Additional file 4), a value that ARTIVA is able to manage based on our simulation results (Table (Table1).1). Second, in this *S. cerevisiae *dataset, it is known that the genes coding for key regulators of the stress response system, *i.e*. transcription factors Yap1, Pdr1 and Pdr3, exhibit flat expression patterns during stress condition (Additional file 4); this prevents the use of correlation measures with their expression profiles to identify causal relationships with their potential target genes. In this context, we needed to adapt the ARTIVA inference procedure in order to integrate gene expression profiles measured in the wild type and KO strains. Regulatory associations between parent and target genes are thus proposed if the deletion of a parent gene significantly alters the expression measurements of its target genes (compared to the WT situation). Compared to the previous study of Lucau-Danila *et al*. [30], the main benefit of ARTIVA analyses is that it provides a dynamic classification of the benomyl response genes (Figure (Figure4).4). It also points out contributions of the Yrr1 and Pdr3 transcription factors, which were ignored in previous analyses. Interestingly, the versatile and non exclusive joint action of Pdr1 and Pdr3 in chemical stress response, together with the overlap with Yap1 activity, is supported by recent experimental data available on these two factors [32,34,35].

The comprehensive analysis suggests that the ARTIVA approach allows us to describe and reverse-engineer the dynamic aspects of molecular networks. Such time-varying networks provide a middle ground between networks homogeneous in time and explicit dynamical models. The latter require substantial further information in order to model the dynamics of biological systems [36]. Inferring such systems is a considerable statistical challenge and it has recently been shown that some parameters cannot be inferred with any degree of certainty from time-course data. This so-called sloppy behaviour [37,38] has been identified even in very simple dynamical systems. In contrast to classical network reverse engineering approaches such as dynamic Bayesian networks [18] and graphical Gaussian models [39], ARTIVA also allows us to construct more complex hypotheses where interactions may depend on time.

As no particular constraint is imposed to the changepoint positions or to the succession in network topologies within phases, the ARTIVA model appears to be highly flexible. The results are not *a priori *directed toward any particular regulatory associations between genes. This flexibility can be extremely valuable, especially when no information regarding the studied biological process is available. But the rapid accumulation of data obtained with different experimental approaches gives the opportunity to acquire a more comprehensive picture of all the interactions between cellular components. To understand the biology of the studied systems better, the trend is clearly towards the aggregation of multiple sources of information. A natural future direction in the development of ARTIVA will be to incorporate data originating from different sources in the model. In particular, protein/DNA interaction data (ChIP-chip or ChIP-seq experiments) could be effective by replacing the uniform prior for the edges with a prior favouring edges that correspond to the experimentally identified interactions (see [40,41] for an illustration). Also, ARTIVA assumes independent network topologies within successive phases and can identify very different regulatory associations between two phases, even if the time delay between the phases is very short. This assumption was appropriate in case of biological models like the *Drosophila *development and the yeast stress response, mainly because those are processes in which transcriptional regulations are highly dynamic. However, when considering systems that evolve more smoothly or in case of datasets with a small number of time points, it would be interesting to incorporate a regularization scheme into ARTIVA in order to favour slight changes from one phase to the next one. Such an approach has already been initiated in [13] for discretized data and in [42] where the regularization scheme is based on a common network structure. There are still huge gaps in our knowledge of biological networks and of the dynamics they mediate. What triggers whether or not an interaction is present depends subtly on the cellular context, the complement of molecules inside a cell (if we focus attention of intra-cellular processes and networks) and their respective molecular interactions. Understanding all of these factors and their interplay will ultimately be crucial in order to design biological interventions rationally. But statistically inferring them poses a set of formidable challenges. The use of relatively simple mathematical models (such as vector-autoregressive processes) allows us to distil the essential dynamics of complex temporal processes in biological systems. Thus ARTIVA provides a platform for the analysis of transcriptomic data, which could be straightforwardly expanded to include other data, *e.g*. transcription factor activities or other proteomic measurements.

SL conceived, implemented the first version of ARTIVA algorithm and drafted the manuscript. JB optimized the ARTIVA source code and performed simulations. FD provided help with data analysis. MPHS and GL contributed equally to this work. MPHS provided help with the model selection formalism and drafted the manuscript. GL designed the experiments, performed data analysis and drafted the manuscript. All the authors read and approved the final manuscript.

**Supplementary Text S1 - Priors illustration and complete mathematical description of the RJMCMC procedure and of the Bayes factor computation**.

Click here for file^{(263K, PDF)}

**Supplementary Figure S1 - Principle of the simulation study**.

Click here for file^{(222K, PDF)}

**Supplementary Dataset S1 - Full edge list of the inferred time varying networks of the 'benomyl' data**.

Click here for file^{(34K, XLS)}

**Supplementary Text S2 - Supplementary results related to the 'benomyl' analyses**.

Click here for file^{(276K, PDF)}

**Supplementary Figure S2 - Expression measurements for the 18 clusters used in the 'benomyl' analyses**.

Click here for file^{(112K, PDF)}

We thank Liam Kelly and Tina Toni for helpful comments on this manuscript, Catherine Etchebest for helpful discussions and the anonymous reviewers for their constructive advices. SL and MPHS gratefully acknowledge support from the BBSRC. JB and GL gratefully acknowledge support from the Institut National de la Transfusion Sanguine. MPHS is a Royal Society Wolfson Research Merit Award holder.

- Luscombe N, Babu M, Yu H, Snyder M, Teichmann S, Gerstein M. Genomic analysis of regulatory network dynamics reveals large topological change. Nature. 2004;431:308–312. doi: 10.1038/nature02782. [PubMed] [Cross Ref]
- Seshasayee A, Bertone P, Fraser G, Luscombe N. Transcriptional regulatory networks in bacteria: from input signals to output responses. Curr Opin Microbiol. 2006;9(5):511–519. doi: 10.1016/j.mib.2006.08.007. [PubMed] [Cross Ref]
- Zhu J, B Z, Smith E, Drees B, Brem R, Kruglyak L, Bumgarner R, Schadt E. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genetics. 2008;40(7):854–861. doi: 10.1038/ng.167. [PMC free article] [PubMed] [Cross Ref]
- Barenco M, Tomescu D, Brewer D, Callard R, Stark J, Hubank M. Ranked prediction of p53 targets using hidden variable dynamic modeling. Genome Biology. 2006;7:R27. doi: 10.1186/gb-2006-7-3-r25. [PMC free article] [PubMed] [Cross Ref]
- Bonneau R, Facciotti M, Reiss D, Schmid A, Pan M, Kaur A, Thorsson V, Shannon P, Johnson M, Bare J, Longabaugh W, Vuthoori M, Whitehead K, Madar A, Suzuki L, Mori T, Chang D, Diruggiero J, Johnson C, Hood L, Baliga N. A predictive model for transcriptional control of physiology in a free living cell. Cell. 2007;131(7):1354–1365. doi: 10.1016/j.cell.2007.10.053. [PubMed] [Cross Ref]
- Friedman N. Inferring Cellular Networks Using Probabilistic Graphical Models. Science. 2004;303:799–805. doi: 10.1126/science.1094068. [PubMed] [Cross Ref]
- Hartemink A. Reverse engineering gene regulatory networks. Nature Biotechnology. 2005;23(5):554–555. doi: 10.1038/nbt0505-554. [PubMed] [Cross Ref]
- Werhli A, Grzegorczyk M, Husmeier D. Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks. Bioinformatics. 2006;22(20):2523–2531. doi: 10.1093/bioinformatics/btl391. [PubMed] [Cross Ref]
- Yoshida R, Imoto S, Higuchi T. CSB '05: Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference. Washington, DC, USA: IEEE Computer Society; 2005. Estimating Time-Dependent Gene Networks from Time Series Microarray Data by Dynamic Linear Models with Markov Switching; pp. 289–298. full_text. [PubMed]
- Talih M, Hengartner N. Structural learning with time-varying components: tracking the cross-section of financial time series. Journal of the Royal Statistical Society Series B. 2005;67(3)
- Fujita A, Sato J, Garay-Malpartida H, Morettin P, Sogayar M, Ferreira C. Time-varying modeling of gene expression regulatory networks using the wavelet dynamic vector autoregressive method. Bioinformatics. 2007;23(13):1623–1630. doi: 10.1093/bioinformatics/btm151. [PubMed] [Cross Ref]
- Xuan X, Murphy KP. ICML. Vol. 227. ACM International Conference Proceeding Series; 2007. Modeling changing dependency structure in multivariate time series; pp. 1055–1062. full_text.
- Robinson J, Hartemink A. Non-Stationary Dynamic Bayesian Networks. NIPS '08: Neural Information Processing Systems 2008. 2008. pp. 1369–1376.
- Rao A, Hero AO, States DJ, Engel JD. Inferring Time-Varying Network Topologies from Gene Expression Data. EURASIP Journal on Bioinformatics and Systems Biology. 2007. [PMC free article] [PubMed]
- Ahmed A, Xing EP. Recovering time-varying networks of dependencies in social and biological studies. Proceedings of the National Academy of Sciences. 2009;106(29):11878–11883. doi: 10.1073/pnas.0901910106. [PubMed] [Cross Ref]
- Needham C, Bradford J, Bulpitt A, Westhead D. Inference in Bayesian networks. Nature Biotechnology. 2006;24:51–53. doi: 10.1038/nbt0106-51. [PubMed] [Cross Ref]
- Alon U. Network motifs: theory and experimental approaches. Nature Reviews Genetics. 2007;8(6):450–461. doi: 10.1038/nrg2102. [PubMed] [Cross Ref]
- Sachs K, Perez O, Pe'er D, Lauffenburger D, Nolan G. Causal protein-signaling networks derived from multiparameter single-cell data. Science. 2005;308(5721):523–529. doi: 10.1126/science.1105809. [PubMed] [Cross Ref]
- Lebre S. Inferring dynamic genetic networks with low order independencies. Statistical Applications in Genetics and Molecular Biology. 2009;8 doi: 10.2202/1544-6115.1294. [PubMed] [Cross Ref]
- de Silva E, Stumpf M. Complex networks and simple models in biology. J Roy Soc Interface. 2005;2:419–340. doi: 10.1098/rsif.2005.0067. [PMC free article] [PubMed] [Cross Ref]
- Green P. Reversible jump Markoc chain Monte Carlo computation and Bayesian model determination. Biometrika. 1995;82:711–732. doi: 10.1093/biomet/82.4.711. [Cross Ref]
- Robert C, Ryden T, Titterington D. Bayesian inference in hidden Markov models through reversible jump Markov chain Monte Carlo. Journal of the Royal Statistical Society B. 2000;62:57–75. doi: 10.1111/1467-9868.00219. [Cross Ref]
- Green PJ. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika. 1995;82:711–732. doi: 10.1093/biomet/82.4.711. [Cross Ref]
- Suchard M, Weiss R, Dorman K, Sinsheimer J. Inferring spatial phylogenetic variation along nucleotide sequences: a multiple change-point model. Journal of the American Statistical Assocation. 2003;98:427–437. doi: 10.1198/016214503000215. [Cross Ref]
- Andrieu C, Doucet A. Joint Bayesian model selection and estimation of noisy sinusoids via reversible jump MCMC. IEEE Trans. on Signal Processing. 1999;47(10):2667–2676. doi: 10.1109/78.790649. [Cross Ref]
- Zellner A. In: Bayesian Inference and Decision Techniques. Goel PK, Zellner A, editor. New York: Elsevier; 1986. On assessing prior distributions and Bayesian regression analysis with g-prior distribution; pp. 233–243.
- Kass RE, Raftery AE. Bayes Factors. Journal of the American Statistical Association. 1995;90:773–795. doi: 10.2307/2291091. [Cross Ref]
- Arbeitman MN, Furlong F, Imam E Eand, Johnson E, Null BH, Baker BS, Krasnow MA, Scott MP, Davis RW, P WK. Gene expression during the life cycle of drosophila melanogaster. Science. 2002;297(5590):2270–2275. doi: 10.1126/science.1072152. [PubMed] [Cross Ref]
- The Gene Ontology Consortium. Gene ontology: tool for the unification of biology. Nature Genetics. 2000;25:25–29. doi: 10.1038/75556. http://www.geneontology.org [PMC free article] [PubMed] [Cross Ref]
- Lucau-Danila A, Lelandais G, Kozovska Z, Tanty V, Delaveau T, Devaux F, Jacq C. Early Expression of Yeast Genes Affected by Chemical Stress. Mol Cell Biol. 2005;25(5):1860–1868. doi: 10.1128/MCB.25.5.1860-1868.2005. [PMC free article] [PubMed] [Cross Ref]
- Monteiro PT, Mendes ND, Teixeira MC, d'Orey S, Tenreiro S, Mira NP, Pais H, Francisco AP, Carvalho AM, Lourenco AB, Sa-Correia I, Oliveira AL, Freitas AT. YEASTRACT-DISCOVERER: new tools to improve the analysis of transcriptional regulatory associations in Saccharomyces cerevisiae. Nucl Acids Res. 2008;36(suppl_1):D132–136. http://nar.oxfordjournals.org/cgi/content/abstract/36/suppl_1/D132 [PMC free article] [PubMed]
- Salin H, Fardeau V, Piccini E, Lelandais G, Tanty V, Lemoine S, Jacq C, Devaux F. Structure and properties of transcriptional networks driving selenite stress response in yeasts. BMC Genomics. 2008;9:333. doi: 10.1186/1471-2164-9-333. http://www.biomedcentral.com/1471-2164/9/333 [PMC free article] [PubMed] [Cross Ref]
- Ernst J, Nau GJ, Bar-Joseph Z. Clustering short time series gene expression data. Bioinformatics (Oxford, England) 2005;21(Suppl 1):i159–168. doi: 10.1093/bioinformatics/bti1022. [PMID: 15961453] [PubMed] [Cross Ref]
- Banerjee D, Lelandais G, Shukla S, Mukhopadhyay G, Jacq C, Devaux F, Prasad R. Responses of Pathogenic and Nonpathogenic Yeast Species to Steroids Reveal the Functioning and Evolution of Multidrug Resistance Transcriptional Networks. Eukaryotic Cell. 2008;7:68–77. doi: 10.1128/EC.00256-07. http://ec.asm.org/cgi/content/abstract/7/1/68 [PMC free article] [PubMed] [Cross Ref]
- Fardeau V, Lelandais G, Oldfield A, Salin H, Lemoine S, Garcia M, Tanty V, Crom SL, Jacq C, Devaux F. The Central Role of PDR1 in the Foundation of Yeast Drug Resistance. J Biol Chem. 2007;282(7):5063–5074. doi: 10.1074/jbc.M610197200. http://www.jbc.org/cgi/content/abstract/282/7/5063 [PubMed] [Cross Ref]
- Toni T, Welch D, Strelkowa N, Ipsen D, Stumpf M. Approximate Bayesian Computation scheme for parameter inference and model selection in dynamical systems. J Roy Soc Interface. 2009;6:187–202. doi: 10.1098/rsif.2008.0172. [PMC free article] [PubMed] [Cross Ref]
- Gutenkunst RN, Waterfall JJ, Casey FP, Brown KS, Myers CR, Sethna JP. Universally sloppy parameter sensitivities in systems biology models. PLoS Comput Biol. 2007;3:1871–1878. doi: 10.1371/journal.pcbi.0030189. [PubMed] [Cross Ref]
- Gutenkunst RN, Casey FP, Waterfall JJ, Myers CR, Sethna JP. Extracting falsifiable predictions from sloppy models. Ann N Y Acad Sci. 2007;1115:203–11. doi: 10.1196/annals.1407.003. [PubMed] [Cross Ref]
- Lauritzen SL. Graphical models. Oxford University Press; 1996.
- Werhli A, Husmeier D. Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge. Statistical Applications in Genetics and Molecular Biology. 2007;6 doi: 10.2202/1544-6115.1282. [PubMed] [Cross Ref]
- Mukherjee S, Speed TP. Network inference using informative priors. Proceedings of the National Academy of Sciences (PNAS) 2008;105(38):14313–14318. doi: 10.1073/pnas.0802272105. [PubMed] [Cross Ref]
- Grzegorczyk M, Husmeier D. In: Advances in Neural Information Processing Systems (NIPS) Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A, editor. Vol. 22. 2009. Non-stationary continuous dynamic Bayesian networks; pp. 682–690.

Articles from BMC Systems Biology are provided here courtesy of **BioMed Central**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |