PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of plosonePLoS OneView this ArticleSubmit to PLoSGet E-mail AlertsContact UsPublic Library of Science (PLoS)
 
PLoS One. 2012; 7(8): e42095.
Published online 2012 August 31. doi:  10.1371/journal.pone.0042095
PMCID: PMC3432056
Inference of Biological Pathway from Gene Expression Profiles by Time Delay Boolean Networks
Tung-Hung Chueh1 and Henry Horng-Shing Lu2*
1Green Energy and Environment Research Laboratories, Industrial Technology Research Institute, Chutung, Hsinchu, Taiwan, Republic of China
2Institute of Statistics, National Chiao Tung University, Hsinchu, Taiwan, Republic of China
Frank Emmert-Streib, Editor
Queen’s University Belfast, United Kingdom
* E-mail: hslu/at/stat.nctu.edu.tw
Competing Interests: The authors have declared that no competing interests exist.
Conceived and designed the experiments: THC HHL. Performed the experiments: THC HHL. Analyzed the data: THC HHL. Contributed reagents/materials/analysis tools: THC HHL. Wrote the paper: THC HHL.
Received February 21, 2012; Accepted July 2, 2012.
One great challenge of genomic research is to efficiently and accurately identify complex gene regulatory networks. The development of high-throughput technologies provides numerous experimental data such as DNA sequences, protein sequence, and RNA expression profiles makes it possible to study interactions and regulations among genes or other substance in an organism. However, it is crucial to make inference of genetic regulatory networks from gene expression profiles and protein interaction data for systems biology. This study will develop a new approach to reconstruct time delay Boolean networks as a tool for exploring biological pathways. In the inference strategy, we will compare all pairs of input genes in those basic relationships by their corresponding An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e001.jpg-scores for every output gene. Then, we will combine those consistent relationships to reveal the most probable relationship and reconstruct the genetic network. Specifically, we will prove that An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e002.jpg state transition pairs are sufficient and necessary to reconstruct the time delay Boolean network of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e003.jpg nodes with high accuracy if the number of input genes to each gene is bounded. We also have implemented this method on simulated and empirical yeast gene expression data sets. The test results show that this proposed method is extensible for realistic networks.
In order to understand complex biological networks and pathways, we need to investigate global structures instead of individual behaviors since there are interactions and associations between genes. Due to the invention of high throughput technology, genome-wide expression profiles can be measured simultaneously [1]. However, it is still a great challenge to identify complex biological networks from genome-wide data because the number of gene interactions is huge [2]. In recent years, there has been a significant progress in research concerning genetic network models and network reconstruction problems.
Clustering and dimension reduction are important methods for grouping genes that have similar expression profiles [3], [4]. In the framework of clustering, it is important to define the degree of similarity between genes. By the method of clustering, we can group genes that have similar expressions. However, we still cannot find the causal relationship between genes. Hence, apart from the relationship of similarity, we will also have to consider another causal relationship between genes.
There have been many methods proposed in the literature to tackle the problem of genetic regulatory network reconstruction. For instance, the steady state approach have been used to model gene regulatory networks [5]. In addition, the Bayesian network model is an important technique that has been studied extensively in the past two decades [6][11]. A Bayesian network is a directed acyclic graph (DAG) comprised of two components. The first component is comprised of nodes that correspond to a set of variables and a set of directed edges between variables with Markov properties. The second component describes a conditional distribution for each variable given its parents in the graph. Recently, Bayesian network models have been applied to analyze microarray expression and biological data [12][15]. However, Bayesian network algorithms have limitations when dealing with large-scale gene regulatory networks because of their complex modeling structure [16]. Although algorithms for reconstructing Bayesian networks have already been developed [17], [18], the algorithms’ computational costs remain a concern for the searching of all potential network structures on the genome-wide expression data.
Therefore, we are considering a simpler model: Boolean networks, which have been studied extensively in a variety of contexts. Boolean networks [19], [20] can effectively explain the dynamic behaviors of living systems. Moreover, for large-scale gene regulatory networks, Kim et al. [21] have used Boolean network with chi-square test on the yeast cell cycle microarray gene expression data sets. The chaos and attractors of Boolean network are also discussed widely from the aspect of power spectrum [22][24]. Recently, Boolean network also have been used as a discrete model for the lac operon [25].
Boolean networks were originally introduced by Kauffman, and received attention in the studies of gene regulatory networks because of their simple structures [26]. In a Boolean network model, nodes represent the gene expression states. The status of a gene is quantized to one of the two states: on or off, representing a gene as active or inactive respectively. The wiring of rules between nodes in the graph represents a functional link between genes and determines the expressions of target genes after giving a series of input genes. Under the structure of Boolean networks, the target gene is determined by a set of genes with specific rules. For each gene, if the indegree (i.e., the number of input genes to each gene) is bounded by a constant An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e004.jpg, only An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e005.jpg pairs of state transition are necessary and sufficient to reconstruct the original network with An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e006.jpg nodes [27], [28]. However, Boolean networks have been criticized for their deterministic nature. The assumption that every affected gene would be expressed immediately at the next time step may be unsound.
Another point of view of constructing genetic network is to focus on the indication the pairwise relationships between genes. Most of the works is to find the gene-pairs with similarity relationship [29][33]. The similarity of a gene-pair represents the two genes with the same expression or opposite expression. In 2005, Li and Lu proposed directed acyclic Boolean network and the statistical reconstruction method of SPAN to infer the pair wise relations of every element [34]. The proposed method can reconstruct Boolean networks from noisy array data by assigning an s-p-score for every pair of genes. In the study, they proposed another relationship between two genes: relationship of prerequisite under the Boolean network model. If gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e007.jpg is a prerequisite for gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e008.jpg, then the “on” status of gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e009.jpg is necessary for the “on” status of gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e010.jpg. Boolean implication network, with the similar aspect, investigated all Boolean implication between pairs of gene for large scale genome microarray datasets [35]. Following the model, Wang et al.[36] proposed a two step counting approach for constructing biological pathways with Boolean network. However, most of these methods only consider pair wise relationship in order to decrease the time complexity. Therefore, if the structure of network is a combination of a set of genes to affect another gene, the algorithms will lose some information and rules in the genetic network reconstruction.
In this study, we will consider a much more generalized model by combining the structure of the above two models. If a Boolean function with one or several genes is a prerequisite for a target gene, then the induction of the Boolean function with input genes is necessary for the expression of the target gene. Hence, the target will be influenced by the Boolean function with several input genes. However, the induction of the Boolean function may not activate the target gene immediately, but at a future time. Therefore, the target gene may not have been influenced right now and we will treat these relationships as time delay affection. In this study, we will infuse these additional relationships for more generalized systems.
Boolean Network
Boolean networks were introduced by Kauffman (1969) forty years ago to represent genetic regulatory networks. First, we will review the definition of a Boolean network. A Boolean network An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e035.jpg is a directed graph consisting of two components: a set of nodes An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e036.jpg that corresponds to genes, and a list of Boolean functions An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e037.jpg that corresponds to the rule of interaction and combination of several genes. For every node An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e038.jpg, its expression is simplified to two levels: on and off, representing a gene as active or inactive. For every Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e039.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e040.jpg specified input nodes An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e041.jpg are assigned to the node An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e042.jpg in the graph and represent the rules of regulatory mechanisms between genes. The expression of a gene is determined by the expression of the gene directly affecting it with a Boolean function. Therefore, the state of each node An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e043.jpg is determined by the Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e044.jpg.
For each node An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e045.jpg, the gene expression state at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e046.jpg is assumed to take either 0 (not-expressed) or 1 (expressed) and is expressed as An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e047.jpg. In a Boolean network, every gene expression profile at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e048.jpg is completely determined by the expression profile of a set of genes An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e049.jpg at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e050.jpg and the corresponding Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e051.jpg. That is, we can write An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e052.jpg.
For convenience, we converted the Boolean network An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e056.jpg to the wiring diagram An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e057.jpg (See Figure 1) [37]. For each node An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e058.jpg, suppose An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e059.jpg are the input nodes assigned to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e060.jpg. Then we construct an additional node An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e061.jpg and connected the edge from An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e062.jpg to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e063.jpg for each An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e064.jpg. That is, the set of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e065.jpg represents the gene expression profile at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e066.jpg and the set of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e067.jpg corresponds to the gene expression profile at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e068.jpg. Hence we can treat the set of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e069.jpg as the input values and the set of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e070.jpg as the corresponding output values. Therefore, the output values of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e071.jpg are determined by An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e072.jpg.
Figure 1
Figure 1
Boolean network G(V,F), wiring diagram G′(V′,F′) and its input/output.
The Structure of Time Delay Boolean Network
In the previous subsection, we found that given the values of the node (An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e073.jpg) at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e074.jpg, the expressions at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e075.jpg will be updated immediately by specific Boolean function (An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e076.jpg). That is, for every gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e077.jpg, if the expression profile of a set of genes An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e078.jpg at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e079.jpg and the corresponding Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e080.jpg is observed, the gene expression of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e081.jpg at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e082.jpg is determined by An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e083.jpg. However, in real genetic regulatory situations, the deterministic system has been criticized due to the existence of misclassification error and noise. In addition, some of the gene expression may result in time delay when the gene is influenced by one or several input genes. That is, the induction of Boolean function may not activate the target gene immediately, but in the future. Hence, it would have been much more flexible to use a non-deterministic network system. In this subsection, we will consider two relationships between the Boolean function and the target gene instead of the deterministic relation.
First, we will introduce the structure of time delay Boolean networks. Suppose there are An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e101.jpg elements, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e102.jpg in a Boolean network. For any elements An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e103.jpg with specific Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e104.jpg, we have two kinds of pair wise relationship: prerequisite and similarity. We say that a Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e105.jpg with specific An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e106.jpg input genes An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e107.jpg at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e108.jpg is the prerequisite for the target gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e109.jpg at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e110.jpg, if the on-status of Boolean function at time t is necessary for the on-status of gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e111.jpg at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e112.jpg. This relationship is denoted by An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e113.jpg. In other words, if the Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e114.jpg is not active at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e115.jpg, gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e116.jpg will be inactive at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e117.jpg. If it does not cause confusion, we will omit the notation of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e118.jpg and input genes as denoted by An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e119.jpg. Moreover, for every gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e120.jpg, we use An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e121.jpg as its dual (from 0 to 1, or from 1 to 0) in this paper. Therefore, for any Boolean function and target gene with a prerequisite relationship there are a total of two possible relationships: An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e122.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e123.jpg. In this model, we do not consider the situation of a dual of Boolean function prerequisite to the target gene, that is An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e124.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e125.jpg. Since for any Boolean function whose dual is a prerequisite to the target gene, there must exist another Boolean function that is a prerequisite to the target gene. For instance, if An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e126.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e127.jpg, then An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e128.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e129.jpg. Therefore, for the prerequisite relationship, we only consider the Boolean function that is a prerequisite to target gene and the dual of target gene.
The other type of relationship between Boolean function and target gene is similarity. We say that the Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e130.jpg and target gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e131.jpg are similar if the status of the Boolean function and the status of the target gene are in the same expression, and we denoted this by fi~vi. In the same way, we do not consider the situation of Boolean function similar to the dual of target gene such as fi~An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e132.jpg in this study. Since if there is one Boolean function that is similar to the dual of target gene, there must exist another Boolean function that is similar to the target gene.
In the diagram, if a Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e133.jpg is a prerequisite to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e134.jpg, we draw a directed arrow from the vertex An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e135.jpg to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e136.jpg and if An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e137.jpg is similar to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e138.jpg, we use an undirected line to connect An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e139.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e140.jpg.
In the model of time delay Boolean network we proposed, the output of the gene expression is not completely determined by the input state and Boolean function. The output expression may have more than one possible result in the time delay Boolean network. We illustrate the above construction by an example in Figure 2. It has three elements, one similarity and two prerequisite relationships. The possible outputs for every input state are listed in the right part of the graph. If we knew the network structure, some of the inputs would have more than one possible output expression in the time delay Boolean network.
Figure 2
Figure 2
One example of time delay Boolean network and its input/output.
Identification Algorithm
First, we only consider Boolean networks in which the maximum number of input genes is bounded by a constant An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e143.jpg for every target gene, because it has been proven that the number of profiles required grows exponentially if An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e144.jpg is not bounded [38]. For simplicity, we only show algorithms for the case of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e145.jpg. However, the algorithm can be intuitively generalized to any An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e146.jpg in a straightforward way. For the inference of genetic network, we need to clarify the following questions for each target gene.
  • Which input genes will affect the target gene?
  • What kind of Boolean functions will be used for combining those input genes?
  • What kind of relationship exists between the Boolean function and the target gene?
In this subsection, we propose an algorithm to clarify the above questions. The algorithm below is conceptually very simple since it simply uses output Boolean functions with input genes and relationships with target genes that are consistent with the data. First, for each output gene expression at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e147.jpg such as An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e148.jpg, we consider all the pairs of elements in An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e149.jpg at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e150.jpg, for instance An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e151.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e152.jpg. Then we count the eight incidents of (An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e153.jpg) being (0,0,0), (0,0,1), An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e154.jpg, (1,1,1) from the sample and arrange them in a An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e155.jpg table; see the left part of Table 1. We mark a cell “+” if the count is positive and mark it “0” otherwise.
Table 1
Table 1
Count and probabilities table for An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e011.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e012.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e013.jpg assuming no misclassification error.
For detecting whether there exists a Boolean function which is a prerequisite to the target gene, we will compare the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e156.jpg output table with the left four basic relationships in Table 2. We consider the basic relationships to be consistent with the output table if the position of 0 cell in the basic relationships is also 0 in the output table. By comparing the output table with the four basic relationships, we can find relationships that are consistent with the output tables. If there is more than one relationship that is consistent with the output tables, we would use the Boolean logic gate “and” to combine the Boolean function and transfer the result to another Boolean function. Hence, the final Boolean function is the prerequisite to the target gene. Similarly, by comparing the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e157.jpg output table with the right four basic relations in Table 2, we could get another Boolean function which is the prerequisite to the dual of target gene.
Table 2
Table 2
Count profiles for the basic eight relationships without misclassification error.
Moreover, if only one Boolean function occurred in above relationship, that is, if there is only one Boolean function that is the prerequisite to the target gene or the dual of target gene, we will treat that relationship as our final relationship between the Boolean function and the target gene. However, if both of the two prerequisite relationships happened (i.e. An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e158.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e159.jpg An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e160.jpg An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e161.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e162.jpg), we need to check whether these two relationships are in conflict. If the dual of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e163.jpg is equivalent to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e164.jpg, our conclusion for inference will be that An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e165.jpg is similar to the target gene (that is, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e166.jpg); otherwise, we will treat it as if there is no relationship between the input genes and the target gene because we did not gather enough information to judge true relationships between An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e167.jpg and (An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e168.jpg) at this moment. By the above identification procedure, we can find the corresponding input genes, Boolean function and their relationship for every target gene.
Identification Algorithm with Noisy Array
In previous subsection, we discussed an identification method for data without noise. In this section we will consider the situation of noisy array data. We assume that every element in the entry of (An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e169.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e170.jpg), An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e171.jpg switches to its reverse status with a misclassification probability An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e172.jpg independently; that is
A mathematical equation, expression, or formula.
 Object name is pone.0042095.e173.jpg
(1)
A mathematical equation, expression, or formula.
 Object name is pone.0042095.e174.jpg
(2)
Thus, the observed array (An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e175.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e176.jpg) contains misclassification error. Our goal is to reconstruct time delay Boolean network from noisy array of binary data (An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e177.jpg).
Similar to section 2, we assume that the maximum number of input genes is bounded by 2 for every target gene. We treat the data in the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e178.jpg table as a multinomial distribution with eight cells whose probabilities are An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e179.jpg as shown in the right part of Table 1, where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e180.jpg. Similarly, we extract the data with misclassification error for every output gene and each pair of input genes as the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e181.jpg table. Now the observed data An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e182.jpg are not generated from the multinomial An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e183.jpg, but from another multinomial An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e184.jpg as shown in Table 3, where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e185.jpg.
Table 3
Table 3
Count and probabilities table for An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e053.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e054.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e055.jpg with misclassification error.
Because of the misclassification error, a portion of the samples of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e186.jpg may change to the other seven cells. We use the notations of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e187.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e188.jpg to represent the counts of eight cells changed from An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e189.jpg. Analogous notations are defined for An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e190.jpg. The splitting is shown in Table 4. Consequently, the generated probabilities (An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e191.jpg) are calculated as follows: An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e192.jpg, where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e193.jpg. Here, we adopt the notation An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e194.jpg analogous to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e195.jpg. The above parameters and splits are shown in Table 4. In the table, it is easy to find that the correspondence between two sets of counts and probabilities is the following:
A mathematical equation, expression, or formula.
 Object name is pone.0042095.e196.jpg
A mathematical equation, expression, or formula.
 Object name is pone.0042095.e197.jpg
(3)
A mathematical equation, expression, or formula.
 Object name is pone.0042095.e198.jpg
Table 4
Table 4
Splitting counts caused by misclassification error.
For the complete data An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e199.jpg, the log-likelihood is given by
A mathematical equation, expression, or formula.
 Object name is pone.0042095.e200.jpg
(4)
where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e201.jpg are those splitting probabilities. Since the complete data An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e202.jpg are not observable, we use the EM algorithm to maximize the log-likelihood. In the E-step, the splitting counts of complete data An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e203.jpg are evaluated by the conditional expectations using the current values of the parameters by the following formula
A mathematical equation, expression, or formula.
 Object name is pone.0042095.e204.jpg
(5)
where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e205.jpg. One probabilities of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e206.jpg are zero in those different hypotheses specified in Table 5. In the M-step, we maximize the conditional expectation of the log-likelihood for the complete data to obtain the maximum likelihood estimates (MLEs) of the parameters. According to the MLEs, we can compute the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e207.jpg-score for every pair of input genes and target gene, which are obtained by estimating for the misclassification probability under every prerequisite relationship.
Table 5
Table 5
The eight basic relationships and their probabilistic hypotheses and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e084.jpg-scores.
For the first step, we would like to determine the most probable relationships between every pair of input genes and one output gene. Next, we find the most probable Boolean function with pair input genes for every output gene and select candidate pairs of input genes and output gene for the watch list. Finally, we reconstruct a time delay Boolean network by integrating the relationship of those genes selected.
For one output gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e208.jpg and a pair of input genes An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e209.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e210.jpg, we define the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e211.jpg-scores An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e212.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e213.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e214.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e215.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e216.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e217.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e218.jpg are, respectively, the maximum likelihood estimates of p under the triangular model: An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e219.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e220.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e221.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e222.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e223.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e224.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e225.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e226.jpg.
According to the EM algorithm described above, we can evaluate the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e227.jpg-score for every output gene. We use the MLE An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e228.jpg to measure how well each hypothesis fits: the smaller the score is, the more likely that the corresponding hypothesis could be true.
If the samples are generated from a time delay Boolean network, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e229.jpg-score are quite useful for the discovery of true relationships. Here we can consider the maximum compatibility criterion: to choose the maximum threshold value so that the selected relationships contain no conflicts [34]. We collect those relationships whose An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e230.jpg-scores are smaller than a threshold. Known biological results are helpful for the determination of a threshold. For example, if we know the relationship An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e231.jpg is true, then the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e232.jpg-scores smaller than An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e233.jpg should be in our watch list. As more relationships are included in the watch list, the more likely we are to observe incompatible ones. In general, we can choose the threshold that allows the maximum number of relationships with no conflicting relationships. Next we will demonstrate the method by illustration examples.
Theoretical Results
First, we will analyze the number of input/output pairs required for the network reconstruction of time delay Boolean network to be unique. The theoretical results of classical Boolean networks only consider the similar relationship [27], [38], [39]. The following results prove the theoretical results time delay Boolean networks that has a more flexible structure and consider both similar and prerequisite relationship.
Proposition 1
For all subsets of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e234.jpg with An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e235.jpg genes, if all assignments (i.e., An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e236.jpg assignments) of Boolean values appear in input expression patterns and all of its possible output expression patterns of the target gene are present, the identification of genetic network is determined to be unique, if it exists.
(Proof) Let An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e237.jpg be any gene in An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e238.jpg and suppose An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e239.jpg is controlled by a Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e240.jpg with similarity or prerequisite relationship (i.e., An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e241.jpg or An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e242.jpg). If the Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e243.jpg is similar to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e244.jpg, the case is proved for the classical Boolean networks in Akutsu et al. (1998). Next, we consider the case of Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e245.jpg as a prerequisite to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e246.jpg. In this case, there must exist a specific input value An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e247.jpg for An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e248.jpg such that An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e249.jpg have two possible values 0 and 1. Hence, any other genes would not control An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e250.jpg because all assignments of Boolean values are appearance. Let us illustrate the above statement by the example for the case of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e251.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e252.jpg. If An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e253.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e254.jpg, when the input of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e255.jpg is 1, the outcome of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e256.jpg being both 0 and 1 will appearance. Therefore, given the input of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e257.jpg, the outcome of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e258.jpg is not deterministic no matter the value of any other gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e259.jpg is 1 or 0. Hence, any other gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e260.jpg would not affect gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e261.jpg. If An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e262.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e263.jpg for some Boolean function An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e264.jpg, there must exist an input An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e265.jpg such that An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e266.jpg. Then, for any other pair of gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e267.jpg where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e268.jpg, the outcome of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e269.jpg is not deterministic for any input of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e270.jpg, if the input of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e271.jpg is An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e272.jpg. In a case of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e273.jpg, we can prove that gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e274.jpg which does not belong to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e275.jpg would not affect the gene An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e276.jpg in a similar way.
Proposition 2
The probability that one sub-assignment with all of its possible results in the target gene does not appear among m random input expression pattern is at most An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e277.jpg.
(Proof) For any fixed set of nodes An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e278.jpg, the probability that a sub-assignment An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e279.jpg does not appear in one random input expression pattern is An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e280.jpg. Thus, among the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e281.jpg random input expressions, the probability that An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e282.jpg appears is An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e283.jpg times is equal to An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e284.jpg where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e285.jpg. In addition, the probability that all of the possible results in the target gene does not appear among An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e286.jpg times input is smaller than An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e287.jpg for An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e288.jpg and equal to 1 for An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e289.jpg. Hence the probability that one sub-assignment and all of its possible results does not appear among An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e290.jpg random input expression is smaller than An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e291.jpg and this can be bounded by An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e292.jpg by an algebra calculation.
Next we prove the main theorem.
Theorem 1
For the identification of one time delay Boolean network of n nodes with maximum indegree An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e293.jpg, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e294.jpg uniformly and randomly sampled input patterns are sufficient for exact inference with probability at least An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e295.jpg.
(Proof) We consider the probability that the condition of Proposition 1 is not satisfied under An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e296.jpg random input expression patterns.
By Proposition 2, the probability that An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e297.jpg with all of its possible results in the target gene does not appear among the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e298.jpg random input expression patterns is bounded by An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e299.jpg for any fixed set of nodes An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e300.jpg. Since the number of combinations of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e301.jpg nodes from a set of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e302.jpg possibilities is bounded by An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e303.jpg, the probability that the condition of Proposition 1 is not satisfied is at most An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e304.jpg. It is not difficult to see that An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e305.jpg holds for An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e306.jpg. Hence, we obtain the theorem by letting the non-identification probability An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e307.jpg.
Next we develop an information theoretic lower bound on the number of input/output pairs needed for the identification of a time delay Boolean network.
Theorem 2
If the maximum indegree An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e308.jpg, at least An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e309.jpg input/output pairs are required for the identification of a time delay Boolean network in the worst case.
(Proof) The number of time delay Boolean networks is given by all the possible combination of Boolean function with An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e310.jpg nodes from a set of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e311.jpg possibilities with all possible relations between Boolean functions with target node. Since there are An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e312.jpg possible combinations of input nodes, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e313.jpg possible Boolean functions and 3 possible relations between Boolean function with each node, there are An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e314.jpg Boolean networks whose maximum indegree is at most An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e315.jpg. On the other hand, there are at most An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e316.jpg possible output patterns with one input expression pattern. Therefore, An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e317.jpg which is the same as An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e318.jpg input/output pairs are required in the worst case.
Example with Simulation and Real Data
We will illustrate our method by the example described in Figure 2. For the pair of samples consist of three elements list in the right part of Figure 2, we uniformly generated 100 input samples and their corresponding possible output samples with misclassification probability An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e319.jpg. For the prerequisite relationship, if the status of Boolean function with input genes is on, then we allow the output value to have equal probability of on or off. The data can be arranged as input/output sample similar to that obtained from the microarray data with time. Namely, the input of each sample can represent the gene expression at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e320.jpg and the output can represent the gene expression at time An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e321.jpg. For each pair of input and output genes, we compute the 8 basic An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e322.jpg-scores that represent the 8 basic hypotheses in Table 5 for all of pair input genes and output genes. After the calculation, the simulation results of every An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e323.jpg-score are listed in Table 6.
Table 6
Table 6
By the time delay Boolean network in Figure 1, we generate 100 samples with p = 0.05.
Beside the example with 3 elements, in order to shows the superiority of the proposed method can be applied to a larger network, a more comprehensive example with a larger network is given in Figure S1.
Next, we have to decide the threshold for choosing the relations. When we increase the threshold of the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e324.jpg-score, the relations whose An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e325.jpg-score are smaller than the threshold will be chosen. Moreover, when the number is 0.138, the conflict occurs, since we have An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e326.jpg and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e327.jpg. However, in our model, there are at most two genes that would affect an output gene. Therefore, we can choose 0.138 as our threshold and include relations whose An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e328.jpg-score is smaller than the threshold. By these procedures, we can reconstruct the time delay Boolean network identical to Figure 2.
In the area of gene regulatory network study, Schuller has summarized regulatory cis-acting elements of structural genes of the nonfermentative metabolism and described the molecular interactions among general regulators and pathway-specific factors [40]. In the gene regulation of gluconeogenesis by Sip4 and Cat8 pathway, the carbon source control could be identified for the regulator Cat8; see (Figure 6) in Schuller [40]. In this study, we apply our proposed approach to explore the expression profiles and show some exploratory result on the Cat8 pathway.
In order to demonstrate the effectiveness of reconstruction, we use the microarray expression dataset of yeast Saccharomyces cerevisiae produced by DeRisi et al. [1] and Spellman et al. [41]. In total, the data is comprised of 41 experiments after filtering out experiments with missing values. By these experimental microarray data sets, we can use our proposed method to reconstruct the biological pathway and the genetic regulation network result is shown in Figure 3. The result is consistent with the genetic network in the literature. That is, the restraint of Mig1 or activation of Snf1 is a prerequisite for the decreasing of Cat8. Moreover, the restraint of Snf1 or Cat8 is a prerequisite for the decreasing of Mls1. However, the negative similarity between Snf1 and Mig1 is undetectable in our current model.
Figure 3
Figure 3
Network reconstruct from the expression data of yeast Saccharomyces cerevisiae.
Conclusions
In this paper, we have introduced the model of time delay Boolean network that generalizes the Boolean network model in order to cope with dependencies that have two kinds of relationships: similarity and prerequisite. The approach for reconstruction of genetic network inference from gene expression data relies on the assumption that the expression of a gene is likely to be controlled by a relatively small number (say An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e329.jpg) of genes. Also, some bounds on the size of data required for the identification of the time delay Boolean networks under constant of indegree are stated and discussed. Moreover, the algorithm of the network reconstruction from noisy array data is developed.
One characteristic of a Boolean network is that all the variables in the graph are binary. If the data we observed is continuous or quantized to have more than two levels, we need to discretize them. For microarray data, the ratios of expression level would be one possible approach of discretization. That is, we can treat the gene as on (active) if the log-ratio of its expression is larger than zero. We treat it as off (inactive) otherwise. In general, biological background knowledge will be helpful for setting thresholds for discretizaion. On the other hand, if the samples are obtained from a time course, then we can consider the gene as on or off by detecting whether the gene is either increasing or decreasing with time.
The work in progress is aimed at evaluating the effectiveness of the described approach for inferring genetic networks from biological gene expression time series data. Besides that, implementation on some other real biological data is also an important task.
For the implement of the network reconstruction algorithm, the greatest complexity is the computation of An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e330.jpg-score for each of the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e331.jpg input elements and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e332.jpg output elements, where An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e333.jpg is the number of elements and An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e334.jpg is the number of indegree. It is an iterative algorithm to compute the MLE for the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e335.jpg-scores by EM procedure while the common practice is to set an upper bound for iterations in numerical implementation. Consequently, this keeps the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e336.jpg complexity for the computation of MLE. In addition, the sorting algorithm for the An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e337.jpg data cost An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e338.jpg in terms of time. Hence, the overall time complexity for the network reconstruction is An external file that holds a picture, illustration, etc.
Object name is pone.0042095.e339.jpg for this algorithm.
Supporting Information
Figure S1
An example of genetic network with 8 nodes.
(PDF)
Acknowledgments
We thank the editor and reviewers for their constructive comments.
Funding Statement
The authors acknowledge support from the National Science Council, National Center for Theoretical Sciences, Shing-Tung Yau Center, and Center of Mathematical Modeling and Scientic Computing at the National Chiao Tung University in Taiwan. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
1. DeRisi JL, Iyer VR, Brown PO (1997) Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278: 680–686. [PubMed]
2. Bornholdt S (2005) Less is more in modeling large genetic networks. Science 310: 449–451. [PubMed]
3. Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95: 14863–14868. [PubMed]
4. Tzeng J, Lu HH-S, Li WH (2008) Multidimensional scaling for large genomic data sets. BMC Bioinformatics 9: 179. [PMC free article] [PubMed]
5. Rawool S, Venkatesh K (2007) Steady state approach to model gene regulatory networks–simulation of microarray experiments. Biosystems 90: 636–655. [PubMed]
6. Jensen FV (1996) An introduction to Bayesian networks. London: University College London Press.
7. Jensen FV (2001) Bayesian networks and decision graphs. New York: Springer.
8. Moler EJ, Radisky DC, Mian IS (2000) Integrating naive bayes models and external knowledge to examine copper and iron homeostasis in s. cerevisiae. Physiol Genomics 4: 127–135. [PubMed]
9. Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. San Mateo: Morgan Kaufmann.
10. Needham C, Bradford J, Bulpitt A, Westhead D (2007) A primer on learning in Bayesian networks for computational biology. PLoS Comput Biol 3: e129. [PMC free article] [PubMed]
11. Reynolds S, Käll L, Riffle M, Bilmes J, Noble W (2008) Transmembrane topology and signal peptide prediction using dynamic Bayesian networks. PLoS Computational Biology 4: e1000213. [PMC free article] [PubMed]
12. Friedman N, Linial M, Nachman I, Pe’er D (2000) Using bayesian networks to analyze expression data. Journal of Computational Biology 7: 601–620. [PubMed]
13. Imoto S, Higuchi T, Goto T, Tashiro K, Kuhara S, et al. (2004) Combining microarrays and biological knowledge for estimating gene networks via bayesian networks. Journal of Bioinformatics and Computational Biology 2: 77–98. [PubMed]
14. Scharfe C, Lu HH-S, Neuenburg JK, Allen EA, Li GC, Klopstock T, Cowan TM, Enns GM, Davis RW (2009) Mapping gene associations in human mitochondria using clinical disease phenotypes. PLoS Computational Biology 5: e1000374. [PMC free article] [PubMed]
15. Liu T, Sung W, Mittal A (2006) Learning gene network using time-delayed bayesian network. International Journal of Artificial Intelligence Tools 15: 353–370.
16. Friedman N, Nachman I, Pe’er D (1999) Learning bayesian network structure from massive datasets: the sparse candidate algorithm. In: Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence. 206–215.
17. Heckerman D, Geiger D, Chickering DM (1995) Learning bayesian networks: the combination of knowledge and statistical data. Machine Learning 20: 197–243.
18. Balakrishnan S, Madigan D (2006) A one-pass sequential Monte Carlo method for Bayesian analysis of massive datasets. Bayesian Analysis 1: 345–362.
19. Huang S (1999) Gene expression profiling, genetic networks and cellular states: An integrating concept for tumorigenesis and drug discovery. Journal of Molecular Medicine 77: 469–480. [PubMed]
20. Shmulevich I, Gluhovsky I, Hashimoto RF, Dougherty ER, Zhang W (2003) Steady-state analysis of genetic regulatory networks modelled by probabilistic Boolean networks. Comparative and Functional Genomics 4: 601–608. [PMC free article] [PubMed]
21. Kim H, Lee JK, Park T (2007) Boolean networks using the chi-square test for inferring large-scale gene regulatory networks. BMC Bioinformatics 8: 37. [PMC free article] [PubMed]
22. Zhang R, de SCavalcante HLD, Gao Z, Gauthier DJ, Socolar JES, et al. (2009) Boolean chaos. Phys Rev E 80: 045202. [PubMed]
23. Socolar JES, Kauffman SA (2003) Scaling in ordered and critical random Boolean networks. Physical Review Letters 90: 68702. [PubMed]
24. Dealy S, Kauffman SA, Socolar JES (2005) Modeling pathways of differentiation in genetic regulatory networks with boolean networks: Research articles. Complex 11: 52–60.
25. Veliz-Cuba A, Stigler B (2011) Boolean models can explain bistability in the lac operon. Journal of Computational Biology 18: 783–794. [PubMed]
26. Kauffman SA (1969) Metabolic stability and epigenesis in randomly constructed genetic nets. Journal of Theoretical Biology 22: 437–467. [PubMed]
27. Akutsu T, Miyano S (1999) Identification of genetic networks from a small number of gene expression patterns under the boolean network model. Proc Pacific Symposium on Biocomputing : 17–28. [PubMed]
28. Ideker T, Thorsson V, Karp R (2000) Discovery of regulatory interactions through perturbation: inference and experimental design. Proc Pacific Symposium on Biocomputing : 302–313. [PubMed]
29. Allocco DJ, Kohane IS, Butte AJ (2004) Quantifying the relationship between co-expression, coregulation and gene function. BMC Bioinformatics 5: 18. [PMC free article] [PubMed]
30. Bosl WJ (2007) Systems biology by the rules: hybrid intelligent systems for pathway modeling and discovery. BMC Systems Biology 1: 13. [PMC free article] [PubMed]
31. Jordan IK, Marino-Ramirez L, Wolf YI, Koonin EV (2004) Conservation and coevolution in the scale-free human gene coexpression network. Molecular Biology and Evolution 21: 2058–2070. [PubMed]
32. Lee HK, Hsu AK, Sajdak J, Qin J, Pavlidis P (2004) Coexpression analysis of human genes across many microarray data sets. Genome Research 14: 1085–1094. [PubMed]
33. Opgen-Rhein R, Strimmer K (2007) From correlation to causation networks: a simple approximate learning algorithm and its application to high-dimensional plant gene expression data. BMC Systems Biology 1: 37. [PMC free article] [PubMed]
34. Li LM, Lu HH-S (2005) Explore biological pathways from noisy array data by directed acyclic boolean networks. Journal of Computational Biology 12: 170–185. [PubMed]
35. Sahoo D, Dill D, Gentles A, Tibshirani R, Plevritis S (2008) Boolean implication networks derived from large scale, whole genome microarray datasets. Genome Biology 9: R157. [PMC free article] [PubMed]
36. Wang H, Lu HH-S, Chueh TH (2011) Constructing biological pathway by a two-step counting approach. Plos one 6: e20074. [PMC free article] [PubMed]
37. Somogyi R, Sniegoski CA (1996) Modeling the complexity of genetic networks: Understanding multigene and pleiotropic regulation. Complexity 1: 45–63.
38. Akutsu T, Kuhara S, Maruyama O, Miyano S (1998) Identification of gene regulatory networks by strategic gene disruptions and gene overexpression. Proc 9th ACM-SIAM Symp Discrete Algorithms: 695–702.
39. Akutsu T, Kuhara S, Maruyama O, Miyano S (2003) Identification of genetic networks by strategic gene disruptions and gene overexpressions under a boolean model. Theoretical Computer Science 298: 235–251.
40. Schuller HJ (2003) Transcriptional control of nonfermentative metabolism in the yeast saccharomyces cerevisiae. Current Genetics 43: 139–160. [PubMed]
41. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, et al. (1998) Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Molecular Biology of Cell 9: 3273–3297. [PMC free article] [PubMed]
Articles from PLoS ONE are provided here courtesy of
Public Library of Science