Nuclear receptors (NR) are a superfamily of ligand-activated transcription factors that modulate specific gene expression by interacting with specific DNA sequence upstream of their target gene. So far there are over 100 nuclear receptors identified [1
]. Estrogen receptor (ER) is a member of the nuclear receptor superfamily and is categorized into the class of ligand-dependent steroid receptor in the 1960s. The study explained it controls diverse biological processes by mediating the actions of steroid hormone estrogen and afforded an appreciation of its global importance in cell growth, cellular signalling, differentiation, maturation and homeostasis in eukaryotic cells. Finally, the general pathway for steroid hormone action was subsequently elucidated [4
Unlike conventional transcription factors, ER is composed of several domains including ligand binding, DNA binding, dimerization, and transcriptional activation. The ligand binding domain participates in several activities including hormone binding, homo- and/or heterodimerization, and transcriptional activation and repression. The binding of the estrogen induces conformational changes in ER that could regulate gene expression by directed interaction with DNA (genomic pathway of ER action) or via an undirected connection with the modulation of some specific proteins (non-genomic pathway) [5
In a gene regulatory network, gene transcription variations are controlled by many transcription factors. It has been established that the presence of regulatory sequences is in the proximity of genes and the existence of proteins is able to bind to those elements and to control the activity of genes by either activation or repression of transcription [7
]. To understand gene regulation, the inference of its regulatory network is an important research topic [8
]. Recent genomic technology, such as genome wide expression array or sequencing, allows us to elucidate the global gene regulatory mechanisms. Due to the well-developed microarray technology, the wealthy information for gene expression allows us to observe the expression levels of thousand of gene at once and helps more accurately predict gene-to-gene interaction according to its similarity or dissimilarity.
One approach to establish the gene regulatory network is to start from gene-gene correlations or interactions. Many computational approaches have been developed aimed to measure associations between mRNA abundant profiles to predict the transcriptional regulatory interaction. Some attempts at determining gene regulation based on the gene expression clustering algorithm. They group the genes that show similar gene expression using correlation coefficient matrix [9
] or mutual information-based algorithm [10
] under the same condition [8
]. However, clustering the resembling genes that are co-regulated cannot present much more information about the biological mechanisms of gene regulation or regulatory pathway. Thus, some computational algorithms are proposed to reconstruct the gene networks by applying statistical approaches, such as Relevance Network, Bayesian Network, Linear Regression Network [12
], and our own Regulation Network [3
Relevance Network detects the relatedness between two genes from their gene expression profiles and gives a link between transcription factor and its target gene if correlated [13
]. The typical methods to calculate the relatedness are Pearson Correlation Coefficient and Mutual information. Pearson Correlation Coefficient provides better performance on detecting linear relationships but it is not as intuitive as the Euclidean distance measure [17
]. Mutual information (MI) gives good performance on non-linear relationship. For example, ARACNE algorithm [16
] estimates the mutual information between the gene expressions of two genes using Gaussian kernel estimator. The measure of relatedness by MI ranges from 0 to 1. Relevance network is a relatively simple model, which computes the pair-wise similarity or dissimilarity between two genes.
Bayesian Network (BN) can identify casual relationships between variables. The topology a BN can provide the dependence or independence of variable [18
], BN algorithm can reveal the dynamics of the gene regulation hierarchy. While BN has its advantage of structure model, it is difficult to inform whether a node (gene) is important to be included. Another challenging is its computational stability. It usually results in multiple optimal networks [19
]. The high computational requirement leads to almost impossible of inference to a large-scale regulatory network [20
]. Also, BN assumes no gene-gene interaction, which can misrepresent the data.
Our proposed ERα regulatory network is a combination of TF binding affinity estimated from ChIP-seq data, up or down regulation using gene expression, and motif conservation in probe sequences. This approach effectively utilized the genomic or non-genomic actions. Unlike previous regression approaches, this method did not use correlation information.
In this paper, our proposed approach analyzes the interaction between TF and target gene conditioned on a group of specific modulator genes. Also, we consider the change of modulators' expression level to perceive its influence on transcriptional activity. We reconstruct gene regulatory networks in related biological subjects via a multiple linear regression approach with interaction term such that the inferred modulator gene is directly embodied and the relationships of the biological subjects they represent are easily exploited. As a result, this reveals deeper insight on how the structure, function, and behaviour of components evolve.