Transcriptional factors (TF) and microRNAs (miRNAs) are important regulation factors to determine the expression levels of mRNAs and miRNAs [1
]. TFs activate or repress gene transcription by binding to specific sites (transcription factor binding sites, or TFBSs) in promoter regions, thus regulating gene expression at the transcription level; miRNAs inhibit mRNA translation by inducing mRNA degradation and/or blocking the translation machinery, thus negatively regulates gene expression at the post-transcriptional level. Given the facts that the transcription of both mRNA and miRNA is regulated by TFs, and that mRNA expression, including TF’s, could be modulated by miRNAs, the cellular transcriptome is believed to be determined by combinatorial regulatory network of at least two interconnected layers, where TFs work as master regulators in the transcriptional layer and miRNAs as fine tuners in the post-transcriptional layer [1
]. It thus becomes critical to delineate and characterize the two-layered combinatorial regulatory networks, for the sake of understanding the regulatory mechanisms at a higher precision than what we can do with either layer alone.
Databases, such as TransFAC [2
] on TF-to-mRNA regulation, TransmiR [3
] on TF-to-miRNA regulation, and TarBase [4
] on miRNA-to-mRNA regulation, provide experimentally validated regulation relationships between regulators and their targets. However, such data alone are too limited to enable large-scaled studies. Therefore, peers have resorted to a forward-prediction strategy to infer regulatory relationships between TFs or miRNAs and their putative targets based on the matching or complementary of motif or seed sequences [5
]. In this way, they built the two-layered combinatorial regulatory networks, and investigated the global and local architectural properties [7
]. It is imaginable that a high rate of false positive predictions is necessitated [9
], and moreover, these forward works generate ‘reference networks’ that span across all spatiotemporal contexts – in concept all regulations that take place at different temporal points and different cells or tissues are combined unreasonably. That is, forward engineering cannot solve a conditional regulatory network that corresponds to a particular cellular context. The reverse engineering strategy therefore comes into use where the regulatory relationships between TFs or miRNAs and their putative targets (cause) are inferred from the observed expression correlations (consequence) (for a review see [10
Reverse engineering has been put into effect in inferring TF-controlled transcriptional regulation networks [11
] as well as sifting miRNA potential targets [14
]. However, we have rarely seen successful applications of reverse engineering in inferring combinatorial networks involving TFs and miRNAs, except for a few works where small-scaled combinatory circuits of miRNAs and TFs were mapped around some selected genes prioritized from the expression data [16
]. The major obstacle in this direction, lack of simultaneously measured miRNA expression data and mRNA expression data, is being relieved as parallel miRNA expression and mRNA expression datasets are being continuously released to public [19
], such as those for epithelial samples [20
] or various tumor samples [22
]. Having only been explored for confirming predicted miRNA targets [24
] or extracting tumor-classifying molecular signatures [26
], these parallel expression datasets have far more potential to be exploited.
Previously, we integrated forward predicted gene regulation relationships with miRNA-perturbed gene expression datasets (MPGE datasets) and as a result elucidated miRNA-centered primary and secondary regulatory cascades in human cancer by using nonparametric test and linear regression modeling [27
]. Confined to the type of expression data - mRNA expression, the combinatorial regulatory networks mapped therein encompassed only the regulation of mRNA by TF and by miRNA (miRNA-to-mRNA, TF-to-mRNA), missing the regulation of miRNA by TF (TF-to-miRNA). This limitation is also existent in a contemporary study [28
], which substitutes mRNA’s expression data for that of the embedded intragenic miRNA in order to identify miRNA-mediated feedback and feed-forward loops. We realize that studies on combinatorial gene regulatory network can be advanced significantly with the help of the aforementioned parallel miRNA expression and mRNA expression datasets. Due to our preceding work on human cancers [27
], we are particularly interested in the NCI-60 data panel [23
] which involves 60 cancerous cell lines originating from breast, central nervous system, colon, leukemia, melanoma, Non-Small Cell Lung, ovarian, prostate, and renal tissues.
In the present work, we demonstrated an efficient integration of the forward-predicted candidate regulatory relationships with the NCI60 panel of parallel miRNA and mRNA expression datasets, giving rise to a genome-scale combinatorial network of transcriptional regulations and miRNA regulations in human cancer. The resultant combinatorial regulatory network makes a scaffold for systematic study of cancer gene regulation, and the demonstrated working pipeline can be extended to reveal conditional combinatorial regulatory landscapes in other cellular contexts.