Homeostasis, the ability to respond to a plethora of environmental challenges, is vital to the cell. This adaptation is achieved by an orchestrated regulation of gene expression. It was discovered that some transcription factors (TFs) act as master regulators in many different conditions, and that the specificity of the regulatory response is obtained through dispatching the signal from the master regulators to downstream TFs (1
). It is quite clear that direct TF interactions (TFIs), both physical and genetic, are the prevalent mechanisms of this dispatching (2–4
). A method for the detection of functionally relevant, condition-specific TFIs would therefore greatly contribute to our understanding of gene regulation.
A necessary first step toward the detection of TFIs is the quantification of individual TF activity. It is difficult to deduce the activity of a TF by its expression alone [only a small fraction of TFs show expression levels that correlate with those of their target genes (5
)], as there are many alternative mechanisms to activate TFs. A complementary approach is the quantification of TF-DNA binding with chromatin immunoprecipitation (ChIP) assays (6
). Computational approaches rely on a known TF-target interaction graph (6
). A linear model that describes gene expression as the product of a position-specific activity matrix derived from binding data, and the unknown TF activities are presented in (8
). The experimental detection of TFIs is based on techniques such as co-immunoprecipitation and protein binding arrays (6
), which are costly and time-consuming. A statistical framework to deduce TF cooperativity from overrepresentation of common TF motifs at the promoter region of target genes is presented in (10
). However, these approaches do not make direct use of gene expression profiles, nor are their predictions condition-specific. The most promising approaches integrate multiple sources of information, e.g. expression data with binding sites from ChIP. The idea is that if two TFs act cooperatively then there should exist a sufficiently large target gene set to which both TFs bind, and the expression profiles of these target genes should be similar across a series of experiments (12
). This concept is used to rigorously assess cooperativity among TFs in the yeast cell cycle (13
). Bar-Joseph et al.
) construct regulatory gene modules by requiring co-regulation and the co-occurrence of binding sites for a pair of interacting TFs. Beer et al.
) cluster gene expression profiles in a preliminary step and apply a Bayesian classifier to predict TF modules, i.e. groups of TFs that act together in regulating a set of targets. Advanced statistical models for the integration of binding data and expression data are used in (16
). Single TFs and TF sets are modeled as hidden variables in a sparse regression model. In this way, the authors can assign a significance value for the combinatorial activity of each TF set. Wang et al.
) view the problem of TFI identification as a learning task and use Bayesian networks for the integration of multiple sources of evidence to predict cooperatively binding TFs.
Although there are only few studies that focus on TFIs, genetic interactions in general have been investigated extensively. Classically, the biological concept of genetic interaction (e.g. epistasis) between two components relies on the simultaneous perturbation of two components that yields an effect which is different from what one would expect from the perturbation of the individual components. This was applied at large scale in synthetic lethality/growth defect screens like (18–20
), to name a few of them. Typically, as many genes as possible are screened for interaction in an automated way by measuring the fitness of single and double gene deletions. Both fitness measures (growth and lethality) are one dimensional. It is still under debate how the deviation of the double deletion fitness from the fitness of the single deletions can be appropriately measured and tested in a rigid mathematical framework (21
). While this direct interaction measure proved to be rather fragile, the comparison of interaction profiles (the vector of all interaction scores of one gene with all others) yielded surprisingly robust and good results (22
). Furthermore, it became evident that the experimental effort can be reduced considerably if not all pairwise combinations of the genes of interest [~5.4 million combinations tested in (18
)] are screened, and that even more information can be gained from measurements under different conditions. This insight is reflected in the work of Bandyopadhyay et al.
) which identified genes interacting with DNA damage-specific partners, screening a comparably low number of 80
000 double mutants.
In the present work, we extend the concept of genetic interaction to high-dimensional phenotypes (e.g. genome-wide messenger RNA (mRNA) measurements, RNA-seq) as these become increasingly available. We formulate a mathematical concept of TFI which relies on the assumption that the common targets of interacting TFs should behave significantly different than the genes targeted by only one TF alone. So far, each pairwise genetic interaction had to be tested in an individual experiment, requiring a huge number of combinatorial perturbations. Our method instead needs only one global intervention to the system [the fact that led to the name One Hand Clapping (OHC)] in the form of an environmental stimulus, and a high-dimensional gene activity readout in order to score all pairwise TFIs. As in the case of synthetic genetic arrays, we compare the obtained interaction profiles between TFs to obtain reliable and stable predictions. A first proof of concept of this method was given in (24
), where we applied OHC to transcriptional activity data obtained under osmotic stress. Here, we establish a solid methodological basis and provide a proof of its universal applicability. After benchmarking the performance of OHC, we construct a compendium of high confidence, condition-specific TFIs based on a large gene expression screen (25
). Finally, we validate two of the novel TFI predictions under osmotic stress, one of them in silico
, the other one in vivo
. OHC is available as an open source, user-friendly R package (see Supplementary data
). The current best practice in the study of gene regulation, consisting of quantification of differential expression and gene set enrichment analysis, can now be extended by the screening for combinatorial TF activity.