We have introduced MINDy, a novel method for the identification of context-specific, post-translational modulators of TF activity. Literature-based and experimental validation suggests that MINDy can recapitulate a large fraction of known MYC modulators and infer novel, context-specific modulators, both of MYC and of other TF.
For well studied TFs, targets for the analysis may be selected from the literature or by performing genome-wide ChIP assays37, 38
. However, computationally-inferred targets performed as well or better than literature-based ones, likely due to their context-specific nature. About 269 TFs have more than 50 ARACNe-inferred targets using the B-cell profiles, and may thus be effectively analyzed by MINDy.
Algorithm performance was remarkably robust to candidate target selection (DB-targets vs. AR-targets). Additionally, MINDy’s formulation is relatively simple, requiring only the availability of a large gene expression profile dataset (n ≥ 200 profiles) characterizing a sufficient variety of naturally occurring or experimentally perturbed cellular phenotypes. This suggests that MINDy can be used to analyze most TFs in a variety of cellular contexts.
Several limitations should be noted. First, candidate modulators that do not satisfy the range constraint cannot be tested by the method. These, however, include mostly either genes that are not expressed or genes whose availability is so tightly regulated (e.g. housekeeping genes) that variability in the gene expression profiles is too limited to establish a low and a high range of expression. Second, candidate modulators that do not satisfy the independence constraint may not be tested using this approach. In practice, less than half (100/233 = 42.9%) of the Ingenuity modulators were in this category. This is not a theoretical limitation of the method but rather an assumption we use to increase its sensitivity. If desired, the more general test may be used without relying on the assumption I[TF; M] = 0 (see SI Section 1.1). Additionally, transcriptional modulators of MYC can be directly inferred by ARACNe and do not require MINDy. Third, in the rare event that the regulatory program of a TF changes from activation to repression for specific targets, as a function of a modulator, this may not be detected by the algorithm because the MI may not change substantially. In this case the multi-information could be used instead of the conditional.
Comparison with existing methods reveals that MINDy is unique in its ability to discover large numbers of modulators of the same TF. Finding the optimal Bayesian Network structure, assuming arbitrary interactions among genes’ parents, for instance, is hyper-exponential in the number of parents. As a result, dissecting network topologies with large numbers (tens to hundreds) of upstream modulators, as is the case for MYC, may be difficult. Other methods39
, while promising in a yeast context, have not yet been extended to mammalian networks. Finally, comparison with a recently introduced algorithm, NetworKIN11
, shows that the latter is restricted to substrates of only 73 kinases, from 20 families, while MINDy can be used to dissect post-translational interactions of a much wider nature, including phosphorylation, acetylation, chromatin modification, formation of transcriptional complexes, and binding site antagonism, as shown for STK38, HDAC1, MEF2B and BHLHB2, just to mention those experimentally validated in this manuscript.
The ability to infer direct and upstream modulators of a desired TF’s activity suggests that MINDy may provide highly specific pharmacological targets for the activation or repression of specific transcriptional programs, when modulators are restricted to druggable genes40
. This could be valuable because TFs are generally considered difficult pharmacological target.
Although preliminarily applied to the identification of modulators of MYC for experimental validation purposes, MINDy has already provided novel biological insights of general significance. First, the results indicate that not all modulators can influence the program of a TF in a global fashion, but may rather influence specific subsets of the target genes. This observation suggests that additional levels of regulation can influence the relationships between modulators and the TF they control in different cellular contexts or depending on different signals. Second, MINDy provided novel information about molecules that regulate the activity of the MYC protein. These mechanisms may be critically altered in tumors, thus modulating MYC’s established oncogenic activity. Finally, MINDy is not limited to dissecting post-translational interactions and may be applied without modification to identify TFs that are directly modulated by microRNAs or indirectly by genetic and epigenetic alterations.
At manuscript publication, the source code and executables for MINDy will be made available under the Open Source licensing agreement. Additionally, MINDy will be incorporated into the geWorkbench package which is distributed both by the NCI and by the National Center for Biomedical Computing at Columbia University (http://wiki.c2b2.columbia.edu/workbench/index.php/Home