Co-functional prediction suggests distinct regulatory pattern between the phosphorylation and transcriptional networks
Since various networks of phosphorylation and transcriptional regulation are available, a straightforward question is to test how well different datasets can recapitulate current biological knowledge. We used five datasets to predict co-functional gene pairs, and assessed the accuracy by comparing the predictions with a gold standard dataset (see Materials and methods
). The datasets covered both genetic and biochemical aspects of phosphorylation network and transcriptional regulatory networks, including KPFN (functional networks derived from a microarray study of kinase/phosphatase single deletion strains 
), TFFN (functional networks derived from TF single deletion strains 
), KBN (biochemical networks derived from in vitro
protein chip 
), KPIN (physical networks of kinase/phosphatase interaction 
), and TFBN (TF binding network derived from ChIP-chip experiments 
). Except for KPIN, the other networks are directed. In each network, the similarity between regulators is calculated by the Pearson correlation coefficient of their interaction profiles, which measures the extent two regulators share common targets. It is expected that highly correlated pairs are co-functional, however the prediction accuracy varies a lot across the five networks considered ( A, B). In phosphorylation network, functional networks (KPFN) are more predictive than biochemical or physical interaction networks (KPIN, KBN); while in the transcriptional regulatory network the opposite is true.
Co-function prediction using different datasets suggests distinct regulatory pattern in phosphorylation network and transcriptional network.
Apart from our observation, in transcriptional regulatory network, it has been pointed out in literature that TF signatures overlap poorly with their corresponding binding targets 
, possible explanations of which include protein-protein interactions between TFs 
, homology relationships 
, and indirect transcriptional regulation 
. Our data and other studies indicate a substantial discrepancy between the biochemical networks and functional networks; explaining this contradictory behavior is an interesting question that we will address below.
We present a simple model to explain the difference in genetic signature and biochemical interaction profile. If two regulators act in a linear pathway (), the deletion of either one will cause the same effect, thus lead to similar signatures. However, their binding targets may vary. In contrast, if two regulators work in parallel () and they bind to the same targets, the deletion of either one will have no effect on the expression level of target genes due to genetic buffering. As a result, they have similar biochemical interaction profiles but distinct signatures. Hence, we hypothesize that the regulatory motifs in the phosphorylation and transcriptional regulatory networks are different, with phosphorylation networks being abundant with linear pathways and transcriptional regulatory network abundant with parallel pathways.
Phosphorylation network and transcriptional network differ in motif usage
To validate our hypothesis that phosphorylation network and transcriptional regulatory network are abundant with different motifs, we examined the network motifs of KBN, TFBN, and their combination. We excluded KPIN because it lacks the direction between two kinase. In these networks, nodes represent regulators and targets, and the edges are directed, representing the physical binding of a regulator to certain targets. In order to investigate the cooperative pattern between regulators, we enumerated three node motifs with the restriction that the two regulators have a direct or indirect regulation on the target gene. By calculating the occurrence of the motifs and contrasting with randomly shuffled networks (see Materials and methods
), the significance of network motifs is evaluated ().
Motif enrichment analysis reveals different motif usage in the phosphorylation and transcriptional regulatory networks.
Our data demonstrates phosphorylation network and transcriptional regulatory network utilize different network motifs. The motif “regulator chain” is only enriched in phosphorylation network. Based on the linear regulatory model (), it can be inferred that KPFN is more predictive of co-function than KBN, which coincides with our observation. The “bi-input” motif is also enriched in phosphorylation network, resulting in a genetic buffering effect of phosphorylation events 
. In transcriptional regulatory networks “feed forward loop” (FFL) motif was enriched, which have already been widely discussed 
. The motifs with loop structure within regulators (bi-component loop1, bi-component loop2) are also enriched in transcriptional regulatory networks. In these motifs, two TFs transcribe each other, and generate a bi-stable system, which switches between two alternative states 
. The two motifs tend to characterize an important mode in transcriptional regulation. TFs cooperate to regulate a set of genes (bi-component loop 2), but their functions are not completely redundant (bi-component loop 1). For example, Ste12 and Tec1 are two TFs that co-regulate genes in filamentous pathway (for example, Kss1), but only Ste12 activates genes in mating pheromone pathway (for example, Ste3) (). In this case, the resultant signatures are divergent but their binding profiles overlap with each other on the co-regulated genes (Figure S1
). This phenomenon is termed mixed epistasis in phosphorylation network 
, where two kinase partly buffer each other, and also have unique functions themselves. We demonstrate that this definition can also be extended to transcriptional network according to the enriched bi-component loops. Because of the enriched buffering relationships in transcriptional regulatory networks, TFBN is more predictive of co-function than TFFN, which is also consistent with our observation. In the combined network, the enriched motif is FFL, which couples phosphorylation with transcription. A previous study showed that FFL formed by kinase CDK1 and transcriptional factors was important to drive temporal transcriptional responses in cell cycle regulation 
It is noted that the motifs enriched in the phosphorylation networks are completely disjoint from those in the transcriptional regulatory networks, which suggests the two networks are structurally quite different. Our results also support the idea that motifs are indeed building blocks of biological networks, and different usage of network motifs carries out distinct function. Except for motif usage, the difference might also be due to different global topological features of KBN and TFBN. However, this possibility is not strongly supported as in both KBN and TFBN the degree distributions obey a power law form (Figure S2
), and their edge densities are at comparative levels (0.035 in KBN vs. 0.030 in TBN).
Hetero-regulatory modules couple phosphorylation with transcriptional regulation
Comprehensive techniques that individually analyze phosphorylation network and transcriptional regulatory network have been extensively studied 
, however, an integrative method is still lacking in literature. To our knowledge, this is the first attempt to systematically address the problem of identifying signaling modules through integration of the two networks. We now describe the basic procedures in identifying hetero-regulatory modules (HeR module). Briefly, we first define the hetero-regulatory similarity score (HeRS score) that measures the co-function potential of one kinase/phosphatase(KP) and one TF. Then for each pair of TF-(Kinase/Phosphotase), we use their HeRS score as the entry of the HeRS matrix (), which represents the similarity between TFs and Kinase/Phosphotase. Through clustering the HeRS matrix, we can identify hetero-regulatory modules in which TFs and Kinases/Phosphotases share similar targets ().
Strategy of identifying hetero-regulatory (HeR) modules.
The HeRS score is defined as the Pearson correlation coefficient of the signature profile of a kinase/phosphatase and the binding profile of a TF. Assuming that the transcriptional response to deletion of a kinase/phosphatase is mediated by TFs that function in the same pathway, a high correlation is expected for heterogeneous regulators within pathway. The definition is inspired by the above network motif analysis. In phosphorylation network, where linear regulatory model applies (), the signature profile better characterizes the regulatory role of a kinase/phosphatase than its binding profile. Conversely, transcriptional regulatory network is enriched with parallel regulatory model (). As a result, a regulatory target of a TF is often missing in its signature, but present in the binding profile.
Among the large scale datasets available, we choose KPFN (which provides the signature profile of the kinase/phosphatase) and TFBN (while provides the binding profile of the TF) for integration according to the following reasons: (1) KPFN achieves the best accuracy in predicting co-functional pairs among the phosphorylation datasets, and TFBN is the best among transcriptional regulatory datasets; (2) The transcriptome can serve as an anchor in coupling phosphorylation events with transcriptional regulation, since any change in the transcriptome can be traced back to transcriptional regulation. Among the large scale datasets available, only KPFN characterizes the phosphorylation network at transcriptome level, while KBN and KPIN are at proteome levels. (3) The signature of a TF is usually reduced compared to the actual regulatory targets due to buffering effect. As a result, inference based on TFFN without considering its cellular context can be misleading. Conversely, the binding profile is an unbiased set of potential target genes, and is demonstrated to be a better representative of the transcription factor's function. In addition, a comparison between TFFN and TFBN was conducted, and the HeRS score based on TFBN is proved more accurate in predicting co-functional heterogeneous pairs than that based on TFFN (Figure S3
Next, we identified hetero-regulatory modules (HeR module) using a clustering approach. A hetero-regulatory module is a set of kinases/phosphatases (KPs) and transcription factors which share targets at the transcriptome level. Mathematically, a cluster of KPs and a cluster of TFs form a HeR module if they have high HeRS scores with each other. Biologically, a HeR module is a set of functionally relevant KPs and TFs, in which KPs transmit signals among each other and regulate the transcriptome through TFs. In another word, a HeR module is a set of regulators, and it is very different from co-regulated gene modules derived from common cluster analysis of microarray datasets, which is a set of co-regulated targets.
Given the hetero-regulatory matrix, in which the element (i, j) represents the hetero-regulating similarity of kinase i and TF j, KPs and TFs are clustered separately by hierarchical clustering. In a HeR module, heterogeneous regulators stand for the corresponding KPs and TFs. The target set of a HeR module is naturally derived, composed of genes that are present in the signature of a KP and bound by a TF in this module (). We applied the procedure to identify HeR modules through integration of KPFN and TFBN in S.cerevisiae
(see Materials and methods
, Figure S4
, ), and evaluated the results. To our delight, the HeR modules can be neatly mapped to MAPK pathways. The structure and function of MAPK pathways, as well as its complexity, is well studied in S.cerevisiae
. Therefore, we take it as a model system to illustrate how HeR modules shed light on the structure and functions of signaling pathways, the cross talk between pathways, and how new functional links are inferred.
Hetero-regulatory modules can recover the structure and function of known signaling pathways
By linking phosphorylation events with transcriptional regulation, HeR modules recover several MAPK pathways known in S.cerevisiae
, including filamentous growth (FG) pathway, mating pheromone (MP) pathway, cell wall integrity (CWI) pathway, and high osmolarity glycerol (HOG) pathway, mainly due to the high HeRS scores between the KPs and TFs in the MAPK pathways (). For example, Ste7, Ste11, and Ste20 are shared kinase in the upstream of FG and MP pathway, and they form a tight cluster with TFs Ste12, Tec1, Mcm1, and Dig1, which also function in the two pathways. Kinase Fus3 is an inhibitor of the FG pathway, and it forms a HeR cluster with main FG pathway TFs, Tec1 and Ste12. Similarly, kinase in the HOG pathway, Hog1, Pbs2, and Ssk2, are clustered with TFs Hot1 and Sko1. These TFs are involved in osmotic stress response. In another example, our method places TF Rlm1 in the same cluster with kinase Bck1 and Slt2, which are MAPKKK and MAPK in the CWI pathway, respectively. The high HeRS score suggests Rlm1 is a major TF in the CWI pathway. This is consistent with the finding that CWI pathway stimulates expression of cell wall biosynthesis genes via phosphorylation of TF Rlm1 
. These examples show the advantage of HeRS score in coupling phosphorylation network with transcriptional regulatory network, and the efficiency of HeR modules in pathway identification.
Hetero-regulatory modules inference results.
Besides linking hetero-regulators, a target set is assigned to a HeR module, which provides the capacity to predict the cellular function of the corresponding module. The target set of one HeR module includes all targets regulated by at least one kinase/phosphotase and at least one TF in this module. For the MAPK related HeR modules, we investigated the functional distribution of their target sets. In most cases the abundant function of the target set is consistent with that of the hetero-regulators (). For example, 56% of the target genes with known function in the MP module are annotated with “mating”; all target genes in the CWI module are annotated with “cell wall”. In addition, the kinase in HOG pathway (Hog1, Pbs2, Ssk2) form three HeR modules with different TFs, each representing different functional aspects of the HOG pathway, and the target genes are also enriched with relative functions. For “HOG Kinase - MP TFs” module, 33% of the target genes are annotated with “mating”, for “HOG Kinase - CWI TFs” module, 67% of the target genes are annotated with “cell wall”, for “HOG Kinase - HOG TFs” module, 50% of the target genes are annotated with “stress response”.
HeR modules reveal the cross-talk between pathways
Many components are shared across different MAPK pathways, but cells maintain the specificity in response to signals. The mechanism to suppress erroneous cross-talk between pathways is not very clear in spite of intensive study on this subject.
In our prediction, the HOG kinase cluster (Hog1, Pbs2, Ssk2) forms hetero-regulatory modules with several TF clusters besides HOG pathway TFs. These TF clusters are involved in FG(Tec1, Ste12), MP(Ste12) and CWI(Rlm1) pathways separately(), which suggests the cross talk between the HOG pathway and these other pathways. To further investigate in more detail these HOG related HeR modules, we examined the target set of the HOG kinase (Hog1, Pbs2, Ssk2). When the HOG kinase are deleted, most of the up-regulated genes are annotated in the MP or FG pathway, while the down-regulated genes are mostly annotated in HOG or CWI pathway. The up-regulated genes can only be bound by MP/FG TFs (Ste12, Tec1), while the down-regulated genes are mainly bound by HOG TFs or CWI TFs. These observations suggest that HOG kinase suppresses the cross talk between the HOG pathway and the MP/FG pathway by inhibiting TF activity of Ste12 and Tec1, and induces cross-talk between the HOG and CWI pathways through the activation of Rlm1().
In fact, there are some experimental evidences supporting our inference. A recent study demonstrated the HOG signaling probably indirectly interrupts signaling transduction in the FG pathway between phosphorylation of Kss1(the MAPK in FG pathway) and activation of Tec1 
. Plus, the HOG and MP pathways are likely insulated from each other by specific scaffolds, although whether this is sufficient to prevent inappropriate cross-talk is not clear 
. Ultimately, it has been found that the HOG pathway could also induce Slt2 through the transcriptional factor Rlm1, which induced the cross talk between the HOG and the CWI pathways 
A similar analysis revealed cross-talk between the MP and FG pathways based on two relevant HeR modules((KP: Ste7, Ste11, Ste20; TF:Ste12, Tec1), (KP: Fus3; TF: Ste12, Tec1)). All kinase in these two HeR modules participate in the MP pathway. It is found that the deletion of STEs (Ste7, Ste11, Ste20) mainly induced down-regulation of MP pathway genes, and the deletion of Fus3 caused the up-regulation of FG pathway genes. In addition, TF Ste12 binds to almost all targets of the two HeR modules, but Tec1 mainly binds to targets of Fus3. Mcm1, a MP pathway-specific TF, also forms a HeR module with the STE kinase (KP: Ste7, Ste11, Ste20; TF: Mcm1, Dig1), and it mainly binds to targets of STEs. (). These results indicate that MP kinase Fus3 suppresses the cross talk between the MP and FG pathways by suppressing Tec1, and Ste12 is the common TF of both pathways.
Fus3 inhibits filamentous pathway mainly through inactivating Tec1.
The above analysis also coincides with experimental findings. STEs could regulate both MP and FG pathways. However, in non-inducing conditions, the MP pathway is activated, while the FG pathway is suppressed. The specificity is decided by Fus3. Fus3 activates the MP specific genes and inactivates the FG pathway by suppressing Tec1, which is the major TF in the FG pathway. Hence, deletion of STE kinase will only influence genes in the MP pathway but not those in the FG pathway. When Fus3 is knocked out, another kinase, Kss1, will become up-regulated and could partially take over the role of Fus3 to activate Ste12. However, it will not inactivate Tec1 as Fus3 does. As a result, the expression of the MP pathway genes is not sensitive to Fus3 deletion, while FG pathway genes are activated since Tec1 is activated. 
Novel function of Sok2 can be inferred from HeR modules
We have demonstrated that hetero-regulators in the same signaling pathway tend to have a high HeRS score, and HeR modules map well to known pathways. Conversely, a high HeRS score can indicate a co-pathway relationship of the corresponding hetero-regulators. Here, we take TF Sok2 as an example to illustrate how to predict gene functions based on HeR modules. Sok2 forms a HeR module with known HOG TFs and HOG kinase (, ), and it binds to many genes in the HOG pathway (). These data predicts Sok2 as a potential TF in the HOG pathway. Although no previous study has reported Sok2's function in HOG pathway, there is indirect evidence to support this claim.
First, in our analysis, Msn2, Sok2, and HOG TFs (Hot1 and Sko1) form a TF cluster (), and the HeRS scores between Sok2 and HOG kinase are greater than that of the Msn2 and HOG kinase. Since Msn2 is the substrate of Hog1 
, the above data indicates close relationship between Sok2 and HOG pathway. Plus, in meiosis and mitosis, Sok2 associates with Msn2/4, and they are co-regulated in the cAMP-dependent protein kinase signal transduction pathway 
. These observations suggest that Sok2 and Msn2/4 may also be co-regulated in HOG pathway. Second, a recent comprehensive phenotypic analysis has found that the deletion of Sok2 caused a decrease in the hyperosmotic stress resistance of cells 
. This provides direct evidence that Sok2 is involved in HOG pathway.
Both our results and experimental data suggest Sok2 has an extensive interaction with HOG pathway, and may be a novel transcription factor in this pathway.
A potential feedback loop in mating pathway is predicted
Fus3 and Kss1 are paralogs, and they are redundant MAPKs in the MP pathway. Previous studies reported that the redundancy of Fus3 and Kss1 is partial, since deletion of Fus3 resulted in a 10% reduction in mating efficiency compared to the wild type level, but the deletion of Kss1 has virtually no effect 
. The phenomenon of partial redundancy is lacking in explanation, while an in-depth examination of the HeR module reveals the missing link.
Two HeR modules are related to the MP pathway, HeR module FUS3 (KP: Fus3; TF: Tec1, Ste12) and HeR module STE (KP: Ste20, Ste11, Ste7; TF: Tec1, Ste12). As analyzed above, genes in MP pathway should be down-regulated upon deletion of the STE kinase, but not sensitive to the deletion of Fus3. When we compared the target sets of Fus3 and STE, all genes behave as expected except for MF(ALPHA)2, which is down-regulated upon deletion of both Fus3 and STE kinase (). Interestingly, MF(ALPHA) 2 is the upstream signal (alpha factor) of the mating pathway, which activates the STE kinase. A simple analysis based on these observations could illustrate a positive feedback loop including Fus3 but not Kss1 (). In addition, MF(ALPHA)2 is occupied by Mcm1 and Ste12 but not Tec1, which is further evidence that MF(ALPHA)2 is transcriptional regulated in the mating pathway. The deletion of Fus3 leads to a decreased expression level of MF(ALPHA)2, resulting in the positive feedback being cut off. In contrast, the deletion of Kss1 does not affect the activity of MF(ALPHA)2, allowing the positive feedback loop to be retained. Since another mating pheromone alpha factor MF(ALPHA)1 is more highly expressed and produces most alpha-factor, deprivation of this positive feedback explains the slight reduction (10%) of mating efficiency.