To describe the inference method more formally we can let ‘M
mxn’ be the connectivity matrix connecting “m” kinases and “n” phosphosites, such that M
ij = 1 if kinase “i” is known to phosphorylate phosphosite “j”, M
ij = 0 otherwise. Let ‘X
n’ be the vector that describes the behavior of all phosphosites in a particular phosphoproteomics experiment, where X
j = {0, 1, −1}, such that X
j = 0 if during the experiment the phosphorylation level of phosphosite ‘j’ did not change or wasn't determined, X
j = 1 if the phosphosite ‘j’ was increasingly phosphorylated, or X
j = −1 if the phosphorylation level of phosphosite ‘j’ was decreased. Having the connectivity matrix M and the vector X, and since there usually are multiple substrates for a specific kinase, the most common behavior of all substrates for a specific kinase, based on a specific experiment, can be calculated for each kinase by: T
m = sign(M
mxnX
n ). Note that because we are just interested in whether most phosphosite-substrates for a specific kinase were increased or decreased overall, we take the “sign” of the inner product. Here T is the resulting vector of size “m”, such that T
i = {1, −1, 0}, T
i = 1 means that most phosphosite-substrates for kinase ‘i’ were increased, T
i = −1 means that most phosphosite-substrates for kinase ‘i’ were decreased, and T
i = 0 means that there is no relevant information for those phosphosite-substrates of kinase ‘i’ in the particular X vector experiment. Once we have computed T, the next step is to infer regulation based on the behavior of sites on those kinases. In order to do this we can define an “association matrix“ P
nxm, such that P
ji = 1 if phosphosite j is on kinase i, P
ij = 0 otherwise. P associates kinases with the phosphosites on them. Then,
Where [P
nxm . X
n] describes the behavior of each phosphosite ‘j’ on kinase ‘i’ in the experiment, and Q is the 'inference regulation vector' per phosphosite, such that Q
j = 1 means the effect of phosphosite ‘j’ is positive, Q
j = −1 means the effect of phosphosite ‘j’ is negative, Q
j=0 means the effect of phosphosite ‘j’ is unknown. Finally, taking the connectivity matrix into account, we can infer the sign of the direct links in the network, which are going to have the same sign of the corresponding phosphosite:
R
mxn will be the ‘inference regulation matrix’ for ‘m’ kinases and ‘n’ phosphosites on kinases, where R
ij = 1 means kinase ‘i’ activates kinase ‘j’ through phosphosite ‘j’, R
ij = −1 means kinase ‘i’ inhibits kinase ‘j’ through phosphosite ‘j’ and R
ij = 0 means that the regulation is unknown. The final and complete formula is:
The same method can be applied to infer signs for phosphatases but the inference rules will be opposite.