We model interactome networks as large electrical circuits of interconnecting junctions (proteins) and resistors (interactions). Our model identifies candidate proteins that make significant contributions to the transfer of biological information between various modules. Compared to degree and betweenness, our model has two major advantages: first, it incorporates the confidence scores of protein-protein interactions; second, it considers all possible paths of information transfer. When a protein that mediates information exchange between modules is knocked down, the disintegration of multiple modules is very likely to result in lethality. Even if the organism is still viable, pleiotropy may be observed because multiple phenotypes imply the breakdown of multiple modules. In support of our model, we find that the information flow score of a protein is well correlated with the likelihood of observing lethality or pleiotropy when the protein is eliminated. Even among proteins of low or medium betweenness, the information flow model is predictive of a protein's essentiality or pleiotropy. Compared to betweenness, the information flow model is not only more effective but also more robust in face of a large amount of low-confidence data. Our method is accessible to the public. The MATLAB implementation of the information flow algorithm, along with the information flow scores for proteins in the yeast interactome network and proteins in the worm interactome network, can be downloaded at http://jura.wi.mit.edu/ge/information_flow_plos/
The information flow model identifies central proteins in interactome networks, and these proteins are likely to connect different functional modules. We developed an algorithm that decomposes interactome networks into subnetworks by removing proteins of high information flow in a recursive manner () (Materials and Methods
). Starting from the largest network component, we removed the protein with the highest information flow score. If the proteins remained connected in a single network, we removed the protein with the next highest information flow score one-at-a-time, until the network fell into multiple pieces upon the protein removal. We then counted the number of proteins in each of the subnetworks. If a subnetwork contained between 15 and 50 proteins, we examined whether any Gene Ontology (GO) term was enriched among proteins in the subnetwork 
. If a subnetwork contained over 50 proteins, we repeated the procedure of removing high information flow proteins from the subnetwork. Overall, we obtained 37 subnetworks, and all but two of them were enriched with proteins from certain GO categories (Table S7
). We investigated the effects of varying the minimum and maximum size of subnetworks (Text S2
). The selected range of 15 to 50 proteins was based on the number of recovered subnetworks as well as the overall GO enrichment scores. If we increased the minimum subnetwork size to 20 proteins, the number of subnetworks shrank to 24, all of which were functionally enriched. However, in order to recover the additional 11 GO enriched subnetworks for a total of 35, we decided to keep the lower threshold at 15 proteins. The fact that the majority of subnetworks are functionally enriched provides additional evidence that proteins with high information flow score interconnect different modules.
An interactome network can be partitioned into subnetworks by recursively removing proteins of high information flow scores.
It was previously observed in a yeast interactome network that ‘date hubs’, which connect different modules, are more likely to participate in genetic interactions than randomly sampled proteins, because elimination of date hubs may make the organism more sensitive to any further genetic perturbations 
. We tested whether proteins of high information flow and proteins of high betweenness show the same property in the C. elegans
interactome. We found that genes that rank the highest 30% in terms of information flow or betweenness are more likely to participate in genetic interactions than randomly selected genes (P
, respectively). This is not particularly surprising because many proteins of high information flow or high betweenness are hubs in the network.
Another possible feature of “between-module” proteins is related to the expression dynamics of these proteins and their interacting partners. In general, interacting proteins are likely to share similar expression profiles 
. Date hubs in yeast interactome networks have been found to be less correlated with their binding partners in terms of expression dynamics than ‘party hubs’ which function within a functional module 
. Proteins of high betweenness in yeast interactome networks have also been reported to show the lack of expression correlation with their binding partners 
. On the other hand, it has been argued in another study that the lack of correlation is dependent on the datasets examined 
. We investigated the correlation of expression profiles 
for proteins of high information flow or proteins of high betweenness with their interacting partners in the C. elegans
interactome. We did not find proteins of high information flow or proteins of high betweenness behaving differently from other proteins in terms of expression correlation with their interacting partners (data not shown). Thus the expression correlation between topologically central proteins and their binding partners may be worth further investigations.
The transmission of biological signals is directional while at present interactome networks often reflect the formation of protein complexes 
and do not contain directionality. We explored whether the information flow model is also applicable to signaling networks with directionality. We generated a signaling network for S. cerevisiae
by integrating phosphorylation events 
and Y2H interactions (see Materials and Methods
). In this network, we examined the top 30% versus the bottom 30% of genes ranked by the information flow score. We found a significant increase in the percentage of pleiotropic genes in the former group (17.0%) as compared to the latter (5.3%) (Table S8
0.01), though the percentages of essential genes are similar for the two groups. This analysis suggests that the information flow model is useful for discovering crucial proteins in signaling networks, as well as in networks of protein complexes. The lack of correlation with lethality may reflect the fact that fewer proteins in signaling networks participate in “housekeeping” functions, which are often mediated by multi-protein molecular machines.
In the future, with more information integrated into interactome networks, we should be able to improve on the performance of information flow model. In addition, interactome networks can vary at different times or in different spatial locations. After all, we still have very limited understanding of how biological information flows through cellular networks. Most likely, it does not flow exactly as the electrical current flow does. As more knowledge is accumulated, we should be able to modify the information flow model according to the design principles of cellular network and highlight the dynamic nature of cellular networks.