From our 3D analysis of disease-associated mutations and their corresponding genes within the atomic-level structurally resolved human protein interactome, we find that specific alteration of protein interactions by in-frame mutations plays an important role in the pathogenesis of many disease genes. More importantly, our results show that the locations of the mutations with respect to the interaction interfaces are crucial in understanding the complex genotype-to-phenotype relationships, including pleiotropy and locus heterogeneity. All observations are demonstrated to be robust to the removal of random interactions and proteins as well as interaction, disease and domain hubs, potential biases that might be present in our datasets (Supplementary Note 15 and Supplementary Figs. 7-22
). Furthermore, all observations remain the same when the calculations are repeated using only known domain-domain interactions from existing co-crystal structures (Supplementary Note 16 and Supplementary Fig. 22
). Our findings are directly applicable to understanding molecular mechanisms of human genetic diseases and discovering new disease-associated genes and mutations both experimentally and computationally, which is of significant interest to both pharmaceutical and medical industries and especially important for treating diseases currently with undruggable target genes. To this end, we provide a list of novel disease-to-gene associations and generate many new hypotheses. Moreover, with the development of exome sequencing, many mutations are being discovered in every study44
. It is difficult to determine their functional relevance experimentally all at once. Our analysis could potentially provide a novel approach to prioritize mutations discovered in large-scale sequencing projects, especially for protein pairs without known co-crystal structures.
The construction of our structurally resolved protein interactome largely relies on the availability of 3D co-crystal structures, which limits the coverage of our network. However with the rapid growth of PDB45
, more co-crystal information will become available and the same principles that we developed here can be readily applied to uncover potential molecular mechanisms of many more disease genes whose structural information is currently missing. Another limiting factor is that some interaction interfaces fall outside of the known domain structures, including the disordered regions46
. Incorporating this type of information will further improve the coverage of hSIN. Moreover, other parts of the protein, especially regions immediately outside of the interacting domains we predicted, might also contribute to the interaction directly or contribute to the correct folding of the corresponding domains. For example, a previous study indicated that the SAM2 domain alone might not be sufficient for the TP63
interaction and suggested that residues upstream and downstream of the SAM2 domain and the P53_tetramer domain could also be involved in the interaction47
. Accordingly, based on the known co-crystal structure of TP53
, we also predicted in hSIN that the P53_tetramer domain of TP63
could also be part of the interface for this interaction.
Although we have shown that the interaction pairs in hSIN have significantly higher co-expression correlation and functional similarity in general, further studies can be carried out by considering gene expression under disease-specific conditions and/or within corresponding tissues for specific disorders. Moreover, study of changes in the protein-protein interaction network during disease progression can also assist the identification of disease biomarkers and modules49
. In addition to genetic mutations, many other factors including environmental stress, epigenetic modifications and invasion of pathogens might also contribute to human clinical disorders50
. Integrating these factors in the follow-up studies of the hypotheses generated by our analysis will likely expand our understanding of many human genetic disorders in the near future.