A systematic classification of protein–protein interfaces is a valuable resource for understanding the principles of molecular recognition and for modelling protein complexes. Here, we present a classification of domain interfaces according to their geometry. Our new algorithm uses a hybrid approach of both sequential and structural features. The accuracy is evaluated on a hand-curated dataset of 416 interfaces. Our hybrid procedure achieves 83% precision and 95% recall, which improves the earlier sequence-based method by 5% on both terms. We classify virtually all domain interfaces of known structure, which results in nearly 6,000 distinct types of interfaces. In 40% of the cases, the interacting domain families associate in multiple orientations, suggesting that all the possible binding orientations need to be explored for modelling multidomain proteins and protein complexes. In general, hub proteins are shown to use distinct surface regions (multiple faces) for interactions with different partners. Our classification provides a convenient framework to query genuine gene fusion, which conserves binding orientation in both fused and separate forms. The result suggests that the binding orientations are not conserved in at least one-third of the gene fusion cases detected by a conventional sequence similarity search. We show that any evolutionary analysis on interfaces can be skewed by multiple binding orientations and multiple interaction partners. The taxonomic distribution of interface types suggests that ancient interfaces common to the three major kingdoms of life are enriched by symmetric homodimers. The classification results are online at http://www.scoppi.org.
The behaviour of biological systems is governed by protein interactions. Considerable effort has already been dedicated to characterise individual proteins and their evolution. As a next step, researchers need to understand the characteristics, dynamics, and evolution of complex networks of proteins. While many experimental techniques determine high-throughput protein–protein interactions, only few provide structural insights into the actual interfaces. The authors provide a comprehensive compendium and classification of these structural interfaces. To this end, they design a fast and accurate algorithm, which they apply to all known structural interactions. As a result, they shed light on the geometry and the evolution of protein interfaces. Their analysis reveals that 40% of protein interactions between homologues associate in multiple orientations. This has, in particular, implications for gene fusion events detected by conventional sequence homology: for one-third of these genes, the fused and nonfused proteins associate in alternative binding orientations. The classification also shows that any evolutionary analysis, such as interface conservation, can be skewed by multiple binding orientations and interaction partners. Hub proteins, which are highly connected to many other proteins in interaction networks, are shown to use distinct surfaces, or faces, for different partners. Interestingly, some proteins develop many different faces for the same partner (e.g., long-chain cytokines and fibronectin), and others use the same face for evolutionary unrelated partners (e.g., the PUA domain family). Finally, the authors show that ancient interfaces, which appear in all three kingdoms of life, are dominated by symmetric homodimers, reflecting the direction of evolution from symmetric to asymmetric or heteromeric.