Protein-protein recognition is of fundamental importance in the vast majority of biological processes. However, it has already been demonstrated that it is very hard to distinguish true complexes from false complexes in so-called cross-docking experiments, where binary protein complexes are separated and the isolated proteins are all docked against each other and scored. Does this result, at least in part, reflect a physical reality? False complexes could reflect possible nonspecific or weak associations.
In this paper, we investigate the twilight zone of protein-protein interactions, building on an interesting outcome of cross-docking experiments: false complexes seem to favor residues from the true interaction site, suggesting that randomly chosen partners dock in a non-random fashion on protein surfaces. Here, we carry out arbitrary docking of a non-redundant data set of 198 proteins, with more than 300 randomly chosen "probe" proteins. We investigate the tendency of arbitrary partners to aggregate at localized regions of the protein surfaces, the shape and compositional bias of the generated interfaces, and the potential of this property to predict biologically relevant binding sites. We show that the non-random localization of arbitrary partners after protein-protein docking is a generic feature of protein structures. The interfaces generated in this way are not systematically planar or curved, but tend to be closer than average to the center of the proteins. These results can be used to predict biological interfaces with an AUC value up to 0.69 alone, and 0.72 when used in combination with evolutionary information. An appropriate choice of random partners and number of docking models make this method computationally practical. It is also noted that nonspecific interfaces can point to alternate interaction sites in the case of proteins with multiple interfaces. We illustrate the usefulness of arbitrary docking using PEBP (Phosphatidylethanolamine binding protein), a kinase inhibitor with multiple partners.
An approach using arbitrary docking, and based solely on physical properties, can successfully identify biologically pertinent protein interfaces.