Many proteins and protein regions have been shown to be intrinsically disordered under native conditions; namely, they contain no or very little well-defined structure
[1]–
[6]. Intrinsically disordered proteins (IDPs) have been found in a wide scope of organisms and their disorder content was shown to increase with organism complexity
[7]–
[11]. Comparative analysis of the functional roles of disordered proteins suggest that they are predominantly located in the cell nucleus; are involved in transcription regulation and cell signaling; and also can be associated with the processes of cell cycle control, endocytosis, replication and biogenesis of cytoskeleton
[10],
[12].
IDPs have certain properties and functions that distinguish them from proteins with well-defined structures. 1) IDPs have no unique three-dimensional structure in an isolated state but can fold upon binding to their interaction partners
[1],
[4],
[13]–
[18]. 2) Conformational changes upon binding in proteins with unstructured regions are much larger than those in structured proteins
[1]. 3) The conformations of disordered regions in a protein complex are determined not only by the amino acid sequences but also by the interacting partners
[1],
[19]. 4) IDPs can have many different functions and can bind to many different partners using the same or different interfaces
[20]. 5) IDPs can accommodate larger interfaces on smaller scaffolds compared to proteins with well-defined structure
[14],
[21],
[22]. 6) IDPs typically have an amino acid composition of low aromatic content and high net charge as well as low sequence complexity and high flexibility
[2],
[10],
[23]. 7) Intrinsic disorder provides for a rapid degradation of unfolded proteins, thereby enabling a rapid response to changes in protein concentration (regulation through degradation)
[24]. 8) Finally, intrinsic disorder offers an elegant mechanism of regulation through post-translational modifications for many cellular processes
[20],
[25].
Predictions of disorder in proteins take into account the characteristic features of unstructured proteins and have been shown to be rather successful, especially in the case of large regions. According to the results of CASP7 (7th Community-Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction), the best prediction groups successfully identified 50–70% of the disordered residues with false positive rates from 3% to 16%
[26]. Prediction methods aim to identify disordered regions through the analysis of amino acid sequences using mainly the physico-chemical properties of the amino acids
[23],
[27]–
[36] or evolutionary conservation
[12],
[37]–
[39].
As protein interactions are crucial for protein function (
[40], references within), the biological role of disordered proteins should also be studied in this context. Indeed, folding of disordered proteins into ordered structures may occur upon binding to their specific partners
[1],
[4],
[13]–
[17] which may allow disordered regions to structurally accommodate multiple interaction partners with high specificity and low affinity
[1],
[41]–
[43]. Moreover, it has been shown that the binding mechanism, whether binding occurs between folded or unfolded chains, depends on the structural characteristics, interface properties, and degree of minimal frustration of monomers
[21],
[44]. Binding through unfolded or partially unfolded intermediates can provide a kinetic advantage through the “fly-casting” mechanism
[19]. According to this mechanism a dimensionality reduction occurs when the folding of a disordered protein is coupled with binding, thereby speeding up the search for specific targets.
A database of continuous protein fragments (Molecular Recognition Features or MORFs) has been compiled from the Protein Data Bank to include short protein chains (with fewer than 70 residues) bound to larger proteins
[45],
[46]. It has been argued that MORFs participate in the coupling of binding and folding, a hypothesis that was supported by the analysis of the composition and predicted disorder of MORF segments. As a result of studying the subtle structural differences of the same proteins in different conditions and functional states, many so-called “dual personality” protein segments were found able to exist in both ordered and disordered states
[47]. There is a continuous range between completely structured and completely disordered proteins in which intermediate cases are rather common
[24]: proteins that are disordered but compact, multi-domain proteins with disordered linkers, and ordered proteins with some local disorder.
Examples of proteins with intrinsically disordered regions which exhibit coupling between folding and binding have been described in the literature previously
[1],
[4],
[13]–
[18]. Nevertheless, the universality of this phenomenon and functional importance of many disordered regions remains unclear. The question can be expanded further to how much intrinsic disorder do protein complexes contain and what is its functional importance? To answer these questions we examine observed and predicted disorder in protein complexes and unbound proteins using a large-scale dataset of protein structures. The atomic details of structures and the conserved binding mode analysis introduced earlier
[48] allow us to monitor changes happening on or near interaction interfaces and to infer their functional importance.