There exist > 78,000 proteins and/or nucleic acids structures that were determined experimentally. Only a small portion of these structures corresponds to those of protein complexes. While homology modeling is able to exploit knowledge-based potentials of side-chain rotomers and backbone motifs to infer structures for new proteins, no such general method exists to extend our understanding of protein interaction motifs to novel protein complexes.
We use a Motif Binding Geometries (MBG) approach, to infer the structure of a protein complex from the database of complexes of homologous proteins taken from other contexts (such as the helix-turn-helix motif binding double stranded DNA), and demonstrate its utility on one of the more important regulatory complexes in biology, that of the RNA polymerase initiating transcription under conditions of phosphate starvation. The modeled PhoB/RNAP/σ-factor/DNA complex is stereo-chemically reasonable, has sufficient interfacial Solvent Excluded Surface Areas (SESAs) to provide adequate binding strength, is physically meaningful for transcription regulation, and is consistent with a variety of known experimental constraints.
Based on a straightforward and easy to comprehend concept, "proteins and protein domains that fold similarly could interact similarly", a structural model of the PhoB dimer in the transcription initiation complex has been developed. This approach could be extended to enable structural modeling and prediction of other bio-molecular complexes. Just as models of individual proteins provide insight into molecular recognition, catalytic mechanism, and substrate specificity, models of protein complexes will provide understanding into the combinatorial rules of cellular regulation and signaling.