Protein-protein interactions are key to the functioning of all cells and many biological processes. To understand the mechanism of a protein-protein interaction, the structure of a protein complex is essential. While many high-resolution (x-ray) structures of protein complexes are available in the Protein Data Bank (PDB1
), a vast number of protein complex structures are not yet determined. Meanwhile, structural genomics projects are underway,2
producing new structures of proteins, many of them monomeric. With the crystal structures (or modeled structures) of the component monomers, protein-protein docking (referred to as protein docking for brevity) can be used to predict the structures of the protein complex when no protein complex structure is available. Recent developments in protein docking allow for atomic-scale protein complex predictions,3
yet work needs to be done to refine these methods so that they can be quickly and reliably applied to unknown protein complexes.
Many protein docking algorithms are divided into several steps: the initial global search and subsequent steps to improve these initial predictions.4
The global search is a full search of the orientations of the two proteins, typically keeping the larger protein (referred to as the receptor) fixed, while moving the smaller protein (the ligand). This is often a rigid-body search in 6 dimensions, utilizing a Fast Fourier Transform (FFT) for efficiency and softness for small overlaps,5-7
but other methods such as Monte Carlo with side chain searching have also been successful.8,9
The following steps can include clustering,10,11
and structural refinement13
of the initial set of predictions. Structural refinement is useful in that it can improve the contacts and the accuracy of initial predictions that are close to the correct conformation but also have room for improvement.
Previously we have implemented several algorithms for initial-stage docking and refinement: ZDOCK, RDOCK, and ZRANK. The program ZDOCK performs a grid-based docking search using Fast Fourier Transform (FFT), and its scoring includes desolvation, electrostatics, and a novel shape complementarity function.14
It has performed consistently among the top algorithms during the CAPRI docking experiment;15
using ZDOCK to perform docking led to 5 of 6 recent targets with at least one prediction rated Acceptable or higher16
(the highest number among all participants). ZDOCK was also found to compare favorably to other FFT-based docking algorithms in a recent study on clustering initial-stage docking predictions.17
While ZDOCK produces many near-native predictions (hits), they are often not ranked in the top 10. To improve the rank of the hits, RDOCK performs docking refinement by reranking the top 2000 ZDOCK predictions using energy minimization followed by scoring using electrostatics and desolvation.18
Although RDOCK has been shown to improve the success rate of ZDOCK predictions, it lacks the ability to quickly process all 54,000 predictions from a ZDOCK run.
To account for this, we developed the ZRANK program; it uses a weighted energy function with van der Waals, electrostatics and desolvation terms to quickly and effectively rerank the ZDOCK predictions without energy minimization.19
It was tested on protein docking Benchmark 2.0,20
using predictions from two versions of ZDOCK: ZDOCK 2.1 (which employs shape complementarity alone) and ZDOCK 2.3 (which employs shape complementarity, desolvation, and electrostatics). In both cases there was significant improvement in docking performance when using ZRANK to rescore the rigid-body predictions; the number of cases with top-ranked hits increased from 2 to 11 for ZDOCK 2.1 and from 6 to 12 for ZDOCK 2.3.
It was noted that ZRANK could be followed with structural refinement to further improve the docking success rate.19
To examine this possibility, we have combined the initial-stage docking of ZDOCK and scoring of ZRANK with the structural refinement of RosettaDock.8
The local refinement of RosettaDock includes side chain repacking and a Monte Carlo search of the local rigid-body space of the ligand. While RosettaDock can be highly successful in obtaining atomically accurate models through its refinement, it is sometimes unsuccessful in locating near-native structures in its initial (Monte Carlo based) global search due to the large size of the search space, particularly for larger proteins.21
On the other hand, ZDOCK is not as limited by size of the protein structures, as it utilizes the FFT to scan the entire protein translational space quickly.
In this study, we tested the effectiveness of refining the initial-stage docking structures from ZDOCK and ZRANK using RosettaDock, and selecting refined models using either RosettaDock score or ZRANK score. Also we explored using a larger perturbation size in the RosettaDock refinement search, to determine whether this can allow for successful refinement of models that are more distant from native. Finally, we optimized the ZRANK scoring function specifically to evaluate refined structures, which leads to a significant improvement in accuracy.