Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Proteins. Author manuscript; available in PMC 2009 November 15.
Published in final edited form as:
PMCID: PMC2726780

Protein-Protein Docking Benchmark Version 3.0


We present version 3.0 of our publicly available protein-protein docking benchmark. This update includes 40 new test cases, representing a 48% increase from Benchmark 2.0. For all of the new cases, the crystal structures of both binding partners are available. As with Benchmark 2.0, SCOP1 (Structural Classification of Proteins) was used to remove redundant test cases. The 124 unbound-unbound test cases in Benchmark 3.0 are classified into 88 rigid-body cases, 19 medium difficulty cases, and 17 difficult cases, based on the degree of conformational change at the interface upon complex formation. In addition to providing the community with more test cases for evaluating docking methods, the expansion of Benchmark 3.0 will facilitate the development of new algorithms that require a large number of training examples. Benchmark 3.0 is available to the public at

Keywords: protein-protein docking, protein complexes, protein-protein interactions, complex structure


In 2003 and 2005 we published two versions of a protein-protein docking benchmark.2,3 It contains structures of proteins for which high-resolution crystal structures are available in both the unbound and bound states. Our goal is to provide a wide variety of test cases so that the protein docking community can evaluate the progress of docking methods. Our benchmark, in its previous two editions,2,3 has been widely used for training and testing protein docking algorithms,4-9 developing re-ranking algorithms,10 formulating energy functions,11 and performing protein structure analysis.12

Since 2005, the number of protein structures in the Protein Data Bank13 (PDB) has increased by more than 10,000, which allowed us to update the Benchmark to version 3.0. Although manual curation of the data during some steps of the benchmark construction was inevitable, we have constructed a semi-automated process to ensure that this update covers all available test cases in the PDB. The new test cases are exclusively unbound-unbound, in that three crystal structures are available, for the complex and each of the unbound proteins.

Semi-automated Dataset Retrieval and Curation

To collect unbound-unbound benchmark cases, we parsed all PDB entries as described previously.2,3 We first identified multi-protein x-ray structures with individual sequence length longer than 30 amino acids and resolution better than 3.25 Å; these two cutoffs were used in the two previous editions of the benchmark. The biological unit information provided by the PDB was used to differentiate biologically relevant interactions from crystal contacts. We filtered out obligate complexes manually, after consulting the literature.

For the remaining protein complexes, we utilized SCOP1 to examine protein family-family pair redundancy within the new cases and against the existing cases from Benchmark 2.0. In addition to the latest version of SCOP (1.71), which was released in Oct. 2006, we used its pre-classification version, Pre-SCOP (, for the structures deposited in PDB since the SCOP 1.71 release. Non-redundancy was set at the family level of SCOP, i.e., no two test cases in Benchmark 3.0 are allowed to belong to the same family-family pair. The users who are interested in developing statistical potentials with our benchmark may also want to exclude test cases that belong to the same superfamily-superfamily pairs. This would affect two pairs of test cases: 1EZU/1N8O, and 1GRN/1WQ1 (labeled with “*” in Table 1). To avoid this level of redundancy, one test case from each of these pairs can be removed. We then eliminated the test cases for which the unbound structures had less than 96% sequence identity to the corresponding bound structures, as defined by BLAST.14 For the remaining test cases with multiple crystal structures of the unbound proteins, we chose the unbound structure with the highest sequence similarity, highest structure resolution and fewest missing residues. Finally, we discarded test cases that present unusual difficulties for docking algorithms, e.g., three or more residues in the binding site were missing in the unbound structure, or the bound and the unbound structures have different cofactors at the binding site. The cofactors included in structures are listed in the table at the benchmark website (

Table 1
Protein-protein Docking Benchmark 3.0

Benchmark Test Cases and Classification

There are a total of 40 new test cases. They are listed in Table 1, along with the existing cases from Benchmark 2.0. Six of these test cases are identical or homologous to CAPRI targets, indicated in the legend of Table 1. To assign difficulty levels of the test cases, we used the degree of conformational changes, as measured by Interface Cα-RMSD (I-RMSD18) and fraction of non-native residue contacts (fnon-nat16), of the unbound structures fitted onto the bound structures. Specifically, the rigid-body cases are cases with I-RMSD ≤ 1.5 Å and fnon-nat ≤ 0.4, the difficult cases are cases with I-RMSD > 2.2 Å, and the medium cases are all remaining cases (i.e., with 1.5 Å < I-RMSD ≤ 2.2 Å, or I-RMSD < 1.5 Å and fnon-nat > 0.4). We used Cα RMSD instead of backbone RMSD (the latter is used in the CAPRI evaluation) because we have been using Cα RMSD since the creation of the Benchmark, which predates CAPRI.

We use this difficulty classification to quantify the extent of conformational change around the binding interface, which broadly affects most docking methods. For Benchmark 2.0 we assigned difficulty level based on the number of possible high-quality hit predictions (as measured by the CAPRI criteria16) attainable using rigid-body docking on a grid. To remove possible bias due to this method and to simplify the classification, we opted to utilize the I-RMSD and fnon-nat metrics for the new cases, selecting cutoffs to maintain consistency among the new cases and those from Benchmark 2.0. Besides conformational changes, other factors such as the size and hydrophobic/electrostatic composition of the interface, as well as the available experimental data on the complex, can also affect the difficulty of a test case.17,18

In total, Benchmark 3.0 has 88 rigid cases, 19 medium cases and 17 difficult cases. There are two difficult cases with large hinge movement (1E4K and 1IRA). Table 2 provides the average values of the three classes in terms of I-RMSD and fnon-nat. We have also included the statistics of the fraction of native residue contacts (fnat16) even though it does not provide additional values to the above two metrics, because it is used in CAPRI evaluation.

Table 2
Statistics on the Three Difficulty Groups in Benchmark 3.0

In addition to the difficulty assessment, we have classified the new test cases into three biochemical categories: Enzyme-Inhibitor (E; 9 cases), Antigen-Antibody (A; 3 cases) and Others (O; 28 cases), as with previous Benchmark versions.2,3 This information is provided in Table 1. We corrected the category assignments for two Benchmark 2.0 cases (1IJK and 1FQ1) from O to E.

Comparison with DOCKGROUND

DOCKGROUND is a relational database of x-ray and simulated protein-protein complexes. Its second release19 contains 99 test cases for which the x-ray structures of the complex and the individual proteins are available. Among these, 30 cases are included in our Benchmark 3.0, based on the PDB IDs of the complexes. For an additional 20 cases, the unbound proteins fall within the same SCOP family pairs as test cases in our Benchmark 3.0. The remaining 40 cases were rejected by our annotation pipeline because of redundancy or complications at the interface (e.g. one or combinations of the following criterions: three or more missing contact residues at binding site, cofactors at the binding site of the complex structure but not in the unbound structure or vice versa, different numbers of protein chains at the interface between the bound and unbound states, or dimerization of receptor or ligand or both in the complex but no corresponding unbound structures). Note that antigen-antibody cases were kept although they have multiple chains in interface. One difference between our curated benchmark and automatically generated databases such as DOCKGROUND is that we provide the residue-aligned and superposed structures of the unbound proteins, which greatly facilitates evaluation of the RMSDs of docked structures. Because the bound and unbound molecules are often not identical, this step requires non-trivial manual effort. The sequence alignments are accessible by following the “Sequence Alignment” column of each test case and the cleaned-up PDB files of the superposed structures can be downloaded as a single gzipped file ( We suggest using randomly rotated configurations of the superposed structures as the starting structures for docking, so that the results are not biasedbecause a near-native conformation is sampled by default.


Benchmark 3.0 includes all possible test cases from the structures deposited in the PDB up to May 2007, and represents a significant increase in cases over the previous versions. With 127 non-redundant test cases, this benchmark should enable the development and testing of algorithms that require a large training set, in addition to those developed for a particular biochemical category or difficulty level.


We are grateful to the Scientific Computing Facilities at Boston University and the Advanced Biomedical Computing Center at NCI, NIH for computing support. This work was funded by NSF grants DBI-0078194, DBI-0133834 and DBI-0116574.


1. Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol. 1995;247(4):536–540. [PubMed]
2. Chen R, Mintseris J, Janin J, Weng Z. A protein-protein docking benchmark. Proteins. 2003;52(1):88–91. [PubMed]
3. Mintseris J, Wiehe K, Pierce B, Anderson R, Chen R, Janin J, Weng Z. Protein-Protein Docking Benchmark 2.0: an update. Proteins. 2005;60(2):214–216. [PubMed]
4. Bordner AJ, Gorin AA. Protein docking using surface matching and supervised machine learning. Proteins. 2007;68(2):488–502. [PubMed]
5. Andrusier N, Nussinov R, Wolfson HJ. FireDock: fast interaction refinement in molecular docking. Proteins. 2007;69(1):139–159. [PubMed]
6. Li CH, Ma XH, Shen LZ, Chang S, Chen WZ, Wang CX. Complex-type-dependent scoring functions in protein-protein docking. Biophys Chem. 2007;129(1):1–10. [PubMed]
7. Liang S, Liu S, Zhang C, Zhou Y. A simple reference state makes a significant improvement in near-native selections from structurally refined docking decoys. Proteins. 2007;69(2):244–253. [PMC free article] [PubMed]
8. Lorenzen S, Zhang Y. Identification of near-native structures by clustering protein docking conformations. Proteins. 2007;68(1):187–194. [PubMed]
9. Tovchigrechko A, Vakser IA. GRAMM-X public web server for protein-protein docking. Nucleic Acids Res. 2006;34(Web Server issue):W310–314. [PMC free article] [PubMed]
10. Pierce B, Weng Z. ZRANK: reranking protein docking predictions with an optimized energy function. Proteins. 2007;67(4):1078–1086. [PubMed]
11. Audie J, Scarlata S. A novel empirical free energy function that explains and predicts protein-protein binding affinities. Biophys Chem. 2007;129(23):198–211. [PubMed]
12. Headd JJ, Ban YE, Brown P, Edelsbrunner H, Vaidya M, Rudolph J. Protein-protein interfaces: properties, preferences, and projections. J Proteome Res. 2007;6(7):2576–2586. [PubMed]
13. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. 2000;28(1):235–242. [PMC free article] [PubMed]
14. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. [PMC free article] [PubMed]
15. Hubbard SJ, Thornton JM. NACCESS. 2.1.1. Department of Biochemistry and Molecular Biology, University College London; 1993.
16. Mendez R, Leplae R, De Maria L, Wodak SJ. Assessment of blind predictions of protein-protein interactions: current status of docking methods. Proteins. 2003;52(1):51–67. [PubMed]
17. Chen R, Weng Z. Docking unbound proteins using shape complementarity, desolvation, and electrostatics. Proteins. 2002;47(3):281–294. [PubMed]
18. Vajda S. Classification of protein complexes based on docking difficulty. Proteins. 2005;60(2):176–180. [PubMed]
19. Gao Y, Douguet D, Tovchigrechko A, Vakser IA. DOCKGROUND system of databases for protein recognition studies: Unbound structures for docking. Proteins. 2007;69(4):845–851. [PubMed]