Given the variety and success of available flexible ligand/rigid receptor docking algorithms, the easiest way to include multiple conformations of receptor in a docking experiment is simply to run multiple independent simulations (). However, integration of MRC sampling into the docking algorithm may offer advantages in terms of calculation speed as well as simplification of the data management. Such ‘ensemble docking’ extensions of original rigid receptor algorithms have been reported, for example for AUTODOCK [6
] or ICM [11
]. Extension of the popular FlexX algorithm, FlexE not only utilizes MRC individually, but attempts to extend the search space beyond the input set of conformations by detecting distinct dissimilar parts and joining them combinatorially [12
]. New potentially accessible receptor conformations are thus generated during the search. However, consideration of too many conformations can lead to reduced performance. In a recent critical evaluation of FlexE on two targets of pharmaceutical interest, β-secretase and JNK-3, the algorithm was unable to handle large loop movements and could not match enrichment factors obtained by running multiple independent FlexX runs on each receptor structure [13
Multiple receptor conformation (MRC) docking flowchart. The filtering step of the flowchart may be skipped if the number of initial conformations is small. The refinement step is skipped in some implementations of the MRC docking.
FLIPDock is another algorithm using the AutoDock force field that introduces a highly sophisticated data structure for the MRC representation, termed Flexibility Tree (FT) [14
]. The FT data structure describes the receptor as a nested system of molecular fragments which can be involved in a range of movement types such as hinge, shear, twist, normal modes, side-chain rotameric states etc. Representation of protein flexibility in terms of intuitive hierarchical classification of movements is a great strength of the approach. The authors proceed to demonstrate that FT can be successfully used to include side-chain movements into docking simulations. It is less clear to what extent the FT can generate realistic atomic-level receptor structures when large-scale movements are involved, as only balanol/protein kinase A docking results were presented.
The original DOCK algorithm was also extended to ensembles of receptor structures, and a comparative study of the MRC docking versus the ‘soft’ docking potential was reported by Ferrari and coworkers [15
]. Two cavities in a mutant T4 lysozyme and aldose reductase (AldR) active site were used as targets for docking and VLS experiments. The lysozyme cavities were considered as an ideal case for the ‘soft’ docking approach, and indeed the authors found improvement in compound ranking as compared to ‘hard’ single apo conformation docking. The docking to four manually chosen conformations resulted in a modest further improvement of ranking for both cavities: 72% and 68% of the native ligands were recovered in the top 1% of the database versus 57% and 64% for the ‘soft’ docking or only 51% and 49% for the ‘hard’ docking. Results of VLS experiments with multiple AldR conformations also showed a 40% improvement over ‘hard’ docking to a single conformation. However, different forms of softened potential, e.g. a truncated 6-12 Lennard-Jones type potential instead of the 6-9 form used by Ferrari and coworkers, are likely to yield different results. Remarkably, the authors were also able to use the results of the MRC VLS for AldR to select a few compounds for experimental testing and discovered two novel low-micromolar inhibitors.
The ensemble docking approach by Huang and Zou [16
] also builds upon DOCK. A ‘reference’ consensus receptor structure is derived from the input ensemble and used in a rigid ligand placement step. The additional Simplex minimization step uses receptor conformation as a discrete variable. It seems questionable whether the Simplex algorithm is well-suited for the optimization of integer parameters. The authors report good validation results, although an unorthodox success criterion (a solution within 2.5Å of the native among top 5 poses) makes objective comparison to other methods difficult.
The recently reported FITTED algorithm allows two receptor flexibility modes[17
]. The first mode, termed ‘semi-flexible’, is essentially an MRC ensemble docking. The second ‘fully-flexible’ mode allows genetic algorithm (GA) to generate different combinations of side-chain rotamers and backbone conformations found in the input ensemble. In addition, the algorithm is capable of simulating displaceable interface water molecules by a combination of special functional form for water interaction and sampling absence/presence of waters in GA.
Ensemble methods may offer significant performance advantage over sequential docking to multiple conformations by conventional rigid-receptor algorithms. For example, 6-fold speedup was reported for ensemble docking to 12 conformations of lysozyme cavity [18
], and the speedup reached 18 fold when docking against 48 conformations. Ultimately the efficiency of ensemble methods should depend on the diversity of the receptor conformations: if the ensemble only involves minor structural variations, its exploration may contribute only additively to the overall computational cost; however, if highly dissimilar binding site conformations are included, each of them will have to be explored virtually independently, potentially multiplying the search time by the number of conformations.
Post-docking optimization may help to further improve both docking pose and its score. Nabuurs, Wagener and de Vlieg demonstrated a robust performance of a combination of FlexX-Ensemble docking combined with a post-docking explicit receptor ligand optimization on a benchmark of 35 ligand-receptor complexes [19
Advantages and pitfalls of the MRC approach in docking and VLS are well illustrated by an in-depth benchmarking study of Barril and Morley[20
]. Their test set consisted of 49 structures of cyclin-dependent kinase 2 (CDK2) with 34 ligands and 149 structures of heat shock protein 90 (HSP90) with 57 ligands. These receptors are among the most thoroughly investigated experimentally. On average, only 33% (CDK2) and 25% (HSP90) of ligands would dock within 2Å RMSD to any single receptor structure, while 97% of them would dock correctly to at least one receptor structure. Unfortunately, the best performing single receptor structure for each ligand is not known in advance. The best performing single
receptor structure could be used to correctly dock up to 68% and 49% ligands (CDK2 and HSP90 respectively). Best-performing combinations of two or more receptor structures were next investigated. Success rate gradually improved to 94% and 77% for the best subsets of 6 CDK2 structures and 8 HSP90 structures respectively. A pitfall of using a large number of receptor structures was also observed: success rate actually declined when more than 39 (CDK2) or 81 (HSP90) structures were used. The performance dependence on the MRC set for realistic random subsets is less dramatic: the average performance improved monotonically with the number of conformations, reaching 76% and 51% for full sets. The bulk of the improvement still occurred for the first 10 (CDK2) and 25 (HSP90) structures, suggesting that a relatively small subset of structures can embody a sufficient number of receptor conformational states to adequately represent induced fit. Interestingly, at least for HSP90 the performance could be significantly improved by including an ad-hoc solvation-based receptor conformational penalty to the scoring function. The observation emphasizes the need for eventual development of methods for accurate receptor conformation scoring, which is currently often disregarded.