|Home | About | Journals | Submit | Contact Us | Français|
Incorporating receptor flexibility is considered crucial for improvement of docking-based virtual screening. With an abundance of crystallographic structures freely available, docking with multiple crystal structures is believed to be a practical approach to cope with protein flexibility. Here we describe a successful application of the docking of multiple structures to discover novel and potent Chk1 inhibitors. Forty-six Chk1 structures were first compared in single structure docking by predicting the binding mode and recovering known ligands. Combinations of different protein structures were then compared by recovery of known ligands and an optimal ensemble of Chk1 structures were selected. The chosen structures were used in the virtual screening of over 60,000 diverse compounds for Chk1 inhibitors. Six novel compounds ranked at the top of the hits list were tested experimentally and two of these compounds inhibited Chk1 activity–the best with an IC50 value of 9.6 μM. Further study indicated that achieving a better enrichment and identifying more diverse compounds was more likely using multiple structures than using only a single structure even when protein structures were randomly selected. Taking into account conformational energy difference did not help to improve enrichment in the top ranked list.
Molecular docking and receptor–based virtual screening have been an indispensible component within structure-based drug design for hits identification and lead optimization.1-3 Although ligand flexibility can be handled by a variety of algorithms in current docking implementation, receptor flexibility remains a major outstanding challenge in the practice of docking-based virtual screening because of the high dimensionality of the conformational space and the complexity of energy function.4 Protein flexibility has long been acknowledged to be often coupled to ligand binding in numerous experimental and theoretical studies. Two kinds of ligand–binding mechanisms have been well discussed5 and include conformational selection, which assumes that the ligand binds to a pre–existing receptor conformation in an equilibrated ensemble, and induced fit, which presumes that the receptor is induced to the bound conformation when ligand binding. In either case, structural fluctuations of receptors need to be taken into account in docking studies.
Various modeling methods have been developed to incorporate receptor flexibility in molecular docking and virtual screening.6 Soft docking, which docks ligands to a rigid receptor with a soft scoring function tolerating some steric clashes, has been reported to be worse for identifying known ligands than the hard scoring function when multiple receptor conformations were used.7 Several docking programs limit protein flexibility to side chains by exploration of rotamer libraries and make the problem less computationally demanding, but cannot deal with backbone movement or other major structural rearrangements. Another implementation is docking of ligands to multiple receptor conformations, which may either be obtained experimentally by X-ray crystallography8-12 and NMR spectroscopy13 or computationally by molecular dynamics,14 normal mode analysis15-17 and other techniques.18
Checkpoint kinase 1 (Chk1), a serine/threonine kinase, is involved in the S-phase checkpoint and G2 checkpoint, which is a key regulator in the DNA damage-induced signaling pathway.19 In response to DNA damage, normal cells are arrested at various cell cycle checkpoints (G1/S/G2) to allow DNA to be repaired; however, cancer cells with p53-deficiency, which is common in tumors, are arrested only at the S or G2 checkpoint because p53 is required at the G1 checkpoint.20 The inhibition of Chk1 is believed to abrogate the remaining checkpoints in cancer cells, consequently leading to cell death due to accumulation of cytotoxicity of chemotherapeutics. Chk1 as a target for chemosensitization and chemoprevention has been reviewed previously.21-22 Therefore, Chk1 inhibitors are capable of increasing the therapeutic efficacy of anticancer drugs as sensitizing agents. A number of Chk1 inhibitors have been studied during the past decade, and have been reviewed previously.23-25 Several compounds have been in advanced preclinical or early clinical development.20, 22 However, a clear need exists for potent and selective Chk1 inhibitors derived from distinct chemotypes, given the unfavorable properties and toxicities of known compounds.
Chk1 is a 54 kDa protein of 476 amino acids comprised of a highly conserved N-terminal kinase domain followed by a linker region and a less conserved C-terminal domain. The first crystal structure of the kinase domain of human Chk1 was reported at 1.7Å resolution in 2000.26 Now, 43 complex structures of Chk1 and ATP-competitive inhibitors have already been solved as well as 3 complex structures of Chk1 and allosteric inhibitors. The abundance of available Chk1 structures offers the opportunity not only to investigate ligand docking but also to explore the impact of receptor flexibility.
Here we present a successful application of docking-based virtual screening with multiple crystal structures in the discovery of novel Chk1 inhibitors. Although this method was evaluated previously, few successful applications have been reported. A plethora of crystallographic structures of Chk1 were employed in this work targeting both the ATP-binding site and the allosteric site. Each structure first was evaluated by the docking accuracy in prediction of binding mode and the enrichment factor in recovery of known ligands from decoy compounds. Then the optimal ensemble of structures was determined and utilized in virtual screening of more than 60,000 diverse compounds for Chk1 inhibitors. Finally, six compounds were purchased and tested in experiments and two of them showed good inhibitory effects. In addition, the conformational energy difference was calculated with targeted molecular dynamics and whether considering conformational energy could improve the enrichment factor in virtual screening is discussed. We also discuss the advantage of multiple structures over a single structure and the chemical space of known ligands explored by each protein structure.
Forty-six crystal structures of Chk1 with different ligands were retrieved from the Protein Data Bank : PDB entries 1NVQ, 1NVR, 1NVS, 1ZLT, 1ZYS, 2AYP, 2BR1, 2BRB, 2BRG, 2BRH, 2BRM, 2BRN, 2BRO, 2C3J, 2C3K, 2C3L, 2CGU, 2CGV, 2CGW, 2CGX, 2E9N, 2E9O, 2E9P, 2E9U, 2E9V, 2GDO, 2GHG, 2HOG, 2HXL, 2HXQ, 2HY0, 2QHM, 2QHN, 2R0U, 2WMQ, 2WMR, 2WMS, 2WMT, 2WMU, 2WMV, 2WMW, 2WMX, 2YWP, 3F9N, 3JVR and 3JVS. Each structure was prepared using the protein preparation workflow in Maestro, and missing atoms and loops were added with Prime in Schrödinger Suite 2009. All water molecules, metal ions and small chemical groups were removed. AMBER03 partial charge was assigned to each receptor atom with Chimera.27
Forty-six ligands were extracted from the Chk1 crystal complexes listed above and were prepared with LigPrep. More than 60,000 structurally diverse compounds were downloaded from all-purchasable subset of the publicly accessible ZINC database with the Tanimoto index at 70% cutoff. These compounds are selected because they are commercially available and display some kind of diversity. Two thousand compounds were randomly selected from this dataset to be used in validation of virtual screening. All small molecules, including crystal ligands and compounds from the ZINC database, were submitted to ConfGen28 to generate ensembles of low-energy conformations in advance. With the random 2,000 compounds, ConfGen output ensembles of conformations for 1,996 compounds. A total of 2,042 small molecules were generated, including 1,996 random compounds and 46 crystal ligands, in the testing set used in validation of virtual screening. All operations were performed within Schrödinger Suite 2009.
DOCK 6.329 was utilized to dock the pre-generated conformational ensemble into Chk1 structures. The DMS program was used to generate a molecular surface for each receptor. The SPHGEN algorithm was utilized to create a negative image of the binding pocket, which is complementary to the molecular surface. All the spheres within 10 Å of the ligand were selected for docking. The receptor box delimiting the binding pocket was calculated using SHOWBOX with an additional box size so that it was greater than 20 Å in length, which is large enough to contain all crystal ligands and most of the compounds. Potential grids were calculated for each receptor using a 0.3 Å spacing that fully enclosed the spheres, and were generated by the GRID program in advance for increasing docking speed. Each conformation was treated as rigid in the docking simulation, and the number of ligand poses sampled was set to 500. All docking experiments were carried out using the IBM Blue Gene/L Supercomputer.
For ensembles with 2, 3, 40, 41, 42, and 43 structures, all possible combinations were enumerated and the enrichment factor (EF) for each combination was calculated. For other ensembles, the structures were randomly selected and duplicate combinations were discarded until 40,000 distinct combinations for each ensemble were generated. The combination with the highest EF was reported as the best solution. All of the analyses were performed with a utility program and scripts developed at our laboratory on the Linux workstation.
The formulae used for the EF calculation is:
Where N is the total size of the compound library; n is the number of compounds selected after screening; A is the total number of known inhibitors; and a is the number of known inhibitors in the selection.
The targeted molecular dynamics (TMD) simulation was performed with Amber 11. The Chk1 crystal structure from PDB entry 1IA826 was used as the starting structure. Of 46 complex structures, 1ZYS was excluded from TMD simulation because of a residue mutation, and the other 45 structures were used as target structures. Amber03 parameters were applied to describe protein atoms. All structures were subjected to energy minimization first using the steepest descent algorithm in 100 steps and then the conjugate gradient algorithm in 1,000 steps. A time step of 1 fs was used to integrate the equation of motion. The system was started with initial velocities of a given temperature (100 K) and then was held at a constant temperature of 300 K. A restraint force of 15.0 was applied to backbone atoms and the RMSD of backbone atoms was calculated after superposition of all protein atoms. All TMD simulations were performed in the implicit solvent model of Generalized Born (GB) for 100 ps.
All of compounds tested in vitro were purchased from InterBioScreen (Moscow, Russia). The active CHK1 human recombinant protein and CHKtide (substrate peptide) for the kinase assay were purchased from Millipore (Temecula, CA).
The kinase assay was performed in accordance with instructions provided by Millipore (Temecula, CA). Briefly, the reaction was carried out in the presence of 10 μCi of [γ-32P]ATP with each compound in 40 μl of reaction buffer containing 20 mM HEPES (pH 7.4), 10 mM MgCl2, 10 mM MnCl2, and 1 mM dithiothreitol. After incubation at room temperature for 30 min, the radioactivity was determined by scintillation counter. Each experiment was repeated three times.
Most of the known Chk1 inhibitors are competitive with ATP. Out of 46 complex structures published, 43 complexes30-43 have a ligand occupying the ATP-binding site, and the other 3 complexes44-45 have a ligand bound to the P-5 substrate binding site, which is about 13 Å44 from the ATP-binding site and is referred to as an allosteric site (Figure 1). Because the ATP-binding site of kinases is highly conserved in sequence and structure, targeting the ATP site could result in off-target side effects. On the other hand, this might offer an advantage of synergistic effects.46 The allosteric site is considered to be more structurally distinct than the ATP binding site, and makes possible the discovery and design of highly selective kinase inhibitors. Thus, interest has been growing in the development of kinase inhibitors interacting with the allosteric site, and progress has been reviewed recently.47
Forty-six Chk1 complex structures with a crystallographic resolution from 1.70 Å to 3.50 Å were utilized in this work. To explore the diversity of these Chk1 structures, we compared both the backbone conformation and binding site conformation for all structures. To compare the backbone conformation of a pair of structures, one structure was used as the reference and the other structure was superposed onto it, and then the pairwise C-α RMSD was calculated. To compare the binding site, only residues within 5 Å of the co-crystallized ligand were used in the superposition and then the C-α RMSD of selected residues was calculated. The calculation was performed with scripts in Schrödinger Suite 2009.
In three complexes with allosteric inhibitors, the C-α RMSD of backbone between 3JVR and 3JVS is 5.68 Å, indicating that they have similar global structures. In contrast, the C-α RMSD of the backbone between 3F9N-3JVR and 3F9N-3JVS is 11.75 Å and 9.13 Å respectively, showing that 3F9N has a slightly different structure from the other two (Figure 1). However, the C-α RMSD of the allosteric site is between 0.18 and 0.56 Å for any pair of structures, suggesting that the allosteric site for all three structures is very similar.
For forty-three complex structures with ATP-competitive inhibitors, the C-α RMSD of backbone varies from 0.13 Å (2CGX-2CGV) to 18.90 Å (2WMS-2GHG) (Figure 2). The structural fluctuation comes mainly from the loop rearrangement in the N-terminal domain (Figure 1). The C-α RMSD of the ATP-binding site is between 0.05 Å (2WMW-2WMX) and 3.72 Å (1NVS-2YWP) (Figure 2). Notably, 61.4% of pairwise structures have a RMSD value over 10.0 Å (Figure 2, left) and 31.0% of pairwise structures have a RMSD value over 3.0 Å (Figure 2, right), indicating that the ensemble of Chk1 structures display conformational diversity in both backbone and binding site.
A primary measurement of docking accuracy is the ability to reproduce correctly the experimentally determined binding mode of known ligands. To compare docking performance with different protein structures, the ligands from crystallographic complexes were docked to all Chk1 structures, which is also called cross-docking. For each ligand, the cognate protein structure was utilized as the reference and other structures were superposed onto it. In this process, only the C-α atoms of residues forming the binding site were used. The docked pose of the ligand was moved alongside the protein, resulting in an alignment of the docked pose with respect to the reference ligand. Then the RMSD of heavy atoms was computed between the docked pose with the lowest docking score and the crystal pose of the ligand. The common criterion is that the binding mode of a ligand is considered to be successfully predicted if the RMSD is no greater than 2 Å. Thus, performance of one protein structure was simply evaluated by counting successfully docked ligands. The superposition of structures and calculation of RMSD were performed using scripts in Schrödinger Suite 2009.
Table 1 lists the results from cross-docking for 3 allosteric inhibitors and co-crystalized protein structures. In native docking, that is, re-docking of the ligand into the co-crystallized protein structure, the docked pose is within 2.0 Å of the crystal ligand pose for all 3 ligands. In cross-docking, the protein structure from 3JVR performs better than the other two, with which the predicted binding modes for all three known inhibitors are close (RMSD < 2.0 Å) to the corresponding reference ligand.
Figure 3 displays the performance of 43 Chk1 structures with ATP-competitive ligands, expressed as a percentage of successfully docked ligands. In native docking, the success rate is 90.7%, and for all ligands, the RMSDs are less than 3.0 Å between the docked pose and the reference ligand. In cross-docking, the success rate of individual structures varies from 16.3% to 60.5%, and the average rate of docking accuracy is 34.9%. Any ligand can be recognized by at least one structure, not necessarily the co-crystallized one. Our results are consistent with previous reports48-51 and also show a significant decrease in success rates from native docking to cross-docking.
Another measurement to assess docking performance is the ability to identify active compounds from a pool of decoys. To evaluate performance of different Chk1 structures, a small compound library was docked to each protein structure. The compound library was constructed as described in Methods and comprises more than 2,000 compounds, including 46 ligands from crystal complexes. The performance of each structure was evaluated by enrichment factor (EF) at the top 1% of ranked list.
Only three true binders exist in the testing set for the allosteric site. With the structure of 3JVR, all three inhibitors are ranked at the top 1%; with the structure of 3F9N, only the native ligand is ranked at the top 1%; with the structure of 3JVS, none of them is ranked at the top 1%.
Forty-three true binders exist in the testing set for the ATP-binding site, giving a theoretical maximum EF of 47.6 at the top 1%. Figure 4 shows that docking to a single structure can deliver an EF up to 21.4 with two structures (1NVR or 2R0U). In the worst case, none of the known ligands is ranked at the top 1% with two structures (2GDO or 2WMS). The structure giving a high success rate in prediction of binding mode does not necessarily obtain a high EF value in recovery of known ligands, which agrees with previous results.48-49
When several structures are combined, obtaining better results is more likely than only using the best single structure (Figure 5). Notably, the EF can rise from 21.4 with the best single structure to 28.5 with only two structures combined. The EF reaches the maximal value (33.2) when six structures are combined, whereas it is reduced to 30.9 when 27 structures are combined and begins to decrease as the number of combined structures increases. When all 43 structures are combined, the EF is only 19.0, which is a little worse than that with the best single structure (21.4). We agree with the so-called anti-cooperative behavior49 as reported previously, which means the result may degrade with the increasing number of protein structures. As indicated, combining all 43 structures (EF=19.0) performs just a little worse than with the best single structure (EF=21.4), but better than with most of the single structures. Actually only 3 out of 43 structures (7.0%) have an EF greater than or equal to 19.0. This will be further discussed later.
The choice of the most appropriate receptor structure is key for successful virtual screening. The ideal structure should have a high success rate in prediction of binding mode and a high EF in recovery of known ligands. Prediction of binding modes is considered the most successful area in docking and scoring, whereas scoring functions remain too primitive to rank a docking hit list correctly.52 In our experience, to choose a structure that performs best in both, prediction of binding mode and recovery of known ligands, is difficult. In reality, possible lead compounds are generally chosen according to scoring rank. Therefore, we determined the optimal structures for virtual screening mainly by the enrichment in recovery of known ligands.
In prediction of the binding mode, all three structures successfully reproduce the experimental poses of native ligands, while 3JVR correctly predicts docking poses of all three ligands. In recovery of known ligands, 3F9N and 3JVR rank their native ligands at the top 1%, whereas 3JVR manages to rank all three known ligands at the top 1%. That is, 3JVR succeeds in reproducing the crystal poses of all three ligands and distinguishing them from decoy compounds. Thus, it is the ideal structure for virtual screening and will be used in the following virtual screening for Chk1 inhibitors.
Finding a small subset of Chk1 structures that can perform the best in both prediction of binding mode and recovery of known ligands is possible. In the practical environment, the results from recovery of known ligands are more intuitive than those from the prediction of binding mode. Therefore, we would select out several structures based on results from recovery of known ligands. Three structures, 1NVR, 2R0U and 2WMX, perform the best or near the best in single structure docking and their combinations perform the best in multiple structures docking. The EF for a single structure at the top 1% is respectively: 1NVR, 21.4; 2R0U, 21.4; 2WMX, 16.6. The EF for the structure ensemble at the top 1% is respectively: 1NVR-2R0U, 28.5; 1NVR-2WMX, 28.5; 1NVR-2R0U-2WMX, 30.9. The three structures are used in the following virtual screening for Chk1 inhibitors.
More than 60,000 diverse compounds were retrieved from the ZINC53 database and prepared as described in Methods. They were each docked to the four chosen Chk1 structures sequentially using Dock 6.3. More than one hundred compounds with the lowest docking score were visually inspected focusing on chemically attractive and novel structures. Finally six compounds were selected to be tested experimentally based on the availability and cost of purchasing (Figure 6). Two of the compounds (chk#2 and chk#4) exhibited an inhibitory effect against Chk1 with IC50 values of 9.6 μM and 49.5 μM. The two active compounds and 43 crystal ligands were compared using Canvas v1.4.54 The similarity was calculated based on fingerprints with Tanimoto index as a metric. For chk#2, the similarity with 43 crystal ligands varies from 0.005 to 0.039; for chk#4, the similarity varies from 0.002 to 0.039. As a comparison, the similarity between 43 crystal ligands ranges from 0.002 to 0.89. Therefore, the two active compounds are different from known inhibitors.
Figure 7 displays the distribution of enrichment factor for single structure and all possible combinations of 2-structure, 3-structure, 40-structure, 41-structure and 42-structure. The EF is 19.0 when all 43 structures are utilized in docking, and this value is used as a threshold to examine how EF varies with an increasing number of structures. For a single structure, 7.0% of 43 structures obtain an EF no less than 19.0; for ensembles with 2 and 3 structures, 15.5% of 903 combinations and 21.9% of 12,341 combinations, respectively, have an EF equal to or greater than 19.0; for ensembles with 40, 41, 42 and 43 structures, the percentage rises to 96.6%, 98.8%, 100% and 100%, respectively. The probability to achieve an EF no less than 19.0 is rising monotonically with the increasing number of structures combined. This is true even if we don't calculate the percentage of ensembles with 4 – 39 structures because of the huge number of possible combinations. Our results clearly indicate that obtaining a good enrichment at the top ranked list with multiple structures is more likely than with a single structure when structures are chosen randomly. On the other hand, the percentage of single structure or structure combinations, which might have an EF of 0, decreases steadily from 4.7% with single structure to 3.2% with 2 structures and 2.5% with 3 structures. Moreover, the lowest EF is 11.9, 14.2 and 19.0, respectively, for ensembles with 40, 41 and 42 structures, suggesting that the risk of getting the worst enrichment is greatly reduced when multiple structures are used. Therefore, the advantage of multiple structures is to be able to not only outperform the best single structure (Figure 5), but also circumvent the worst result and guarantee a good enrichment if it's not the best.
One common criticism of the multiple structures approach is the exclusion of conformational energy difference. Taking into account conformational energy would improve enrichment,55 binding mode prediction49 and ligand selectivity profile12 even with very crude approximations when multiple receptor conformations are applied in docking. Here, in order to evaluate if considering the conformational energy difference can improve docking performance with multiple structures, we calculate the conformational energy difference with targeted molecular dynamics,56 a computationally demanding method but more precise than previous reports,49, 55 where the conformation energy difference is simply estimated. The maximal difference of docking score between the top 1% compounds for all structures is 20.6 Kcal/mol (Figure 8). The conformational energy difference ranges from 0.19 to 278.95 Kcal/mol for pairwise structures, and only 27.6% of pairwise structures have an energy difference less than 20 Kcal/mol (Figure 8). This means that, for most of the structures, when multiple structures are combined, the enrichment factor would be no better than that from the best single structure, because conformational energy plays a dominant role in ranking. A total of 238 pairs of structures have an energy difference that is less than 20 Kcal/mol. We examined whether the enrichment factor could be improved for 2-structure ensembles when taking into account conformational energy difference. Our results show that considering conformational energy could hardly improve the enrichment factor in recovery of known ligands. In contrast, the enrichment factor deteriorates in most cases. In a few ensembles, with which the enrichment factor is increased, a large conformational energy difference exists between the two protein structures. Therefore, it performs no better than the better single structure.
Of the 43 ATP competitive inhibitors, only 22 ligands (51%) are ranked at the top 1% of the screening results in 43 independent docking simulations. The other 21 true inhibitors are falsely classified into a negative set no matter what structure is used with 1% as threshold. The most frequently recognized ligands are from 2E9O, 1ZYS, 2E9V and 2GHG, and they are ranked at the top 1% by 31, 28, 27 and 26 structures, respectively (Figure 9), indicating that the crystal structures show an inclination to identify some ligands more easily than others. The 22 ligands are recognized 187 times totally at the top 1%, and these 4 ligands explain 60% of them. To further explore the chemical diversity of Chk1 ligands, we utilized the hierarchical clustering method in Canvas54 to classify all ligands from crystal complexes (Figure 10). For the 4 most recognizable ligands, the ligand from 2E9O belongs to cluster 10; the ligands from 1ZYS and 2E9V belong to cluster 5; the ligand from 2GHG belongs to cluster 4. After checking the results from recovery of known ligands, most known ligands ranked at the top 1% belong to cluster 5 no matter which structure is employed. Twelve of 22 recognizable ligands belong to cluster 5, the biggest cluster (19 ligands), and they occur 106 times at the top 1%. However, in cluster 6, which contains 12 ligands and is the second biggest cluster, only three ligands are ranked at the top 1% 8 times. For the 3 structures employed in virtual screening, the ligand from 1NVR belongs to cluster 11; the ligand from 2R0U belongs to cluster 5; the ligand from 2WMX belongs to cluster 6. With 1NVR, the three ligands in cluster 11 are ranked at the top 1% and, with 2WMX, 2 ligands in cluster 6 are ranked at the top 1%, except those easily recognized ligands in cluster 5. The results suggest that it is more likely in virtual screening to identify diverse compounds with multiple structures than with a single structure.
Previous results also raise another common question as to whether a simple and reliable strategy exists to choose the optimal ensemble of receptor structures according to available properties of receptors and ligands. Several efforts have been made to identify the most enriching structure without running a small-scale virtual screening. Selecting significant structures based on docking results from prediction of binding mode seems unsuccessful because of absence of correlation between good docked poses and good scores or ranks.48-49 Interestingly, the volume of the binding site that the co-crystallized ligand occupies is considered a possible reason to explain the performance of different crystal structures.50 Five strategies to construct ensembles have been compared in a recent study.11
Here we examine the protein structures in the optimal ensemble and corresponding co-crystallized ligands to look for properties that can be utilized as selection criterion. The backbone C-α RMSD for pairwise structures in the optimal ensemble is respectively: 1NVR-2R0U, 7.2 Å; 1NVR-2WMX, 15.7 Å; and 2R0U-2WMX, 14.7 Å. Although 2R0U and 2WMX have a different backbone conformation, the C-α RMSD of the binding site is less than 0.3 Å suggesting that they have a similar binding site. The C-α RMSD of the binding site for other pairwise structures is between 2.2 and 3.1 Å. As mentioned earlier, the ligand from 1NVR, 2R0U and 2WMX belongs to cluster 11, 5 and 6, respectively. In summary, the optimal ensemble of Chk1 structures can be described by three criteria: 1) they have different backbone conformation; 2) their cognate ligands belong to different chemical spaces (clusters); and 3) the ligands come from the three biggest clusters. However, determining the optimal ensemble based on protein backbone comparison and ligand clustering is not sufficient because a great number of ensembles meet the three requirements. According to our results, with active ligands available, the approach to evaluate protein structures by the recovery of known ligands is more appropriate for selection of the optimal ensemble applied in virtual screening than a simple strategy based on properties of receptors and ligands, though this approach needs more computer time.
One disadvantage of multiple structures docking is that the required computation time increases linearly with the number of structures, which makes it a computationally demanding process. However, now with the emergence of powerful and parallel supercomputers and clusters, multiple structures docking can be performed daily in drug discovery. In this work, all docking simulations were performed on an IBM Blue Gene/L supercomputer. In predicting the binding mode, less than 17 minutes with 128 nodes was required to dock one Chk1 structure to the 46 known ligands, and about 3 hours with 512 nodes were required to dock 46 Chk1 structures to 46 known ligands. In the recovery of known ligands, less than 15 minutes with 512 nodes was needed to dock one Chk1 structure to the testing set of over 2,000 compounds. When 46 Chk1 structures were docked to the testing set, less than 6 hours were needed because 1,024 nodes were used. In virtual screening of compounds extracted from ZINC database, about 20 hours were needed with 512 nodes to dock over 60,000 compounds into one Chk1 structure.
In this work, we examined the performance of a number of Chk1 crystal structures in the prediction of binding mode and recovery of known ligands. An optimal ensemble of Chk1 structures were selected mainly based on the results from recovery of known ligands and used for the successful discovery of novel Chk1 inhibitors. Furthermore, we compared the performance of multiple structures versus a single structure in the recovery of known ligands when structures were selected randomly. We found that obtaining a good enrichment is more likely with multiple structures than with a single structure. Identifying more diverse ligands is also more likely with multiple structures than with a single structure according to our ligand clustering results. We found that the chosen protein structures displayed diverse conformations in backbone and binding site, and the co-crystallized ligands belonged to different clusters. However, to select a small subset of protein structures according to the available information of receptors and ligands was still challenging. Evaluating protein structures by the recovery of known inhibitors is a reliable approach to determine the optimal ensemble of structures. Although this process was computationally expensive, increasing the power of supercomputers and clusters would make it less time-consuming. We demonstrated that the evaluation process could be finished in 10 hours using the IBM Blue Gene/L supercomputer. Finally, the conformational energy difference was calculated by targeted molecular dynamics and our results indicated that this could not improve the enrichment.
This work was supported by The Hormel Foundation, National Institutes of Health Grant R37 CA081064 and NCI Contract Number HHSN-261200533001C - NO1-CN-53301.