The success of ligand docking calculations typically depends on the quality of the receptor structure. Given improvements in protein structure prediction approaches, approximate protein models now can be routinely obtained for the majority of gene products in a given proteome. Structure-based virtual screening of large combinatorial libraries of lead candidates against theoretically modeled receptor structures requires fast and reliable docking techniques capable of dealing with structural inaccuracies in protein models. Here, we present Q-DockLHM, a method for low-resolution refinement of binding poses provided by FINDSITELHM, a ligand homology modeling approach. We compare its performance to that of classical ligand docking approaches in ligand docking against a representative set of experimental (both holo and apo) as well as theoretically modeled receptor structures. Docking benchmarks reveal that unlike all-atom docking, Q-DockLHM exhibits the desired tolerance to the receptor’s structure deformation. Our results suggest that the use of an evolution-based approach to ligand homology modeling followed by fast low-resolution refinement is capable of achieving satisfactory performance in ligand-binding pose prediction with promising applicability to proteome-scale applications.
Q-dock; ligand docking; homology modeling; low-resolution modeling; threading
To find out whether linarin can be used as a potential natural inhibitor to target CDK4 in retinoblastoma using virtual screening studies.
Materials and Methods:
In this study, molecular modeling and protein structure optimization was performed for crystal structure of CDK4 (PDB id: 3G33), and was subjected to Molecular Dynamics (MD) simulation for 10 nanoseconds, as a preparatory process for docking. Furthermore, the stable conformation obtained in the MD simulation was utilized for virtual screening against the library of natural compounds in Indian Plant Anticancer Compounds Database (InPACdb) using AutoDock Vina. Finally, best docked ligands were revalidated individually through semi-flexible docking by AutoDock 4.0.
The CDK4 structure was stereochemically optimized to fix clashes and bad angles, which placed 96.4% residues in the core region of Ramachandran plot. The final structure of CDK4 that emerged after MD simulation was proven to be highly stable as per different validation tools. Virtual screening and docking was carried out for CDK4 against optimized ligands from InPACdb through AutoDock Vina. This inferred Linarin (Inpacdb AC.NO. acd0073) as a potential therapeutic agent with binding energy of -8.9 kJ/mol. Furthermore, it was also found to be valid as per AutoDock 4.0 semi-flexible docking procedure, with the binding energy of -8.18 kJ/mol and Ki value of 1.01 μM.
The docking results indicate linarin, a flavonoid plant compound, as a potential inhibitor of CDK4 compared to some of the currently practiced anticancer drugs for retinoblastoma. This finding can be extended to experimental validation to assess the in vivo efficacy of the identified compound.
Cyclin-dependent kinase 4; InPACdb; molecular docking; molecular dynamics; retinoblastoma; virtual screening
Virtual screening by molecular docking has become a widely used approach to lead discovery in the pharmaceutical industry when a high resolution structure of the biological target of interest is available. The performance of three widely-used docking programs (Glide, GOLD, and DOCK) for virtual database screening is studied when they are applied to the same protein target and ligand set. Comparisons of the docking programs and scoring functions using a large and diverse data set of pharmaceutically interesting targets and active compounds are carried out. We focus on the problem of docking and scoring flexible compounds which are sterically capable of docking into a rigid conformation of the receptor. The Glide XP methodology is shown to consistently yield enrichments superior to the two alternative methods, while GOLD outperforms DOCK on average. The study also shows that docking into multiple receptor structures can decrease the docking error in screening a diverse set of active compounds.
The rapidly growing number of theoretically predicted protein structures requires robust methods that can utilize low-quality receptor structures as targets for ligand docking. Typically, docking accuracy falls off dramatically when apo or modeled receptors are used in docking experiments. Low-resolution ligand docking techniques have been developed to deal with structural inaccuracies in predicted receptor models. In this spirit, we describe the development and optimization of a knowledge-based potential implemented in Q-Dock, a low-resolution flexible ligand docking approach. Self-docking experiments using crystal structures reveals satisfactory accuracy, comparable with all-atom docking. All-atom models reconstructed from Q-Dock’s low-resolution models can be further refined by even a simple all-atom energy minimization. In decoy-docking against distorted receptor models with a root-mean-square deviation, RMSD, from native of ~3 Å, Q-Dock recovers on average 15–20% more specific contacts and 25–35% more binding residues than all-atom methods. To further improve docking accuracy against low-quality protein models, we propose a pocket-specific protein-ligand interaction potential derived from weakly homologous threading holo-templates. The success rate of Q-Dock employing a pocket-specific potential is 6.3 times higher than that previously reported for the Dolores method, another low-resolution docking approach.
Q-Dock; ligand docking; low-resolution docking; pocket-specific potential; protein models; threading
Incorporating receptor flexibility is considered crucial for improvement of docking-based virtual screening. With an abundance of crystallographic structures freely available, docking with multiple crystal structures is believed to be a practical approach to cope with protein flexibility. Here we describe a successful application of the docking of multiple structures to discover novel and potent Chk1 inhibitors. Forty-six Chk1 structures were first compared in single structure docking by predicting the binding mode and recovering known ligands. Combinations of different protein structures were then compared by recovery of known ligands and an optimal ensemble of Chk1 structures were selected. The chosen structures were used in the virtual screening of over 60,000 diverse compounds for Chk1 inhibitors. Six novel compounds ranked at the top of the hits list were tested experimentally and two of these compounds inhibited Chk1 activity–the best with an IC50 value of 9.6 μM. Further study indicated that achieving a better enrichment and identifying more diverse compounds was more likely using multiple structures than using only a single structure even when protein structures were randomly selected. Taking into account conformational energy difference did not help to improve enrichment in the top ranked list.
Virtual screening is becoming an important tool for drug discovery. However, the application of virtual screening has been limited by the lack of accurate scoring functions. Here, we present a novel scoring function, MedusaScore, for evaluating protein-ligand binding. MedusaScore is based on models of physical interactions that include van der Waals, solvation and hydrogen bonding energies. To ensure the best transferability of the scoring function, we do not use any protein-ligand experimental data for parameter training. We then test the MedusaScore for docking decoy recognition and binding affinity prediction and find superior performance compared to other widely used scoring functions. Statistical analysis indicates that one source of inaccuracy of MedusaScore may arise from the unaccounted entropic loss upon ligand binding, which suggests avenues of approach for further MedusaScore improvement.
Molecular docking simulations of fully flexible protein receptor (FFR) models are coming of age. In our studies, an FFR model is represented by a series of different conformations derived from a molecular dynamic simulation trajectory of the receptor. For each conformation in the FFR model, a docking simulation is executed and analyzed. An important challenge is to perform virtual screening of millions of ligands using an FFR model in a sequential mode since it can become computationally very demanding. In this paper, we propose a cloud-based web environment, called web Flexible Receptor Docking Workflow (wFReDoW), which reduces the CPU time in the molecular docking simulations of FFR models to small molecules. It is based on the new workflow data pattern called self-adaptive multiple instances (P-SaMIs) and on a middleware built on Amazon EC2 instances. P-SaMI reduces the number of molecular docking simulations while the middleware speeds up the docking experiments using a High Performance Computing (HPC) environment on the cloud. The experimental results show a reduction in the total elapsed time of docking experiments and the quality of the new reduced receptor models produced by discarding the nonpromising conformations from an FFR model ruled by the P-SaMI data pattern.
Molecular docking is the most practical approach to leverage protein structure for ligand discovery, but the technique retains important liabilities that make it challenging to deploy on a large scale. We have therefore created an expert system, DOCK Blaster, to investigate the feasibility of full automation. The method requires a PDB code, sometimes with a ligand structure, and from that alone can launch a full screen of large libraries. A critical feature is self-assessment, which estimates the anticipated reliability of the automated screening results using pose fidelity and enrichment. Against common benchmarks, DOCK Blaster recapitulates the crystal ligand pose within 2 Å rmsd 50−60% of the time; inferior to an expert, but respectrable. Half the time the ligand also ranked among the top 5% of 100 physically matched decoys chosen on the fly. Further tests were undertaken culminating in a study of 7755 eligible PDB structures. In 1398 cases, the redocked ligand ranked in the top 5% of 100 property-matched decoys while also posing within 2 Å rmsd, suggesting that unsupervised prospective docking is viable. DOCK Blaster is available at http://blaster.docking.org.
The rapidly increasing number of high-resolution X-ray structures of G-protein coupled receptors (GPCRs) creates a unique opportunity to employ comparative modeling and docking to provide valuable insight into the function and ligand binding determinants of novel receptors, to assist in virtual screening and to design and optimize drug candidates. However, low sequence identity between receptors, conformational flexibility, and chemical diversity of ligands present an enormous challenge to molecular modeling approaches. It is our hypothesis that rapid Monte-Carlo sampling of protein backbone and side-chain conformational space with Rosetta can be leveraged to meet this challenge. This study performs unbiased comparative modeling and docking methodologies using 14 distinct high-resolution GPCRs and proposes knowledge-based filtering methods for improvement of sampling performance and identification of correct ligand-receptor interactions. On average, top ranked receptor models built on template structures over 50% sequence identity are within 2.9 Å of the experimental structure, with an average root mean square deviation (RMSD) of 2.2 Å for the transmembrane region and 5 Å for the second extracellular loop. Furthermore, these models are consistently correlated with low Rosetta energy score. To predict their binding modes, ligand conformers of the 14 ligands co-crystalized with the GPCRs were docked against the top ranked comparative models. In contrast to the comparative models themselves, however, it remains difficult to unambiguously identify correct binding modes by score alone. On average, sampling performance was improved by 103 fold over random using knowledge-based and energy-based filters. In assessing the applicability of experimental constraints, we found that sampling performance is increased by one order of magnitude for every 10 residues known to contact the ligand. Additionally, in the case of DOR, knowledge of a single specific ligand-protein contact improved sampling efficiency 7 fold. These findings offer specific guidelines which may lead to increased success in determining receptor-ligand complexes.
We employ ensemble docking simulations to characterize the interactions of two enantiomeric forms of a Ru-complex compound (1-R and 1-S) with three protein kinases, namely PIM1, GSK-3β, and CDK2/cyclin A. We show that our ensemble docking computational protocol adequately models the structural features of these interactions and discriminates between competing conformational clusters of ligand-bound protein structures. Using the determined X-ray crystal structure of PIM1 complexed to the compound 1-R as a control, we discuss the importance of including the protein flexibility inherent in the ensemble docking protocol, for the accuracy of the structure prediction of the bound state. A comparison of our ensemble docking results suggests that PIM1 and GSK-3β bind the two enantiomers in similar fashion, through two primary binding modes: conformation I, which is very similar to the conformation presented in the existing PIM1/compound 1-R crystal structure; conformation II, which represents a 180° flip about an axis through the NH group of the pyridocarbazole moiety, relative to conformation I. In contrast, the binding of the enantiomers to CDK2 is found to have a different structural profile including a suggested bound conformation, which lacks the conserved hydrogen bond between the kinase and the ligand (i.e., ATP, staurosporine, Ru-complex compound). The top scoring conformation of the inhibitor bound to CDK2 is not present among the top-scoring conformations of the inhibitor bound to either PIM1 or GSK-3β and vice-versa. Collectively, our results help provide atomic-level insights into inhibitor selectivity among the three kinases.
Small molecular kinase inhibitor; Protein kinase; Inhibitor selectivity; Ruthenium-based organometalic compound; Molecular dynamics simulation; Molecular docking; Protein flexibility; Ensemble molecular docking
The number of protein targets with a known or predicted tri-dimensional structure and of drug-like chemical compounds is growing rapidly and so is the need for new therapeutic compounds or chemical probes. Performing flexible structure-based virtual screening computations on thousands of targets with millions of molecules is intractable to most laboratories nor indeed desirable. Since shape complementarity is of primary importance for most protein-ligand interactions, we have developed a tool/protocol based on rigid-body docking to select compounds that fit well into binding sites.
Here we present an efficient multiple conformation rigid-body docking approach, MS-DOCK, which is based on the program DOCK. This approach can be used as the first step of a multi-stage docking/scoring protocol. First, we developed and validated the Multiconf-DOCK tool that generates several conformers per input ligand. Then, each generated conformer (bioactives and 37970 decoys) was docked rigidly using DOCK6 with our optimized protocol into seven different receptor-binding sites. MS-DOCK was able to significantly reduce the size of the initial input library for all seven targets, thereby facilitating subsequent more CPU demanding flexible docking procedures.
MS-DOCK can be easily used for the generation of multi-conformer libraries and for shape-based filtering within a multi-step structure-based screening protocol in order to shorten computation times.
Progress in functional genomics and structural studies on biological macromolecules are generating a growing number of potential targets for therapeutics, adding to the importance of computational approaches for small molecule docking and virtual screening of candidate compounds. In this review, recent improvements in several public domain packages that are widely used in the context of drug development, including DOCK, AutoDock, AutoDock Vina and Screening for Ligands by Induced-fit Docking Efficiently (SLIDE) are surveyed. The authors also survey methods for the analysis and visualisation of docking simulations, as an important step in the overall assessment of the results. In order to illustrate the performance and limitations of current docking programs, the authors used the National Center for Toxicological Research (NCTR) oestrogen receptor benchmark set of 232 oestrogenic compounds with experimentally measured strength of binding to oestrogen receptor alpha. The methods tested here yielded a correlation coefficient of up to 0.6 between the predicted and observed binding affinities for active compounds in this benchmark.
drug discovery; small molecule docking; virtual screening; docking packages; visualisation of docking poses; oestrogen receptor; oestrogen activity prediction; SAR
Unconstrained rigid docking, flexible side chain docking and protein crystal structure determinations reveal a water-mediated hinge binding mode for a series of benzimidazole ligands of the protein kinase CHK2. This binding mode is different from those previously postulated in the literature and may provide a useful approach to selective small molecule inhibitor design.
Two closely related binding modes have previously been proposed for the ATP-competitive benzimidazole class of checkpoint kinase 2 (CHK2) inhibitors; however, neither binding mode is entirely consistent with the reported SAR. Unconstrained rigid docking of benzimidazole ligands into representative CHK2 protein crystal structures reveals an alternative binding mode involving a water-mediated interaction with the hinge region; docking which incorporates protein side chain flexibility for selected residues in the ATP binding site resulted in a refinement of the water-mediated hinge binding mode that is consistent with observed SAR. The flexible docking results are in good agreement with the crystal structures of four exemplar benzimidazole ligands bound to CHK2 which unambiguously confirmed the binding mode of these inhibitors, including the water-mediated interaction with the hinge region, and which is significantly different from binding modes previously postulated in the literature.
ADP, adenosine diphosphate; ATM, ataxia telangiectasia mutated; ATP, adenosine triphosphate; CHK2, checkpoint kinase 2; GOLD, genetic optimisation for ligand docking; GST, glutathione S-transferase; KD, kinase domain; MOE, molecular operating environment; PARP, poly ADP-ribose polymerase; PDB, protein data bank; PLIF, protein ligand interaction fingerprints; SAR, structure activity relationship; SIFt, structural interaction fingerprints; Kinase; CHK2; Flexible docking; Crystallography; Inhibitor
Incorporating receptor flexibility into molecular docking should improve results for flexible proteins. However, the incorporation of explicit all-atom flexibility with molecular dynamics for the entire protein chain may also introduce significant error and “noise” that could decrease docking accuracy and deteriorate the ability of a scoring function to rank native-like poses. We address this apparent paradox by comparing the success of several flexible receptor models in cross-docking and multiple receptor ensemble docking for p38α mitogen-activated protein (MAP) kinase. Explicit all-atom receptor flexibility has been incorporated into a CHARMM-based molecular docking method (CDOCKER) using both molecular dynamics (MD) and torsion angle molecular dynamics (TAMD) for the refinement of predicted protein-ligand binding geometries. These flexible receptor models have been evaluated, and the accuracy and efficiency of TAMD sampling is directly compared to MD sampling. Several flexible receptor models are compared, encompassing flexible side chains, flexible loops, multiple flexible backbone segments, and treatment of the entire chain as flexible. We find that although including side chain and some backbone flexibility is required for improved docking accuracy as expected, docking accuracy also diminishes as additional and unnecessary receptor flexibility is included into the conformational search space. Ensemble docking results demonstrate that including protein flexibility leads to to improved agreement with binding data for 227 active compounds. This comparison also demonstrates that a flexible receptor model enriches high affinity compound identification without significantly increasing the number of false positives from low affinity compounds.
CDOCKER; CHARMM; Binding Pocket; Protein-Ligand Interactions; Flexible Docking; DFG-out; linear interaction energy
The use of multiple X-ray protein structures has been reported to be an efficient alternative for the representation of the binding pocket flexibility needed for accurate small molecules docking. However, the docking performance of the individual single conformations varies widely and adding certain conformations to an ensemble is even counterproductive. Here we used a very large and diverse benchmark of 1068 X-ray protein conformations of 99 therapeutically relevant proteins, first, to compare the performance of the ensemble and single conformation docking, and, secondly, to find the properties of best performing conformers that can be used to select a smaller set of conformers for ensemble docking. The conformer selection has been validated through retrospective virtual screening experiments aimed at separating known ligand binders from decoys. We found that the conformers co-crystallized with the largest ligands displayed high selectivity for binders, and when combined in ensembles they consistently provided better results than randomly chosen protein conformations. The use of ensembles encompassing between 3 to 5 experimental conformations consistently improved the docking accuracy and binders vs. decoys separation.
Identification of antigenic peptide epitopes is an essential prerequisite in T cell-based molecular vaccine design. Computational (sequence-based and structure-based) methods are inexpensive and efficient compared to experimental approaches in screening numerous peptides against their cognate MHC alleles. In structure-based protocols, suited to alleles with limited epitope data, the first step is to identify high-binding peptides using docking techniques, which need improvement in speed and efficiency to be useful in large-scale screening studies. We present pDOCK: a new computational technique for rapid and accurate docking of flexible peptides to MHC receptors and primarily apply it on a non-redundant dataset of 186 pMHC (MHC-I and MHC-II) complexes with X-ray crystal structures.
We have compared our docked structures with experimental crystallographic structures for the immunologically relevant nonameric core of the bound peptide for MHC-I and MHC-II complexes. Primary testing for re-docking of peptides into their respective MHC grooves generated 159 out of 186 peptides with Cα RMSD of less than 1.00 Å, with a mean of 0.56 Å. Amongst the 25 peptides used for single and variant template docking, the Cα RMSD values were below 1.00 Å for 23 peptides. Compared to our earlier docking methodology, pDOCK shows upto 2.5 fold improvement in the accuracy and is ~60% faster. Results of validation against previously published studies represent a seven-fold increase in pDOCK accuracy.
The limitations of our previous methodology have been addressed in the new docking protocol making it a rapid and accurate method to evaluate pMHC binding. pDOCK is a generic method and although benchmarks against experimental structures, it can be applied to alleles with no structural data using sequence information. Our outcomes establish the efficacy of our procedure to predict highly accurate peptide structures permitting conformational sampling of the peptide in MHC binding groove. Our results also support the applicability of pDOCK for in silico identification of promiscuous peptide epitopes that are relevant to higher proportions of human population with greater propensity to activate T cells making them key targets for the design of vaccines and immunotherapies.
Screening of ligand molecules to target proteins using computer-aided docking is a critical step in rational drug discovery. Based on this circumstance, we
attempted to develop a virtual screening application system, named VSDK Virtual Screening by Docking, which can function under the Windows platform. This is
a user-friendly, flexible, and versatile tool which can be used by users who are familiar with Windows OS. The virtual screening performance was tested for an
arbitrarily-selected receptor, FGFR tyrosine kinase (pdb code: 1agw), by using ligands downloaded from ZINC database with its grid size of x,y,z = 30,30,30 and
run number of 10. It took 90 minutes for 100 molecules for this virtual screening. VSDK is freely available at the designated URL, and a simplified manual can be
downloaded from VSDK home page. This tool will have a more challenging scope and achievement as the computer speed and accuracy are increased and secured
in the future.
The database is available for free at http://www.pharm.kobegakuin.ac.jp/˜akaho/english_top.html
Different classes of compounds were investigated for their binding affinities into different protein tyrosine kinases (PTKs) employing a novel flexible
ligand docking approach by using AutoDock 3.05 and 4. These compounds include many flavin analogs, which were developed in our group with varying
degrees of cytotoxic activity (comparable or moderately superior to cisplatin and ara-c), and database selected analogs. They were docked onto twelve
different families of PTKs retrieved from the Protein Data Bank. These proteins are representatives of plausible models of interactions with
chemotherapeutic agents. A comparative study of the intact co-crystallized ligands of various types of PTKs was carried out. Results revealed that the
new class of 5-deazapteridine and steroid hybrid compounds VIa,b, and d, and the vertical-type bispyridodipyrimidine with n-hexyl chain junction
between its N-10 and N-10 atoms Xa, exhibited non-selective PTK binding capacities, with the lowest (Gb). On the other hand, 2-amino benzoic acid
analog IIa, phenoxypyrido [3, 4-d]pyrimidine derivative IVc, tyrosine containing tripeptide Vd, and the one from Sumisho data base 831 are proposed to
have selective PTK binding affinities to certain classes of tyrosine kinases, namely, HGFR (c-met), ZAP-70, insulin receptor kinase, EGFR, respectively.
All These compounds of highest affinities were docked within the binding sites of PTKs with reasonable RMSD and 1-5 hydrogen bonds.
The increasing number of RNA crystal structures enables a structure-based approach to the discovery of new RNA-binding ligands. To develop the poorly explored area of RNA-ligand docking, we have conducted a virtual screening exercise for a purine riboswitch to probe the strengths and weaknesses of RNA-ligand docking. Using a standard protein-ligand docking program with only minor modifications, four new ligands with binding affinities in the micromolar range were identified, including two compounds based on molecular scaffolds not resembling known ligands. RNA-ligand docking performed comparably to protein-ligand docking indicating that this approach is a promising option to explore the wealth of RNA structures for structure-based ligand design.
► Using RNA-ligand docking, four new ligands were discovered for a purine riboswitch ► Two of the ligands were based on scaffolds not known to bind to this riboswitch ► Crystal structures were determined that confirm the binding modes of new ligands ► Molecular docking is a promising method for RNA-structure-based ligand design
Protein binding sites undergo ligand specific conformational changes upon ligand binding. However, most docking protocols rely on a fixed conformation of the receptor, or on the prior knowledge of multiple conformations representing the variation of the pocket, or on a known bounding box for the ligand. Here we described a general induced fit docking protocol that requires only one initial pocket conformation and identifies most of the correct ligand positions as the lowest score. We expanded a previously used diverse “cross-docking” benchmark to thirty ligand-protein pairs extracted from different crystal structures. The algorithm systematically scans pairs of neighbouring side chains, replaces them by alanines, and docks the ligand to each ‘gapped’ version of the pocket. All docked positions are scored, refined with original side chains and flexible backbone and re-scored. In the optimal version of the protocol pairs of residues were replaced by alanines and only one best scoring conformation was selected from each ‘gapped’ pocket for refinement. The optimal SCARE (SCan Alanines and REfine) protocol identifies a near native conformation (under 2Å RMSD) as the lowest rank for 80% of pairs if the docking bounding box is defined by the predicted pocket envelope, and for as many as 90% of the pairs if the bounding box is derived from the known answer with ~5 Å margin as used in most previous publications. The presented fully automated algorithm takes about two hours per pose of a single processor time, requires only one pocket structure and no prior knowledge about the binding site location. Furthermore, the results for conformationally conserved pockets do not deteriorate due to substantial increase of the pocket variability.
Scanning Docking; Cross Docking; ICM; Internal Coordinate Mechanics; Induced Fit; Receptor Flexibility; Drug Binding; Structure Based Drug Design
Docking and virtual screening (VS) reach maximum potential when the receptor displays the structural changes needed for accurate ligand binding. Unfortunately, these conformational changes are often poorly represented in experimental structures or homology models, debilitating their docking performance. Recently, we have shown that receptors optimized with our LiBERO method (Ligand-guided Backbone Ensemble Receptor Optimization) were able to better discriminate active ligands from inactives in flexible-ligand VS docking experiments. The LiBERO method relies on the use of ligand information for selecting the best performing individual pockets from ensembles derived from normal mode analysis or Monte Carlo. Here we present ALiBERO, a new computational tool that has expanded the pocket selection from single to multiple, allowing for automatic iteration of the sampling-selection procedure. The selection of pockets is performed by a dual method that uses exhaustive combinatorial search plus individual addition of pockets, selecting only those that maximize the discrimination of known actives compounds from decoys. The resulting optimized pockets showed increased VS performance when later used in much larger unrelated test sets consisting of biologically active and inactive ligands. In this paper we will describe the design and implementation of the algorithm, using as a reference the human estrogen receptor alpha.
Optimizing amino acid conformation and identity is a central problem in computational protein design. Protein design algorithms must allow realistic protein flexibility to occur during this optimization, or they may fail to find the best sequence with the lowest energy. Most design algorithms implement side-chain flexibility by allowing the side chains to move between a small set of discrete, low-energy states, which we call rigid rotamers. In this work we show that allowing continuous side-chain flexibility (which we call continuous rotamers) greatly improves protein flexibility modeling. We present a large-scale study that compares the sequences and best energy conformations in 69 protein-core redesigns using a rigid-rotamer model versus a continuous-rotamer model. We show that in nearly all of our redesigns the sequence found by the continuous-rotamer model is different and has a lower energy than the one found by the rigid-rotamer model. Moreover, the sequences found by the continuous-rotamer model are more similar to the native sequences. We then show that the seemingly easy solution of sampling more rigid rotamers within the continuous region is not a practical alternative to a continuous-rotamer model: at computationally feasible resolutions, using more rigid rotamers was never better than a continuous-rotamer model and almost always resulted in higher energies. Finally, we present a new protein design algorithm based on the dead-end elimination (DEE) algorithm, which we call iMinDEE, that makes the use of continuous rotamers feasible in larger systems. iMinDEE guarantees finding the optimal answer while pruning the search space with close to the same efficiency of DEE. Availability: Software is available under the Lesser GNU Public License v3. Contact the authors for source code.
Computational protein design is a promising field with many biomedical applications, such as drug design, or the redesign of new enzymes to perform nonnatural chemical reactions. An essential feature of any protein design algorithm is the ability to accurately model the flexibility that occurs in real proteins. In enzyme design, for example, an algorithm must predict how the designed protein will change during binding and catalysis. In this work we present a large-scale study of 69 protein redesigns that shows the necessity of modeling more realistic protein flexibility. Specifically, we model the continuous space around low-energy conformations of amino acid side chains, and compare it against the standard rigid approach of modeling only a small discrete set of low-energy conformations. We show that by allowing the side chains to move in the continuous space around low energy conformations during the protein design search, we obtain very different sequences that better match real protein sequences. Moreover, we propose a new protein design algorithm that, contrary to conventional wisdom, shows that we can search the continuous space around side chains with close to the same efficiency as algorithms that model only a discrete set of conformations.
Inhibitors of the transmembrane protein sarco/endoplasmic reticulum calcium ATPase (SERCA) are invaluable tools for the study of the enzyme’s physiological functions and they have been recognized as a promising new class of anticancer agents. For the discovery of novel enzyme inhibitors, small molecule docking for virtual screens of large compound libraries has become increasingly important. Since the performance of various docking routines varies considerably, depending on the target and the chemical nature of the ligand, we critically evaluated the performance of four frequently used programs – GOLD, AutoDock, Surflex-Dock, and FRED – for the docking of SERCA inhibitors based on the structures of thapsigargin, di-tert-butylhydroquinone, and cyclopiazonic acid. Evaluation criteria were docking accuracy using crystal structures as references, docking reproducibility, and correlation between docking scores and known bioactivities. The best overall results were obtained by GOLD and FRED. Docking runs with conformationally flexible binding sites produced no significant improvement of the results.
computational docking; scoring function; inhibitory potency; calcium pump; thapsigargin; di-tert-butylhydroquinone; cyclopiazonic acid; inhibitor binding site
A database consisting of 780 ligand-receptor complexes, termed SB2010, has been derived from the Protein Databank to evaluate the accuracy of docking protocols for regenerating bound ligand conformations. The goal is to provide easily accessible community resources for development of improved procedures to aid virtual screening for ligands with a wide range of flexibilities. Three core experiments using the program DOCK, which employ rigid (RGD), fixed anchor (FAD), and flexible (FLX) protocols, were used to gauge performance by several different metrics: (1) global results, (2) ligand flexibility, (3) protein family, and (4) crossdocking. Global spectrum plots of successes and failures vs rmsd reveal well-defined inflection regions, which suggest the commonly used 2 Å criteria is a reasonable choice for defining success. Across all 780 systems, success tracks with the relative difficulty of the calculations: RGD (82.3%) > FAD (78.1%) > FLX (63.8%). In general, failures due to scoring strongly outweigh those due to sampling. Subsets of SB2010 grouped by ligand flexibility (7-or-less, 8-to-15, and 15-plus rotatable bonds) reveal success degrades linearly for FAD and FLX protocols, in contrast to RGD which remains constant. Despite the challenges associated with FLX anchor orientation and on-the-fly flexible growth, success rates for the 7-or-less (74.5%), and in particular the 8-to-15 (55.2%) subset, are encouraging. Poorer results for the very flexible 15-plus set (39.3%) indicate substantial room for improvement. Family-based success appears largely independent of ligand flexibility suggesting a strong dependence on the binding site environment. For example, zinc-containing proteins are generally problematic despite moderately flexible ligands. Finally, representative crossdocking examples, for carbonic anhydrase, thermolysin, and neuraminidase families, show the utility of family-based analysis for rapid identification of particularly good or bad docking trends, and the type of failures involved (scoring/sampling), which will likely be of interest to researchers making specific receptor choices for virtual screening. SB2010 is available for download at http://rizzolab.org
Virtual and high-throughput screens (HTS) should have complementary strengths and weaknesses, but studies that prospectively and comprehensively compare them are rare. We undertook a parallel docking and HTS screen of 197861 compounds against cruzain, a thiol protease target for Chagas disease, looking for reversible, competitive inhibitors. On workup, 99% of the hits were eliminated as false positives, yielding 146 well-behaved, competitive ligands. These fell into five chemotypes: two were prioritized by scoring among the top 0.1% of the docking-ranked library, two were prioritized by behavior in the HTS and by clustering, and one chemotype was prioritized by both approaches. Determination of an inhibitor/cruzain crystal structure and comparison of the high-scoring docking hits to experiment illuminated the origins of docking false-negatives and false-positives. Prioritizing molecules that are both predicted by docking and are HTS-active yields well-behaved molecules, relatively unobscured by the false-positives to which both techniques are individually prone.