Virtual screening is becoming an important tool for drug discovery. However, the application of virtual screening has been limited by the lack of accurate scoring functions. Here, we present a novel scoring function, MedusaScore, for evaluating protein-ligand binding. MedusaScore is based on models of physical interactions that include van der Waals, solvation and hydrogen bonding energies. To ensure the best transferability of the scoring function, we do not use any protein-ligand experimental data for parameter training. We then test the MedusaScore for docking decoy recognition and binding affinity prediction and find superior performance compared to other widely used scoring functions. Statistical analysis indicates that one source of inaccuracy of MedusaScore may arise from the unaccounted entropic loss upon ligand binding, which suggests avenues of approach for further MedusaScore improvement.
Based on a statistical mechanics-based iterative method, we have extracted a set of distance-dependent, all-atom pairwise potentials for protein-ligand interactions from the crystal structures of 1300 protein-ligand complexes. The iterative method circumvents the long-standing reference state problem in knowledge-based scoring functions. The resulted scoring function, referred to as ITScore 2.0, has been tested with the CSAR (Community Structure-Activity Resource, 2009 release) benchmark of 345 diverse protein-ligand complexes. ITScore 2.0 achieved a Pearson correlation of R2 = 0.54 in binding affinity prediction. A comparative analysis has been done on the scoring performances of ITScore 2.0, the van der Waals (VDW) scoring function, the VDW with heavy atoms only, and the force field (FF) scoring function of DOCK which consists of a VDW term and an electrostatic term. The results reveal several important factors that affect the scoring performances, which could be helpful for the improvement of scoring functions.
scoring function; molecular docking; CSAR benchmark; ligand-protein interactions; knowledge-based
Two sets of ligand binding decoys have been constructed for the CSAR (Community Structure-Activity Resource) benchmark by using the MDock and DOCK programs for rigid-ligand and flexible-ligand docking, respectively. The decoys generated for each complex in the benchmark thoroughly cover the binding site and also contain a certain number of near-native binding modes. A few scoring functions have been evaluated using the ligand binding decoy sets for their abilities of predicting near-native binding modes. Among them, ITScore achieved a success rate of 86.7% for the rigid-ligand decoys and 79.7% for the flexible-ligand decoys, under the common definition of a successful prediction as RMSD < 2.0 Å from the native structure if the top-scored binding mode was considered. The decoy sets may serve as benchmarks for binding mode prediction of a scoring function, which are available at the CSAR website (http://www.csardock.org/).
molecular docking; scoring function; CSAR benchmark; binding mode; knowledge-based
The Community Structure-Activity Resource (CSAR) datasets are used develop and test a Support Vector Machine-based scoring function in regression mode (SVR). Two scoring functions (SVR-KB and SVR-EP) are derived with the objective of reproducing the trend of the experimental binding affinities provided within the two CSAR datasets. The features used to train SVR-KB are knowledge-based pairwise potentials, while SVR-EP is based on physico-chemical properties. SVR-KB and SVR-EP were compared to seven other widely-used scoring functions, including Glide, X-score, GoldScore, ChemScore, Vina, Dock and PMF. Results showed that SVR-KB trained with features obtained from three-dimensional complexes of the PDBbind dataset outperformed all other scoring functions including best performing X-score, by nearly 0.1 using three correlation coefficients, namely Pearson, Spearman and Kendall. It was interesting that higher performance in rank-ordering did not translate into greater enrichment in virtual screening assessed using the 40 targets of the Directory of Useful Decoys (DUD). To remedy this situation, a variant of SVR-KB (SVR-KBD) was developed by following a target-specific tailoring strategy that we had previously employed to derive SVM-SP. SVR-KBD showed much higher enrichment outperforming all other scoring functions tested, and was comparable in performance to our previously-derived scoring function SVM-SP.
A major goal in drug design is the improvement of computational methods for docking and scoring. The Community Structure Activity Resource (CSAR) aims to collect available data from industry and academia which may be used for this purpose (www.csardock.org). Also, CSAR is charged with organizing community-wide exercises based on the collected data. The first of these exercises was aimed to gauge the overall state of docking and scoring, using a large and diverse data set of protein–ligand complexes. Participants were asked to calculate the affinity of the complexes as provided and then recalculate with changes which may improve their specific method. This first data set was selected from existing PDB entries which had binding data (Kd or Ki) in Binding MOAD, augmented with entries from PDBbind. The final data set contains 343 diverse protein–ligand complexes and spans 14 pKd. Sixteen proteins have three or more complexes in the data set, from which a user could start an inspection of congeneric series. Inherent experimental error limits the possible correlation between scores and measured affinity; R2 is limited to ∼0.9 when fitting to the data set without over parametrizing. R2 is limited to ∼0.8 when scoring the data set with a method trained on outside data. The details of how the data set was initially selected, and the process by which it matured to better fit the needs of the community are presented. Many groups generously participated in improving the data set, and this underscores the value of a supportive, collaborative effort in moving our field forward.
A major goal in drug design is the
improvement of computational
methods for docking and scoring. The Community Structure Activity
Resource (CSAR) has collected several data sets from industry and
added in-house data sets that may be used for this purpose (www.csardock.org). CSAR has currently obtained data from Abbott, GlaxoSmithKline,
and Vertex and is working on obtaining data from several others. Combined
with our in-house projects, we are providing a data set consisting
of 6 protein targets, 647 compounds with biological affinities, and
82 crystal structures. Multiple congeneric series are available for
several targets with a few representative crystal structures of each
of the series. These series generally contain a few inactive compounds,
usually not available in the literature, to provide an upper bound
to the affinity range. The affinity ranges are typically 3–4
orders of magnitude per series. For our in-house projects, we have
had compounds synthesized for biological testing. Affinities were
measured by Thermofluor, Octet RED, and isothermal titration calorimetry
for the most soluble. This allows the direct comparison of the biological
affinities for those compounds, providing a measure of the variance
in the experimental affinity. It appears that there can be considerable
variance in the absolute value of the affinity, making the prediction
of the absolute value ill-defined. However, the relative rankings
within the methods are much better, and this fits with the observation
that predicting relative ranking is a more tractable problem computationally.
For those in-house compounds, we also have measured the following
physical properties: logD, logP, thermodynamic solubility, and pKa. This data set also provides a substantial
decoy set for each target consisting of diverse conformations covering
the entire active site for all of the 58 CSAR-quality crystal structures.
The CSAR data sets (CSAR-NRC HiQ and the 2012 release) provide substantial,
publically available, curated data sets for use in parametrizing and
validating docking and scoring methods.
As part of the Community Structure-Activity Resource (CSAR) center, a set of 343 high-quality, protein–ligand crystal structures were assembled with experimentally determined Kd or Ki information from the literature. We encouraged the community to score the crystallographic poses of the complexes by any method of their choice. The goal of the exercise was to (1) evaluate the current ability of the field to predict activity from structure and (2) investigate the properties of the complexes and methods that appear to hinder scoring. A total of 19 different methods were submitted with numerous parameter variations for a total of 64 sets of scores from 16 participating groups. Linear regression and nonparametric tests were used to correlate scores to the experimental values. Correlation to experiment for the various methods ranged R2 = 0.58–0.12, Spearman ρ = 0.74–0.37, Kendall τ = 0.55–0.25, and median unsigned error = 1.00–1.68 pKd units. All types of scoring functions—force field based, knowledge based, and empirical—had examples with high and low correlation, showing no bias/advantage for any particular approach. The data across all the participants were combined to identify 63 complexes that were poorly scored across the majority of the scoring methods and 123 complexes that were scored well across the majority. The two sets were compared using a Wilcoxon rank-sum test to assess any significant difference in the distributions of >400 physicochemical properties of the ligands and the proteins. Poorly scored complexes were found to have ligands that were the same size as those in well-scored complexes, but hydrogen bonding and torsional strain were significantly different. These comparisons point to a need for CSAR to develop data sets of congeneric series with a range of hydrogen-bonding and hydrophobic characteristics and a range of rotatable bonds.
Highly efficient and specific biomolecular recognition requires both affinity and specificity. Previous quantitative descriptions of biomolecular recognition were mostly driven by improving the affinity prediction, but lack of quantification of specificity. We developed a novel method SPA (SPecificity and Affinity) based on our funneled energy landscape theory. The strategy is to simultaneously optimize the quantified specificity of the “native” protein-ligand complex discriminating against “non-native” binding modes and the affinity prediction. The benchmark testing of SPA shows the best performance against 16 other popular scoring functions in industry and academia on both prediction of binding affinity and “native” binding pose. For the target COX-2 of nonsteroidal anti-inflammatory drugs, SPA successfully discriminates the drugs from the diversity set, and the selective drugs from non-selective drugs. The remarkable performance demonstrates that SPA has significant potential applications in identifying lead compounds for drug discovery.
Poor performance of scoring functions is a well-known bottleneck in structure-based virtual screening, which is most frequently manifested in the scoring functions’ inability to discriminate between true ligands versus known non-binders (therefore designated as binding decoys). This deficiency leads to a large number of false positive hits resulting from virtual screening. We have hypothesized that filtering out or penalizing docking poses recognized as non-native (i.e., pose decoys) should improve the performance of virtual screening in terms of improved identification of true binders. Using several concepts from the field of cheminformatics, we have developed a novel approach to identifying pose decoys from an ensemble of poses generated by computational docking procedures. We demonstrate that the use of target-specific pose (-scoring) filter in combination with a physical force field-based scoring function (MedusaScore) leads to significant improvement of hit rates in virtual screening studies for 12 of the 13 benchmark sets from the clustered version of the Database of Useful Decoys (DUD). This new hybrid scoring function outperforms several conventional structure-based scoring functions, including XSCORE∷HMSCORE, ChemScore, PLP, and Chemgauss3, in six out of 13 data sets at early stage of VS (up 1% decoys of the screening database). We compare our hybrid method with several novel VS methods that were recently reported to have good performances on the same DUD data sets. We find that the retrieved ligands using our method are chemically more diverse in comparison with two ligand-based methods (FieldScreen and FLAP∷LBX). We also compare our method with FLAP∷RBLB, a high-performance VS method that also utilizes both the receptor and the cognate ligand structures. Interestingly, we find that the top ligands retrieved using our method are highly complementary to those retrieved using FLAP∷RBLB, hinting effective directions for best VS applications. We suggest that this integrative virtual screening approach combining cheminformatics and molecular mechanics methodologies may be applied to a broad variety of protein targets to improve the outcome of structure-based drug discovery studies.
Computational methods for predicting protein-ligand binding free energy continue to be popular as a potential cost-cutting method in the drug discovery process. However, accurate predictions are often difficult to make as estimates must be made for certain electronic and entropic terms in conventional force field based scoring functions. Mixed quantum mechanics/molecular mechanics (QM/MM) methods allow electronic effects for a small region of the protein to be calculated, treating the remaining atoms as a fixed charge background for the active site. Such a semi-empirical QM/MM scoring function has been implemented in AMBER using DivCon and tested on a set of 23 metalloprotein-ligand complexes, where QM/MM methods provide a particular advantage in the modeling of the metal ion. The binding affinity of this set of proteins can be calculated with an R2 of 0.64 and a standard deviation of 1.88 kcal/mol without fitting and 0.71 and a standard deviation of 1.69 kcal/mol with fitted weighting of the individual scoring terms. In this study we explore using various methods to calculate terms in the binding free energy equation, including entropy estimates and minimization standards. From these studies we found that using the rotational bond estimate to ligand entropy results in a reasonable R2 of 0.63 without fitting. We also found that using the ESCF energy of the proteins without minimization resulted in an R2 of 0.57, when using the rotatable bond entropy estimate.
Empirical scoring functions used in protein-ligand docking calculations are typically trained on a dataset of complexes with known affinities with the aim of generalizing across different docking applications. We report a novel method of scoring-function optimization that supports the use of additional information to constrain scoring function parameters, which can be used to focus a scoring function’s training towards a particular application, such as screening enrichment. The approach combines multiple instance learning, positive data in the form of ligands of protein binding sites of known and unknown affinity and binding geometry, and negative (decoy) data of ligands thought not to bind particular protein binding sites or known not to bind in particular geometries. Performance of the method for the Surflex-Dock scoring function is shown in cross-validation studies and in 8 blind test cases. Tuned functions optimized with a sufficient amount of data exhibited either improved or undiminished screening performance relative to the original function across all eight complexes. Analysis of the changes to the scoring function suggest that modifications can be learned that are related to protein-specific features such as active-site mobility.
The Community Structure–Activity
Resource (CSAR) recently
held its first blinded exercise based on data provided by Abbott,
Vertex, and colleagues at the University of Michigan, Ann Arbor. A
total of 20 research groups submitted results for the benchmark exercise
where the goal was to compare different improvements for pose prediction,
enrichment, and relative ranking of congeneric series of compounds.
The exercise was built around blinded high-quality experimental data
from four protein targets: LpxC, Urokinase, Chk1, and Erk2. Pose prediction
proved to be the most straightforward task, and most methods were
able to successfully reproduce binding poses when the crystal structure
employed was co-crystallized with a ligand from the same chemical
series. Multiple evaluation metrics were examined, and we found that
RMSD and native contact metrics together provide a robust evaluation
of the predicted poses. It was notable that most scoring functions
underpredicted contacts between the hetero atoms (i.e., N, O, S, etc.)
of the protein and ligand. Relative ranking was found to be the most
difficult area for the methods, but many of the scoring functions
were able to properly identify Urokinase actives from the inactives
in the series. Lastly, we found that minimizing the protein and correcting
histidine tautomeric states positively trended with low RMSD for pose
prediction but minimizing the ligand negatively trended. Pregenerated
ligand conformations performed better than those that were generated
on the fly. Optimizing docking parameters and pretraining with the
native ligand had a positive effect on the docking performance as
did using restraints, substructure fitting, and shape fitting. Lastly,
for both sampling and ranking scoring functions, the use of the empirical
scoring function appeared to trend positively with the RMSD. Here,
by combining the results of many methods, we hope to provide a statistically
relevant evaluation and elucidate specific shortcomings of docking
methodology for the community.
The human pregnane X receptor (PXR) is a transcriptional regulator of many genes involved in xenobiotic metabolism and excretion. Reliable prediction of high affinity binders with this receptor would be valuable for pharmaceutical drug discovery to predict potential toxicological responses
Materials and Methods
Computational models were developed and validated for a dataset consisting of human PXR (PXR) activators and non-activators. We used support vector machine (SVM) algorithms with molecular descriptors derived from two sources, Shape Signatures and the Molecular Operating Environment (MOE) application software. We also employed the molecular docking program GOLD in which the GoldScore method was supplemented with other scoring functions to improve docking results.
The overall test set prediction accuracy for PXR activators with SVM was 72% to 81%. This indicates that molecular shape descriptors are useful in classification of compounds binding to this receptor. The best docking prediction accuracy (61%) was obtained using 1D Shape Signature descriptors as a weighting factor to the GoldScore. By pooling the available human PXR data sets we revealed those molecular features that are associated with human PXR activators.
These combined computational approaches using molecular shape information may assist scientists to more confidently identify PXR activators.
docking; hybrid methods; machine learning; pregnane X receptor; shape signatures descriptors; support vector machine
OPLS all atom force field parameters were developed in order to model a diverse set of novel rhenium based estrogen receptor ligands whose relative binding affinities (RBA) to the estrogen receptor alpha isoform (ERα) with respect to 17β-Estradiol were available. The binding properties of these novel rhenium based organometallic complexes were studied with a combination of Comparative Molecular Similarity Indices Analysis (CoMSIA) and docking. A total of 29 estrogen receptor ligands consisting of 11 rhenium complexes and 18 organic ligands were docked inside the ligand-binding domain (LBD) of ERα utilizing the program Gold. The top ranked pose was used to construct CoMSIA models from a training set of 22 of the estrogen receptor ligands which were selected at random. In addition scoring functions from the docking runs and the polar volume (PV) were also studied to investigate their ability to predict RBA ERα. A partial least-squares analysis consisting of the CoMSIA steric, electrostatic and hydrophobic indices together with the polar volume proved sufficiently predictive having a correlation coefficient, r2, of 0.94 and a cross-validated correlation coefficient, q2, utilizing the leave one out method of 0.68. Analysis of the scoring functions from Gold showed particularly poor correlation to RBA ERα which did not improve when the rhenium complexes were extracted to leave the organic ligands. The combined CoMSIA and polar volume model ranked correctly the ligands in order of increasing RBA ERα, illustrating the utility of this method as a prescreening tool in the development of novel rhenium based estrogen receptor ligands.
steroid; docking; estrogen receptor; rhenium; CoMSIA
Accurately predicting the binding affinities of large sets of diverse protein-ligand complexes is an extremely challenging task. The scoring functions that attempt such computational prediction are essential for analysing the outputs of Molecular Docking, which is in turn an important technique for drug discovery, chemical biology and structural biology. Each scoring function assumes a predetermined theory-inspired functional form for the relationship between the variables that characterise the complex, which also include parameters fitted to experimental or simulation data, and its predicted binding affinity. The inherent problem of this rigid approach is that it leads to poor predictivity for those complexes that do not conform to the modelling assumptions. Moreover, resampling strategies, such as cross-validation or bootstrapping, are still not systematically used to guard against the overfitting of calibration data in parameter estimation for scoring functions.
We propose a novel scoring function (RF-Score) that circumvents the need for problematic modelling assumptions via non-parametric machine learning. In particular, Random Forest was used to implicitly capture binding effects that are hard to model explicitly. RF-Score is compared with the state of the art on the demanding PDBbind benchmark. Results show that RF-Score is a very competitive scoring function. Importantly, RF-Score’s performance was shown to improve dramatically with training set size and hence the future availability of more high quality structural and interaction data is expected to lead to improved versions of RF-Score.
Engineering specific interactions between proteins and small molecules is extremely useful for biological studies, as these interactions are essential for molecular recognition. Furthermore, many biotechnological applications are made possible by such an engineering approach, ranging from biosensors to the design of custom enzyme catalysts. Here, we present a novel method for the computational design of protein-small ligand binding named PocketOptimizer. The program can be used to modify protein binding pocket residues to improve or establish binding of a small molecule. It is a modular pipeline based on a number of customizable molecular modeling tools to predict mutations that alter the affinity of a target protein to its ligand. At its heart it uses a receptor-ligand scoring function to estimate the binding free energy between protein and ligand. We compiled a benchmark set that we used to systematically assess the performance of our method. It consists of proteins for which mutational variants with different binding affinities for their ligands and experimentally determined structures exist. Within this test set PocketOptimizer correctly predicts the mutant with the higher affinity in about 69% of the cases. A detailed analysis of the results reveals that the strengths of PocketOptimizer lie in the correct introduction of stabilizing hydrogen bonds to the ligand, as well as in the improved geometric complemetarity between ligand and binding pocket. Apart from the novel method for binding pocket design we also introduce a much needed benchmark data set for the comparison of affinities of mutant binding pockets, and that we use to asses programs for in silico design of ligand binding.
The effects of solvation and entropy play a critical role in determining the binding free energy in protein-ligand interactions. Despite the good balance between speed and accuracy, no current knowledge-based scoring functions account for the effects of solvation and configurational entropy explicitly due to the difficulty in deriving the corresponding pair potentials and the resulting double counting problem. In the present work, we have included the solvation effect and configurational entropy in the knowledge-based scoring function by an iterative method. The newly developed scoring function has yielded a success rate of 91% in identifying near-native binding modes with Wang et al.’s benchmark of 100 diverse protein-ligand complexes. The results have been compared with the results of 15 other scoring functions for validation purpose. In binding affinity prediction, our scoring function has yielded a correlation of R2 = 0.76 between the predicted binding scores and the experimentally measured binding affinities on the PMF validation sets of 77 diverse complexes. The results have been compared with R2 of four other well-known knowledge-based scoring functions. Finally, our scoring function was also validated on the large PDBbind database of 1299 protein-ligand complexes and yielded a correlation coefficient of 0.474. The present computational model can be applied to other scoring functions to account for solvation and entropic effects.
scoring function; ligand-protein interactions; knowledge-based; desolvation; entropy
The implementation of a novel sequential computational approach that can be used effectively for virtual screening and identification of prospective ligands that bind to trypanothione reductase (TryR) is reported. The multistep strategy combines a ligand-based virtual screening for building an enriched library of small molecules with a docking protocol (AutoDock, X-Score) for screening against the TryR target. Compounds were ranked by an exhaustive conformational consensus scoring approach that employs a rank-by-rank strategy by combining both scoring functions. Analysis of the predicted ligand−protein interactions highlights the role of bulky quaternary amine moieties for binding affinity. The scaffold hopping (SHOP) process derived from this computational approach allowed the identification of several chemotypes, not previously reported as antiprotozoal agents, which includes dibenzothiepine, dibenzooxathiepine, dibenzodithiepine, and polycyclic cationic structures like thiaazatetracyclo-nonadeca-hexaen-3-ium. Assays measuring the inhibiting effect of these compounds on T. cruzi and T. brucei TryR confirm their potential for further rational optimization.
The binding between the major histocompatibility complex and the presented peptide is an indispensable prerequisite for the adaptive immune response. There is a plethora of different in silico techniques for the prediction of the peptide binding affinity to major histocompatibility complexes. Most studies screen a set of peptides for promising candidates to predict possible T cell epitopes. In this study we ask the question vice versa: Which peptides do have highest binding affinities to a given major histocompatibility complex according to certain in silico scoring functions?
Since a full screening of all possible peptides is not feasible in reasonable runtime, we introduce a heuristic approach. We developed a framework for Genetic Algorithms to optimize peptides for the binding to major histocompatibility complexes. In an extensive benchmark we tested various operator combinations. We found that (1) selection operators have a strong influence on the convergence of the population while recombination operators have minor influence and (2) that five different binding prediction methods lead to five different sets of "optimal" peptides for the same major histocompatibility complex. The consensus peptides were experimentally verified as high affinity binders.
We provide a generalized framework to calculate sets of high affinity binders based on different previously published scoring functions in reasonable runtime. Furthermore we give insight into the different behaviours of operators and scoring functions of the Genetic Algorithm.
Quantum mechanical semiempirical comparative binding energy analysis calculations have been carried out for a series of protein kinase B (PKB) inhibitors derived from fragment- and structure-based drug design. These protein−ligand complexes were selected because they represent a consistent set of experimental data that includes both crystal structures and affinities. Seven scoring functions were evaluated based on both the PM3 and the AM1 Hamiltonians. The optimal models obtained by partial least-squares analysis of the aligned poses are predictive as measured by a number of standard statistical criteria and by validation with an external data set. An algorithm has been developed that provides residue-based contributions to the overall binding affinity. These residue-based binding contributions can be plotted in heat maps so as to highlight the most important residues for ligand binding. In the case of these PKB inhibitors, the maps show that Met166, Thr97, Gly43, Glu114, Ala116, and Val50, among other residues, play an important role in determining binding affinity. The interaction energy map makes it easy to identify the residues that have the largest absolute effect on ligand binding. The structure−activity relationship (SAR) map highlights residues that are most critical to discriminating between more and less potent ligands. Taken together the interaction energy and the SAR maps provide useful insights into drug design that would be difficult to garner in any other way.
Understanding the underlying physics of the binding of small molecule ligands to protein active sites is a key objective of computational chemistry and biology. It is widely believed that displacement of water molecules from the active site by the ligand is a principal (if not the dominant) source of binding free energy. Although continuum theories of hydration are routinely used to describe the contributions of the solvent to the binding affinity of the complex, it is still an unsettled question as to whether or not these continuum solvation theories describe the underlying molecular physics with sufficient accuracy to reliably rank the binding affinities of a set of ligands for a given protein. Here we develop a novel, computationally efficient, descriptor of the contribution of the solvent to the binding free energy of a small molecule and its associated receptor that captures the effects of the ligand displacing the solvent from the protein active site with atomic detail. This descriptor quantitatively predicts (R2=0.81) the binding free energy differences between congeneric ligand pairs for the test system factor Xa, elucidates physical properties of the active site solvent that appear to be missing in most continuum theories of hydration, and identifies several features of the hydration of the factor Xa active site relevant to the structure-activity-relationship of its inhibitors.
A central problem in de novo drug design is determining the binding affinity of a ligand with a receptor. A new scoring algorithm is presented that estimates the binding affinity of a protein-ligand complex given a three-dimensional structure. The method, LISA (Ligand Identification Scoring Algorithm), uses an empirical scoring function to describe the binding free energy. Interaction terms have been designed to account for van der Waals (VDW) contacts, hydrogen bonding, desolvation effects and metal chelation to model the dissociation equilibrium constants using a linear model. Atom types have been introduced to differentiate the parameters for VDW, H-bonding interactions and metal chelation between different atom pairs. A training set of 492 protein-ligand complexes was selected for the fitting process. Different test sets have been examined to evaluate its ability to predict experimentally measured binding affinities. By comparing with other well known scoring functions, the results show that LISA has advantages over many existing scoring functions in simulating protein-ligand binding affinity, especially metalloprotein-ligand binding affinity. Artificial Neural Network (ANN) was also used in order to demonstrate that the energy terms in LISA are well designed and do not require extra cross terms.
Empirical scoring function; Artificial Neural Network
The prediction of the binding free energy between a ligand and a protein is an important component in the virtual screening and lead optimization of ligands for drug discovery. To determine the quality of current binding free energy estimation programs, we examined FlexX, X-Score, AutoDock and BLEEP for their performance in binding free energy prediction in various situations including co-crystallized complex structures, cross docking of ligands to their non-co-crystallized receptors, docking of thermally unfolded receptor decoys to their ligands and complex structures with “randomized” ligand decoys. In no case was there a satisfactory correlation between the experimental and estimated binding free energies over all the datasets tested. Meanwhile, a strong correlation between ligand molecular weight-binding affinity correlation and experimental-predicted binding affinity correlation was found. Sometimes the programs also correctly ranked ligands’ binding affinities even though native interactions between the ligands and their receptors were essentially lost due to receptor deformation or ligand randomization, and the programs could not decisively discriminate randomized ligand decoys from their native ligands; this suggested that the tested programs miss important components for the accurate capture of specific ligand binding interactions.
cross docking; binding free energy; AutoDock; X-Score; FlexX; BLEEP; rigid-receptor docking; unfolded receptor decoy; randomized ligand decoy
Binding affinity prediction is one of the most critical components to computer-aided structure-based drug design. Despite advances in first-principle methods for predicting binding affinity, empirical scoring functions that are fast and only relatively accurate are still widely used in structure-based drug design. With the increasing availability of X-ray crystallographic structures in the Protein Data Bank and continuing application of biophysical methods such as isothermal titration calorimetry to measure thermodynamic parameters contributing to binding free energy, sufficient experimental data exists that scoring functions can now be derived by separating enthalpic (ΔH) and entropic (TΔS) contributions to binding free energy (ΔG). PHOENIX, a scoring function to predict binding affinities of protein-ligand complexes, utilizes the increasing availability of experimental data to improve binding affinity predictions by the following: model training and testing using high-resolution crystallographic data to minimize structural noise, independent models of enthalpic and entropic contributions fitted to thermodynamic parameters assumed to be thermodynamically biased to calculate binding free energy, use of shape and volume descriptors to better capture entropic contributions. A set of 42 descriptors and 112 protein-ligand complexes were used to derive functions using partial least squares for change of enthalpy (ΔH) and change of entropy (TΔS) to calculate change of binding free energy (ΔG), resulting in a predictive r2 (r2pred) of 0.55 and a standard error (SE) of 1.34 kcal/mol. External validation using the 2009 version of the PDBbind “refined set” (n = 1612) resulted in a Pearson correlation coefficient (Rp) of 0.575 and a mean error (ME) of 1.41 pKd. Enthalpy and entropy predictions were of limited accuracy individually. However, their difference resulted in a relatively accurate binding free energy. While the development of an accurate and applicable scoring function was an objective of this study, the main focus was evaluation of the use of high-resolution X-ray crystal structures with high-quality thermodynamic parameters from isothermal titration calorimetry for scoring function development. With the increasing application of structure-based methods in molecular design, this study suggests that using high-resolution crystal structures, separating enthalpy and entropy contributions to binding free energy, and including descriptors to better capture entropic contributions may prove to be effective strategies towards rapid and accurate calculation of binding affinity.
Current scoring functions are not very successful in protein-ligand binding affinity prediction albeit their popularity in structure-based drug designs. Here, we propose a general knowledge-guided scoring (KGS) strategy to tackle this problem. Our KGS strategy computes the binding constant of a given protein-ligand complex based on the known binding constant of an appropriate reference complex. A good training set that includes a sufficient number of protein-ligand complexes with known binding data needs to be supplied for finding the reference complex. The reference complex is required to share a similar pattern of key protein-ligand interactions to that of the complex of interest. Thus, some uncertain factors in protein-ligand binding may cancel out, resulting in a more accurate prediction of absolute binding constants.
In our study, an automatic algorithm was developed for summarizing key protein-ligand interactions as a pharmacophore model and identifying the reference complex with a maximal similarity to the query complex. Our KGS strategy was evaluated in combination with two scoring functions (X-Score and PLP) on three test sets, containing 112 HIV protease complexes, 44 carbonic anhydrase complexes, and 73 trypsin complexes, respectively. Our results obtained on crystal structures as well as computer-generated docking poses indicated that application of the KGS strategy produced more accurate predictions especially when X-Score or PLP alone did not perform well.
Compared to other targeted scoring functions, our KGS strategy does not require any re-parameterization or modification on current scoring methods, and its application is not tied to certain systems. The effectiveness of our KGS strategy is in theory proportional to the ever-increasing knowledge of experimental protein-ligand binding data. Our KGS strategy may serve as a more practical remedy for current scoring functions to improve their accuracy in binding affinity prediction.