Binding hot spots, protein sites with high-binding affinity, can be identified using X-ray crystallography or NMR by screening libraries of small organic molecules that tend to cluster at such regions. FTMAP, a direct computational analog of the experimental screening approaches, globally samples the surface of a target protein using small organic molecules as probes, finds favorable positions, clusters the conformations and ranks the clusters on the basis of the average energy. The regions that bind several probe clusters predict the binding hot spots, in good agreement with experimental results. Small molecules discovered by fragment-based approaches to drug design also bind at the hot spot regions. To identify such molecules and their most likely bound positions, we extend the functionality of FTMAP (http://ftmap.bu.edu/param) to accept any small molecule as an additional probe. In its updated form, FTMAP identifies the hot spots based on a standard set of probes, and for each additional probe shows representative structures of nearby low energy clusters. This approach helps to predict bound poses of the user-selected molecules, detects if a compound is not likely to bind in the hot spot region, and provides input for the design of larger ligands.
Motivation: The binding sites of proteins generally contain smaller regions that provide major contributions to the binding free energy and hence are the prime targets in drug design. Screening libraries of fragment-sized compounds by NMR or X-ray crystallography demonstrates that such ‘hot spot’ regions bind a large variety of small organic molecules, and that a relatively high ‘hit rate’ is predictive of target sites that are likely to bind drug-like ligands with high affinity. Our goal is to determine the ‘hot spots’ computationally rather than experimentally.
Results: We have developed the FTMAP algorithm that performs global search of the entire protein surface for regions that bind a number of small organic probe molecules. The search is based on the extremely efficient fast Fourier transform (FFT) correlation approach which can sample billions of probe positions on dense translational and rotational grids, but can use only sums of correlation functions for scoring and hence is generally restricted to very simple energy expressions. The novelty of FTMAP is that we were able to incorporate and represent on grids a detailed energy expression, resulting in a very accurate identification of low-energy probe clusters. Overlapping clusters of different probes are defined as consensus sites (CSs). We show that the largest CS is generally located at the most important subsite of the protein binding site, and the nearby smaller CSs identify other important subsites. Mapping results are presented for elastase whose structure has been solved in aqueous solutions of eight organic solvents, and we show that FTMAP provides very similar information. The second application is to renin, a long-standing pharmaceutical target for the treatment of hypertension, and we show that the major CSs trace out the shape of the first approved renin inhibitor, aliskiren.
Availability: FTMAP is available as a server at http://ftmap.bu.edu/.
Supplementary information: Supplementary Material is available at Bioinformatics online.
Fragment based drug design (FBDD) starts with finding fragment-sized compounds that are highly ligand efficient and can serve as a core moiety for developing high affinity leads. Although the core-bound structure of a protein facilitates the construction of leads, effective design is far from straightforward. We show that protein mapping, a computational method developed to find binding hot spots and implemented as the FTMap server, provides information that complements the fragment screening results and can drive the evolution of core fragments into larger leads with a minimal loss or, in some cases, even a gain in ligand efficiency. The method places small molecular probes, the size of organic solvents, on a dense grid around the protein, and identifies the hot spots as consensus clusters formed by clusters of several probes. The hot spots are ranked based on the number of probe clusters, which predicts the binding propensity of the subsites and hence their importance for drug design. Accordingly, with a single exception the main hot spot identified by FTMap binds the core compound found by fragment screening. The most useful information is provided by the neighboring secondary hot spots, indicating the regions where the core can be extended to increase its affinity. To quantify this information, we calculate the density of probes from mapping, which describes the binding propensity at each point, and show that the change in the correlation between a ligand position and the probe density upon extending or repositioning the core moiety predicts the expected change in ligand efficiency.
Protein mapping; protein docking; drug design; ligand efficiency; affinity prediction
Computational solvent mapping globally samples the surface of target proteins using molecular probes – small molecules or functional groups – to identify potentially favorable binding positions. The method is based on X-ray and NMR screening studies showing that the binding sites of proteins also bind a large variety of fragment-sized molecules. We have developed the multi-stage mapping algorithm FTMap (available as a server at http://ftmap.bu.edu/) based on the fast Fourier transform (FFT) correlation approach. Identifying regions of low free energy rather than individual low energy conformations, FTMap reproduces the available experimental mapping results. Applications to a variety of proteins show that the probes always cluster in important subsites of the binding site, and the amino acid residues that interact with many probes also bind the specific ligands of the protein. The “consensus” sites at which a number of different probes cluster are likely to be “druggable” sites, capable of binding drug-size ligands with high affinity. Due to its sensitivity to conformational changes the method can also be used for comparing the binding sites in different structures of a protein.
Protein structure; protein-ligand interactions; binding site; binding hot spots; fragment-based ligand design; druggability; binding site comparison; docking
In the context of protein-protein interactions, the term “hot spot” refers to a residue or cluster of residues that makes a major contribution to the binding free energy, as determined by alanine scanning mutagenesis. In contrast, in pharmaceutical research a hot spot is a site on a target protein that has high propensity for ligand binding and hence is potentially important for drug discovery. Here we examine the relationship between these two hot spot concepts by comparing alanine scanning data for a set of 15 proteins with results from mapping the protein surfaces for sites that can bind fragment-sized small molecules. We find the two types of hot spots are largely complementary; the residues protruding into hot spot regions identified by computational mapping or experimental fragment screening are almost always themselves hot spot residues as defined by alanine scanning experiments. Conversely, a residue that is found by alanine scanning to contribute little to binding rarely interacts with hot spot regions on the partner protein identified by fragment mapping. In spite of the strong correlation between the two hot spot concepts, they fundamentally differ, however. In particular, while identification of a hot spot by alanine scanning establishes the potential to generate substantial interaction energy with a binding partner, there are additional topological requirements to be a hot spot for small molecule binding. Hence, only a minority of hot spots identified by alanine scanning represent sites that are potentially useful for small inhibitor binding, and it is this subset that is identified by experimental or computational fragment screening.
The identification of hot spots, i.e. binding regions that contribute substantially to the free energy of ligand binding, is a critical step for structure-based drug design. Here we present the application of two fragment-based methods to the detection of hot spots for DJ-1 and glucocerebrosidase (GCase), targets for the development of therapeutics for Parkinson’s and Gaucher’s diseases respectively. While the structures of these two proteins are known, binding information is lacking. In this study we employ both the multiple solvent crystal structures (MSCS) method and the FTMap algorithm to identify regions suitable for the development of pharmacological chaperones for DJ-1 and GCase. Comparison of data derived via MSCS and FTMap also shows that FTMap, a computational method for the identification of fragment binding hot spots, is an accurate and robust alternative to the performance of expensive and difficult MSCS experiments.
fragment-based drug design; structure-based drug design; hot spot identification; DJ-1; glucocerebrosidase; Parkinson’s disease; Gaucher’s disease; pharmacological chaperones
To address the problem of specificity in G-protein coupled receptor (GPCR) drug discovery, there has been tremendous recent interest in allosteric drugs that bind at sites topographically distinct from the orthosteric site. Unfortunately, structure-based drug design of allosteric GPCR ligands has been frustrated by the paucity of structural data for allosteric binding sites, making a strong case for predictive computational methods. In this work, we map the surfaces of the β1 (β1AR) and β2 (β2AR) adrenergic receptor structures, to detect a series of five potentially druggable allosteric sites. We employ the FTMAP algorithm to identify “hot spots” with affinity for a variety of organic probe molecules corresponding to drug fragments. Our work is distinguished by an ensemble-based approach, whereby we map diverse receptor conformations taken from Molecular Dynamics (MD) simulations totalling ~0.5 μs. Our results reveal distinct pockets formed at both solvent-exposed and lipid-exposed cavities, which we interpret in the light of experimental data and which may constitute novel targets for GPCR drug discovery. This mapping data can now serve to drive a combination of fragment-based and virtual screening approaches for the discovery of small molecules that bind at these sites and which may offer highly selective therapies.
molecular dynamics; allosteric; GPCR; docking; fragment-based
To address the problem of specificity in G-protein coupled receptor (GPCR) drug discovery, there has been tremendous recent interest in allosteric drugs that bind at sites topographically distinct from the orthosteric site. Unfortunately, structure-based drug design of allosteric GPCR ligands has been frustrated by the paucity of structural data for allosteric binding sites, making a strong case for predictive computational methods. In this work, we map the surfaces of the β1 (β1AR) and β2 (β2AR) adrenergic receptor structures to detect a series of five potentially druggable allosteric sites. We employ the FTMAP algorithm to identify ‘hot spots’ with affinity for a variety of organic probe molecules corresponding to drug fragments. Our work is distinguished by an ensemble-based approach, whereby we map diverse receptor conformations taken from molecular dynamics (MD) simulations totaling approximately 0.5 μs. Our results reveal distinct pockets formed at both solvent-exposed and lipid-exposed cavities, which we interpret in light of experimental data and which may constitute novel targets for GPCR drug discovery. This mapping data can now serve to drive a combination of fragment-based and virtual screening approaches for the discovery of small molecules that bind at these sites and which may offer highly selective therapies.
allosteric; docking; fragment-based; GPCR; molecular dynamics
We have recently discovered an allosteric switch in Ras, bringing an additional level of complexity to this GTPase whose mutants are involved in nearly 30% of cancers. Upon activation of the allosteric switch, there is a shift in helix 3/loop 7 associated with a disorder to order transition in the active site. Here, we use a combination of multiple solvent crystal structures and computational solvent mapping (FTMap) to determine binding site hot spots in the “off” and “on” allosteric states of the GTP-bound form of H-Ras. Thirteen sites are revealed, expanding possible target sites for ligand binding well beyond the active site. Comparison of FTMaps for the H and K isoforms reveals essentially identical hot spots. Furthermore, using NMR measurements of spin relaxation, we determined that K-Ras exhibits global conformational dynamics very similar to those we previously reported for H-Ras. We thus hypothesize that the global conformational rearrangement serves as a mechanism for allosteric coupling between the effector interface and remote hot spots in all Ras isoforms. At least with respect to the binding sites involving the G domain, H-Ras is an excellent model for K-Ras and probably N-Ras as well. Ras has so far been elusive as a target for drug design. The present work identifies various unexplored hot spots throughout the entire surface of Ras, extending the focus from the disordered active site to well-ordered locations that should be easier to target.
Ras isoforms; drug target; binding site hot spots; Ras dynamics; allosteric switch
The study of protein-protein interactions is becoming increasingly important for biotechnological and therapeutic reasons. We can define two major areas therein: the structural prediction of protein-protein binding mode, and the identification of the relevant residues for the interaction (so called 'hot-spots'). These hot-spot residues have high interest since they are considered one of the possible ways of disrupting a protein-protein interaction. Unfortunately, large-scale experimental measurement of residue contribution to the binding energy, based on alanine-scanning experiments, is costly and thus data is fairly limited. Recent computational approaches for hot-spot prediction have been reported, but they usually require the structure of the complex.
We have applied here normalized interface propensity (NIP) values derived from rigid-body docking with electrostatics and desolvation scoring for the prediction of interaction hot-spots. This parameter identifies hot-spot residues on interacting proteins with predictive rates that are comparable to other existing methods (up to 80% positive predictive value), and the advantage of not requiring any prior structural knowledge of the complex.
The NIP values derived from rigid-body docking can reliably identify a number of hot-spot residues whose contribution to the interaction arises from electrostatics and desolvation effects. Our method can propose residues to guide experiments in complexes of biological or therapeutic interest, even in cases with no available 3D structure of the complex.
We present a new database of computational hot spots in protein interfaces: HotSprint. Hot spots are residues comprising only a small fraction of interfaces yet accounting for the majority of the binding energy. HotSprint contains data for 35 776 protein interfaces among 49 512 protein interfaces extracted from the multi-chain structures in Protein Data Bank (PDB) as of February 2006. The conserved residues in interfaces with certain buried accessible solvent area (ASA) and complex ASA thresholds are flagged as computational hot spots. The predicted hot spots are observed to correlate with the experimental hot spots with an accuracy of 76%. Several machine-learning methods (SVM, Decision Trees and Decision Lists) are also applied to predict hot spots, results reveal that our empirical approach performs better than the others. A web interface for the HotSprint database allows users to browse and query the hot spots in protein interfaces. HotSprint is available at http://prism.ccbb.ku.edu.tr/hotsprint; and it provides information for interface residues that are functionally and structurally important as well as the evolutionary history and solvent accessibility of residues in interfaces.
Systematic mutagenesis studies have shown that only a few interface residues termed hot spots contribute significantly to the binding free energy of protein-protein interactions. Therefore, hot spots prediction becomes increasingly important for well understanding the essence of proteins interactions and helping narrow down the search space for drug design. Currently many computational methods have been developed by proposing different features. However comparative assessment of these features and furthermore effective and accurate methods are still in pressing need.
In this study, we first comprehensively collect the features to discriminate hot spots and non-hot spots and analyze their distributions. We find that hot spots have lower relASA and larger relative change in ASA, suggesting hot spots tend to be protected from bulk solvent. In addition, hot spots have more contacts including hydrogen bonds, salt bridges, and atomic contacts, which favor complexes formation. Interestingly, we find that conservation score and sequence entropy are not significantly different between hot spots and non-hot spots in Ab+ dataset (all complexes). While in Ab- dataset (antigen-antibody complexes are excluded), there are significant differences in two features between hot pots and non-hot spots. Secondly, we explore the predictive ability for each feature and the combinations of features by support vector machines (SVMs). The results indicate that sequence-based feature outperforms other combinations of features with reasonable accuracy, with a precision of 0.69, a recall of 0.68, an F1 score of 0.68, and an AUC of 0.68 on independent test set. Compared with other machine learning methods and two energy-based approaches, our approach achieves the best performance. Moreover, we demonstrate the applicability of our method to predict hot spots of two protein complexes.
Experimental results show that support vector machine classifiers are quite effective in predicting hot spots based on sequence features. Hot spots cannot be fully predicted through simple analysis based on physicochemical characteristics, but there is reason to believe that integration of features and machine learning methods can remarkably improve the predictive performance for hot spots.
Structures of the influenza A virus M2 proton channel have been determined by X-ray crystallography in the open conformation, and by NMR in the closed state. Whereas the X-ray structure shows a single inhibitor molecule in the middle of the channel, four inhibitor molecules bind the channel’s outer surface in the NMR structure. Although in both structures the strongest hot spots (i.e., regions which substantially contribute to the free energy of binding any potential ligand) lie inside the pore, hot spots also are found at exterior locations. By considering all available models, we propose the primary drug binding site is inside the pore, but that exterior binding also occurs under appropriate conditions.
Hot spots are residues contributing the most of binding free energy yet accounting for a small portion of a protein interface. Experimental approaches to identify hot spots such as alanine scanning mutagenesis are expensive and time-consuming, while computational methods are emerging as effective alternatives to experimental approaches.
In this study, we propose a semi-supervised boosting SVM, which is called sbSVM, to computationally predict hot spots at protein-protein interfaces by combining protein sequence and structure features. Here, feature selection is performed using random forests to avoid over-fitting. Due to the deficiency of positive samples, our approach samples useful unlabeled data iteratively to boost the performance of hot spots prediction. The performance evaluation of our method is carried out on a dataset generated from the ASEdb database for cross-validation and a dataset from the BID database for independent test. Furthermore, a balanced dataset with similar amounts of hot spots and non-hot spots (65 and 66 respectively) derived from the first training dataset is used to further validate our method. All results show that our method yields good sensitivity, accuracy and F1 score comparing with the existing methods.
Our method boosts prediction performance of hot spots by using unlabeled data to overcome the deficiency of available training data. Experimental results show that our approach is more effective than the traditional supervised algorithms and major existing hot spot prediction methods.
The KFC Server is a web-based implementation of the KFC (Knowledge-based FADE and Contacts) model—a machine learning approach for the prediction of binding hot spots, or the subset of residues that account for most of a protein interface's; binding free energy. The server facilitates the automated analysis of a user submitted protein–protein or protein–DNA interface and the visualization of its hot spot predictions. For each residue in the interface, the KFC Server characterizes its local structural environment, compares that environment to the environments of experimentally determined hot spots and predicts if the interface residue is a hot spot. After the computational analysis, the user can visualize the results using an interactive job viewer able to quickly highlight predicted hot spots and surrounding structural features within the protein structure. The KFC Server is accessible at http://kfc.mitchell-lab.org.
Alanine scanning mutagenesis is a powerful experimental methodology for investigating the structural and energetic characteristics of protein complexes. Individual amino-acids are systematically mutated to alanine and changes in free energy of binding (ΔΔG) measured. Several experiments have shown that protein-protein interactions are critically dependent on just a few residues ("hot spots") at the interface. Hot spots make a dominant contribution to the free energy of binding and if mutated they can disrupt the interaction. As mutagenesis studies require significant experimental efforts, there is a need for accurate and reliable computational methods. Such methods would also add to our understanding of the determinants of affinity and specificity in protein-protein recognition.
We present a novel computational strategy to identify hot spot residues, given the structure of a complex. We consider the basic energetic terms that contribute to hot spot interactions, i.e. van der Waals potentials, solvation energy, hydrogen bonds and Coulomb electrostatics. We treat them as input features and use machine learning algorithms such as Support Vector Machines and Gaussian Processes to optimally combine and integrate them, based on a set of training examples of alanine mutations. We show that our approach is effective in predicting hot spots and it compares favourably to other available methods. In particular we find the best performances using Transductive Support Vector Machines, a semi-supervised learning scheme. When hot spots are defined as those residues for which ΔΔG ≥ 2 kcal/mol, our method achieves a precision and a recall respectively of 56% and 65%.
We have developed an hybrid scheme in which energy terms are used as input features of machine learning models. This strategy combines the strengths of machine learning and energy-based methods. Although so far these two types of approaches have mainly been applied separately to biomolecular problems, the results of our investigation indicate that there are substantial benefits to be gained by their integration.
It is known that binding free energy of protein-protein interaction is mainly contributed by hot spot (high energy) interface residues. Here, we
investigate the characteristics of hot spots by examining inter-atomic sidechain-sidechain interactions using a dataset of 296 alanine-mutated interface
residues. Results show that hot spots participate in strong and energetically favorable sidechain-sidechain interactions. Subsequently, we describe a novel,
yet simple ‘hot spot’ prediction model with an accuracy that is similar to many available approaches. The model is also shown to efficiently distinguish
specific protein-protein interactions from non-specific interactions.
protein-protein interaction; interface analysis; hot spot residues; inter-atomic interaction
A traditional technique for structure-based drug design (SBDD) is mapping protein surfaces with probe molecules to identify “hot spots” where key functional groups can best complement the receptor. Common methods, such as minimizing probes or calculating grids, use a fixed protein structure in the gas phase, ignoring both protein flexibility and proper competition between the probes and water. As a result, the potential surface is quite rugged and many spurious, local minima are identified. Here, we compare rigid and fully flexible protein in mixed-solvent molecular dynamics (MixMD), which allows for flexibility and full solvent effects. We were surprised to find that that the many local minima are still found when a protein's conformational sampling is restricted; the dynamic averaging of probes and competition with water does not smooth the potential surface as one might expect. Only when a protein is allowed to be fully flexible in the simulation are the proper minima located and the spurious ones eliminated. Our results indicate that inclusion of full protein flexibility is critical to accurate hot-spot mapping for SBDD.
It is well known that most of the binding free energy of protein interaction is contributed by a few key hot spot residues. These residues are crucial for understanding the function of proteins and studying their interactions. Experimental hot spots detection methods such as alanine scanning mutagenesis are not applicable on a large scale since they are time consuming and expensive. Therefore, reliable and efficient computational methods for identifying hot spots are greatly desired and urgently required.
In this work, we introduce an efficient approach that uses support vector machine (SVM) to predict hot spot residues in protein interfaces. We systematically investigate a wide variety of 62 features from a combination of protein sequence and structure information. Then, to remove redundant and irrelevant features and improve the prediction performance, feature selection is employed using the F-score method. Based on the selected features, nine individual-feature based predictors are developed to identify hot spots using SVMs. Furthermore, a new ensemble classifier, namely APIS (A combined model based on Protrusion Index and Solvent accessibility), is developed to further improve the prediction accuracy. The results on two benchmark datasets, ASEdb and BID, show that this proposed method yields significantly better prediction accuracy than those previously published in the literature. In addition, we also demonstrate the predictive power of our proposed method by modelling two protein complexes: the calmodulin/myosin light chain kinase complex and the heat shock locus gene products U and V complex, which indicate that our method can identify more hot spots in these two complexes compared with other state-of-the-art methods.
We have developed an accurate prediction model for hot spot residues, given the structure of a protein complex. A major contribution of this study is to propose several new features based on the protrusion index of amino acid residues, which has been shown to significantly improve the prediction performance of hot spots. Moreover, we identify a compact and useful feature subset that has an important implication for identifying hot spot residues. Our results indicate that these features are more effective than the conventional evolutionary conservation, pairwise residue potentials and other traditional features considered previously, and that the combination of our and traditional features may support the creation of a discriminative feature set for efficient prediction of hot spot residues. The data and source code are available on web site http://home.ustc.edu.cn/~jfxia/hotspot.html.
Hot spots are energetically important residues at protein interfaces and they are not randomly distributed across the interface but rather clustered. These clustered hot spots form hot regions. Hot regions are important for the stability of protein complexes, as well as providing specificity to binding sites. We propose a database called HotRegion, which provides the hot region information of the interfaces by using predicted hot spot residues, and structural properties of these interface residues such as pair potentials of interface residues, accessible surface area (ASA) and relative ASA values of interface residues of both monomer and complex forms of proteins. Also, the 3D visualization of the interface and interactions among hot spot residues are provided. HotRegion is accessible at http://prism.ccbb.ku.edu.tr/hotregion.
A protein binding hot spot is a small cluster of residues tightly packed at the center of the interface between two interacting proteins. Though a hot spot constitutes a small fraction of the interface, it is vital to the stability of protein complexes. Recently, there are a series of hypotheses proposed to characterize binding hot spots, including the pioneering O-ring theory, the insightful 'coupling' and 'hot region' principle, and our 'double water exclusion' (DWE) hypothesis. As the perspective changes from the O-ring theory to the DWE hypothesis, we examine the physicochemical properties of the binding hot spots under the new hypothesis and compare with those under the O-ring theory.
The requirements for a cluster of residues to form a hot spot under the DWE hypothesis can be mathematically satisfied by a biclique subgraph if a vertex is used to represent a residue, an edge to indicate a close distance between two residues, and a bipartite graph to represent a pair of interacting proteins. We term these hot spots as DWE bicliques. We identified DWE bicliques from crystal packing contacts, obligate and non-obligate interactions. Our comparative study revealed that there are abundant unique bicliques to the biological interactions, indicating specific biological binding behaviors in contrast to crystal packing. The two sub-types of biological interactions also have their own signature bicliques. In our analysis on residue compositions and residue pairing preferences in DWE bicliques, the focus was on interaction-preferred residues (ipRs) and interaction-preferred residue pairs (ipRPs). It is observed that hydrophobic residues are heavily involved in the ipRs and ipRPs of the obligate interactions; and that aromatic residues are in favor in the ipRs and ipRPs of the biological interactions, especially in those of the non-obligate interactions. In contrast, the ipRs and ipRPs in crystal packing are dominated by hydrophilic residues, and most of the anti-ipRs of crystal packing are the ipRs of the obligate or non-obligate interactions.
These ipRs and ipRPs in our DWE bicliques describe a diverse binding features among the three types of interactions. They also highlight the specific binding behaviors of the biological interactions, sharply differing from the artifact interfaces in the crystal packing. It can be noted that DWE bicliques, especially the unique bicliques, can capture deep insights into the binding characteristics of protein interfaces.
The energy distribution along the protein–protein interface is not homogenous; certain residues contribute more to the binding free energy, called ‘hot spots’. Here, we present a web server, HotPoint, which predicts hot spots in protein interfaces using an empirical model. The empirical model incorporates a few simple rules consisting of occlusion from solvent and total knowledge-based pair potentials of residues. The prediction model is computationally efficient and achieves high accuracy of 70%. The input to the HotPoint server is a protein complex and two chain identifiers that form an interface. The server provides the hot spot prediction results, a table of residue properties and an interactive 3D visualization of the complex with hot spots highlighted. Results are also downloadable as text files. This web server can be used for analysis of any protein–protein interface which can be utilized by researchers working on binding sites characterization and rational design of small molecules for protein interactions. HotPoint is accessible at http://prism.ccbb.ku.edu.tr/hotpoint.
We present a partial-differential-equation (PDE)-constrained approach for optimizing a molecule’s electrostatic interactions with a target molecule. The approach, which we call reverse-Schur co-optimization, can be more than two orders of magnitude faster than the traditional approach to electrostatic optimization. The efficiency of the co-optimization approach may enhance the value of electrostatic optimization for ligand-design efforts–in such projects, it is often desirable to screen many candidate ligands for their viability, and the optimization of electrostatic interactions can improve ligand binding affinity and specificity. The theoretical basis for electrostatic optimization derives from linear-response theory, most commonly continuum models, and simple assumptions about molecular binding processes. Although the theory has been used successfully to study a wide variety of molecular binding events, its implications have not yet been fully explored, in part due to the computational expense associated with the optimization. The co-optimization algorithm achieves improved performance by solving the optimization and electrostatic simulation problems simultaneously, and is applicable to both unconstrained and constrained optimization problems. Reverse-Schur co-optimization resembles other well-known techniques for solving optimization problems with PDE constraints. Model problems as well as realistic examples validate the reverse-Schur method, and demonstrate that our technique and alternative PDE-constrained methods scale very favorably compared to the standard approach. Regularization, which ordinarily requires an explicit representation of the objective function, can be included using an approximate Hessian calculated using the new BIBEE/P (boundary-integral-based electrostatics estimation by preconditioning) method.
The steroid and xenobiotic-responsive human pregnane X receptor (PXR) binds a broad range of structurally diverse compounds. The structures of the apo and ligand-bound forms of PXR are very similar, in contrast to most promiscuous proteins that generally adapt their shape to different ligands. We investigated the structural origins of PXR's recognition promiscuity using computational solvent mapping, a technique developed for the identification and characterization of hot spots, i.e., regions of the protein surface that are major contributors to the binding free energy. Results reveal that the smooth and nearly spherical binding site of PXR has a well-defined hot spot structure, with four hot spots located on four different sides of the pocket and a fifth close to its center. Three of these hot spots are already present in the ligand-free protein. The most important hot spot is defined by three structurally and sequentially conserved residues, W299, F288, and Y306. This largely hydrophobic site is not very specific, and interacts with all known PXR ligands. Depending on their sizes and shapes, individual PXR ligands extend into 2, 3, or 4 more hot spot regions. The large number of potential arrangements within the binding site explains why PXR is able to accommodate a large variety of compounds. All five hot spots include at least one important residue, which is conserved in all mammalian PXRs, suggesting that the hot spot locations have remained largely invariant during mammalian evolution. The same side chains also show a high level of structural conservation across hPXR structures. However, each of the hPXR hot spots also includes residues with moveable side chains, further increasing the size variation in ligands that PXR can bind. Results also suggest a unique signal transduction mechanism between the PXR homodimerization interface and its co-activator binding site.
Protein-protein interactions are critically dependent on just a few ‘hot spot’ residues at the interface. Hot spots make a dominant contribution to the free energy of binding and they can disrupt the interaction if mutated to alanine. Here, we present HSPred, a support vector machine(SVM)-based method to predict hot spot residues, given the structure of a complex. HSPred represents an improvement over a previously described approach (Lise et al, BMC Bioinformatics 2009, 10:365). It achieves higher accuracy by treating separately predictions involving either an arginine or a glutamic acid residue. These are the amino acid types on which the original model did not perform well. We have therefore developed two additional SVM classifiers, specifically optimised for these cases. HSPred reaches an overall precision and recall respectively of 61% and 69%, which roughly corresponds to a 10% improvement. An implementation of the described method is available as a web server at http://bioinf.cs.ucl.ac.uk/hspred. It is free to non-commercial users.