Actin is one of the most abundant proteins in eukaryotic cells, where it plays key roles in cell shape, motility, and regulation. Actin is found in globular (G) and filamentous (F) structure in the cell. The helix of actin occurs as a result of polymerization of monomeric G-actin molecules through sequential rowing, is called F-actin. Recently, the crystal structure of an actin dimer has been reported, which details molecular interface in F-actin. In this study, the computational prediction model of actin and actin complex has been constructed base on the atomic model structure of G-actin. To this end, a docking simulation was carried out using predictive docking tools to obtain modeled structures of the actin–actin complex. Following molecular dynamics refinement, hot spots interactions at the protein interface were identified, that were predicted to contribute substantially to the free energy of binding. These provided a detailed prediction of key amino acid interactions at the protein–protein interface. The obtained model can be used for future experimental and computational studies to draw biological and functional conclusions. Also, the identified interactions will be used for designing next studies to understand the occurrence of F-actin structure.
F-actin; G-actin; Protein–protein interaction; Docking; Hot spots
A good scoring function is essential for molecular docking computations. In conventional scoring functions, energy terms modeling pairwise interactions are cumulatively summed, and the best docking solution is selected. Here, we propose to transform protein-ligand interactions into three-dimensional geometric networks, from which recurring network substructures, or network motifs, are selected and used to provide probability-ranked interaction templates with which to score docking solutions.
A novel scoring function for protein-ligand docking, MotifScore, was developed. It is non-energy-based, and docking is, instead, scored by counting the occurrences of motifs of protein-ligand interaction networks constructed using structures of protein-ligand complexes. MotifScore has been tested on a benchmark set established by others to assess its ability to identify near-native complex conformations among a set of decoys. In this benchmark test, 84% of the highest-scored docking conformations had root-mean-square deviations (rmsds) below 2.0 Å from the native conformation, which is comparable with the best of several energy-based docking scoring functions. Many of the top motifs, which comprise a multitude of chemical groups that interact simultaneously and make a highly significant contribution to MotifScore, capture recurrent interacting patterns beyond pairwise interactions.
While providing quite good docking scores, MotifScore is quite different from conventional energy-based functions. MotifScore thus represents a new, network-based approach for exploring problems associated with molecular docking.
The 3D-partner is a web tool to predict interacting partners and binding models of a query protein sequence through structure complexes and a new scoring function. 3D-partner first utilizes IMPALA to identify homologous structures (templates) of a query from a heterodimer profile library. The interacting-partner sequence profiles of these templates are then used to search interacting candidates of the query from protein sequence databases (e.g. SwissProt) by PSI-BLAST. We developed a new scoring function, which includes the contact-residue interacting score (e.g. the steric, hydrogen bonds, and electrostatic interactions) and the template consensus score (e.g. couple-conserved residue and the template similarity scores), to evaluate how well the interfaces between the query and interacting candidates. Based on this scoring function, 3D-partner provides the statistic significance, the binding models (e.g. hydrogen bonds and conserved amino acids) and functional annotations of interacting partners. The correlation between experimental energies and predicted binding affinities of our scoring function is 0.91 on 275 mutated residues from the ASEdb. The average precision of the server is 0.72 on 563 queries and the execution time of this server for a query is ∼15 s on average. These results suggest that the 3D-partner server can be useful in protein-protein interaction predictions and binding model visualizations. The server is available online at: http://3D-partner.life.nctu.edu.tw.
Exhaustive exploration of molecular interactions at the level of complete proteomes requires efficient and reliable computational approaches to protein function inference. Ligand docking and ranking techniques show considerable promise in their ability to quantify the interactions between proteins and small molecules. Despite the advances in the development of docking approaches and scoring functions, the genome-wide application of many ligand docking/screening algorithms is limited by the quality of the binding sites in theoretical receptor models constructed by protein structure prediction. In this study, we describe a new template-based method for the local refinement of ligand-binding regions in protein models using remotely related templates identified by threading. We designed a Support Vector Regression (SVR) model that selects correct binding site geometries in a large ensemble of multiple receptor conformations. The SVR model employs several scoring functions that impose geometrical restraints on the Cα positions, account for the specific chemical environment within a binding site and optimize the interactions with putative ligands. The SVR score is well correlated with the RMSD from the native structure; in 47% (70%) of the cases, the Pearson’s correlation coefficient is >0.5 (>0.3). When applied to weakly homologous models, the average heavy atom, local RMSD from the native structure of the top-ranked (best of top five) binding site geometries is 3.1 Å (2.9 Å) for roughly half of the targets; this represents a 0.1 (0.3) Å average improvement over the original predicted structure. Focusing on the subset of strongly conserved residues, the average heavy atom RMSD is 2.6 Å (2.3 Å). Furthermore, we estimate the upper bound of template-based binding site refinement using only weakly related proteins to be ~2.6 Å RMSD. This value also corresponds to the plasticity of the ligand-binding regions in distant homologues. The Binding Site Refinement (BSR) approach is available to the scientific community as a web server that can be accessed at http://cssb.biology.gatech.edu/bsr/.
Ligand-binding site refinement; proteinthreading; protein structure prediction; ligand-binding site prediction; ensemble docking; molecular function
We employ ensemble docking simulations to characterize the interactions of two enantiomeric forms of a Ru-complex compound (1-R and 1-S) with three protein kinases, namely PIM1, GSK-3β, and CDK2/cyclin A. We show that our ensemble docking computational protocol adequately models the structural features of these interactions and discriminates between competing conformational clusters of ligand-bound protein structures. Using the determined X-ray crystal structure of PIM1 complexed to the compound 1-R as a control, we discuss the importance of including the protein flexibility inherent in the ensemble docking protocol, for the accuracy of the structure prediction of the bound state. A comparison of our ensemble docking results suggests that PIM1 and GSK-3β bind the two enantiomers in similar fashion, through two primary binding modes: conformation I, which is very similar to the conformation presented in the existing PIM1/compound 1-R crystal structure; conformation II, which represents a 180° flip about an axis through the NH group of the pyridocarbazole moiety, relative to conformation I. In contrast, the binding of the enantiomers to CDK2 is found to have a different structural profile including a suggested bound conformation, which lacks the conserved hydrogen bond between the kinase and the ligand (i.e., ATP, staurosporine, Ru-complex compound). The top scoring conformation of the inhibitor bound to CDK2 is not present among the top-scoring conformations of the inhibitor bound to either PIM1 or GSK-3β and vice-versa. Collectively, our results help provide atomic-level insights into inhibitor selectivity among the three kinases.
Small molecular kinase inhibitor; Protein kinase; Inhibitor selectivity; Ruthenium-based organometalic compound; Molecular dynamics simulation; Molecular docking; Protein flexibility; Ensemble molecular docking
To determine the structures of protein-protein interactions, protein docking is a valuable tool that complements experimental methods to characterize protein complexes. While protein docking can often produce a near-native solution within a set of global docking predictions, there are sometimes predictions that require refinement to elucidate correct contacts and conformation. Previously, we developed the ZRANK algorithm to rerank initial docking predictions from ZDOCK, a docking program developed by our lab. In this study, we have applied the ZRANK algorithm toward refinement of protein docking models, in conjunction with the protein docking program RosettaDock. This was performed by reranking global docking predictions from ZDOCK, performing local side chain and rigid-body refinement using RosettaDock, and selecting the refined model based on ZRANK score. For comparison, we examined using RosettaDock score instead of ZRANK score, and a larger perturbation size for the RosettaDock search, and determined that the larger RosettaDock perturbation size with ZRANK scoring was optimal. This method was validated on a protein-protein docking benchmark. For refining docking benchmark predictions from the newest ZDOCK version, this led to improved structures of top-ranked hits in 20 of 27 cases, and an increase from 23 to 27 cases with hits in the top 20 predictions. Finally, we optimized the ZRANK energy function using refined models, which provides a significant improvement over the original ZRANK energy function. Using this optimized function and the refinement protocol, the numbers of cases with hits ranked at number one increased from 12 to 19 and from 7 to 15 for two different ZDOCK versions. This shows the effective combination of independently developed docking protocols (ZDOCK/ZRANK, and RosettaDock), indicating that using diverse search and scoring functions can improve protein docking results.
Molecular docking is widely used to obtain binding modes and binding affinities of a molecule to a given target protein. Despite considerable efforts, however, prediction of both properties by docking remains challenging mainly due to protein’s structural flexibility and inaccuracy of scoring functions. Here, an integrated approach has been developed to improve the accuracy of binding mode and affinity prediction, and tested for small molecule MDM2 and MDMX antagonists. In this approach, initial candidate models selected from docking are subjected to equilibration MD simulations to further filter the models. Free energy perturbation molecular dynamics (FEP/MD) simulations are then applied to the filtered ligand models to enhance the ability in predicting the near-native ligand conformation. The calculated binding free energies for MDM2 complexes are overestimated compared to experimental measurements mainly due to the difficulties in sampling highly flexible apo-MDM2. Nonetheless, the FEP/MD binding free energy calculations are more promising for discriminating binders from nonbinders than docking scores. In particular, the comparison between the MDM2 and MDMX results suggests that apo-MDMX has lower flexibility than apo-MDM2. In addition, the FEP/MD calculations provide detailed information on the different energetic contributions to ligand binding, leading to a better understanding of the sensitivity and specificity of protein-ligand interactions.
Molecular dynamics simulation; free energy perturbation; protein-protein interaction; docking; computer-aided drug design
The distance-dependent knowledge-based DrugScorePPI potentials, previously developed for in silico alanine scanning and hot spot prediction on given structures of protein-protein complexes, are evaluated as a scoring and objective function for the structure prediction of protein-protein complexes. When applied for ranking “unbound perturbation” (“unbound docking”) decoys generated by Baker and coworkers a 4-fold (1.5-fold) enrichment of acceptable docking solutions in the top ranks compared to a random selection is found. When applied as an objective function in FRODOCK for bound protein-protein docking on 97 complexes of the ZDOCK benchmark 3.0, DrugScorePPI/FRODOCK finds up to 10% (15%) more high accuracy solutions in the top 1 (top 10) predictions than the original FRODOCK implementation. When used as an objective function for global unbound protein-protein docking, fair docking success rates are obtained, which improve by ∼2-fold to 18% (58%) for an at least acceptable solution in the top 10 (top 100) predictions when performing knowledge-driven unbound docking. This suggests that DrugScorePPI balances well several different types of interactions important for protein-protein recognition. The results are discussed in view of the influence of crystal packing and the type of protein-protein complex docked. Finally, a simple criterion is provided with which to estimate a priori if unbound docking with DrugScorePPI/FRODOCK will be successful.
Two sets of ligand binding decoys have been constructed for the CSAR (Community Structure-Activity Resource) benchmark by using the MDock and DOCK programs for rigid-ligand and flexible-ligand docking, respectively. The decoys generated for each complex in the benchmark thoroughly cover the binding site and also contain a certain number of near-native binding modes. A few scoring functions have been evaluated using the ligand binding decoy sets for their abilities of predicting near-native binding modes. Among them, ITScore achieved a success rate of 86.7% for the rigid-ligand decoys and 79.7% for the flexible-ligand decoys, under the common definition of a successful prediction as RMSD < 2.0 Å from the native structure if the top-scored binding mode was considered. The decoy sets may serve as benchmarks for binding mode prediction of a scoring function, which are available at the CSAR website (http://www.csardock.org/).
molecular docking; scoring function; CSAR benchmark; binding mode; knowledge-based
Near-native selections from docking decoys have proved challenging especially when unbound proteins are used in the molecular docking. One reason is that significant atomic clashes in docking decoys lead to poor predictions of binding affinities of near native decoys. Atomic clashes can be removed by structural refinement through energy minimization. Such an energy minimization, however, will lead to an unrealistic bias toward docked structures with large interfaces. Here, we extend an empirical energy function developed for protein design to protein–protein docking selection by introducing a simple reference state that removes the unrealistic dependence of binding affinity of docking decoys on the buried solvent accessible surface area of interface. The energy function called EMPIRE (EMpirical Protein-InteRaction Energy), when coupled with a refinement strategy, is found to provide a significantly improved success rate in near native selections when applied to RosettaDock and refined ZDOCK docking decoys. Our work underlines the importance of removing nonspecific interactions from specific ones in near native selections from docking decoys.
knowledge-based potential; energy score functions; reference state; binding affinity; docking decoys
Protein heterodimer complexes are often involved in catalysis, regulation, assembly, immunity and inhibition. This involves the formation of stable interfaces
between the interacting partners. Hence, it is of interest to describe heterodimer interfaces using known structural complexes. We use a non-redundant dataset of
192 heterodimer complex structures from the protein databank (PDB) to identify interface residues and describe their interfaces using amino-acids residue property
preference. Analysis of the dataset shows that the heterodimer interfaces are often abundant in polar residues. The analysis also shows the presence of two classes
of interfaces in heterodimer complexes. The first class of interfaces (class A) with more polar residues than core but less than surface is known. These interfaces
are more hydrophobic than surfaces, where protein-protein binding is largely hydrophobic. The second class of interfaces (class B) with more polar residues than
core and surface is shown. These interfaces are more polar than surfaces, where binding is mainly polar. Thus, these findings provide insights to the understanding
of protein-protein interactions.
protein-protein interaction (PPI); heterodimer; interface; surface; core; polar abundance
As protein–protein interactions are crucial in most biological processes, it is valuable to understand how and where protein pairs interact. We developed a web server HOMCOS (Homology Modeling of Complex Structure, http://biunit.naist.jp/homcos) to predict interacting protein pairs and interacting sites by homology modeling of complex structures. Our server is capable of three services. The first is modeling heterodimers from two query amino acid sequences posted by users. The server performs BLAST searches to identify homologous templates in the latest representative dataset of heterodimer structures generated from the PQS database. Structure validity is evaluated by the combination of sequence similarity and knowledge-based contact potential energy as previously described. The server generates a sequence-replaced model PDB file and a MODELLER script to build full atomic models of complex structures. The second service is modeling homodimers from one query sequence. The third service is identification of potentially interacting proteins for one query sequence. The server searches the dataset of heterodimer structures for a homologous template, outputs the candidate interacting sequences in the Uniprot database homologous for the interacting partner template proteins. These features are useful for wide range of researchers to predict putative interaction sites and interacting proteins.
Molecular docking is widely used to predict novel lead compounds for drug discovery. Success depends on the quality of the docking scoring function, among other factors. An imperfect scoring function can mislead by predicting incorrect ligand geometries or by selecting nonbinding molecules over true ligands. These false-positive hits may be considered “decoys”. Although these decoys are frustrating, they potentially provide important tests for a docking algorithm; the more subtle the decoy, the more rigorous the test. Indeed, decoy databases have been used to improve protein structure prediction algorithms and protein–protein docking algorithms. Here, we describe 20 geometric decoys in five enzymes and 166 “hit list” decoys–i.e., molecules predicted to bind by our docking program that were tested and found not to do so–for β-lactamase and two cavity sites in lysozyme. Especially in the cavity sites, which are very simple, these decoys highlight particular weaknesses in our scoring function. We also consider the performance of five other widely used docking scoring functions against our geometric and hit list decoys. Intriguingly, whereas many of these other scoring functions performed better on the geometric decoys, they typically performed worse on the hit list decoys, often highly ranking molecules that seemed to poorly complement the model sites. Several of these “hits” from the other scoring functions were tested experimentally and found, in fact, to be decoys. Collectively, these decoys provide a tool for the development and improvement of molecular docking scoring functions. Such improvements may, in turn, be rapidly tested experimentally against these and related experimental systems, which are well-behaved in assays and for structure determination.
The efficient and accurate quantification of protein-ligand interactions using computational methods is still a challenging task. Two factors strongly contribute to the failure of docking methods to predict free energies of binding accurately: the insufficient incorporation of protein flexibility coupled to ligand binding and the neglected dynamics of the protein-ligand complex in current scoring schemes. We have developed a new methodology, named the ‘ligand-model’ concept, to sample protein conformations that are relevant for binding structurally diverse sets of ligands. In the ligand-model concept, molecular-dynamics (MD) simulations are performed with a virtual ligand, represented by a collection of functional groups that binds to the protein and dynamically changes its shape and properties during the simulation. The ligand model essentially represents a large ensemble of different chemical species binding to the same target protein. Representative protein structures were obtained from the MD simulation, and docking was performed into this ensemble of protein conformation. Similar binding poses were clustered, and the averaged score was utilized to re-rank the poses. We demonstrate that the ligand-model approach yields significant improvements in predicting native-like binding poses and quantifying binding affinities compared to static docking and ensemble docking simulations into protein structures generated from an apo MD simulation.
Ligand-model concept; protein-ligand interactions; protein flexibility; induced-fit; docking; holo; apo
The identification of near native protein-protein complexes among a set of decoys remains highly challenging. A strategy for improving the success rate of near native detection is to enrich near native docking decoys in a small number of top ranked decoys. Recently, we found that a combination of three scoring functions (energy, conservation and interface propensity) can predict the location of binding interface regions with reasonable accuracy. Here, these three scoring functions are modified and combined into a consensus scoring function called ENDES for enriching near native docking decoys. We found that all individual scores result in enrichment for the majority of 28 targets in ZDOCK2.3 decoy set and the 22 targets in Benchmark 2.0. Among the three scores, the interface propensity score yields the highest enrichment in both sets of protein complexes. When these scores are combined into the ENDES consensus score, a significant increase in enrichment of near-native structures is found. For example, when 2000 dock decoys are reduced to 200 decoys by ENDES, the fraction of near-native structures in docking decoys increases by a factor of about six in average. ENDES was implemented into a computer program that is available for download at http://sparks.informatics.iupui.edu.
Protein Docking; Near Native Decoy Selection; Energy Score; Interface Propensity; Conservation Score
Protein-protein docking, which aims to predict the structure of a protein-protein complex from its unbound components, remains an unresolved challenge in structural bioinformatics. An important step is the ranking of docked poses using a scoring function, for which many methods have been developed. There is a need to explore the differences and commonalities of these methods with each other, as well as with functions developed in the fields of molecular dynamics and homology modelling.
We present an evaluation of 115 scoring functions on an unbound docking decoy benchmark covering 118 complexes for which a near-native solution can be found, yielding top 10 success rates of up to 58%. Hierarchical clustering is performed, so as to group together functions which identify near-natives in similar subsets of complexes. Three set theoretic approaches are used to identify pairs of scoring functions capable of correctly scoring different complexes. This shows that functions in different clusters capture different aspects of binding and are likely to work together synergistically.
All functions designed specifically for docking perform well, indicating that functions are transferable between sampling methods. We also identify promising methods from the field of homology modelling. Further, differential success rates by docking difficulty and solution quality suggest a need for flexibility-dependent scoring. Investigating pairs of scoring functions, the set theoretic measures identify known scoring strategies as well as a number of novel approaches, indicating promising augmentations of traditional scoring methods. Such augmentation and parameter combination strategies are discussed in the context of the learning-to-rank paradigm.
Docking; Scoring functions; Binding energy; Ranking; SwarmDock
Protein-RNA interactions play fundamental roles in many biological processes. Understanding the molecular mechanism of protein-RNA recognition and formation of protein-RNA complexes is a major challenge in structural biology. Unfortunately, the experimental determination of protein-RNA complexes is tedious and difficult, both by X-ray crystallography and NMR. For many interacting proteins and RNAs the individual structures are available, enabling computational prediction of complex structures by computational docking. However, methods for protein-RNA docking remain scarce, in particular in comparison to the numerous methods for protein-protein docking.
We developed two medium-resolution, knowledge-based potentials for scoring protein-RNA models obtained by docking: the quasi-chemical potential (QUASI-RNP) and the Decoys As the Reference State potential (DARS-RNP). Both potentials use a coarse-grained representation for both RNA and protein molecules and are capable of dealing with RNA structures with posttranscriptionally modified residues. We compared the discriminative power of DARS-RNP and QUASI-RNP for selecting rigid-body docking poses with the potentials previously developed by the Varani and Fernandez groups.
In both bound and unbound docking tests, DARS-RNP showed the highest ability to identify native-like structures. Python implementations of DARS-RNP and QUASI-RNP are freely available for download at http://iimcb.genesilico.pl/RNP/
RNA; protein; RNP; macromolecular docking; complex modeling; structural bioinformatics
Accommodating backbone flexibility continues to be the most difficult challenge in computational docking of protein-protein complexes. Towards that end, we simulate four distinct biophysical models of protein binding in RosettaDock, a multi-scale Monte-Carlo based algorithm that uses a quasi-kinetic search process to emulate the diffusional encounter of two proteins and identify low energy complexes. The four binding models are: 1) key-lock model (KL) using rigid-backbone docking, 2) conformer selection model (CS) using a novel ensemble docking algorithm, 3) induced fit model (IF) using energy gradient-based backbone minimization, and 4) a combined conformer selection/induced fit model (CS/IF). Backbone flexibility was limited to the smaller partner of the complex, structural ensembles were generated using Rosetta refinement methods, and docking consisted of local perturbations around the complexed conformation using unbound component crystal structures for a set of 21 target complexes. The lowest-energy structure contained more than 30% of the native residue-residue contacts for 9, 13, 13, and 14 targets for KL, CS, IF and CS/IF docking respectively. When applied to 15 targets using NMR ensembles of the smaller protein, the lowest-energy structure recovered at least 30% native residue contacts in 3, 8, 4 and 8 targets for KL, CS, IF and CS/IF docking respectively. CS/IF docking of the NMR ensemble performed equally well or better than KL docking with the unbound crystal structure in 10 of 15 cases. The marked success of CS and CS/IF docking shows that ensemble docking can be a versatile and effective method for accommodating conformational plasticity in docking and serves as a demonstration for the conformer selection theory - that binding-competent conformers exist in the unbound ensemble and can be selected based on their favorable binding energies.
protein-protein docking; flexible docking; ensemble docking; conformer selection; NMR ensembles
With the development of many computational methods that predict the structural models of protein-protein complexes, there is a pressing need to benchmark their performance. As was the case for protein monomers, assessing the quality of models of protein complexes is not straightforward. An effective scoring scheme should be able to detect substructure similarity and estimate its statistical significance. Here, we focus on characterizing the similarity of the interfaces of the complex and introduce two scoring functions. The first, the interfacial Template Modeling score (iTM-score), measures the geometric distance between the interfaces, while the second, the Interface Similarity score (IS-score), evaluates their side chain contact similarity in addition to their geometric similarity. We first demonstrate that the IS-score is more suitable for assessing docking models than the iTM-score. The IS-score is then validated in a large-scale benchmark test on 1,562 dimeric complexes. Finally, the scoring function is applied to evaluate docking models submitted to the Critical Assessment of PRediction of Interactions (CAPRI) experiments. While the results according to the new scoring scheme are generally consistent with the original CAPRI assessment, the IS-score identifies models whose significance was previously underestimated.
docking; protein-protein interaction; protein-protein interface; structure prediction; TM-score; IS-score; CAPRI
Molecular docking is an important method for the research of protein-protein interaction and recognition. A protein can be considered as a network when the residues are treated as its nodes. With the contact energy between residues as link weight, a weighted residue network is constructed in this paper. Two weighted parameters (strength and weighted average nearest neighbors’ degree) are introduced into this model at the same time. The stability of a protein is characterized by its strength. The global topological properties of the protein-protein complex are reflected by the weighted average nearest neighbors’ degree. Based on this weighted network model and these two parameters, a new docking scoring function is proposed in this paper. The scoring and ranking for 42 systems’ bound and unbounded docking results are performed with this new scoring function. Comparing the results obtained from this new scoring function with that from the pair potentials scoring function, we found that this new scoring function has a similar performance to the pair potentials on some items, and this new scoring function can get a better success rate. The calculation of this new scoring function is easy, and the result of its scoring and ranking is acceptable. This work can help us better understand the mechanisms of protein-protein interactions and recognition.
residue network; weighted parameter; protein-protein docking; scoring function
Structural details of protein–protein interactions are invaluable for understanding and deciphering biological mechanisms. Computational docking methods aim to predict the structure of a protein–protein complex given the structures of its single components. Protein flexibility and the absence of robust scoring functions pose a great challenge in the docking field. Due to these difficulties most of the docking methods involve a two-tier approach: coarse global search for feasible orientations that treats proteins as rigid bodies, followed by an accurate refinement stage that aims to introduce flexibility into the process. The FireDock web server, presented here, is the first web server for flexible refinement and scoring of protein–protein docking solutions. It includes optimization of side-chain conformations and rigid-body orientation and allows a high-throughput refinement. The server provides a user-friendly interface and a 3D visualization of the results. A docking protocol consisting of a global search by PatchDock and a refinement by FireDock was extensively tested. The protocol was successful in refining and scoring docking solution candidates for cases taken from docking benchmarks. We provide an option for using this protocol by automatic redirection of PatchDock candidate solutions to the FireDock web server for refinement. The FireDock web server is available at http://bioinfo3d.cs.tau.ac.il/FireDock/.
DNA–protein interactions are involved in many essential biological
activities. Because there is no simple mapping code between DNA base pairs and
protein amino acids, the prediction of DNA–protein interactions is a
challenging problem. Here, we present a novel computational approach for
predicting DNA-binding protein residues and DNA–protein interaction
modes without knowing its specific DNA target sequence. Given the structure of a
DNA-binding protein, the method first generates an ensemble of complex
structures obtained by rigid-body docking with a nonspecific canonical B-DNA.
Representative models are subsequently selected through clustering and ranking
by their DNA–protein interfacial energy. Analysis of these encounter
complex models suggests that the recognition sites for specific DNA binding are
usually favorable interaction sites for the nonspecific DNA probe and that
nonspecific DNA–protein interaction modes exhibit some similarity to
specific DNA–protein binding modes. Although the method requires as
input the knowledge that the protein binds DNA, in benchmark tests, it achieves
better performance in identifying DNA-binding sites than three previously
established methods, which are based on sophisticated machine-learning
techniques. We further apply our method to protein structures predicted through
modeling and demonstrate that our method performs satisfactorily on protein
models whose root-mean-square Cα deviation from native is up to 5
Å from their native structures. This study provides valuable
structural insights into how a specific DNA-binding protein interacts with a
nonspecific DNA sequence. The similarity between the specific
DNA–protein interaction mode and nonspecific interaction modes may
reflect an important sampling step in search of its specific DNA targets by a
Many essential biological activities require interactions between DNA and
proteins. These proteins usually use certain amino acids, called DNA-binding
sites, to recognize their specific DNA targets. To facilitate the search of its
specific DNA targets, a DNA-binding protein often associates with nonspecific
DNA and then diffuses along the DNA. Due to the weak interactions between
nonspecific DNA and the protein, structural characterization of nonspecific
DNA–protein complexes is experimentally challenging. This paper
describes a computational modeling study on nonspecific DNA–protein
complexes and comparative analysis with respect to specific
DNA–protein complexes. The study found that the specific DNA-binding
sites on a protein are typically favorable for nonspecific DNA and that
nonspecific and specific DNA–protein interaction modes are quite
similar. This similarity may reflect an important sampling step in the search
for the specific DNA target sequence by a DNA-binding protein. On the basis of
these observations, a novel method was proposed for predicting DNA-binding sites
and binding modes of a DNA-binding protein without knowing its specific DNA
target sequence. Ultimately, the combination of this method and protein
structure prediction may lead the way to high throughput modeling of
ATP is involved in numerous biochemical reactions in living cells interacting with different proteins. Molecular docking simulations provide considerable insight into the problem of molecular recognition of this substrate. To improve the selection of correct ATP poses among those generated by docking algorithms we propose a post-docking reranking criterion. The method is based on detailed analysis of the intermolecular interactions in 50 high-resolution 3D-structures of ATP-protein complexes. A distinctive new feature of the proposed method is that the ligand molecule is divided into fragments that differ in their physical properties. The placement of each of them into the binding site is judged separately by different criteria, thus avoiding undesirable averaging of the scoring function terms by highlighting those relevant for particular fragments. The scoring performance of the new criteria was tested with the docking solutions for ATPprotein complexes and a significant improvement in the selection of correct docking poses was observed, as compared to the standard scoring function.
ATP recognition; molecular hydrophobicity potential; molecular lipophilicity potential; stacking
Towards an understanding of the interacting nature of the CLAVATA (CLV) complex, we predicted the 3D structures of CLV3/ESR-related (CLE) peptides and the ectodomain of their potential receptor proteins/kinases, and docking models of these molecules. The results show that the ectodomain of CLV1 can form homodimers and that the 12-/13-amino-acid CLV3 peptide fits into the binding clefts of the CLV1 dimers. Our results also demonstrate that the receptor domain of CORYNE (CRN), a recently identified receptor-like kinase, binds tightly to the ectodomain of CLV2, and this likely leads to an increased possibility for docking with CLV1. Furthermore, our docking models reveal that two CRN-CLV2 ectodomain heterodimers are able to form a tetramer receptor complex. Peptides of CLV3, CLE14, CLE19, and CLE20 are also able to bind a potential CLV2-CRN heterodimer or heterotetramer complex. Using a cell-division reporter line, we found that synthetic 12-amino-acid CLE14 and CLE20 peptides inhibit, irreversibly, root growth by reducing cell division rates in the root apical meristem, resulting in a short-root phenotype. Intriguingly, we observed that exogenous application of cytokinin can partially rescue the short-root phenotype induced by over-expression of either CLE14 or CLE20 in planta. However, cytokinin treatment does not rescue the short-root phenotype caused by exogenous application of the synthetic CLE14/CLE20 peptides, suggesting a requirement for a condition provided only in living plants. These results therefore imply that the CLE14/CLE20 peptides may act through the CLV2-CRN receptor kinase, and that their availabilities and/or abundances may be affected by cytokinin activity in planta.
CLE; 3D structures; Docking models; Root apical meristem; Cytokinin
Protein-protein interactions are involved in most cellular processes, and their detailed physico-chemical and structural characterization is needed in order to understand their function at the molecular level. In-silico docking tools can complement experimental techniques, providing three-dimensional structural models of such interactions at atomic resolution. In several recent studies, protein structures have been modeled as networks (or graphs), where the nodes represent residues and the connecting edges their interactions. From such networks, it is possible to calculate different topology-based values for each of the nodes, and to identify protein regions with high centrality scores, which are known to positively correlate with key functional residues, hot spots, and protein-protein interfaces.
Here we show that this correlation can be efficiently used for the scoring of rigid-body docking poses. When integrated into the pyDock energy-based docking method, the new combined scoring function significantly improved the results of the individual components as shown on a standard docking benchmark. This improvement was particularly remarkable for specific protein complexes, depending on the shape, size, type, or flexibility of the proteins involved.
The network-based representation of protein structures can be used to identify protein-protein binding regions and to efficiently score docking poses, complementing energy-based approaches.
protein interactions; small-world networks; binding site prediction; protein-protein docking; pyDock