Prediction of physical protein-protein interactions represents a key challenge in computational systems biology. This study provides a proof-of-principle that high-throughput in silico protein docking results can be used to predict interaction partners.
Deciphering the whole network of protein interactions for a given proteome (‘interactome') is the goal of many experimental and computational efforts in Systems Biology. Separately the prediction of the structure of protein complexes by docking methods is a well-established scientific area. To date, docking programs have not been used to predict interaction partners. We provide a proof of principle for such an approach. Using a set of protein complexes representing known interactors in their unbound form, we show that a standard docking program can distinguish the true interactors from a background of 922 non-redundant potential interactors. We additionally show that true interactions can be distinguished from non-likely interacting proteins within the same structural family. Our approach may be put in the context of the proposed ‘funnel-energy model'; the docking algorithm may not find the native complex, but it distinguishes binding partners because of the higher probability of favourable models compared with a collection of non-binders. The potential exists to develop this proof of principle into new approaches for predicting interaction partners and reconstructing biological networks.
interactome; protein docking; protein–protein interaction
The CAPRI experiment (Critical Assessment of Predicted Interactions) simulates realistic and diverse docking challenges, each case having specific properties that may be exploited by docking algorithms. Motivated by the different CAPRI challenges, we developed and implemented a comprehensive suite of docking algorithms. These were incorporated into a dynamic docking protocol, consisting of four main stages: (1) Biological and bioinformatics research aiming to predict the binding site residues, to define distance constraints between interface atoms and to analyze the flexibility of molecules; (2) Rigid or flexible docking, performed by the PatchDock or FlexDock method, which utilizes the information gathered in the previous step. Symmetric complexes are predicted by the SymmDock method; (3) Flexible refinement and re-ranking of the rigid docking solution candidates, performed by FiberDock; and finally, (4) clustering and filtering the results based on energy funnels. We analyzed the performance of our docking protocol on a large benchmark and on recent CAPRI targets. The analysis has demonstrated the importance of biological information gathering prior to docking, which significantly increased the docking success rate, and of the refinement and re-scoring stage that significantly improved the ranking of the rigid docking solutions. Our failures were mostly a result of mishandling backbone flexibility, inaccurate homology modeling, or incorrect biological assumptions. Most of the methods are available at http://bioinfo3d.cs.tau.ac.il/.
Structural insight from transcription factor-DNA (TF-DNA) complexes is of paramount importance to our understanding of the affinity and specificity of TF-DNA interaction, and to the development of structure-based prediction of TF binding sites. Yet the majority of the TF-DNA complexes remain unsolved despite the considerable experimental efforts being made. Computational docking represents a promising alternative to bridge the gap. To facilitate the study of TF-DNA docking, carefully designed benchmarks are needed for performance evaluation and identification of the strengths and weaknesses of docking algorithms.
We constructed two benchmarks for flexible and rigid TF-DNA docking respectively using a unified non-redundant set of 38 test cases. The test cases encompass diverse fold families and are classified into easy and hard groups with respect to the degrees of difficulty in TF-DNA docking. The major parameters used to classify expected docking difficulty in flexible docking are the conformational differences between bound and unbound TFs and the interaction strength between TFs and DNA. For rigid docking in which the starting structure is a bound TF conformation, only interaction strength is considered.
We believe these benchmarks are important for the development of better interaction potentials and TF-DNA docking algorithms, which bears important implications to structure-based prediction of transcription factor binding sites and drug design.
Over the last years, large scale proteomics studies have generated a wealth of information of biomolecular complexes. Adding the structural dimension to the resulting interactomes represents a major challenge that classical structural experimental methods alone will have difficulties to confront. To meet this challenge, complementary modeling techniques such as docking are thus needed. Among the current docking methods, HADDOCK (High Ambiguity-Driven DOCKing) distinguishes itself from others by the use of experimental and/or bioinformatics data to drive the modeling process and has shown a strong performance in the critical assessment of prediction of interactions (CAPRI), a blind experiment for the prediction of interactions. Although most docking programs are limited to binary complexes, HADDOCK can deal with multiple molecules (up to six), a capability that will be required to build large macromolecular assemblies. We present here a novel web interface of HADDOCK that allows the user to dock up to six biomolecules simultaneously. This interface allows the inclusion of a large variety of both experimental and/or bioinformatics data and supports several types of cyclic and dihedral symmetries in the docking of multibody assemblies. The server was tested on a benchmark of six cases, containing five symmetric homo-oligomeric protein complexes and one symmetric protein-DNA complex. Our results reveal that, in the presence of either bioinformatics and/or experimental data, HADDOCK shows an excellent performance: in all cases, HADDOCK was able to generate good to high quality solutions and ranked them at the top, demonstrating its ability to model symmetric multicomponent assemblies. Docking methods can thus play an important role in adding the structural dimension to interactomes. However, although the current docking methodologies were successful for a vast range of cases, considering the variety and complexity of macromolecular assemblies, inclusion of some kind of experimental information (e.g. from mass spectrometry, nuclear magnetic resonance, cryoelectron microscopy, etc.) will remain highly desirable to obtain reliable results.
In modeling ligand-protein interactions, the representation and role of water is of great importance. We introduce a forcefield and hydration docking method that enables the automated prediction of waters mediating the binding of ligands with target proteins. The method presumes no prior knowledge of the apo or holo protein hydration state, and is potentially useful in the process of structure-based drug discovery. The hydration forcefield accounts for the entropic and enthalpic contributions of discrete waters to ligand binding, improving energy estimation accuracy and docking performance. The forcefield has been calibrated and validated on a total of 417 complexes (197 training set; 220 test set), then tested in cross-docking experiments, for a total of 1649 ligand-protein complexes evaluated. The method is computationally efficient and was used to model up to 35 waters during docking. The method was implemented and tested using unaltered AutoDock4 with new forcefield tables.
Synaptic vesicles dock to the plasma membrane at synapses to facilitate rapid exocytosis. Docking was originally proposed to require the soluble N-ethylmaleimide–sensitive fusion attachment protein receptor (SNARE) proteins; however, perturbation studies suggested that docking was independent of the SNARE proteins. We now find that the SNARE protein syntaxin is required for docking of all vesicles at synapses in the nematode Caenorhabditis elegans. The active zone protein UNC-13, which interacts with syntaxin, is also required for docking in the active zone. The docking defects in unc-13 mutants can be fully rescued by overexpressing a constitutively open form of syntaxin, but not by wild-type syntaxin. These experiments support a model for docking in which UNC-13 converts syntaxin from the closed to the open state, and open syntaxin acts directly in docking vesicles to the plasma membrane. These data provide a molecular basis for synaptic vesicle docking.
Like Olympic swimmers crouched on their starting blocks, synaptic vesicles prepare for fusion with the neuronal plasma membrane long before the starting gun fires. This preparation enables vesicles to fuse rapidly, synchronously, and in the correct place when the signal finally arrives. A well-known but poorly understood part of vesicle preparation is docking, in which vesicles prepare for release by attaching to the plasma membrane at the eventual site of release. Here, we outline a molecular mechanism for docking. Using a combination of genetics and electron microscopy, we find that docking requires two proteins: the cytoplasmic protein UNC-13 and the plasma membrane protein syntaxin. Syntaxin is known to form two configurations, closed and open. We find that the open form of syntaxin can bypass the docking function of UNC-13, while the closed form cannot. These experiments suggest that docking is the attachment of synaptic vesicles to syntaxin; that syntaxin must be open for this attachment to occur; and that UNC-13′s role in docking is to promote open syntaxin.
Experiments in C. elegans support a model for synaptic vesicle docking in which the active zone protein UNC-13 converts syntaxin from the closed to the open state, and open syntaxin acts directly in docking vesicles to the plasma membrane.
Small molecule docking predicts the interaction of a small molecule ligand with a protein at atomic-detail accuracy including position and conformation the ligand but also conformational changes of the protein upon ligand binding. While successful in the majority of cases, docking algorithms including RosettaLigand fail in some cases to predict the correct protein/ligand complex structure. In this study we show that simultaneous docking of explicit interface water molecules greatly improves Rosetta’s ability to distinguish correct from incorrect ligand poses. This result holds true for both protein-centric water docking wherein waters are located relative to the protein binding site and ligand-centric water docking wherein waters move with the ligand during docking. Protein-centric docking is used to model 99 HIV-1 protease/protease inhibitor structures. We find protease inhibitor placement improving at a ratio of 9∶1 when one critical interface water molecule is included in the docking simulation. Ligand-centric docking is applied to 341 structures from the CSAR benchmark of diverse protein/ligand complexes . Across this diverse dataset we see up to 56% recovery of failed docking studies, when waters are included in the docking simulation.
Protein-RNA interactions play an important role in many biological processes. The ability to predict the molecular structures of protein-RNA complexes from docking would be valuable for understanding the underlying chemical mechanisms. We have developed a novel non-redundant benchmark dataset for protein-RNA docking and scoring. The diverse dataset of 72 targets consists of 52 unbound-unbound test complexes, and 20 unbound-bound test complexes. Here, unbound-unbound complexes refer to cases in which both binding partners of the co-crystallized complex are either in apo form or in a conformation taken from a different protein-RNA complex, whereas unbound-bound complexes are cases in which only one of the two binding partners has another experimentally determined conformation. The dataset is classified into three categories according to the interface RMSD and the percentage of native contacts in the unbound structures: 49 easy, 16 medium, and 7 difficult targets. The bound and unbound cases of the benchmark dataset are expected to benefit the development and improvement of docking and scoring algorithms for the docking community. All the easy-to-view structures are freely available to the public at http://zoulab.dalton.missouri.edu/RNAbenchmark/.
Benchmarking; protein-RNA interactions; molecular docking; scoring function; molecular recognition
The transition of epithelial cells from their normal non-motile state to a motile one requires the coordinated action of a number of small GTPases. We have previously shown that epithelial cell migration is stimulated by the coordinated activation of Arf and Rac GTPases. This crosstalk depends upon the assembly of a multi-protein complex that contains the Arf-activating protein cytohesin 2/ARNO and the Rac activating protein Dock180. Two scaffolding proteins that bind directly to cytohesin 2 organize this complex.
We now have found that Rac activation in response to hepatocyte growth factor (HGF) requires cytohesin 2 and Dock180. GRASP/Tamalin is one of the scaffolds that builds the complex containing cytohesin 2 and Dock180. We determine here that the Ala/Pro rich region of GRASP directly interacts with the SH3 domain of Dock180. By binding to both cytohesin 2/ARNO and Dock180, GRASP bridges the guanine nucleotide exchange factors (GEFs) that activate Arf and Rac, thereby promoting Arf-to-Rac signaling. Furthermore, we find that knockdown of GRASP impairs hepatocyte growth factor (HGF)-stimulated Rac activation and HGF-stimulated epithelial migration.
GRASP binds directly both cytohesin 2 and Dock180 to coordinate their activities, and by doing so promotes crosstalk between Arf and Rac.
Cytohesin; GRASP; Tamalin; Dock180; Arf6 and Rac1
Motivation: An effective docking algorithm for antibody–protein antigen complex prediction is an important first step toward design of biologics and vaccines. We have recently developed a new class of knowledge-based interaction potentials called Decoys as the Reference State (DARS) and incorporated DARS into the docking program PIPER based on the fast Fourier transform correlation approach. Although PIPER was the best performer in the latest rounds of the CAPRI protein docking experiment, it is much less accurate for docking antibody–protein antigen pairs than other types of complexes, in spite of incorporating sequence-based information on the location of the paratope. Analysis of antibody–protein antigen complexes has revealed an inherent asymmetry within these interfaces. Specifically, phenylalanine, tryptophan and tyrosine residues highly populate the paratope of the antibody but not the epitope of the antigen.
Results: Since this asymmetry cannot be adequately modeled using a symmetric pairwise potential, we have removed the usual assumption of symmetry. Interaction statistics were extracted from antibody–protein complexes under the assumption that a particular atom on the antibody is different from the same atom on the antigen protein. The use of the new potential significantly improves the performance of docking for antibody–protein antigen complexes, even without any sequence information on the location of the paratope. We note that the asymmetric potential captures the effects of the multi-body interactions inherent to the complex environment in the antibody–protein antigen interface.
Availability: The method is implemented in the ClusPro protein docking server, available at http://cluspro.bu.edu.
firstname.lastname@example.org or email@example.com
Supplementary data are available at Bioinformatics online.
The functions of proteins is often realized through their mutual interactions. Determining a relative transformation for a pair of proteins and their conformations which form a stable complex, reproducible in nature, is known as docking. It is an important step in drug design, structure determination and understanding function and structure relationships. In this paper we extend our non-uniform fast Fourier transform docking algorithm to include an adaptive search phase (both translational and rotational) and thereby speed up its execution. We have also implemented a multithreaded version of the adaptive docking algorithm for even faster execution on multicore machines. We call this protein-protein docking code F2Dock (F2 = Fast Fourier). We have calibrated F2Dock based on an extensive experimental study on a list of benchmark complexes and conclude that F2Dock works very well in practice. Though all docking results reported in this paper use shape complementarity and Coulombic potential based scores only, F2Dock is structured to incorporate Lennard-Jones potential and re-ranking docking solutions based on desolvation energy.
Computational Structural Biology; Protein-Protein Interactions; Fast Fourier Methods; Algorithms; Docking; Redocking
Bacterial type I polyketide synthases assemble structurally diverse natural products of significant clinical value from simple metabolic building blocks. The synthesis of these compounds occurs in a processive fashion along a large multi-protein complex. Transfer of the acyl intermediate across inter-polypeptide junctions is mediated, at least in large part, by N- and C-terminal docking domains. We report here a comprehensive analysis of the binding affinity and selectivity for the complete set of discrete docking domain pairs in the pikromycin and erythromycin PKS systems. Despite disconnection from their parent module, each cognate pair of docking domains retained exquisite binding selectivity. Further insights were obtained by X-ray crystallographic analysis of the PikAIII/PikAIV docking domain interface. This new information revealed a series of key interacting residues that enabled development of a structural model for the recently proposed H2–T2 class of polypeptides involved in PKS intermodular molecular recognition.
Structural details of protein–protein interactions are invaluable for understanding and deciphering biological mechanisms. Computational docking methods aim to predict the structure of a protein–protein complex given the structures of its single components. Protein flexibility and the absence of robust scoring functions pose a great challenge in the docking field. Due to these difficulties most of the docking methods involve a two-tier approach: coarse global search for feasible orientations that treats proteins as rigid bodies, followed by an accurate refinement stage that aims to introduce flexibility into the process. The FireDock web server, presented here, is the first web server for flexible refinement and scoring of protein–protein docking solutions. It includes optimization of side-chain conformations and rigid-body orientation and allows a high-throughput refinement. The server provides a user-friendly interface and a 3D visualization of the results. A docking protocol consisting of a global search by PatchDock and a refinement by FireDock was extensively tested. The protocol was successful in refining and scoring docking solution candidates for cases taken from docking benchmarks. We provide an option for using this protocol by automatic redirection of PatchDock candidate solutions to the FireDock web server for refinement. The FireDock web server is available at http://bioinfo3d.cs.tau.ac.il/FireDock/.
The mammalian DOCK180 protein belongs to an evolutionarily conserved protein family, which together with ELMO proteins, is essential for activation of Rac GTPase-dependent biological processes. Here, we have analyzed the DOCK180-ELMO1 interaction, and map direct interaction interfaces to the N-terminal 200 amino acids of DOCK180, and to the C-terminal 200 amino acids of ELMO1, comprising the ELMO1 PH domain. Structural and biochemical analysis of this PH domain reveals that it is incapable of phospholipid binding, but instead structurally resembles FERM domains. Moreover, the structure revealed an N-terminal amphiphatic α-helix, and point mutants of invariant hydrophobic residues in this helix disrupt ELMO1-DOCK180 complex formation. A secondary interaction between ELMO1 and DOCK180 is conferred by the DOCK180 SH3 domain and proline-rich motifs at the ELMO1 C-terminus. Mutation of both DOCK180-interaction sites on ELMO1 is required to disrupt the DOCK180-ELMO1 complex. Significantly, although this does not affect DOCK180 GEF activity toward Rac in vivo, Rac signaling is impaired, implying additional roles for ELMO in mediating intracellular Rac signaling.
To determine the structures of protein-protein interactions, protein docking is a valuable tool that complements experimental methods to characterize protein complexes. While protein docking can often produce a near-native solution within a set of global docking predictions, there are sometimes predictions that require refinement to elucidate correct contacts and conformation. Previously, we developed the ZRANK algorithm to rerank initial docking predictions from ZDOCK, a docking program developed by our lab. In this study, we have applied the ZRANK algorithm toward refinement of protein docking models, in conjunction with the protein docking program RosettaDock. This was performed by reranking global docking predictions from ZDOCK, performing local side chain and rigid-body refinement using RosettaDock, and selecting the refined model based on ZRANK score. For comparison, we examined using RosettaDock score instead of ZRANK score, and a larger perturbation size for the RosettaDock search, and determined that the larger RosettaDock perturbation size with ZRANK scoring was optimal. This method was validated on a protein-protein docking benchmark. For refining docking benchmark predictions from the newest ZDOCK version, this led to improved structures of top-ranked hits in 20 of 27 cases, and an increase from 23 to 27 cases with hits in the top 20 predictions. Finally, we optimized the ZRANK energy function using refined models, which provides a significant improvement over the original ZRANK energy function. Using this optimized function and the refinement protocol, the numbers of cases with hits ranked at number one increased from 12 to 19 and from 7 to 15 for two different ZDOCK versions. This shows the effective combination of independently developed docking protocols (ZDOCK/ZRANK, and RosettaDock), indicating that using diverse search and scoring functions can improve protein docking results.
Peptide–protein interactions are among the most prevalent and important interactions in the cell, but a large fraction of those interactions lack detailed structural characterization. The Rosetta FlexPepDock web server (http://flexpepdock.furmanlab.cs.huji.ac.il/) provides an interface to a high-resolution peptide docking (refinement) protocol for the modeling of peptide–protein complexes, implemented within the Rosetta framework. Given a protein receptor structure and an approximate, possibly inaccurate model of the peptide within the receptor binding site, the FlexPepDock server refines the peptide to high resolution, allowing full flexibility to the peptide backbone and to all side chains. This protocol was extensively tested and benchmarked on a wide array of non-redundant peptide–protein complexes, and was proven effective when applied to peptide starting conformations within 5.5 Å backbone root mean square deviation from the native conformation. FlexPepDock has been applied to several systems that are mediated and regulated by peptide–protein interactions. This easy to use and general web server interface allows non-expert users to accurately model their specific peptide–protein interaction of interest.
Most life science processes involve, at the atomic scale, recognition between two molecules. The prediction of such interactions at the molecular level, by so-called docking software, is a non-trivial task. Docking programs have a wide range of applications ranging from protein engineering to drug design. This article presents SwissDock, a web server dedicated to the docking of small molecules on target proteins. It is based on the EADock DSS engine, combined with setup scripts for curating common problems and for preparing both the target protein and the ligand input files. An efficient Ajax/HTML interface was designed and implemented so that scientists can easily submit dockings and retrieve the predicted complexes. For automated docking tasks, a programmatic SOAP interface has been set up and template programs can be downloaded in Perl, Python and PHP. The web site also provides an access to a database of manually curated complexes, based on the Ligand Protein Database. A wiki and a forum are available to the community to promote interactions between users. The SwissDock web site is available online at http://www.swissdock.ch. We believe it constitutes a step toward generalizing the use of docking tools beyond the traditional molecular modeling community.
DNA–protein interactions are involved in many essential biological
activities. Because there is no simple mapping code between DNA base pairs and
protein amino acids, the prediction of DNA–protein interactions is a
challenging problem. Here, we present a novel computational approach for
predicting DNA-binding protein residues and DNA–protein interaction
modes without knowing its specific DNA target sequence. Given the structure of a
DNA-binding protein, the method first generates an ensemble of complex
structures obtained by rigid-body docking with a nonspecific canonical B-DNA.
Representative models are subsequently selected through clustering and ranking
by their DNA–protein interfacial energy. Analysis of these encounter
complex models suggests that the recognition sites for specific DNA binding are
usually favorable interaction sites for the nonspecific DNA probe and that
nonspecific DNA–protein interaction modes exhibit some similarity to
specific DNA–protein binding modes. Although the method requires as
input the knowledge that the protein binds DNA, in benchmark tests, it achieves
better performance in identifying DNA-binding sites than three previously
established methods, which are based on sophisticated machine-learning
techniques. We further apply our method to protein structures predicted through
modeling and demonstrate that our method performs satisfactorily on protein
models whose root-mean-square Cα deviation from native is up to 5
Å from their native structures. This study provides valuable
structural insights into how a specific DNA-binding protein interacts with a
nonspecific DNA sequence. The similarity between the specific
DNA–protein interaction mode and nonspecific interaction modes may
reflect an important sampling step in search of its specific DNA targets by a
Many essential biological activities require interactions between DNA and
proteins. These proteins usually use certain amino acids, called DNA-binding
sites, to recognize their specific DNA targets. To facilitate the search of its
specific DNA targets, a DNA-binding protein often associates with nonspecific
DNA and then diffuses along the DNA. Due to the weak interactions between
nonspecific DNA and the protein, structural characterization of nonspecific
DNA–protein complexes is experimentally challenging. This paper
describes a computational modeling study on nonspecific DNA–protein
complexes and comparative analysis with respect to specific
DNA–protein complexes. The study found that the specific DNA-binding
sites on a protein are typically favorable for nonspecific DNA and that
nonspecific and specific DNA–protein interaction modes are quite
similar. This similarity may reflect an important sampling step in the search
for the specific DNA target sequence by a DNA-binding protein. On the basis of
these observations, a novel method was proposed for predicting DNA-binding sites
and binding modes of a DNA-binding protein without knowing its specific DNA
target sequence. Ultimately, the combination of this method and protein
structure prediction may lead the way to high throughput modeling of
Computational approaches to protein-protein docking typically include scoring aimed at improving the rank of the near-native structure relative to the false-positive matches. Knowledge-based potentials improve modeling of protein complexes by taking advantage of the rapidly increasing amount of experimentally derived information on protein-protein association. An essential element of knowledge-based potentials is defining the reference state for an optimal description of the residue-residue (or atom-atom) pairs in the non-interaction state.
The study presents a new Distance- and Environment-dependent, Coarse-grained, Knowledge-based (DECK) potential for scoring of protein-protein docking predictions. Training sets of protein-protein matches were generated based on bound and unbound forms of proteins taken from the DOCKGROUND resource. Each residue was represented by a pseudo-atom in the geometric center of the side chain. To capture the long-range and the multi-body interactions, residues in different secondary structure elements at protein-protein interfaces were considered as different residue types. Five reference states for the potentials were defined and tested. The optimal reference state was selected and the cutoff effect on the distance-dependent potentials investigated. The potentials were validated on the docking decoys sets, showing better performance than the existing potentials used in scoring of protein-protein docking results.
A novel residue-based statistical potential for protein-protein docking was developed and validated on docking decoy sets. The results show that the scoring function DECK can successfully identify near-native protein-protein matches and thus is useful in protein docking. In addition to the practical application of the potentials, the study provides insights into the relative utility of the reference states, the scope of the distance dependence, and the coarse-graining of the potentials.
Low-affinity ligands can be efficiently optimized into high-affinity drug leads by structure based drug design when atomic-resolution structural information on the protein/ligand complexes is available. In this work we show that the use of a few, easily obtainable, experimental restraints improves the accuracy of the docking experiments by two orders of magnitude. The experimental data are measured in nuclear magnetic resonance spectra and consist of protein-mediated NOEs between two competitively binding ligands. The methodology can be widely applied as the data are readily obtained for low-affinity ligands in the presence of non-labelled receptor at low concentration. The experimental inter-ligand NOEs are efficiently used to filter and rank complex model structures that have been pre-selected by docking protocols. This approach dramatically reduces the degeneracy and inaccuracy of the chosen model in docking experiments, is robust with respect to inaccuracy of the structural model used to represent the free receptor and is suitable for high-throughput docking campaigns.
Electronic supplementary material
The online version of this article (doi:10.1007/s10858-011-9590-5) contains supplementary material, which is available to authorized users.
NMR; INPHARMA; NOE; Docking; Drug design
A good scoring function is essential for molecular docking computations. In conventional scoring functions, energy terms modeling pairwise interactions are cumulatively summed, and the best docking solution is selected. Here, we propose to transform protein-ligand interactions into three-dimensional geometric networks, from which recurring network substructures, or network motifs, are selected and used to provide probability-ranked interaction templates with which to score docking solutions.
A novel scoring function for protein-ligand docking, MotifScore, was developed. It is non-energy-based, and docking is, instead, scored by counting the occurrences of motifs of protein-ligand interaction networks constructed using structures of protein-ligand complexes. MotifScore has been tested on a benchmark set established by others to assess its ability to identify near-native complex conformations among a set of decoys. In this benchmark test, 84% of the highest-scored docking conformations had root-mean-square deviations (rmsds) below 2.0 Å from the native conformation, which is comparable with the best of several energy-based docking scoring functions. Many of the top motifs, which comprise a multitude of chemical groups that interact simultaneously and make a highly significant contribution to MotifScore, capture recurrent interacting patterns beyond pairwise interactions.
While providing quite good docking scores, MotifScore is quite different from conventional energy-based functions. MotifScore thus represents a new, network-based approach for exploring problems associated with molecular docking.
Docking to the plasma membrane prepares vesicles for rapid release. Here, we describe a mechanism for dense core vesicle docking in neurons. In Caenorhabditis elegans motor neurons, dense core vesicles dock at the plasma membrane but are excluded from active zones at synapses. We have found that the calcium-activated protein for secretion (CAPS) protein is required for dense core vesicle docking but not synaptic vesicle docking. In contrast, we see that UNC-13, a docking factor for synaptic vesicles, is not essential for dense core vesicle docking. Both the CAPS and UNC-13 docking pathways converge on syntaxin, a component of the SNARE (soluble N-ethyl-maleimide–sensitive fusion protein attachment receptor) complex. Overexpression of open syntaxin can bypass the requirement for CAPS in dense core vesicle docking. Thus, CAPS likely promotes the open state of syntaxin, which then docks dense core vesicles. CAPS function in dense core vesicle docking parallels UNC-13 in synaptic vesicle docking, which suggests that these related proteins act similarly to promote docking of independent vesicle populations.
Repositioning existing drugs for new therapeutic uses is an efficient approach to drug discovery. We have developed a computational drug repositioning pipeline to perform large-scale molecular docking of small molecule drugs against protein drug targets, in order to map the drug-target interaction space and find novel interactions. Our method emphasizes removing false positive interaction predictions using criteria from known interaction docking, consensus scoring, and specificity. In all, our database contains 252 human protein drug targets that we classify as reliable-for-docking as well as 4621 approved and experimental small molecule drugs from DrugBank. These were cross-docked, then filtered through stringent scoring criteria to select top drug-target interactions. In particular, we used MAPK14 and the kinase inhibitor BIM-8 as examples where our stringent thresholds enriched the predicted drug-target interactions with known interactions up to 20 times compared to standard score thresholds. We validated nilotinib as a potent MAPK14 inhibitor in vitro (IC50 40 nM), suggesting a potential use for this drug in treating inflammatory diseases. The published literature indicated experimental evidence for 31 of the top predicted interactions, highlighting the promising nature of our approach. Novel interactions discovered may lead to the drug being repositioned as a therapeutic treatment for its off-target's associated disease, added insight into the drug's mechanism of action, and added insight into the drug's side effects.
Most drugs are designed to bind to and inhibit the function of a disease target protein. However, drugs are often able to bind to ‘off-target’ proteins due to similarities in the protein binding sites. If an off-target is known to be involved in another disease, then the drug has potential to treat the second disease. This repositioning strategy is an alternate and efficient approach to drug discovery, as the clinical and toxicity histories of existing drugs can greatly reduce drug development cost and time. We present here a large-scale computational approach that simulates three-dimensional binding between existing drugs and target proteins to predict novel drug-target interactions. Our method focuses on removing false predictions, using annotated ‘known’ interactions, scoring and ranking thresholds. 31 of our top novel drug-target predictions were validated through literature search, and demonstrated the utility of our method. We were also able to identify the cancer drug nilotinib as a potent inhibitor of MAPK14, a target in inflammatory diseases, which suggests a potential use for the drug in treating rheumatoid arthritis.
Protein-RNA interactions play fundamental roles in many biological processes. Understanding the molecular mechanism of protein-RNA recognition and formation of protein-RNA complexes is a major challenge in structural biology. Unfortunately, the experimental determination of protein-RNA complexes is tedious and difficult, both by X-ray crystallography and NMR. For many interacting proteins and RNAs the individual structures are available, enabling computational prediction of complex structures by computational docking. However, methods for protein-RNA docking remain scarce, in particular in comparison to the numerous methods for protein-protein docking.
We developed two medium-resolution, knowledge-based potentials for scoring protein-RNA models obtained by docking: the quasi-chemical potential (QUASI-RNP) and the Decoys As the Reference State potential (DARS-RNP). Both potentials use a coarse-grained representation for both RNA and protein molecules and are capable of dealing with RNA structures with posttranscriptionally modified residues. We compared the discriminative power of DARS-RNP and QUASI-RNP for selecting rigid-body docking poses with the potentials previously developed by the Varani and Fernandez groups.
In both bound and unbound docking tests, DARS-RNP showed the highest ability to identify native-like structures. Python implementations of DARS-RNP and QUASI-RNP are freely available for download at http://iimcb.genesilico.pl/RNP/
RNA; protein; RNP; macromolecular docking; complex modeling; structural bioinformatics
Is it possible to identify what the best solution of a docking program is? The usual answer to this question is the highest score solution, but interactions between proteins are dynamic processes, and many times the interaction regions are wide enough to permit protein-protein interactions with different orientations and/or interaction energies. In some cases, as in a multimeric protein complex, several interaction regions are possible among the monomers. These dynamic processes involve interactions with surface displacements between the proteins to finally achieve the functional configuration of the protein complex. Consequently, there is not a static and single solution for the interaction between proteins, but there are several important configurations that also have to be analyzed.
To extract those representative solutions from the docking output datafile, we have developed an unsupervised and automatic clustering application, named DockAnalyse. This application is based on the already existing DBscan clustering method, which searches for continuities among the clusters generated by the docking output data representation. The DBscan clustering method is very robust and, moreover, solves some of the inconsistency problems of the classical clustering methods like, for example, the treatment of outliers and the dependence of the previously defined number of clusters.
DockAnalyse makes the interpretation of the docking solutions through graphical and visual representations easier by guiding the user to find the representative solutions. We have applied our new approach to analyze several protein interactions and model the dynamic protein interaction behavior of a protein complex. DockAnalyse might also be used to describe interaction regions between proteins and, therefore, guide future flexible dockings. The application (implemented in the R package) is accessible.