G protein-coupled receptors (GPCRs) represent a large family of signaling proteins that includes many therapeutic targets; however, progress in identifying new small molecule drugs has been disappointing. The past four years have seen remarkable progress in the structural biology of GPCRs, raising the possibility of applying structure-based approaches to GPCR drug discovery efforts. Of the various structure-based approaches that have been applied to soluble protein targets, such as proteases and kinases, in silico docking is among the most ready applicable to GPCRs. Early studies suggest that GPCR binding pockets are well suited to docking, and docking screens have identified potent and novel compounds for these targets. This review will focus on the current state of in silico docking for GPCRs.
Predicting absolute protein-ligand binding affinities remains a frontier challenge in ligand discovery and design. This becomes more difficult when ionic interactions are involved, because of the large opposing solvation and electrostatic attraction energies. In a blind test, we examined whether alchemical free energy calculations could predict binding affinities of 14 charged and 5 neutral compounds previously untested as ligands for a cavity binding site in Cytochrome C Peroxidase. In this simplified site, polar and cationic ligands compete with solvent to interact with a buried aspartate. Predictions were tested by calorimetry, spectroscopy, and crystallography. Of the 15 compounds predicted to bind, 13 were experimentally confirmed, while four compounds were false negative predictions. Predictions had an RMSE of 1.95 kcal/mol to the experimental affinities, and predicted poses had an average RMSD of 1.7 Å to the crystallographic poses. This test serves as a benchmark for these thermodynamically rigorous calculations at predicting binding affinities for charged compounds, and gives insights into the existing sources of error, which are primarily electrostatic interactions inside proteins. Our experiments also provide a useful set of ionic binding affinities in a simplified system for testing new affinity prediction methods.
free energy calculations; electrostatics; ligand binding; molecular dynamics
Drug efficacy does not always increase
sigmoidally with concentration,
which has puzzled the community for decades. Unlike standard sigmoidal
curves, bell-shaped concentration–response curves suggest more
complex biological effects, such as multiple-binding sites or multiple
targets. Here, we investigate a physical property-based mechanism
for bell-shaped curves. Beginning with the observation that some drugs
form colloidal aggregates at relevant concentrations, we determined
concentration–response curves for three aggregating anticancer
drugs, formulated both as colloids and as free monomer. Colloidal
formulations exhibited bell-shaped curves, losing activity at higher
concentrations, while monomeric formulations gave typical sigmoidal
curves, sustaining a plateau of maximum activity. Inverting the question,
we next asked if molecules with bell-shaped curves, reported in the
literature, form colloidal aggregates at relevant concentrations.
We selected 12 molecules reported to have bell-shaped concentration–response
curves and found that five of these formed colloids. To understand
the mechanism behind the loss of activity at concentrations where
colloids are present, we investigated the diffusion of colloid-forming
dye Evans blue into cells. We found that colloidal species are excluded
from cells, which may explain the mechanism behind toxicological screens
that use Evans blue, Trypan blue, and related dyes.
Proteins fluctuate between alternative conformations, which presents a challenge for ligand discovery because such flexibility is difficult to treat computationally owing to problems with conformational sampling and energy weighting. Here, we describe a flexible-docking method that samples and weights protein conformations using experimentally-derived conformations as a guide. The crystallographically refined occupancies of these conformations, which are observable in an apo receptor structure, define energy penalties for docking. In a large prospective library screen, we identified new ligands that target specific receptor conformations of a cavity in Cytochrome c Peroxidase, and we confirm both ligand pose and associated receptor conformation predictions by crystallography. The inclusion of receptor flexibility led to ligands with new chemotypes and physical properties. By exploiting experimental measures of loop and side chain flexibility, this method can be extended to the discovery of new ligands for hundreds of targets in the Protein Data Bank where similar experimental information is available.
A substantial challenge for genomic enzymology is the reliable annotation for proteins of unknown function. Described here is an interrogation of uncharacterized enzymes from the amidohydrolase superfamily using a structure-guided approach that integrates bioinformatics, computational biology and molecular enzymology. Previously, Tm0936 from Thermotoga maritima was shown to catalyze the deamination of S-adenosylhomocysteine (SAH) to Sinosylhomocysteine (SIH). Homologues of Tm0936 homologues were identified, and substrate profiles were proposed by docking metabolites to modeled enzyme structures. These enzymes were predicted to deaminate analogues of adenosine including SAH, 5’-methylthioadenosine (MTA), adenosine (Ado), and 5’-deoxyadenosine (5’-dAdo). Fifteen of these proteins were purified to homogeneity and the three-dimensional structures of three proteins were determined by X-ray diffraction methods. Enzyme assays supported the structure-based predictions and identified subgroups of enzymes with the capacity to deaminate various combinations of the adenosine analogues, including the first enzyme (Dvu1825) capable of deaminating 5’-dAdo. One subgroup of proteins, exemplified by Moth1224 from Moorella thermoacetica, deaminates guanine to xanthine and another subgroup, exemplified by Avi5431 from Agrobacterium vitis S4, deaminates two oxidatively damaged forms of adenine: 2-oxoadenine and 8-oxoadenine. The sequence and structural basis of the observed substrate specificities was proposed and the substrate profiles for 834 protein sequences were provisionally annotated. The results highlight the power of a multidisciplinary approach for annotating enzymes of unknown function.
Proteins of unknown function belonging to cog1816 and cog0402 were characterized. Sav2595 from Steptomyces avermitilis MA-4680, Acel0264 from Acidothermus cellulolyticus 11B, Nis0429 from Nitratiruptor sp. SB155-2 and Dr0824 from Deinococcus radiodurans R1 were cloned, purified, and their substrate profiles determined. These enzymes were previously incorrectly annotated as adenosine deaminases or chlorohydrolases. It was shown here that these enzymes actually deaminate 6-aminodeoxyfutalosine. The deamination of 6-aminodeoxyfutalosine is part of an alternative menaquinone biosynthetic pathway that involves the formation of futalosine. 6-Aminodeoxyfutalosine is deaminated by these enzymes with catalytic efficiencies greater than 105 M−1 s−1, Km values of 0.9 to 6.0 μM and kcat values of 1.2 to 8.6 s−1. Adenosine, 2′-deoxyadenosine, thiomethyladenosine, and S-adenosylhomocysteine are deaminated at least an order of magnitude slower than 6-aminodeoxyfutalosine. The crystal structure of Nis0429 was determined and the substrate, 6-aminodeoxyfutalosine, was positioned in the active site, based on the presence of adventitiously bound benzoic acid. In this model Ser-145 interacts with the carboxylate moiety of the substrate. The structure of Dr0824 was also determined, but a collapsed active site pocket prevented docking of substrates. A computational model of Sav2595 was built based on the crystal structure of adenosine deaminase and substrates were docked. The model predicted a conserved arginine after β-strand 1 to be partially responsible for the substrate specificity of Sav2595.
Fragment screens have successfully identified new scaffolds in drug discovery, often with relatively high hit rates (5%) using small screening libraries (1,000–10,000 compounds). This raises two questions: would other noteworthy chemotypes be found were one to screen all commercially available fragments (> 300,000), and does the success rate imply low specificity of fragments? We used molecular docking to screen large libraries of fragments against CTX-M β-lactamase. We identified ten millimolar-range inhibitors from the 69 compounds tested. The docking poses corresponded closely to the crystallographic structures subsequently determined. Notably, these initial low-affinity hits showed little specificity between CTX-M and an unrelated β-lactamase, AmpC, which is unusual among β-lactamase inhibitors. This is consistent with the idea that the high hit rates among fragments correlate to a low initial specificity. As the inhibitors were progressed, both specificity and affinity rose together, yielding to our knowledge the first micromolar-range noncovalent inhibitors against a class A β-lactamase.
Of the over 22 million protein sequences in the nonredundant TrEMBL database, fewer than 1% have experimentally confirmed functions. Structure-based methods have been used to predict enzyme activities from experimentally determined structures; however, for the vast majority of proteins, no such structures are available. Here, homology models of a functionally uncharacterized amidohydrolase from Agrobacterium radiobacter K84 (Arad3529) were computed based on a remote template structure. The protein backbone of two loops near the active site was remodeled, resulting in four distinct active site conformations. Substrates of Arad3529 were predicted by docking of 57672 high-energy intermediate (HEI) forms of 6440 metabolites against these four homology models. Based on docking ranks and geometries, a set of modified pterins were suggested as candidate substrates for Arad3529. The predictions were tested by enzymology experiments, and Arad3529 deaminated many pterin metabolites (substrate, kcat/Km [M−1s−1]): formylpterin, 5.2 × 106; pterin-6-carboxylate, 4.0 × 106; pterin-7-carboxylate, 3.7 × 106; pterin, 3.3 × 106; hydroxymethylpterin, 1.2 × 106; biopterin, 1.0 × 106; D-(+)-neopterin, 3.1 × 105; isoxanthopterin, 2.8 × 105; sepiapterin, 1.3 × 105; folate, 1.3 × 105, xanthopterin, 1.17 × 105; 7,8-dihydrohydroxymethylpterin, 3.3 × 104. While pterin is a ubiquitous oxidative product of folate degradation, genomic analysis suggests that the first step of an undescribed pterin degradation pathway is catalyzed by Arad3529. Homology model-based virtual screening, especially with modeling of protein backbone flexibility, may be broadly useful for enzyme function annotation and discovering new pathways and drug targets.
The substrate specificities of two incorrectly annotated enzymes belonging to cog3964 from the amidohydrolase superfamily (AHS) were determined. This group of enzymes is currently misannotated as either dihydroorotase or adenine deaminase. Atu3266 from Agrobacterium tumefaciens C58 and Oant2987 from Ochrobactrum anthropi ATCC 49188 were determined to catalyze the hydrolysis of acetyl-R-mandelate and similar esters with values of kcat/Km that exceed 105 M−1 s−1. These enzymes do not catalyze the deamination of adenine or the hydrolysis of dihydroorotate. Atu3266 was crystallized and the structure determined to a resolution of 2.62 Å. The protein folds as a distorted (β/α)8-barrel and binds two zincs in the active site. The substrate profile was determined via a combination of computational docking to the three-dimensional structure of Atu3266 and screening of a highly focused library of potential substrates. The initial weak hit was the hydrolysis of N-acetyl-D-serine (kcat/Km = 4 M−1s−1). This was followed by the progressive identification of acetyl-R-glycerate (4 × 102 M−1s−1), acetyl glycolate (kcat/Km = 1.3 × 104 M−1 s−1) and ultimately acetyl-R-mandelate (kcat/Km =2.8 × 105 M−1 s−1).
To compare virtual and high-throughput screening in an unbiased way, 50,000 compounds were docked into the 3-dimensional structure of dihydrofolate reductase prospectively, and the results were compared to a subsequent experimental screening of the same library. Undertaking these calculations demanded careful database curation and control calculations with annotated inhibitors. These ultimately led to a ranked list of more likely and less likely inhibitors and to the prediction that relatively few inhibitors would be found in the empirical screen. The latter prediction turned out to be correct, with arguably no validated inhibitors found experimentally. Subsequent retesting of high-scoring docked molecules may have found 2 true inhibitors, although this remains uncertain due to experimental ambiguities. The implications of this study for screening campaigns are considered. (Journal of Biomolecular Screening 2005:667-674)
high-throughput screening; HTS; virtual screening; molecular docking; database preparation
This paper takes advantage of similarities between the C. elegans and human pharmacopeia to identify and validate pharmacological targets that regulate C. elegans feeding rates.
Phenotypic screens can identify molecules that are at once penetrant and active on the integrated circuitry of a whole cell or organism. These advantages are offset by the need to identify the targets underlying the phenotypes. Additionally, logistical considerations limit screening for certain physiological and behavioral phenotypes to organisms such as zebrafish and C. elegans. This further raises the challenge of elucidating whether compound-target relationships found in model organisms are preserved in humans. To address these challenges we searched for compounds that affect feeding behavior in C. elegans and sought to identify their molecular mechanisms of action. Here, we applied predictive chemoinformatics to small molecules previously identified in a C. elegans phenotypic screen likely to be enriched for feeding regulatory compounds. Based on the predictions, 16 of these compounds were tested in vitro against 20 mammalian targets. Of these, nine were active, with affinities ranging from 9 nM to 10 µM. Four of these nine compounds were found to alter feeding. We then verified the in vitro findings in vivo through genetic knockdowns, the use of previously characterized compounds with high affinity for the four targets, and chemical genetic epistasis, which is the effect of combined chemical and genetic perturbations on a phenotype relative to that of each perturbation in isolation. Our findings reveal four previously unrecognized pathways that regulate feeding in C. elegans with strong parallels in mammals. Together, our study addresses three inherent challenges in phenotypic screening: the identification of the molecular targets from a phenotypic screen, the confirmation of the in vivo relevance of these targets, and the evolutionary conservation and relevance of these targets to their human orthologs.
Many beneficial pharmacological interventions were first discovered by observing the effects of perturbation of intact biological systems by small organic molecules without a priori knowledge of their targets. This forward pharmacological approach has the advantage of directly identifying new pharmacological agents that are active on complex biological processes. However, because of experimental feasibility, systematic application of this approach is generally limited to small animals such as the roundworm C. elegans and zebrafish, raising the question of whether use of these animals could identify compounds that act on ortholgous mammalian targets. A significant challenge in addressing this question is the determination of the molecular identities of the compounds' targets responsible for the desired phenotypic outcomes. Here we describe a computational approach for target identification based on structural similarities of newly identified compounds to known ligand interactions with mostly mammalian targets. For several of the compounds emerging from a C. elegans phenotypic screen, we predict and confirm mammalian targets using in vitro binding assays. Using genetic and pharmacological assays, we then demonstrate that a subset of these compounds alter C. elegans feeding rates through the C. elegans counterparts of the predicted mammalian targets.
Molecular docking remains an important tool for structure-based screening to find new ligands and chemical probes. As docking ambitions grow to include new scoring function terms, and to address ever more targets, the reliability and extendability of the orientation sampling, and the throughput of the method, become pressing. Here we explore sampling techniques that eliminate stochastic behavior in DOCK3.6, allowing us to optimize the method for regularly variable sampling of orientations. This also enabled a focused effort to optimize the code for efficiency, with a three-fold increase in the speed of the program. This, in turn, facilitated extensive testing of the method on the 102 targets, 22,805 ligands and 1,411,214 decoys of the Directory of Useful Decoys - Enhanced (DUD-E) benchmarking set, at multiple levels of sampling. Encouragingly, we observe that as sampling increases from 50 to 500 to 2000 to 5000 to 20000 molecular orientations in the binding site (and so from about 1×1010 to 4×1010 to 1×1011 to 2×1011 to 5×1011 mean atoms scored per target, since multiple conformations are sampled per orientation), the enrichment of ligands over decoys monotonically increases for most DUD-E targets. Meanwhile, including internal electrostatics in the evaluation ligand conformational energies, and restricting aromatic hydroxyls to low energy rotamers, further improved enrichment values. Several of the strategies used here to improve the efficiency of the code are broadly applicable in the field.
A docking screen identified reversible, non-covalent inhibitors (e.g. 1) of the parasite cysteine protease cruzain. Chemical optimization of 1 led to a series of oxadiazoles possessing interpretable SAR and potencies as much as 500-fold greater than 1. Detailed investigation of the SAR series subsequently revealed that many members of the oxadiazole class (and surprisingly also 1) act via divergent modes of inhibition – competitive or via colloidal aggregation – depending on the assay conditions employed.
Protein classification typically uses structural, sequence, or functional similarity. Here we introduce an orthogonal method that organizes proteins by ligand similarity, focusing here on the class A G protein-coupled receptor (GPCR) protein family. Comparing a ligand-based dendogram to a sequence-based one, we sought examples of GPCRs that were distantly linked by sequence but neighbors by ligand similarity. Experimental testing of compounds predicted to link three of these new pairs confirmed the predicted association, with potencies ranging from the low-nanomolar to low-micromolar. We then identified hundreds of non-GPCRs closely related to GPCRs by ligand similarity, including the CXCR2 chemokine receptor to Casein kinase I, the cannabinoid receptors to epoxide hydrolase 2, and the α2 adrenergic receptor to phospholipase D. These, too, were confirmed experimentally. Ligand similarities among these targets may reflect a chemical integration in the time domain of molecular signaling.
A key challenge in structure-based discovery is accounting for modulation of protein-ligand interactions by ordered and bulk solvent. To investigate this, we compared ligand binding to a buried cavity in Cytochrome c Peroxidase (CcP), where affinity is dominated by a single ionic interaction, versus a cavity variant partly opened to solvent by loop deletion. This opening had unexpected effects on ligand orientation, affinity, and ordered water structure. Some ligands lost over ten-fold in affinity and reoriented in the cavity, while others retained their geometries, formed new interactions with water networks, and improved affinity. To test our ability to discover new ligands against this opened site prospectively, a 534,000 fragment library was docked against the open cavity using two models of ligand solvation. Using an older solvation model that prioritized many neutral molecules, three such uncharged docking hits were tested, none of which was observed to bind; these molecules were not highly ranked by the new, context-dependent solvation score. Using this new method, another 15 highly-ranked molecules were tested for binding. In contrast to the previous result, 14 of these bound detectably, with affinities ranging from 8 µM to 2 mM. In crystal structures, four of these new ligands superposed well with the docking predictions but two did not, reflecting unanticipated interactions with newly ordered waters molecules. Comparing recognition between this open cavity and its buried analog begins to isolate the roles of ordered solvent in a system that lends itself readily to prospective testing and that may be broadly useful to the community.
Penicillin-binding protein 6 (PBP6) is one of the two main dd-carboxypeptidases in Escherichia coli, which are implicated in maturation of bacterial cell wall and formation of cell shape. Here, we report the first X-ray crystal structures of PBP6, capturing its apo state (2.1 Å), an acyl-enzyme intermediate with the antibiotic ampicillin (1.8 Å), and for the first time for a PBP, a preacylation complex (a “Michaelis complex”, determined at 1.8 Å) with a peptidoglycan substrate fragment containing the full pentapeptide, NAM-(l-Ala-d-isoGlu-l-Lys-d-Ala-d-Ala). These structures illuminate the molecular interactions essential for ligand recognition and catalysis by dd-carboxypeptidases, and suggest a coupling of conformational flexibility of active site loops to the reaction coordinate. The substrate fragment complex structure, in particular, provides templates for models of cell wall recognition by PBPs, as well as substantiating evidence for the molecular mimicry by β-lactam antibiotics of the peptidoglycan acyl-d-Ala-d-Ala moiety.
large library virtual screen against an activated
β2-adrenergic receptor (β2AR) structure returned potent
agonists to the exclusion of inverse-agonists, providing the first
complement to the previous virtual screening campaigns against inverse-agonist-bound
G protein coupled receptor (GPCR) structures, which predicted only
inverse-agonists. In addition, two hits recapitulated the signaling
profile of the co-crystal ligand with respect to the G protein and
arrestin mediated signaling. This functional fidelity has important
implications in drug design, as the ability to predict ligands with
predefined signaling properties is highly desirable. However, the
agonist-bound state provides an uncertain template for modeling the
activated conformation of other GPCRs, as a dopamine D2 receptor (DRD2)
activated model templated on the activated β2AR structure returned
few hits of only marginal potency.
Normal cilia length and motility are critical for proper cellular function. Prior studies of the regulation of ciliary structure and length have primarily focused on the intraflagellar transport machinery and motor proteins required for ciliary assembly and disassembly. However, several mutants with abnormal length flagella highlight the importance of signaling proteins as well. In this study, an unbiased chemical screen was performed to uncover signaling pathways that are critical for ciliogenesis and length regulation using flagella of the green alga Chlamydomonas reinhardtii as a model. The annotated Sigma LOPAC1280 chemical library was screened for effects on flagellar length, motility and severing as well as cell viability. Assay data were clustered to identify pathways regulating flagella. The most frequently target found to be involved in flagellar length regulation was the family of dopamine binding G-protein coupled receptors (GPCRs). In mammalian cells, cilium length could indeed be altered with expression of the dopamine D1 receptor. Our screen thus reveals signaling pathways that are potentially critical for ciliary formation, resorption, and length maintenance, which represent candidate targets for therapeutic intervention of disorders involving ciliary malformation and malfunction.
PDE4 is one of eleven known cyclic nucleotide phosphodiesterase families and plays a pivotal role in mediating hydrolytic degradation of the important cyclic nucleotide second messenger, cyclic 3′5′ adenosine monophosphate (cAMP). PDE4 inhibitors are known to have anti-inflammatory properties, but their use in the clinic has been hampered by mechanism-associated side effects that limit maximally tolerated doses. In an attempt to initiate the development of better-tolerated PDE4 inhibitors we have surveyed existing approved drugs for PDE4-inhibitory activity. With this objective, we utilised a high-throughput computational approach that identified moexipril, a well tolerated and safe angiotensin-converting enzyme (ACE) inhibitor, as a PDE4 inhibitor. Experimentally we showed that moexipril and two structurally related analogues acted in the micro molar range to inhibit PDE4 activity. Employing a FRET-based biosensor constructed from the nucleotide binding domain of the type 1 exchange protein activated by cAMP, EPAC1, we demonstrated that moexipril markedly potentiated the ability of forskolin to increase intracellular cAMP levels. Finally, we demonstrated that the PDE4 inhibitory effect of moexipril is functionally able to induce phosphorylation of the small heat shock protein, Hsp20, by cAMP dependent protein kinase A. Our data suggest that moexipril is a bona fide PDE4 inhibitor that may provide the starting point for development of novel PDE4 inhibitors with an improved therapeutic window.
Phosphodiesterase inhibitor; Protein kinase A (PKA), PDE4; Catechol ether; Cyclic 3′5′ adenosine monophosphate (cAMP)
model binding sites allow one to isolate entangled terms
in molecular energy functions. Here, we investigate the effects on
ligand recognition of the introduction of a histidine into a hydrophobic
cavity in lysozyme. We docked 656040 molecules and tested 26 highly
and nine poorly ranked. Twenty-one highly ranked molecules bound and
five were false positives, while three poorly ranked molecules were
false negatives. In the 16 X-ray complexes now known, the docking
predictions overlaid well with the crystallographic results. Although
ligand enrichment was high, the false negatives, the false positives,
and the inability to rank order illuminated weaknesses in our scoring,
particularly overweighed apolar and underweighted polar terms. Adjusting
these led to new problems, reflecting the entangled nature of docking
scoring functions. Changes in ligand affinity relative to other lysozyme
cavities speak to the subtleties of molecular recognition even in
these simple sites and to their relevance for testing different models
Two enzymes of unknown function from the cog1735 subset of the amidohydrolase superfamily (AHS), LMOf2365_2620 (Lmo2620) from Listeria monocytogenes str. 4b F2365 and Bh0225 from Bacillus halodurans C-125, were cloned, expressed and purified to homogeneity. The catalytic functions of these two enzymes were interrogated by an integrated strategy encompassing bioinformatics, computational docking to three-dimensional crystal structures, and library screening. The three-dimensional structure of Lmo2620 was determined at a resolution of 1.6 Å with two phosphates and a binuclear zinc center in the active site. The proximal phosphate bridges the binuclear metal center and is 7.1 Å away from the distal phosphate. The distal phosphate hydrogen bonds with Lys-242, Lys-244, Arg-275 and Tyr-278. Enzymes within cog1735 of the AHS have previously been shown to catalyze the hydrolysis of substituted lactones. Computational docking of the high energy intermediate (HEI) form of the KEGG database to the three-dimensional structure of Lmo2620 highly enriched anionic lactones versus other candidate substrates. The active site structure and the computational docking results suggested that probable substrates would likely include phosphorylated sugar lactones. A small library of diacid sugar lactones and phosphorylated sugar lactones was synthesized and tested for substrate activity with Lmo2620 and Bh0225. Two substrates were identified for these enzymes, d-lyxono-1,4-lactone-5-phosphate and l-ribono-1,4-lactone-5-phosphate. The kcat/Km values for the cobalt-substituted enzymes with these substrates are ~105 M−1 s−1.
Applications in structural biology and medicinal chemistry require protein-ligand scoring functions for two distinct tasks: (i) ranking different poses of a small molecule in a protein binding site; and (ii) ranking different small molecules by their complementarity to a protein site. Using probability theory, we developed two atomic distance-dependent statistical scoring functions: PoseScore was optimized for recognizing native binding geometries of ligands from other poses and RankScore was optimized for distinguishing ligands from nonbinding molecules. Both scores are based on a set of 8,885 crystallographic structures of protein-ligand complexes, but differ in the values of three key parameters. Factors influencing the accuracy of scoring were investigated, including the maximal atomic distance and non-native ligand geometries used for scoring, as well as the use of protein models instead of crystallographic structures for training and testing the scoring function. For the test set of 19 targets, RankScore improved the ligand enrichment (logAUC) and early enrichment (EF1) scores computed by DOCK 3.6 for 13 and 14 targets, respectively. In addition, RankScore performed better at rescoring than each of seven other scoring functions tested. Accepting both the crystal structure and decoy geometries with all-atom root-mean-square errors of up to 2 Å from the crystal structure as correct binding poses, PoseScore gave the best score to a correct binding pose among 100 decoys for 88% of all cases in a benchmark set containing 100 protein-ligand complexes. PoseScore accuracy is comparable to that of DrugScoreCSD and ITScore/SE, and superior to 12 other tested scoring functions. Therefore, RankScore can facilitate ligand discovery, by ranking complexes of the target with different small molecules; PoseScore can be used for protein-ligand complex structure prediction, by ranking different conformations of a given protein-ligand pair. The statistical potentials are available through the Integrative Modeling Platform (IMP) software package (http://salilab.org/imp/) and the LigScore web server (http://salilab.org/ligscore/).
statistical potential; reference state; binding pose; ligand enrichment
Discovering the unintended “off-targets” that predict adverse drug reactions (ADRs) is daunting by empirical methods alone. Drugs can act on multiple protein targets, some of which can be unrelated by traditional molecular metrics, and hundreds of proteins have been implicated in side effects. We therefore explored a computational strategy to predict the activity of 656 marketed drugs on 73 unintended “side effect” targets. Approximately half of the predictions were confirmed, either from proprietary databases unknown to the method or by new experimental assays. Affinities for these new off-targets ranged from 1 nM to 30 μM. To explore relevance, we developed an association metric to prioritize those new off-targets that explained side effects better than any known target of a given drug, creating a Drug-Target-ADR network. Among these new associations was the prediction that the abdominal pain side effect of the synthetic estrogen chlorotrianisene was mediated through its newly discovered inhibition of the enzyme COX-1. The clinical relevance of this inhibition was borne-out in whole human blood platelet aggregation assays. This approach may have wide application to de-risking toxicological liabilities in drug discovery.
The Enzyme Function Initiative (EFI) was recently established to address the challenge of assigning reliable functions to enzymes discovered in bacterial genome projects; in this Current Topic we review the structure and operations of the EFI. The EFI includes the Superfamily/Genome, Protein, Structure, Computation, and Data/Dissemination Cores that provide the infrastructure for reliably predicting the in vitro functions of unknown enzymes. The initial targets for functional assignment are selected from five functionally diverse superfamilies (amidohydrolase, enolase, glutathione transferase, haloalkanoic acid dehalogenase, and isoprenoid synthase), with five superfamily-specific Bridging Projects experimentally testing the predicted in vitro enzymatic activities. The EFI also includes the Microbiology Core that evaluates the in vivo context of in vitro enzymatic functions and confirms the functional predictions of the EFI. The deliverables of the EFI to the scientific community include: 1) development of a large-scale, multidisciplinary sequence/structure-based strategy for functional assignment of unknown enzymes discovered in genome projects (target selection, protein production, structure determination, computation, experimental enzymology, microbiology, and structure-based annotation); 2) dissemination of the strategy to the community via publications, collaborations, workshops, and symposia; 3) computational and bioinformatic tools for using the strategy; 4) provision of experimental protocols and/or reagents for enzyme production and characterization; and 5) dissemination of data via the EFI’s website, enzymefunction.org. The realization of multidisciplinary strategies for functional assignment will begin to define the full metabolic diversity that exists in nature and will impact basic biochemical and evolutionary understanding, as well as a wide range of applications of central importance to industrial, medicinal and pharmaceutical efforts.
Target identification is a core challenge in chemical genetics. Here we use chemical similarity to predict computationally the targets of 586 compounds active in a zebrafish behavioral assay. Of 20 predictions tested, 11 had activities ranging from 1 to 10,000nM on the predicted targets. The role of two of these targets was tested in the original zebrafish phenotype. Prediction of targets from chemotype is rapid and may be generally applicable.