The in silico methods for drug discovery are becoming increasingly powerful and useful. That, in combination with increasing computer processor power, in our case using a novel distributed computing grid, has enabled us to greatly enhance our virtual screening efforts. Herein we review some of these efforts using both receptor and ligand-based virtual screening, with the goal of finding new anticancer agents. In particular, nucleic acids are a neglected set of targets, especially the different morphologies of duplex, triplex, and quadruplex DNA, many of which have increasing biological relevance. We also review examples of molecular modeling to understand receptors and using virtual screening against G-protein coupled receptor membrane proteins.
Virtual screening; drug discovery; membrane protein; G-protein coupled receptor; telomere; quadruplex; DNA
Because of advances in the high-throughput screening technology, identification of a hit that can bind to a target protein has become a relatively easy task; however, in the process of drug discovery, the following hit-to-lead and lead optimization still remain challenging. In a typical hit-to-lead and lead optimization process, the analogues of the most promising hits are synthesized for the development of structure-activity relationship (SAR) analysis, and in turn, in the effort of optimization of lead compounds, such analysis provides guidance for the further synthesis. The synthesis processes are usually long and labor-intensive. In silico searching has becoming an alternative approach to explore SAR especially with millions of compounds ready to be screened and most of them can be easily obtained. Here, we report our discovery of fifteen new Dishevelled PDZ domain inhibitors by using such an approach. In our studies, we first developed a pharmacophore model based on NSC668036, an inhibitor previously identified in our laboratory; based on the model, we then screened the ChemDiv database by using an algorithm that combines similarity search and docking procedures; finally, we selected potent inhibitors based on docking analysis and examined them by using NMR spectroscopy. NMR experiments showed that all the fifteen compounds we chose bound to the PDZ domain tighter than NSC668036.
Chemical space; NMR; PDZ domain inhibitors; SAR; Virtual screening; Wnt signaling pathway
Src plays various roles in tumour progression, invasion, metastasis, angiogenesis and survival. It is one of the multiple targets of multi-target kinase inhibitors in clinical uses and trials for the treatment of leukemia and other cancers. These successes and appearances of drug resistance in some patients have raised significant interest and efforts in discovering new Src inhibitors. Various in-silico methods have been used in some of these efforts. It is desirable to explore additional in-silico methods, particularly those capable of searching large compound libraries at high yields and reduced false-hit rates.
We evaluated support vector machines (SVM) as virtual screening tools for searching Src inhibitors from large compound libraries. SVM trained and tested by 1,703 inhibitors and 63,318 putative non-inhibitors correctly identified 93.53%~ 95.01% inhibitors and 99.81%~ 99.90% non-inhibitors in 5-fold cross validation studies. SVM trained by 1,703 inhibitors reported before 2011 and 63,318 putative non-inhibitors correctly identified 70.45% of the 44 inhibitors reported since 2011, and predicted as inhibitors 44,843 (0.33%) of 13.56M PubChem, 1,496 (0.89%) of 168 K MDDR, and 719 (7.73%) of 9,305 MDDR compounds similar to the known inhibitors.
SVM showed comparable yield and reduced false hit rates in searching large compound libraries compared to the similarity-based and other machine-learning VS methods developed from the same set of training compounds and molecular descriptors. We tested three virtual hits of the same novel scaffold from in-house chemical libraries not reported as Src inhibitor, one of which showed moderate activity. SVM may be potentially explored for searching Src inhibitors from large compound libraries at low false-hit rates.
Src; c-src; Computer aided drug design; Kinase inhibitor; Virtual screening; Support vector machine
Hepatitis C virus (HCV) NS5B polymerase is a key target for the development of therapeutic agents aimed at the treatment of HCV infections. Here we report on the identification of novel allosteric inhibitors of HCV NS5B through a combination of structure-based virtual screening, synthesis and structure-activity relationship (SAR) optimization approach. Virtual screening of 260,000 compounds from the ChemBridge database against the tetracyclic indole inhibitor binding pocket of NS5B (allosteric pocket-1, AP-1), sequentially down-sized the library by 4 orders of magnitude to yield 23 candidates. In vitro evaluation of the NS5B inhibitory activity of the in-silico selected compounds resulted in 17% hit rate, identifying two novel chemotypes. Of these, compound 3, bearing the rhodanine scaffold, proved amenable for productive SAR exploration and synthetic modification. As a result, 25 derivatives that exhibited IC50 values ranging from 7.7 to 68.0 μM were developed. Docking analysis of lead compound 28 within the tetracyclic indole- and benzylidene-binding allosteric pockets (AP-1 and AP-3, respectively) of NS5B revealed topological similarities between these two pockets. Compound 28, a novel rhodanine analog with NS5B inhibitory potency in the low micromolar level range may be a promising lead for future development of more potent NS5B inhibitors.
NS5B polymerase; Virtual screening; Rhodanine; Imidazocoumarin, SAR
G protein-coupled receptors (GPCRs) represent a large family of signaling proteins that includes many therapeutic targets; however, progress in identifying new small molecule drugs has been disappointing. The past four years have seen remarkable progress in the structural biology of GPCRs, raising the possibility of applying structure-based approaches to GPCR drug discovery efforts. Of the various structure-based approaches that have been applied to soluble protein targets, such as proteases and kinases, in silico docking is among the most ready applicable to GPCRs. Early studies suggest that GPCR binding pockets are well suited to docking, and docking screens have identified potent and novel compounds for these targets. This review will focus on the current state of in silico docking for GPCRs.
In silico drug target identification, which includes many distinct algorithms for finding disease genes and proteins, is the first step in the drug discovery pipeline. When the 3D structures of the targets are available, the problem of target identification is usually converted to finding the best interaction mode between the potential target candidates and small molecule probes. Pharmacophore, which is the spatial arrangement of features essential for a molecule to interact with a specific target receptor, is an alternative method for achieving this goal apart from molecular docking method. PharmMapper server is a freely accessed web server designed to identify potential target candidates for the given small molecules (drugs, natural products or other newly discovered compounds with unidentified binding targets) using pharmacophore mapping approach. PharmMapper hosts a large, in-house repertoire of pharmacophore database (namely PharmTargetDB) annotated from all the targets information in TargetBank, BindingDB, DrugBank and potential drug target database, including over 7000 receptor-based pharmacophore models (covering over 1500 drug targets information). PharmMapper automatically finds the best mapping poses of the query molecule against all the pharmacophore models in PharmTargetDB and lists the top N best-fitted hits with appropriate target annotations, as well as respective molecule’s aligned poses are presented. Benefited from the highly efficient and robust triangle hashing mapping method, PharmMapper bears high throughput ability and only costs 1 h averagely to screen the whole PharmTargetDB. The protocol was successful in finding the proper targets among the top 300 pharmacophore candidates in the retrospective benchmarking test of tamoxifen. PharmMapper is available at http://220.127.116.11/pharmmapper.
The current drug R&D pipeline for most neglected diseases remains weak, and unlikely to support registration of novel drug classes that meet desired target product profiles in the short term. This calls for sustained investment as well as greater emphasis in the risky upstream drug discovery. Access to technologies, resources, and strong management as well as clear compound progression criteria are factors in the successful implementation of any collaborative drug discovery effort. We discuss how some of these factors have impacted drug discovery for tropical diseases within the past four decades, and highlight new opportunities and challenges through the virtual North–South drug discovery network as well as the rationale for greater participation of institutions in developing countries in product innovation. A set of criteria designed to facilitate compound progression from screening hits to drug candidate selection is presented to guide ongoing efforts.
The recognition process between a protein and a partner represents a significant theoretical challenge. In silico structure-based drug design carried out with nothing more than the three-dimensional structure of the protein has led to the introduction of many compounds into clinical trials and numerous drug approvals. Central to guiding the discovery process is to recognize active among non-active compounds. While large-scale computer simulations of compounds taken from a library (virtual screening) or designed de novo are highly desirable in the post-genomic area, many technical problems remain to be adequately addressed. This article presents an overview and discusses the limits of current computational methods for predicting the correct binding pose and accurate binding affinity. It also presents the performances of the most popular algorithms for exploring binary and multi-body protein interactions.
flexibility; binding affinity; protein–ligand/protein; interactions; drug design; computational methods
A collection of 26 polyammonium cyclophane-type macrocycles with a large structural diversity has been screened for G-quadruplex recognition. A two-step selection procedure based on the FRET-melting assay was carried out enabling identification of macrocycles of high affinity (ΔT1/2 up to 30°C) and high selectivity for the human telomeric G-quadruplex. The four selected hits possess sophisticated architectures, more particularly the presence of a pendant side-arm as well as the existence of a particular topological arrangement appear to be strong determinants of quadruplex binding. These compounds are thus likely to create multiple contacts with the target that may be at the origin of their high selectivity, thereby suggesting that this class of macrocycles offers unique advantages for targeting G-quadruplex-DNA.
High-throughput fluorescent intercalator displacement (HT-FID) was adapted to the semi-automated screening of a commercial compound library containing 60,000 molecules resulting in the discovery of cytotoxic DNA-targeted agents. Although commercial libraries are routinely screened in drug discovery efforts, the DNA binding potential of the compounds they contain has largely been overlooked. HT-FID led to the rapid identification of a number of compounds for which DNA binding properties were validated through demonstration of concentration-dependent DNA binding and increased thermal melting of A/T- or G/C-rich DNA sequences. Selected compounds were assayed further for cell proliferation inhibition in glioblastoma cells. Seven distinct compounds emerged from this screening procedure that represent structures unknown previously to be capable of targeting DNA leading to cell death. These agents may represent structures worthy of further modification to optimally explore their potential as cytotoxic anti-cancer agents. In addition, the general screening strategy described may find broader impact toward the rapid discovery of DNA targeted agents with biological activity.
DNA binding; fluorescent intercalator displacement (FID); high-throughput screening; cytotoxicity
Limited structural information of drug targets, cellular toxicity possessed by lead compounds, and large amounts of potential leads are the major issues facing the design-oriented approach of discovering new leads. In an attempt to tackle these issues, we have developed a process of virtual screening based on the observation that conformational rearrangements of the dengue virus envelope protein are essential for the mediation of viral entry into host cells via membrane fusion. Screening was based solely on the structural information of the Dengue virus envelope protein and was focused on a target site that is presumably important for the conformational rearrangements necessary for viral entry. To circumvent the issue of lead compound toxicity, we performed screening based on molecular docking using structural databases of medical compounds. To enhance the identification of hits, we further categorized and selected candidates according to their novel structural characteristics. Finally, the selected candidates were subjected to a biological validation assay to assess inhibition of Dengue virus propagation in mammalian host cells using a plaque formation assay. Among the 10 compounds examined, rolitetracycline and doxycycline significantly inhibited plaque formation, demonstrating their inhibitory effect on dengue virus propagation. Both compounds were tetracycline derivatives with IC50s estimated to be 67.1 µM and 55.6 µM, respectively. Their docked conformations displayed common hydrophobic interactions with critical residues that affected membrane fusion during viral entry. These interactions will therefore position the tetracyclic ring moieties of both inhibitors to bind firmly to the target and, subsequently, disrupt conformational rearrangement and block viral entry. This process can be applied to other drug targets in which conformational rearrangement is critical to function.
Many drug candidates fail in clinical development due to their insufficient selectivity that may cause undesired side effects. Therefore, modern drug discovery is routinely supported by computational techniques, which can identify alternate molecular targets with a significant potential for cross-reactivity. In particular, the development of highly selective kinase inhibitors is complicated by the strong conservation of the ATP-binding site across the kinase family. In this paper, we describe X-ReactKIN, a new machine learning approach that extends the modeling and virtual screening of individual protein kinases to a system level in order to construct a cross-reactivity virtual profile for the human kinome. To maximize the coverage of the kinome, X-ReactKIN relies solely on the predicted target structures and employs state-of-the-art modeling techniques. Benchmark tests carried out against available selectivity data from high-throughput kinase profiling experiments demonstrate that for almost 70% of the inhibitors, their alternate molecular targets can be effectively identified in the human kinome with a high (>0.5) sensitivity at the expense of a relatively low false positive rate (<0.5). Furthermore, in a case study, we demonstrate how X-ReactKIN can support the development of selective inhibitors by optimizing the selection of kinase targets for small-scale counter-screen experiments. The constructed cross-reactivity profiles for the human kinome are freely available to the academic community at http://cssb.biology.gatech.edu/kinomelhm/
X-ReactKIN; human kinome; kinase functional space; kinome structural coverage; kinase inhibitors; drug development; drug off-targets; Chemical Systems Biology
The human genome contains thousands of regions, including that of the telomere, that have the potential to form quadruplex structures. Many of these regions are potential targets for therapeutic intervention. There are many different folding patterns for quadruplex DNAs and the loops exhibit much more variation than do the quartets. The successful targeting of a particular quadruplex structure requires distinguishing that structure from all of the other quadruplex structures that may be present. A mix and measure fluorescent screening method has been developed, that utilizes multiple reporter molecules that bind to different features of quadruplex DNA. The reporter molecules are used in combination with DNAs that have a variety of quadruplex structures. The screening is based on observing the increase or decrease in the fluorescence of the reporter molecules. The selectivity of a set of test molecules has been determined by this approach.
Shape is a fundamentally important molecular feature that often determines the fate of a compound in terms of molecular interactions with preferred and non-preferred biological targets. Complementarity of binding in small molecule-protein, peptide-receptor, antigen-antibody and protein-protein interactions is key to life and survival, but also to targeting molecules with bioactivity. We review the application of shape in various biological systems such as substrate recognition, ligand specificity / selectivity and antibody recognition in the context of computational methods such as docking, quantitative structure activity relationships, classification models and similarity search algorithms. These in silico pharmacology methods have recently demonstrated the importance and applicability of determining molecular shape in drug discovery, virtual screening and predictive toxicology. The results from recently published studies show that shape and shape-based descriptors are at least as useful as other traditional molecular descriptors.
Antibody; Depth; Descriptors; Dopamine receptors; Molecular shape; Nuclear hormone receptor; Pharmacophore
Heat shock protein 90 (Hsp90) is an important target in cancer because of its role in maintaining transformation and has recently become the focus of several drug discovery and development efforts. While compounds with different modes of action are known, the focus of this review is on those classes of compounds which inhibit Hsp90 by binding to the N-terminal ATP pocket. These include natural product inhibitors such as geldanamycin and radicicol and synthetic inhibitors comprised of purines, pyrazoles, isoxazoles and other scaffolds. The synthetic inhibitors have been discovered either by structure-based design, high throughput screening and more recently using fragment-based design and virtual screening techniques. This review will discuss the discovery of these different classes, as well as their development as potential clinical agents.
Geldanamycin; Radicicol; Purines; PU-class; Pyrazoles; Isoxazoles; Cancer
Recent drug discovery efforts have utilized high throughput screening (HTS) of large chemical libraries to identify compounds that modify the activity of discrete molecular targets. The molecular target approach to drug screening is widely used in the pharmaceutical and biotechnology industries, because of the amount of knowledge now available regarding protein structure that has been obtained by computer simulation. The molecular target approach requires that the structure of target molecules, and an understanding of their physiological functions, is known. This approach to drug discovery may, however, limit the identification of novel drugs. As an alternative, the phenotypic- or pathway-screening approach to drug discovery is gaining popularity, particularly in the academic sector. This approach not only provides the opportunity to identify promising drug candidates, but also enables novel information regarding biological pathways to be unveiled. Reporter assays are a powerful tool for the phenotypic screening of compound libraries. Of the various reporter genes that can be used in such assays, those encoding secreted proteins enable the screening of hit molecules in both living cells and animals. Cell- and animal-based screens enable simultaneous evaluation of drug metabolism or toxicity with biological activity. Therefore, drug candidates identified in these screens may have increased biological efficacy and a lower risk of side effects in humans. In this article, we review the reporter bioassay systems available for phenotypic drug discovery.
drug development; high throughput screening; reporter mice; age-related disorders
Hepatocyte growth factor (HGF) is an important regulator of normal development and homeostasis, and dysregulated signaling through the HGF receptor, Met, contributes to tumorigenesis, tumor progression and metastasis in numerous human malignancies. The development of selective small-molecule inhibitors of oncogenic tyrosine kinases (TK) has led to well-tolerated, targeted therapies for a growing number of cancer types. To identify selective Met TK inhibitors, we used a high-throughput virtual screen of the 13.5 million compound ChemNavigator database to find compounds most likely to bind to the Met ATP binding site and to form several critical interactions with binding site residues predicted to stabilize the kinase domain in its inactive conformation. Subsequent biological screening of 70 in silico hit structures using cell-free and intact cell assays identified three active compounds with micromolar IC50 values. The predicted binding modes and target selectivity of these compounds are discussed and compared to other known Met TK inhibitors.
Efforts to discover new drugs for Alzheimer’s disease emphasizing multiple targets was conducted seeking to inhibit amyloid oligomer formation and to prevent radical formation. The tryptoline and tryptamine cores of BACE1 inhibitors previously identified by virtual screening were modified in silico for additional modes of action. These core structures were readily linked to different side chains using 1,2,3-triazole rings as bridges by copper catalyzed azide-alkyne cycloaddition reactions. Three compounds among the sixteen designed compounds exerted multifunctional activities including β-secretase inhibitory action, anti-amyloid aggregation, metal chelating and antioxidant effects at micromolar levels. The neuroprotective effects of the multifunctional compounds 6h, 12c and 12h on Aβ1–42 induced neuronal cell death at 1 μM were significantly greater than those of the potent single target compound, BACE1 inhibitor IV and were comparable to curcumin. The observed synergistic effect resulting from the reduction of the Aβ1–42 neurotoxicity cascade substantiates the validity of our multifunctional strategy in drug discovery for Alzheimer’s disease.
multifunction drugs; BACE1 inhibitor; anti-amyloid aggregation; chelator; antioxidant; neuroprotection
Importance to the field
Virtual screening is a computer-based technique for identifying promising compounds to bind to a target molecule of known structure. Given the rapidly increasing number of protein and nucleic acid structures, virtual screening continues to grow as an effective method for the discovery of new inhibitors and drug molecules.
Areas covered in this review
We describe virtual screening methods that are available in the AutoDock suite of programs, and several of our successes in using AutoDock virtual screening in pharmaceutical lead discovery.
What the reader will gain
A general overview of the challenges of virtual screening is presented, along with the tools available in the AutoDock suite of programs for addressing these challenges.
Take home message
Virtual screening is an effective tool for the discovery of compounds for use as leads in drug discovery, and the free, open source program AutoDock is an effective tool for virtual screening.
virtual screening; computer-aided drug design; computational docking; AutoDock
The accurate prediction of protein druggability (propensity to bind high-affinity drug-like small molecules) would greatly benefit the fields of chemical genomics and drug discovery. We have developed a novel approach to quantitatively assess protein druggability by computationally screening a fragment-like compound library. In analogy to NMR-based fragment screening, we dock ∼11000 fragments against a given binding site and compute a computational hit rate based on the fraction of molecules that exceed an empirically chosen score cutoff. We perform a large-scale evaluation of the approach on four datasets, totaling 152 binding sites. We demonstrate that computed hit rates correlate with hit rates measured experimentally in a previously published NMR-based screening method. Secondly, we show that the in silico fragment screening method can be used to distinguish known druggable and non-druggable targets, including both enzymes and protein-protein interaction sites. Finally, we explore the sensitivity of the results to different receptor conformations, including flexible protein-protein interaction sites. Besides its original aim to assess druggability of different protein targets, this method could be used to identifying druggable conformations of flexible binding site for lead discovery, and suggesting strategies for growing or joining initial fragment hits to obtain more potent inhibitors.
Computational methods involving virtual screening could potentially be employed to discover new biomolecular targets for an individual molecule of interest (MOI). However, existing scoring functions may not accurately differentiate proteins to which the MOI binds from a larger set of macromolecules in a protein structural database. An MOI will most likely have varying degrees of predicted binding affinities to many protein targets. However, correctly interpreting a docking score as a hit for the MOI docked to any individual protein can be problematic. In our method, which we term “Virtual Target Screening (VTS)”, a set of small drug-like molecules are docked against each structure in the protein library to produce benchmark statistics. This calibration provides a reference for each protein so that hits can be identified for an MOI. VTS can then be used as tool for: drug repositioning (repurposing), specificity and toxicity testing, identifying potential metabolites, probing protein structures for allosteric sites, and testing focused libraries (collection of MOIs with similar chemotypes) for selectivity. To validate our VTS method, twenty kinase inhibitors were docked to a collection of calibrated protein structures. Here we report our results where VTS predicted protein kinases as hits in preference to other proteins in our database. Concurrently, a graphical interface for VTS was developed.
virtual target screening; virtual counter-screening; computational inverse docking; in silico inverse docking; drug retargeting; drug repurposing; off-target effects
CDP-ME kinase (IspE) contributes to the non-mevalonate or deoxy-xylulose phosphate (DOXP) pathway for isoprenoid precursor biosynthesis found in many species of bacteria and apicomplexan parasites. IspE has been shown to be essential by genetic methods and since it is absent from humans it constitutes a promising target for antimicrobial drug development. Using in silico screening directed against the substrate binding site and in vitro high-throughput screening directed against both, the substrate and co-factor binding sites, non-substrate-like IspE inhibitors have been discovered and structure-activity relationships were derived. The best inhibitors in each series have high ligand efficiencies and favourable physico-chemical properties rendering them promising starting points for drug discovery. Putative binding modes of the ligands were suggested which are consistent with established structure-activity relationships. The applied screening methods were complementary in discovering hit compounds, and a comparison of both approaches highlights their strengths and weaknesses. It is noteworthy that compounds identified by virtual screening methods provided the controls for the biochemical screens.
A novel chemocentric approach to identifying cancer-relevant targets is introduced. Starting with a large chemical collection, the strategy uses the list of small molecule hits arising from a differential cytotoxicity screening on tumor HCT116 and normal MRC-5 cell lines to identify proteins associated with cancer emerging from a differential virtual target profiling of the most selective compounds detected in both cell lines. It is shown that this smart combination of differential in vitro and in silico screenings (DIVISS) is capable of detecting a list of proteins that are already well accepted cancer drug targets, while complementing it with additional proteins that, targeted selectively or in combination with others, could lead to synergistic benefits for cancer therapeutics. The complete list of 115 proteins identified as being hit uniquely by compounds showing selective antiproliferative effects for tumor cell lines is provided.
Disrupting protein-protein interactions by small organic molecules is nowadays a promising strategy employed to block protein targets involved in different pathologies. However, structural changes occurring at the binding interfaces make difficult drug discovery processes using structure-based drug design/virtual screening approaches. Here we focused on two homologous calcium binding proteins, calmodulin and human centrin 2, involved in different cellular functions via protein-protein interactions, and known to undergo important conformational changes upon ligand binding.
In order to find suitable protein conformations of calmodulin and centrin for further structure-based drug design/virtual screening, we performed in silico structural/energetic analysis and molecular docking of terphenyl (a mimicking alpha-helical molecule known to inhibit protein-protein interactions of calmodulin) into X-ray and NMR ensembles of calmodulin and centrin. We employed several scoring methods in order to find the best protein conformations. Our results show that docking on NMR structures of calmodulin and centrin can be very helpful to take into account conformational changes occurring at protein-protein interfaces.
NMR structures of protein-protein complexes nowadays available could efficiently be exploited for further structure-based drug design/virtual screening processes employed to design small molecule inhibitors of protein-protein interactions.
We demonstrated a method to screen for binders to a particular G-quadruplex sequence using easily designed short peptides consisting of naturally occurring amino acids and mining of binding data using statistical methods such as hierarchical clustering analysis (HCA). Despite the small size of the library used in this study, candidates of specific binders were identified. In addition, a selected peptide stabilized the G-quadruplex structure of a DNA oligonucleotide derived from the promoter region of the protooncogene c-MYC. This study illustrates how a peptide library can be designed and presents a screening guideline for construction of G-quadruplex binders. Such G-quadruplex peptide binders could be functionally modified to enable switching, cellular penetration, and organelle-targeting for cell and tissue engineering.