|Home | About | Journals | Submit | Contact Us | Français|
Finding new uses for existing drugs, or drug repositioning, has been used as a strategy for decades to get drugs to more patients. As the ability to measure molecules in high-throughput ways has improved over the past decade, it is logical that such data might be useful for enabling drug repositioning through computational methods. Many computational predictions for new indications have been borne out in cellular model systems, though extensive animal model and clinical trial-based validation are still pending. In this review, we show that computational methods for drug repositioning can be classified in two axes: drug based, where discovery initiates from the chemical perspective, or disease based, where discovery initiates from the clinical perspective of disease or its pathology. Newer algorithms for computational drug repositioning will likely span these two axes, will take advantage of newer types of molecular measurements, and will certainly play a role in reducing the global burden of disease.
The number of new drugs approved per dollar spent or pharmaceutical research and development productivity has significantly declined in recent decades . By conservative estimates, it now takes ~15 years  and $800 million to $1 billion to bring a single drug to market . There are two major reasons for this decline in the total number of safe and effective new drugs reaching the market. The first is that prevalent drug development strategies within pharmaceutical companies remain conservative, typically oriented on discovery of a new therapeutic target combined with a search for a novel therapeutic compound that modulates the activity of the identified target. This is followed by a slow, costly and risky process of experimental and clinical validation.
The second major reason for reduced productivity is the lack of systematic evaluation of additional indications that each drug can target, both during the drug’s development phase and subsequent to its arrival on the market. Some of the most profitable and successful pharmaceuticals did not begin development for their current indications, but instead were re-purposed or repositioned for new uses . Accidental discovery, unintended side effects or obvious follow on indications have led to new uses of such drugs. Classic examples include Minoxidil (originally tested for hypertension; now indicated for hair loss), Viagra (originally tested for angina; now indicated for erectile dysfunction and pulmonary hypertension), Avastin (originally indicated for metastatic colon cancer and nonsmall-cell lung cancer; later approved for metastatic breast cancer) and Rituxan (originally indicated for non-Hodgkin’s Lymphoma; later approved for chronic lymphocytic leukemia and rheumatoid arthritis).
Revenues generated by repositioned drugs can exceed billions: sales of thalidomide, repositioned for multiple myeloma, reached US $271 million in 2003 alone ; sildenafil, repositioned for erectile dysfunction, had annual sales of US $1.88 billion in 2003 . While the revenues generated by repositioned drugs have been substantial, the real incentive for repositioning is the clear benefit for patients. For example, thalidomide’s antiangiogenic properties have provided therapeutic benefits to multiple myeloma patients, who otherwise had few treatment options for their disease, while the central dopamine agonist properties of bromocriptine recent led to its approval in the USA for a new indication of Type 2 diabetes .
Drug repositioning or repurposing (i.e. finding a new use for an existing drug) can provide solutions to the problems facing pharmaceutical companies. Such efforts have spanned the spectrum from traditionally blind screening methods of chemical libraries against specific cell lines  or cellular organisms [8, 9], to serial testing of animal models , to the newer data-driven approaches involving computational methods. The latter category typically takes advantage of the fact that a single molecule can act on multiple targets and could be beneficial to indications where the additional targets are relevant (the known compound-new target approach). In fact, there is strong evidence that such off-target interactions, or polypharmacology, are common among many approved drugs compounds . Additionally, repositioning efforts also leverage the fact that mechanisms and targets are shared between diseases or biological processes enabling drugs that work on a target in one work for the other (known targets in a new indication). Drug repositioning has several advantages compared to a traditional approach to the development of a drug de novo.
The drug development cycle for a repositioned drug can be as short as 3–12 years compared to the traditional 10–17 years required to bring a new chemical entity to market . This is due to the fact that several steps of the drug development pipeline can be eliminated during repurposing efforts. However, the discovery of a new use of a drug for a new condition can be a haphazard process, as illustrated by the examples given above, where new indications were found through side effects, or through exploiting useful properties of these drugs , demonstrated in the utility of arsenic for acute promyelocytic leukemia , amphotericin B for leishmaniasis  and thalidomide for multiple myeloma . While physicians, pharmaceutical and biotechnology companies have manual methods and prior knowledge that enable repositioning of drugs through clinical trials, these occurrences are often serendipitous and rare. One challenge in drug repositioning, therefore, lies in predicting and choosing new therapeutic indications to prospectively test for a drug of interest. This review will focus on computational approaches for guiding and selecting new indications for drugs.
Many computational strategies for drug repositioning have been published. One way to classify these methods is by categorizing them as either ‘drug based’, where discovery of repositioning opportunities initiates from the chemical or pharmaceutical perspective, or ‘disease based’, where discovery initiates from the perspective of disease management, symptomatology or pathology (Figure 1). Drug-based approaches might be preferred, if there is interest or expertise in modeling or understanding more precise pharmacological properties leading to repurposing opportunities, or if rich pharmacological or chemical data for drugs is available. Disease-based approaches may be preferred to overcome missing knowledge in the pharmacology of a drug (e.g. unknown or additional targets), or if repositioning efforts are to be focused on a specific disease or therapeutic category. While each of these approaches present unique informatics challenges, successful repositioning strategies often incorporate elements from both drug- and disease-based methods. Here, we discuss several specific examples of computational approaches developed from these perspectives.
The structure and chemical properties of a drug compound itself are evidently associated with its ultimate effective use as a therapy. It is, therefore, possible to explore repositioning opportunities for a drug compounds based on shared chemical characteristics. The rational basis for this approach is rooted in known quantitative relationships between chemical structures and biochemical activity (QSAR). Although similar structures do not always behave the same in biological systems, the degrees of similarity that exist can be exploited using computational approaches for drug repositioning. The computational basis of chemical similarity approaches is to extract a set of chemical features for each drug in a set of drugs, and then to relate the drugs directly to each other by clustering or constructing networks based on the extracted features . Drug repositioning opportunities can then be inferred by simple chemical association, or by looking for particular biological features, such as known drug targets, enriched in the resulting relationships.
As an example, to discover novel targets for metabotropic glutamate receptor (mGluR) antagonists, Noeske et al.  extracted standardized pharmacophore descriptors for a collection of known mGluR and also for a broader set of known drug compounds with diverse known drug targets. These descriptors were then used to project the compounds on to a self-organizing map (SOM), which revealed distinct subclusters of mGluR antagonists, and also overlapping localization with ligands known to bind to histamine (H1R), dopamine (D2R) and several other targets. These predicted interactions were subsequently confirmed by experimental validation, which showed weak but significant binding affinities between mGluR antagonists predicted off-targets.
Keiser et al.  took an integrated chemical similarity approach to drug repositioning that incorporated both the structural similarity between drug compounds, as well as knowledge of established compound-target relationships. In this approach, drug targets were represented by the set of ligands which were known to bind to them. To evaluate a possible novel association between an established drug (query compound) and an off-target, a score was derived by calculating the sum of the structural similarity between the query compound and each member of the set of ligands known to bind to the target. The pair-wise similarity score was computed as the Tanimoto coefficient, representing the structural similarity between two compounds based on 2D chemical fingerprint descriptors of their chemical structure. A statistical model based on the extreme value distribution was then used to determine a significant score, and scores surpassing the significance threshold indicated a probable association between the query drug and the target. Several of the off-target interactions predicted by this method were experimentally confirmed by binding assays, and a predicted interaction between N,N-dimethyltryptamine and serotonergic receptors was confirmed in a knockout mouse model. Other sources of chemical information can also be used to make associations .
Limitations of the chemical similarity approach for drug repositioning largely stem from the fact that many structures  and other chemical properties of known drug compounds contain errors, or are withheld as proprietary information. Furthermore, many physiological effects cannot be predicted by chemical properties alone, because drugs undergo complex, and largely uncharacterized, metabolic transformations and other pharmacokinetic transformations as they are metabolized and physiologically distributed.
Another way in which drugs can be related to other drugs and disease states for the purpose of repositioning is by computational assessment of similarities in molecular profiles. When a pharmacologically active compound is exposed to a biological system, the result is a perturbation of the biological system through the compound’s mechanism of action (MOA). Although the precise MOA is not well-understood for many approved compounds, high-throughput molecular measurement techniques, such as gene expression microarrays, can be used to measure and represent the total molecular activity of a compound in a biological system. In this way, it becomes possible to construct a ‘signature’ of the molecular activity of a drug compound acting in a biological system. These signatures of molecular activity can then be compared to establish therapeutic relationships between drugs and diseases even in cases when a drug’s MOA or even primary target is unknown.
One of the most comprehensive and systematic approaches toward leveraging the molecular activity approach for drug repositioning is the Connectivity Map project . The Connectivity Map currently provides a reference collection of gene expression based molecular activity profiles for 1309 compounds, which were obtained by systematically exposing the compounds to a few key cancer cell lines and measuring the genome-wide transcriptional response. The molecular activity profile of each drug in the reference collection contains, for each gene measured, a rank-based measure of the change in transcriptional activity after exposure to the drug compound. These profiles can be used as the basis of comparison to connect drugs to other drugs and diseases based on shared molecular activity.
As one example of this utility, Iorio et al.  computed the pair-wise similarity between the molecular activity signatures of all drug compounds represented in the Connectivity Map using a novel, rank-based metric based on Gene Set Enrichment Analysis (GSEA). Drugs were then organized into a network using the resulting similarity scores, and a network partitioning strategy based on ‘affinity propagation’  was applied to cluster the network into coherent ‘communities’ of drugs. The resulting drug communities comprised drugs with similar MOAs, which often shared canonical targets and pathways. Through this approach, repositioning opportunities are revealed by co-location of drugs within the network clusters, which suggests a shared molecular activity with other drugs in the cluster. The authors used this approach to infer previously unknown cellular autophagy activity for the rho-kinase inhibitor Fasudil, which was further supported by experimental validation of predicted target levels.
Another way the Connectivity Map could be used for drug repositioning is to directly compare the molecular activity signatures of drugs with those of a disease state. Since disease pathology can similarly be viewed as a perturbation of a biological state, the same approach can be applied to measure the genome-wide transcriptional changes in the disease condition and generate a signature or profile of its molecular activity. The resulting disease signature provides a common basis by which the molecular activity of a disease can be compared to the molecular activity of a drug to identify novel therapeutic opportunities. Wei et al.  successfully applied this approach to identify the mTOR inhibitor rapamycin as a modulator of glucocorticoid (GC) resistance in acute lymphoblastic leukemia (ALL). They first derived a gene expression signature GC resistance by comparing gene expression profiles of GC-resistant and GC-sensitive ALL samples. They then compared the GC-resistance signature with the drug compound molecular profiles in the Connectivity Map using the nonparametric GSEA method , which revealed a significant enrichment of concordant gene expression changes between gene transcripts the GC-resistance signature and those in the molecular activity profile of rapamycin.
Strategies for drug repositioning based on molecular activity similarity are not limited to analysis of transcriptional response, and may incorporate any number of high-dimensional molecular characterizations of drug effect; such as chemical screening assays or high-throughput gene knockout assays. Chen et al.  used data from the PubChem bioassay repository to create molecular activity profiles for represented drug compounds, which were organized into similarity networks based on their bioassay activity profiles. The resulting bioassay network was then mapped on to biological networks constructed from metabolic pathways and protein–protein interactions using a bipartite mapping scheme that considered the sequence similarity between protein targets in the assay nodes and the protein sequence nodes in the biological network. The resulting mappings between the PubChem bioassay and biological networks provide representation and interpretation of the biological activity of compounds within the context of biological systems, with the goal of understanding how the compounds might perturb a biological system toward efficacy or adverse effects.
The primary limitation of drug repositioning strategies based on molecular activity similarity is their heavy reliance on the quality and assumptions of the means used to derive molecular activity profiles. For example, the Connectivity Map is derived by exposing the whole drug compound to isolated cell lines, which may not accurately reflect the biological activity of the drug in a complete physiological system. Many drugs undergo chemical transformations after they are metabolized in vivo, and in fact the drug metabolites often provide the eventual therapeutic effect. Furthermore, the pathology of many disease conditions, including metabolic diseases such as Type 2 diabetes, spans multiple tissues and organ systems; therefore, it might be difficult to represent and compare such diseases on the basis of a single molecular activity signature.
Molecular docking comprises a set of computational methods that aimed at discovering novel relationships between chemical ligands and targets through use of simulation and modeling of their direct physical interaction . Molecular docking methods can enable drug repositioning by attempting to predict physical interactions between existing compounds and novel therapeutic targets. If a drug is predicted to physically interact with a previously unknown target, then the drug might be considered as a possible repositioning candidate for disease conditions in which the predicted target is known or suspected to play a role, or perturb the molecular pathology of the disease. Molecular docking approaches can be used on a target-by-target basis to look for repositioning strategies for a particular target of interest, or to establish networks of ligand–target interactions to explore drug repositioning opportunities across systems of predicted drug–target interactions.
Zahler et al.  describe a virtual ‘inverse’ screening approach to identify novel targets of the kinase inhibitor indirubin. Beginning with indirubin as a chemical of interest, they sequentially screened its ligand against a database of kinase receptor structures through molecular docking to discover and validate a novel interaction between an indirubin derivative and PDK1. Kinnings et al.  employed molecular docking in a chemical systems biology approach to reposition Entacapone, a catechol-O-methyl transferase (COMT) inhibitor used to treat Parkinson’s disease, as a treatment for multi-drug resistant (MDR) tuberculosis. They began by extracting target-binding sites for approved drugs from the 3D structures of their known targets, and then performed a computational search to identify putative off-target proteins with similar ligand binding sites. They then used molecular docking to evaluate candidates from the binding site similarity search for physical interaction with the associated drug ligand, retaining predicted interactions as putative novel off-targets of the drug. Focusing on off-target predictions for the protein enoyl–acyl carrier protein reductase (InhA), which is involved in synthesis of the bacterial cell wall, they used this approach to identify and validate antagonistic ligands of a binding site extracted from COMT. Taking this approach further, Chang et al.  have recently shown that structure-based predictions can also be filtered based on other data- and knowledge sources, such as metabolic networks, tissue localization and gene expression patterns.
There are some potential limitations to the use of molecular docking in drug repositioning. Foremost, the approach typically requires that the 3D structure of both the chemical ligand and the protein target are well resolved. At present, the structures of many physiologically important proteins are not fully resolved; including whole families of G-protein coupled receptors (GPCRs), which are favored as drug targets for many approved drugs. Additionally, the results of molecular docking are known to incur high false-positive rates, due to errors in resolved protein structures and incomplete modeling of atomic and molecular interactions . Despite these challenges, it is clear that there are tens of thousands of drugs waiting for an indication with available information , and multiple drug-based approaches are likely to be fruitful.
One approach to leveraging disease-based information for drug repurposing is to utilize knowledge of drug indications for disease. Chiang et al.  used a ‘guilt by association’ approach to discover novel drug indications based on the similarity of their efficacious indications. Diseases were deemed as similar to each other if they already shared a significant number of therapies. Across each pair of similar diseases, those remaining drugs that were currently used against only one of the pair were then considered as logical candidates as drugs for the other disease in the pair. Novel drug-indication associations could then be inferred by associating drugs with novel indications by expanding from simple pairs into network clusters.
However, repositioning strategies based on the associative transfer of indications are limited by the varied and complex relationships that associate a drug as an indication for a particular disease condition. For example, many drugs are indicated as palliative treatments for various cancers, which are difficult to discern from those drugs indicated as primary chemotherapeutic treatments. At present, there is no comprehensive, systematic representation of known drug indications that would enable such fine-scale delineation of types of drug–disease relationships; however, efforts to construct systematized drug ontologies and other resources are underway .
An implicit assumption in drug repositioning is that a drug can be repositioned from one indication to another because the two indications share some aspect of underlying molecular pathophysiology that is responsive to the therapeutic effect of a drug. Therefore, computational strategies for assessing molecular relationships between distinct disease pathologies can serve as a means for drug repositioning. In this approach, repositioning opportunities exist when diseases are found to exhibit similarity at the molecular level (even without similarity at the phenotypic or clinical level), suggesting that drugs might be shared among diseases with high degrees of molecular similarity.
Hu and Agarwal  created a disease-similarity network using publicly available gene expression profiles, and integrated this network with molecular profiles and knowledge of drugs and drug targets to infer drug repositioning opportunities and suggest molecular targets and mechanisms underlying drug effects. They began by identifying and acquiring disease-related experiments in the NCBI Gene Expression Omnibus (GEO) and computed differential gene expression profiles between classes represented in the experiments (e.g. affected versus healthy). The resulting disease profiles were compared using a correlation-based similarity metric and organized into a network to reveal novel disease relationships based on genome-wide transcriptional response. This network was further integrated with drug molecular profiles derived from the Connectivity Map to create a drug–disease network where clusters of drugs and diseases suggest shared drug mechanisms and molecular disease pathology. Similar efforts using the genetics and known pathways involved in these diseases have also been successful [36, 37].
Suthram et al.  integrated a larger set of molecular profiles of diseases with protein–protein interaction (PPI) data, to infer protein functional modules and networks that were shared among many diseases. Using a human PPI that was organized into ‘modules’ of functionally interacting proteins, a statistical approach was used to evaluate the molecular signatures of diseases for gene functional module activity, whereby the module activity was determined as the mean normalized transcriptional activity of its component genes in the disease molecular profile. A network of disease–disease relationships was then formed on the basis of functional module activity shared between diseases. Within these networks that she discovered as common across diseases were multiple known drug targets. Curiously, drugs that hit these targets were already known to be effective against multiple indications, compared to drugs hitting other targets in other networks, a property she called pluripotent drug targets. This work also leveraged publicly available experiments from GEO to create molecular profiles of disease, which were restricted to include only those experiments comparing disease affected tissues to healthy controls to create standardized disease signatures.
The limitations of using shared molecular pathology approaches are similar to those implicit in the shared molecular activity approach for drug-based repositioning. The utility of these approaches will be limited by the ability measure and represent the molecular pathology underlying the disease. As disease pathology often incorporates a multitude of molecular entities, tissues and organ systems, it can be quite challenging to model the molecular state of diseases such that they can be easily compared using computational approaches. Promising network-based approaches to overcoming the challenge of modeling and comparing complex molecular disease states has been proposed by Schadt et al. [39–41].
Another way to connect drugs to clinical effects for the drug repurposing is through the side effects of drugs, which represent unintended consequences of the drug action. Side effects provide a means to connect drugs to diseases, because they encode the physiological consequence of a drug compound’s biological activity. Furthermore, the phenotypic expression of a side effect can be similar to that of a disease, implying that the underlying pathways or physiological systems may be similarly perturbed by both the drug and the disease condition. This provides the basis to relate drugs to other drugs or diseases by side effect profiles, even in cases where the precise pharmacological mechanism facilitating the side effect is unknown.
Campillos et al.  performed a systematic analysis to identify novel drug–target relationships for 746 approved drugs using a side effect similarity approach. For each drug, they extracted side effects from the drug package insert and mapped them to standardized medical symptom terms using the Unified Medical Language System (UMLS) Metathesaurus . The side effect terms were given weights using a scheme that incorporated their frequency and correlation across all drugs in the set, and similarity scores were computed between a pair of drugs based on the sum of the respective weights of their common side effects. A randomization approach was used to establish the significance of the side effect similarity scores, which were further incorporated with a measure of the structural similarity between drugs to increase predictive power. The resulting drug–drug relationships were shown to recapitulate many shared target relationships between drugs, and several predicted novel drug–target relationships were experimentally confirmed.
The most apparent limitation of the side effect similarity approach is the necessity for having well-defined side effect profiles for a drug. Despite rigorous preclinical assessment, the side effect profile for a newly approved drug may only be fully discerned after years of clinical use and postmarket surveillance. In addition, the assumption that similar phenotypic expression of a drug side effect implies a common pathophysiological basis may not always hold. For example, the side effect of ‘hair loss’ can arise when a drug interferes with hormonal systems that regulate hair growth, or alternatively disrupt immune function in a manner that causes harm to the cells comprising the hair follicle.
With continued difficulties in accelerating the growth rate of new chemical entities reaching regulatory approval, it appears certain that the strategy of repositioning of existing pharmaceuticals will only gain in acceptance. Computational methods for repositioning are probably the most efficient way to yield novel indications for these drugs, and the power of these methods will only increase as more molecular measurements of increasingly different types become available. However, in general, extensive animal model and clinical trials have not yet been launched based on these computational predictions; these studies are now needed to fully demonstrate the utility of computational drug repositioning.
Regardless of whether computational methods become the standard for drug repositioning, it is clear that many other undiscovered uses of drugs do exist. Finding these new uses is an important and necessary step towards reducing the burden of disease.
This work was supported by funding from the National Library of Medicine Biomedical Informatics Training Grant (T15 LM007033) and NIGMS (R01 GM079719).
Joel Dudley is a Bioinformatics Specialist at Stanford, developing genomics-based classifications of human disease and studying medical evolution. Joel was previously at Arizona State University, involved with developing MEGA.
Tarangini Deshpande holds a PhD from Purdue University, and is a founder of NuMedii, a company using a comprehensive genomics computational platform for new indications discovery.
Atul Butte is an Assistant Professor and Chief of the Division of Systems Medicine in the Department of Pediatrics at Stanford, and has authored over 100 publications.