The primary results of the study fall into two basic categories. The first relate to what can be predicted about a compound’s potential biological effects based purely on molecular similarity and whether the use of 2D or 3D similarity methods influences the types of inferences that can be made. The second involve a census of the pharmacological congruence between pairs of drugs, where the pairs have been defined based upon the characteristics of their 2D and 3D similarities. The principal observation from the first category is that both 2D and 3D similarity (and their combination) are able to predict biological targets, but 3D similarity is more likely to identify effects that are not obvious from knowledge of pre-existing molecular pharmacology. The principal observation from the second category is that molecules sharing high similarity in both a 2D and 3D sense are much more likely to exhibit highly similar target profiles than those molecules that exhibit topological variation but retain high 3D similarity.
Predicting On- and Off-Target Effects
For each of the 358 drugs (see Methods and Data), we asked what their computed log-odds score was for each of the 44 targets that had 5 or more annotated drugs as either primary or secondary modulators to serve as positive examples. shows the results of the computation for both primary targets (top) and secondary targets (bottom). Nearly all computed log odds scores were positive (about 90% for all methods), indicating greater similarity than dissimilarity to example sets of compounds for either 2D, 3D, or 2D+3D log-odds computations. This was the desired result, since the definitions of targets differentiated between different binding sites on the same protein assemblies, so ligands within a given set of modulators were known to bind competitively.
Figure 6 Proportion of drug targets correctly predicted. These plots indicate the proportion of drug targets correctly predicted by the three similarity methods at various Log Odds thresholds. Using 2D (red line), 3D (green line), or a combination of the similarity (more ...)
For primary target predictions, performance of the methods was 2D < 3D < 2D+3D, but the degree of improvement in moving beyond 2D ranged from 9 percentage points at a log-odds threshold of 6.0 to 15 percentage points at a log-odds threshold of 20.0. The added value of combining the two similarity approaches yielded typical gains of 10 percentage points over a broad range of log-odds values. At a threshold of 6.0, the combination of 2D+3D similarity methods was able to identify a majority (59%) of all primary target annotations. As mentioned earlier, and as we have previously reported, the relatively limited gains of 3D over 2D are explained directly by human design bias.1
The new observation here is that the effect holds in the forward predictive direction: when one has a set of ligands with known activity, 2D similarity works quite well in assigning primary targets to new molecules. For secondary target prediction, the same qualitative performance was observed, but the performance gains for 3D over 2D were 24 percentage points (log-odds threshold of 6.0) to 30 points or greater for log-odds thresholds of 10.0 or more. The combination of methods yielded only a marginal gain, as with primary targets, of typically 10 percentage points or less, identifying 68% of the secondary targets at a log-odds threshold of 6.0.
The highlighted circles from (bottom) provide examples of specific secondary target predictions shown in , , and . shows the case of promethazine, whose primary target is the H1 receptor, and whose off-targets include multiple dopamine receptor subtypes. The drugs promazine and trifluoperazine are examples of the degree of structural concordance that can occur, allowing for predictions of targets by essentially any method for computing molecular similarity. For these two molecules compared to promethazine, both 2D and 3D approaches produced p-values less than 0.01 (the most extreme bin from ), and combined with p-values from 33 other dopamine D2 receptor drug comparisons, yielded log-odds scores of 10, 15, and 22 for 2D, 3D, and 2D+3D, respectively. These drugs were all synthesized as part of the medicinal investigation of what were then termed “anti-histaminic pheno-thiazines,” many of which had anti-psychotic properties.21
These properties were due to a host of effects on different brain receptors, but are thought to primarily derive from modulation of dopamine receptors of multiple subtypes. Relatively subtle changes in structure (e.g. from promazine to promethazine) yield sufficient different in target potencies to shift primary indication from anti-psychotic for promazine to anti-histamine for promethazine. However, the shifts in potency are not so dramatic as to abrogate the multiple target effects entirely.
Figure 7 A 2D similarity method can sometimes correctly predict a target. Shown above are the 2D structures and 3D similarity overlays for the drug promethazine (a histamine H1 receptor antagonist) compared to two dopamine receptor antagonists, promazine and trifluoperazine. (more ...)
Figure 8 Our 3D similarity method more accurately predicts off-targets. Shown above are the 2D structures and 3D similarity overlays for the drug thioridazine (a dopamine receptor antagonist) compared to two muscarinic receptor antagonists, oxybutinin and diphenidol. (more ...)
Figure 9 A combination of 2D and 3D similarity methods makes a small improvement over 3D alone. Shown above are the 2D structures and 3D similarity overlays for the drug nefazodone (a 5HT reuptake transporter inhibitor) compared to two alpha1 adrenergic receptor (more ...)
shows another example of a phenothiazine anti-psychotic whose primary effects derive from dopamine receptor modulation. Here, however, some of the more significant side-effects are those modulated by muscarinic antagonism, including dry-mouth and blurred-vision. In this case, the 2D log-odds was just 3. Whereas 2D similarity did not produce a low p-value when comparing thioridazine to either oxybutynin or diphenidol (two potent anti-muscarinics), 3D similarity yielded much lower p-values. Coupled with those derived from comparisons to 61 other M1 drugs, the 3D log-odds score was 39, allowing very confident assignment of muscarinic targeting to thioridazine. In this case, the addition of 2D similarity to 3D produced a slight reduction in computed log-odds score. Log-odds scores are not additive; additional observations affect the combinatorics such that a collection of p-values which alone yield a marginally positive log-odds score may diminish the score derived from a collection of p-values that produced a high score.
shows a case where the combination of 2D and 3D similarity produced a log-odds score greater than 6.0 where neither alone met that threshold. Nefazodone yields its primary effects through modulation of multiple reuptake transporters, but it has a significant side-effect of postural hypotension deriving from modulation of alpha-adrenergic receptors. In this case, for alpha-1A receptor, 2D alone yielded log-odds of 2.4, with 3D yielding 5.5. Comparisons to dapiprazole and doxazosin produced no extreme p-values using either 2D or 3D, but all four scores leaned in favor of similarity to nefazodone. Along with 50 other drug comparisons, the combined log-odds for adrenergic effects was 6.5.
Excess Targets: False Positive Predictions
The framework we have developed allows for the combination of multiple sources of information to yield a single scalar value associated with a class prediction. In such a situation, it is both customary and desirable to make an estimate not only of true positive success rates but also of the corresponding false positive rates (e.g. with a receiver-operator characteristic (ROC) analysis). Here, we were able to identify primary and secondary targets about 60–70% of the time at a combination log-odds score threshold of 6.0. However, at that threshold, there are targets suggested for drugs for which no annotation is known. At a log-odds threshold yielding a true positive rate of 60%, the typical ratio of excess predicted targets relative to the total number of known primary and secondary targets was roughly two to three, depending on the class of drugs involved. Larger numbers of excess targets were observed for drugs whose primary targets were among the aminergic GPCRs. The difficulty in interpreting this observation is that public data do not exist that systematically profile small molecule drugs in biochemical assays.
As a surrogate for biochemical data in unknown drug-target relationships, we manually assessed package insert and related information to make a determination of whether muscarinic side-effects were both present and drug-related. These included dry-mouth, urinary retention, blurred-vision, drowsiness, mydriasis, and other effects. For the 358 drugs where we had no formal annotations of muscarinic target effects, which totaled 294 compounds, we surveyed a random subset of slightly more than half of them (180 drugs total), resulting in 84 with muscarinic side-effects and 96 without. We also surveyed 29 of the 64 drugs that we had previously annotated as binding muscarinic receptors. All 29 of the previously annotated muscarinic modulators showed clear, drug-related side-effects (90% exhibited dry mouth effects, 69% drowsiness, and a majority also showed urinary retention, blurred vision, and dizziness). For the 64 drugs with annotated muscarinic target effects, the mean log-odds score for muscarinic receptors was 25.8, with 92% scoring higher than 6.0.
Overall, using the side-effect assessments as a binary class label for the 180 surveyed drugs that had not been annotated as muscarinic modulators, the log-odds score produced an ROC area of 0.88 (95% confidence interval of 0.83–0.93). The enrichment for drugs with muscarinic side-effects among the top 1% log-odds scores was 19-fold. Of the surveyed drugs, 90% of those with a log-odds score of 6.0 or greater showed classic muscarinic side-effects (38/42 surveyed drugs). Even at a threshold of just 2.0, 85% were positive (55/65). Above a threshold of 26.0, all surveyed drugs (16 total) showed such side-effects. Conversely, below a log-odds threshold of -6.0, just 6% (3/47 surveyed) showed potentially muscarinic effects. Below a threshold of -16.0, no drugs showed such effects (26 total).
Three examples of drugs that had lacked muscarinic annotations are particularly informative. Amoxapine, an anti-depressant working primarily through the norepinephrine reuptake transporter, received a log-odds score of 4.8. Prescribing information indicated that the most frequent side-effects included dry mouth, constipation, and blurred vision. It has also been shown biochemically to bind muscarinic receptors.22–24
Orphenadrine, an anti-histamine prescribed to relieve muscular pain, received a score of 42.9. Prescribing information indicates that “dryness of the mouth is usually the first adverse effect to appear.” The drug has been shown to antagonize muscarinic receptors with a Ki
Mesoridazine received a score of 37.1, had clear muscarinic side-effects, and also has a Ki
of 69nM against the M1 receptor.22
Notably, it received a log-odds score of 6.4 against the HERG potassium channel, though it had not been annotated for such activity. It was withdrawn from the US market in 2004 due to HERG-mediated cardiac side-effects.26
This survey of muscarinic side-effects among previously unannotated drugs makes three points. First, the empty cells of the annotation matrix of drug to target interactions cannot be thought of as indicating no effect. Second, the log odds scores were both sensitive and specific with respect to muscarinic target annotation. Third, the lack of systematic profiling of drugs for which ample human data exist represent a large gap in our knowledge. Manual curation of this depth requires on the order of 30 minutes to 1 hour per drug per side-effect, after establishing the relationship between a particular target and the relevant human pharmacology down to specific terms and variations. We are exploring automated means to consider databases of side-effect terms and their relationship to predicted on- and off-targets in order to carry out a more comprehensive study.
Approaches for semi-automatic curation such as SIDER12
are challenged by variations in language such as “dryness of the mouth” instead of “dry mouth.” The MedDRA dictionary,27
for example, lists the latter as a defined medical term (but not the former), and relatively sophisticated language parsing is required to relate the two together. In the case of orphenadrine, one of the 84 drugs with clear muscarinic side-effects, SIDER misses the dry mouth effect, which is clinically the most prominent. Even with much more extensive synonym mapping, cases exist where side-effects are listed as not
being present, or are listed as being present but then dispensed with as not different from placebo, which is challenging to assess without expert manual curation.
Relationship to Other Methods
Two relatively recent approaches to data fusion involving molecular similarity are particularly relevant to our log-odds scoring approach. Muchmore and Hadjuk’s Belief Theory approach5
and the Similarity Ensemble Approach (SEA)3
introduced by Shoichet’s group both offer the means to make predictions about a given molecule’s activity based upon its relationships to other molecules.
The Belief Theory approach makes use of Hooper’s Rule, which was devised in the late 1600’s by George Hooper, predating the Bayesian belief approaches later popularized by Laplace and his adherents.28
The rule was devised to address the credibility of a report of some fact when simultaneously attested by N reporters, each with credibility p (high p implying high credibility). This rule formalizes the notion that multiple partially credible sources strengthen one-another’s credibility. In the original report applying this rule to predictions of molecular activity based on similarity, the definitions of positive pairs of molecules and negative pairs differed from the current work, with molecule pairs considered as positive sharing not only a target but similar potency against the target. Similarity descriptors were converted into probability functions by considering a large set of positive and negative pairs and counting the number of times that a pair with some level of similarity was a positive example. Evidence from multiple similarity methods concerning pairs of molecules was combined using Hooper’s Rule. A key distinction with our log-odds approach is that the Belief Theory formalism always
increases belief, no matter how marginal an additional source’s belief may be. In the log-odds approach, N very low p-values coupled with N symmetrically high p-values yield a log-odds of associating a target to a ligand of zero. The Belief Theory approach treats the cases of very low similarity as attesting in favor
of the proposition that the query molecule will hit the target in question, but with low belief. One might argue that the interpretation of such a value is more akin to a reporter attesting against
a fact rather than giving it marginal support, making the application of Hooper’s Rule a matter of empirical choice rather than purely logical.
The SEA method3
uses a framework for estimating probabilities that is similar to that used for comparing sequence similarity of proteins, with likelihoods represented as E-values (a p-value multiplied by a large, arbitrary constant representing a database size). SEA makes use of 2D topological similarity to compute pairwise similarities between sets of molecules. By choosing a threshold below which to ignore similarity values, the pairwise sum of all similarities between two sets of unrelated molecules was shown to fit an extreme value distribution. So, to compare one (or several) molecules against a set with known activity, the magnitude of the raw similarity set comparison score is compared with that expected from unrelated sets, a probability is derived, and an E-value is produced. In contrast with the Belief Theory approach, in this formulation, the presence of poor similarity values yields poorer E-values.
Both the Belief Theory and SEA approaches treat raw similarity values as being equivalent regardless of the specific molecules or molecule sets in question. Our observation of both the GSIM and Surflex-Sim methods, which we believe will also hold for other methods such as ROCS29
(3D) and Daylight fingerprint-based similarity30
(2D), is that the probability of observing some raw value varies depending on the particular structure involved. As seen in , p-values associated with narrow similarity ranges included extremely significant values as well as clearly random ones. For similar molecules, the distributions of observed similarities to the background set tend to be close (see for an example). However, for a small and simple molecule, such as acetaminophen, the required similarity score to reach a p-value of 0.01 is higher (8.3) than for more complex molecules, such as azithromycin, where the required similarity is lower (5.8). Clearly, the particular values depend on the composition of the background molecule set, but we do not believe it is possible to construct a non-degenerate background set against which all molecules will exhibit congruent similarity distributions. By assessing differences in likelihood of observing different similarity levels within the context of each specific molecule pair, it is likely that the associated log-odds scores better reflect the underlying similarity relationships than approaches that take a coarser-grained approach. Of course, it is also possible to make use of global similarity distributions with the log-odds approach, but it is difficult to justify doing so.
Quantitative Comparison to Other Approaches
The muscarinic side-effect prediction task offers the opportunity for direct comparison of our approach to Belief Theory and to SEA. We computed joint beliefs regarding muscarinic activity for the 180 drugs, and used these beliefs to assess ROC area. Recall that the 180 drug set consisted of 86 positives and 94 negatives based on the presence of side-effects, with similarities for each computed against 64 known muscarinic modulating drugs. For Belief Theory, the formula for combining evidence is given by B = 1 − (1−B1)*(1−B2)* … *(1−BN), where B1…BN are the separate beliefs associated with the assertion that a given molecule has a particular activity. The most direct comparison to Much-more and Hadjuk’s formulation is made by setting each Bi = (1−pi), with each pi derived from the 3D similarity computations used above for the log-odds approach. Using a single global distribution to obtain p-values from the similarities, we observed an ROC area of 0.61 ± 0.05 (95% confidence interval), which was significantly worse than for the log-odds approach (0.88 ± 0.05). Using empirically determined p-values for each molecular comparison (as the log-odds approach does), the performance improved to 0.72 ± 0.05, but was still significantly worse than the log-odds result.
Note, however, that the ROC area comparisons are somewhat misleading due to the degeneracy in the Belief Theory evidence rule. If a single belief is 1.0 (p-value of 0.0), the overall joint belief will be 1.0 no matter what the other belief values may be. For the muscarinic side-effect prediction task, this results in a large proportion of joint beliefs for the 180 drugs to be exactly 1.0. This degeneracy stems from the definition of Hooper’s Rule, but its effect can be ameliorated by scaling down all beliefs by a constant factor. The best result we were able to obtain for Belief Theory was an ROC area of 0.85 ± 0.05 (nominally indistinguishable from log-odds), using B = 0.5 * (1−p), with empirically determined p-values for each pairwise molecular comparison. Even with this augmentation, there were a significant number of tied values of high belief, covering nearly 10% of the 180 ligands. The maximal enrichment for Belief Theory, in this most favorable (and artificial) formulation, was 6.4, corresponding to a true-positive (TP) rate of 55% and false-positive (FP) rate of 9%. Much better early enrichment was possible with the log-odds approach, since there is no multiplicative degeneracy involving strict interpretation of p-values. We obtained maximal enrichment of 20-fold at a false-positive rate of just 1% using 3D log-odds.
For the SEA approach, a direct performance comparison (with the same set of 64 annotated muscarinic ligands used here) was not possible using the web-based SEA interface (sea.bkslab.org). However, the annotations underpinning SEA predictions are far more extensive than those used here, with over 1000 ligands having muscarinic target activity (including exact matches for 50% of the 64 used here, and close analogs for over 85%). We queried the 180 drugs for SEA predictions, which were reported for target predictions with E-values < 10.0 (recall that such E-values are generally thought to be significant when less than 10−10.0). For each drug, we recorded the most extreme E-values against any muscarinic subtype. Those molecules with no predicted muscarinic targets were assigned an E-value of 100.0. The corresponding ROC area was 0.57 ± 0.05, significantly worse than the log-odds approach. As with the Belief Theory approach, interpretation of ROC areas is problematic due to tied values. With SEA, the tied values were at the low end of the ranking, since the majority of drugs received no muscarinic target predictions at all. Maximal enrichment for the SEA approach occurred within the non-tied value range at an E-value cutoff of 10−1.2, allowing for a direct comparison. Maximal enrichment was 3.8-fold. This corresponded to an FP rate of 4% and a TP rate of 15%. Three direct comparisons between the log-odds approach and SEA are particularly meaningful: 1) the maximal enrichment, which was 20-fold for log-odds vs. 4-fold for SEA; 2) corresponding TP rates at the same 4% FP rate, 48% vs. 15%, respectively; and 3) corresponding FP rates at the same 15% TP rate, 0% vs. 4%.
We believe that the inherent degeneracy in Hooper’s rule favoring high beliefs makes it inappropriate to use in a situation where belief values cannot be fully trusted. Given a single spurious annotation or a single similarity method yielding an inappropriately high confidence in a single molecular comparison, Belief Theory will produce incorrectly high belief in a prediction. In the case of SEA, we believe that the fundamental divergence of 2D similarity methods from the direct biophysical underpinnings of molecular activity limit the degree to which one can identify surprising off-target effects with high specificity.
Off-Target Prediction: Detection of Surprising Effects
The distinctions among different methods for data fusion, while clearly important, are not as critical as the distinctions among similarity methods that provide information to the data fusion computations themselves. Those similarity approaches whose scores are derived from directly relevant biophysical features (like surface shape and electrostatics) will yield different inferences than those that are less directly related to physical characteristics but which may be closely related to design ancestry. Two particularly telling examples of the distinction involve methadone and imipramine, compounds whose long history allows us to understand not only what the compounds do pharmacologically, but also why they were synthesized and tested to begin with.
illustrates the historic context of the synthesis and testing of methadone and imipra-mine. Methadone was synthesized during WWII as part of an effort to develop anti-cholinergics for use as nerve gas antidotes,21
due to the limited availability of the natural product atropine (nerve gas results in an accumulation of acetylcholine by inhibition of acetylcholinesterase, leading to spasm and death). On testing in animals, the surprising finding was that methadone (and demerol as well) produced the Straub-tail effect, indicative of opioid analgesic activity. In a similar serendipitous story,31
the compound G-22,355, which became known as imipramine, was selected for testing as an antipsychotic. Roland Kuhn, a psychiatrist at the Cantonal Mental Hospital of Münsterlingen, and Robert Domenjoz, a medicinal chemist at Geigy Pharmaceuticals, identified it as being structurally similar to chlorpromazine (Thorazine). Kuhn tested the compound with no success on psychotic patients, but prior to returning the supply, it was tested on a small number of depressive patients. The effects were sufficiently dramatic after just three patients to suggest the compound had unique properties and warranted further testing. Imi-pramine established a new class of drugs,32
which ultimately came to be understood as acting primarily through the serotonin reuptake transporter.
Figure 10 The design intention and surprising effects of some older drugs are known. Methadone and demerol were synthesized in an effort to make synthetically scalable anti-cholinergics as nerve gas antidotes by the Nazis in WWII. Their opioid effects were discovered (more ...)
Methadone’s surprising on-target activity could have been predicted by the 3D log-odds approach based on the structures of morphinan-based opioids such as hydrocodone and codeine that had been identified well before methadone’s synthesis. These had very low p-values using the Surflex-Sim approach. For hydrocodone, codeine, morphine, and oxycodone, the 3D p-values were, respectively: 0.007, 0.048, 0.057, and 0.060. The 2D GSIM p-values were, respectively: 0.35, 0.35, 0.35, and 0.63. Clearly, in order to predict the opioid activity, 3D structural comparisons would be required. The case of imipramine cannot be considered in this pseudo-prospective fashion, since its synthesis and testing led subsequently to the identification of both its primary biological mechanism of action as well as to the line of chemical inquiry that produced selective agents such as citalopram. However, if we consider citalopram’s relationship to the eight serotonin-reuptake inhibitors that pre-dated it from our set of 358 (imipramine, clomipramine, trimipramine, amitriptyline, trazodone, paroxetine, fluvoxamine, and fluoxe-tine), we see that 3D similarity yielded p-values ≤ 0.05 for all eight, but 2D similarity yielded p-values ≤ 0.05 for only three.
Overall, for known secondary targets (most of which can be considered surprises to some degree), the 3D log-odds scores were, on average, 9.3 log units higher than the 2D scores. For known primary targets (where relatively fewer can be considered surprises), the difference was 4.0 in favor of 3D log-odds over 2D. Relationships that can be deduced through 3D molecular similarity include those that genuinely are surprising, not just those that would be obvious to someone knowledgeable of molecular pharmacology in a particular area.
Recent Off-Target Predictions
Given these anecdotes, there clearly can be differences between the types of inferences that can be drawn from 2D and 3D molecular similarity methods. The supporting information behind predictions such as these is important because the natural application of computational methods for predicting off-target effects is to identify those that someone intimately involved in a particular pharmacological area could not reasonably guess. What we have seen is that in cases where we are able to understand both the reasoning behind molecular design and the serendipitous discoveries about activity, it is the province of 2D methods to uncover effects related to historical reasoning that anticipated the effects but 3D methods to also find the surprises.
In 2006, we observed that methadone, based on 3D molecular similarity, co-segregated with muscarinic and histamine receptor antagonists, echoing its genesis more than sixty years earlier.2
We did not show biochemically that methadone was a muscarinic antagonist, but we pointed out that its side-effects included those associated with muscarinic antagonism: dry mouth, urinary retention, sweating, and reduced bowel motility. Subsequently, a biochemical assay showed that methadone has a Ki
of 1.0 μM for the M3 receptor.3
Using Surflex-Sim 3D similarity, methadone could be properly associated with the mu opioid receptor. What our study lacked was the perspective that 2D similarity provides as to what should have been considered obvious
in this case: the basic reasoning behind synthesis of methadone was topological analogy to atropine and its analogs. Keiser et al.3
directly showed that the SEA 2D similarity approach could reveal the off-target muscarinic effect of methadone (but not the on-target opioid effect). The 2D SEA approach successfully detected the association between methadone and the muscarinic receptor because attempts to create antimuscarinics from 2D analogy to atropine eventually succeeded, resulting in compounds such as adiphenine, diphenidol, tolterodine, oxybutynin, dicyclomine, and many others with a clear 2D similarity to methadone.
We observed this same pattern involving scaffold ancestry in a recently published application of the SEA approach.4
In it, a set of predictions were correctly made for four drugs, where each of the predicted off-targets was unrelated by sequence or structure to the primary targets of the drugs. shows two of the drugs, primary canonical targets, predicted off-targets, and an example of a previously published33–38
high-affinity ligand of each off-target
protein that shares a scaffold with each drug. In each case, the scaffold in question had been actively probed in medicinal chemistry exercises for the predicted off-target effect. The specificity of the highlighted scaffold for the off-target in question among CHEMBL39
annotations was over 40-fold for tetrabenazine, and the highlighted scaffold for the delavirdine prediction was over 1000-fold greater for H4 compared with any other target. Two other sets of predictions were made on drugs which target the NMDA receptor: ifenprodil and a simple analog thereof. The predicted and verified activities included reuptake transporters (5HTT and NET), opioid receptors (mu and kappa), and the D4 receptor. These activities shared the same pattern as those in with respect to the presence of previously published high-affinity analogs against the predicted targets (data not shown). The more general point relates to experimental molecular pharmacology. In 1991, ifenprodil was investigated for activity in addition to the NMDA and adrenergic ones already known,15
and potent activity was reported for the sigma and 5HT1a receptors. Established pharmacological crosstalk among ligands of sigma receptors and the opioid mu, delta, and kappa subtypes13, 40
anticipated weak opioid activities for ifenprodil and its analog. Crosstalk between ligands of the adrenergic and 5HT receptors and reuptake transporters14
anticipated these activities as well. Complex specificity patterns across multiple reuptake transporters and multiple receptor subtypes of sigma, NMDA, opioid, adrenergic, and serotonin have been probed for many years.
Figure 11 At left, two drugs are shown for which off-targets were identified through application of the SEA 2D similarity approach. Potencies for off-target effects were much weaker than for the on-target drug effects (shown below the drug names). The off-target (more ...)
The presence of many published ligand/target relationships provides data for computational inferences that parallel pharmacological knowledge. For predictions to have high practical utility, they must identify off-target effects for drugs automatically, reliably, and with high specificity, and ideally they must identify effects that are truly surprising. Evaluating computational methods is challenging, since even nominally prospective predictions can be driven by the evolutionary history of drugs. One can “predict” an activity for a ligand based on the fact that someone thought of the activity in connection with the ligand’s scaffold before, causing analogs to be developed and probed for that activity. In such cases, tools that ferret out such information will be useful only to the extent that they are either more effective than someone knowledgeable in molecular pharmacology or that they are facile to apply automatically and have a low rate of false predictions. Developers of predictive methods should disclose the reasons why a method made a particular prediction. Usually this requires only the provision of typical molecular structures that underpinned an inference. Special care must be taken in the case of methods for predicting off-target effects, since the goal is to identify those effects that might otherwise derail a clinical candidate, and it is reasonable to believe that the more obvious potential effects would have been extensively investigated.
Relationship of Structural Novelty to Pharmacology
From the foregoing discussion and our previous work,1
it is clear that the drug design process shows a clear component of design relating directly to topological reasoning about the biological activity expected from a particular molecular structure. It is also clear that clinically relevant surprises occur both with respect to primary targets as well as secondary ones. To assess the degree to which chemical structural novelty was directly related to novelty in pharmacological effect, we computed the pairwise similarity of all 358 drugs and split them into four groups: pairs with high 2D and high 3D similarity, low 2D but high 3D, low 3D but high 2D, and low 2D and low 3D. shows the proportions of molecule pairs within each group that had identical annotated targets (blue bars), overlapping primary targets but including some differences as to overall target effects (orange), non-overlapping primary targets but some overlap among secondary targets (green), and completely non-overlapping targets (purple). It is important to understand that the annotation of target effects include only those where sufficient experimentation exists in order to localize an effect to a specific binding site on a particular protein assembly. So, as we saw above with the analysis of muscarinic side-effects, many unan-notated drug-target relationships may well exist.
Figure 12 Drug pairs were segregated based on 2D and 3D similarity p-values into the 4 quadrants shown above (number of pairs per quadrant shown in parentheses). Conservative structural modifications are much more likely to yield highly similar pharmacology. Drug (more ...)
In the case of high 3D and high 2D similarity (upper right), nearly 80% of drug pairs show some degree of target overlap, with nearly 40% having identical targets and nearly 70% sharing primary targets. With the same level of 3D similarity but with low 2D similarity (upper left), slightly less than half of the drugs share targets, and just 10% have identical targets. The converse case (high 2D, low 3D, bottom right) produces somewhat similar proportions but with 70% having no common targets. As expected, molecules sharing no molecular similarity shared no targets nearly 95% of the time. shows examples from each quadrant. The case of imipramine and its fast follow-on compound amitriptyline fell into the identical target set; indeed they have very little to differentiate them in terms of pharmacology even beyond specific targets.41
However, the structural creativity shown by citalopram relative to imipramine (high 3D, low 2D) produced much more specificity with respect to the serotonin reuptake transporter, and citalopram along with other SSRI’s came to dominate anti-depressant therapy. A typical case for low 3D but high 2D similarity is the pair bupropion and ketorolac, which share no targets. The overwhelmingly common case for low 2D and low 3D similarity is exemplified by albuterol and imipramine, again sharing no common targets. About 2% of the time, drugs with some overlapping targets share no similarity at all. The case of sildenafil and tadalafil are a particularly striking example, both binding PDE5 within the same volume, but exhibiting no molecular similarity, either by eye or through computational means.1
Note, however, that while the annotated targets were identical for the pair, their detailed pharmacology is significantly different, particularly with respect to half-life.
Figure 13 Imipramine and amitriptyline differ by only 1 atom, share identical on- and off-targets, and were approved as antidepressants in 1959 and 1961, respectively. In contrast, citalopram has significantly lower 2D similarity to imipramine, has fewer off-targets (more ...)
These findings are not surprising in a qualitative sense. It should be the case that nearly identical molecules will more frequently share very similar biological effects than those that start to differ. We believe that the degree of deviation in effects is striking. There is a four-fold difference in the a priori likelihood that two drugs will share identical pharmacological targets when shifting from high 2D and high 3D similarity to a case that shares only high 3D congruence. Consider the case of designing a new drug with knowledge of the structures of existing drugs within a therapeutic category. In the modern research environment, it is likely that one will be able to guarantee that the desired target be among those that will exhibit pharmacological effects, but one cannot know that these effects will be the dominating ones. From , we will consider the molecule pairs that share some targets in this analysis. By designing a me-too analog (high 2D and 3D similarity to existing drugs), one has about a 47% chance of showing the same target profile as the incumbent compound versus showing either a difference in secondary targets or overlapping targets with different primary effects. By designing a structurally novel compound (high 3D but low 2D), one has a 23% chance of showing the same target profile. In the me-too case, chances are even (53%:47%) in terms of seeing novelty at the level of target specificity, but in the case of a structurally novel drug scaffold, the chances are 3:1 in favor of novelty (77%:23%). Two things are worth noting. First, with modern 3D molecular similarity and 3D QSAR methods, design of such compounds is tractable. Second, the development risks associated with novelty are almost certainly higher, since one cannot know a priori with full confidence what the precise biological effect differences might be, only that one is more likely to encounter them.