The 656 drugs were computationally screened for their likelihood to bind to 73 targets (Supplementary Table S2
) using SEA.25–27
The targets belong to the Novartis in vitro
safety panels based on their association with ADRs.22,28
Here we insisted that they also be described in ChEMBL,29
enabling correspondence with SEA predictions (Supplementary Table S2
). ChEMBL annotates over 285,000 ligands modulating over 1,500 different human targets with affinities better than 30 μM. SEA calculated the similarity of each drug versus each set of ligands for the 73 targets, comparing the overall set similarity to a model of such expected at random. For instance, the sodium channel blocker aprindine loosely resembled the set of histamine H1
ligands; though no single H1
ligand was strongly similar to the drug (), the overall similarity of the set was much above that expected at random, leading to a highly significant SEA expectation value (E-value) of 5×10−26
between aprinidine and H1
receptor ligands. Only 1,644 of the over 47,000 possible drug-target pairs had significant E-values. Of these, 403 were already known in ChEMBL, and so were trivially confirmed; we do not consider these further. Of the remaining 1,241 predictions, 348 (28%) were unknown to ChEMBL, but could be found in proprietary ligand-target databases that were unavailable to SEA (Methods). The remaining 893 predictions represented previously unexplored drug-target associations.
New drug-off-target predictions confirmed by in vitro experiment. Representative, confirmed predictions are shown.
Of these predictions, 694 were tested at Novartis. For 478, activity was less than 25% at 30 μM; these were considered disproved. For another 65 predictions, activity was between 25 and 50% at 30 μM; these were considered ambiguous. Finally, for 151 of the new drug-target predictions IC50
values of less (better) than 30 μM were measured in concentration-response curves (, Supplementary Figure S1
). In 125 cases, the drugs had an IC50
value better than 10 μM and in 48 activities were sub-micromolar (, Supplementary Table S3
, Supplementary Figure S1
). In summary, of the 1,042 predictions that were tested (694 by assay, 348 by databases), 48% were confirmed either in proprietary databases, unknown to the method and to those undertaking the SEA calculation, or in Novartis assays in full concentration response, and just under 46% were disproved ().
Predicting off-targets, and their novelty
In assessing these results, one would like to compare the true- to the false-positive and to the false-negative predictions. Whereas this work offers guidance on the first question, we can only address false negatives for a few compounds (Supplementary Results
). Among these was astemizole, which had affinities ranging from 0.1 to 9 μM on the 5-HT2A
receptors, as measured in other projects at Novartis. These targets were missed owing to a charge post-filter, separate from SEA itself, which excluded compounds with net charge dissimilar from the reference ligands.30
Astemizole was improperly assigned31
a charge of +2, wrongly differentiating it from the known ligands; the SEA E-values linking astemizole to these targets were themselves between 10−25
. Other failures could be attributed to SEA itself. For instance, promazine bound to the histamine H1
receptors with low to mid-nanomolar affinities, but the SEA E-values at 10−4
were below our significance cutoff. This work was undertaken with ChEMBL2 as a source of ligand-target association; had we used the more recent ChEMBL10, H1
would have been predicted with an E-value of 10−9
), and had we used ChEMBL12 and a newer version of SEA both targets would have been predicted. Clearly, with its reliance on topology and on inference from known ligand-target associations, SEA will have false negatives.
A key question is whether the new predictions were in any way surprising. One way to evaluate this is to compare the similarity of drugs predicted for new targets to the closest previously known ligand for that target. We used Tanimoto coefficients (Tc), which compare the groups in common between two molecules, here represented by ECFP_4 fingerprints. Tc values between nearest molecules were small, often less than 0.432
; visual inspection of these pairs confirms the dissimilarity suggested by the low Tc values (). More systematically, SEA may be compared to a method that predicts targets based only on one nearest neighbor (a 1NN model) (). For close analogs (Tc values > 0.7, ), the fraction of true positives was comparable between 1NN and SEA (). But across most similarity thresholds, SEA substantially outperformed 1NN, and by nearly two-fold in the low similarity range. Thus, for the Rho kinase inhibitor fasudil, SEA predicted only the adrenergic α2A
receptor, with an E-value of 1.1×10−7
, which was experimentally confirmed (IC50
= 4 μM). This occurred despite the low similarity of the closest known α2
ligand, which had a Tc value of 0.37 to fasudil. Conversely, at this similarity threshold the 1NN model predicted nine targets, only three of which were confirmed (Supplementary Table S4
). For chlorotrianisene, two of the three targets predicted by SEA were confirmed; conversely, at its 0.31 Tc for cyclooxygenase-1 (COX-1) the 1NN model predicted ten targets, only two of which were confirmed.
We also investigated how often the new off-target would have been obvious based on sequence similarity of the targets.25,26,33
We calculated the BLAST sequence similarity of predicted targets to any known target of a drug (, Supplementary Table S3
). Of the 151 new off-target predictions, 39 (26%) had BLAST E-values greater (worse) than 10−5
, suggesting the previously known targets shared no sequence similarity with the new off-targets (, Supplementary Table S3
, ). For example, the anesthetic dyclonine was shown to bind the histamine H2
receptor (HRH2), while the closest known target was the Nav1.8
channel (SCN10A), which has no significant sequence similarity (BLAST E-value > 1) and is functionally unrelated to H2
. Similarly, the anti-nausea drug alosetron antagonized the 5-HT2B
receptor with an IC50
of 18 nM, though 5-HT2B
has no sequence similarity to the ion channel targets of this drug (). Chlorotrianisene potently inhibits the enzyme COX-1, which is unrelated by sequence to the primary nuclear hormone receptor of this drug, the estrogen receptor ().