The term “virtual screening” is fairly new. A SciFinder search suggests the first appearance of this phrase was in the 1990's,(10
) but the idea has been around for a long time. The concept of using 3D similarity (sometimes using shape alone, sometimes using atom typing, i.e., assignment of chemical character to an atom or group of atoms or the fields emanating from molecules) as a basis for virtual screening is integral to a computational chemist’s tool chest. As Clark suggested,(10
) companies that were pioneers in this area have many success stories. Indeed, Merck Research Laboratories (MRL) has been developing virtual screening methods for decades11−13
and has several published examples where 3D similarity has been applied in virtual screening and projects have been thereby advanced.14−16
Most of these early efforts helped bridge the gap from a peptide-like lead to a drug-like lead.
MRL’s first published application of virtual screening was in the non-peptide fibrinogen receptor antagonist program.(14
) Starting from the endogenous Arg-Gly-Asp motif, a virtual screen of the corporate database identified many non-peptides that mimicked this group. Some of these were tested, and one turned out to be a 27 μM (IC50
) lead. In another example, a query was constructed from key amino acids of somatotropin release-inhibitor factor (SRIF) plus additional “sphere points” that defined salient electrostatic and volume regions. A virtual screen of Merck’s flexibase(17
) of over 1 million compounds using this query identified compound 1
(Figure ), which was found to have measurable activity.16,18
This compound was ranked 41 out of >1 million compounds by SQW (SemiQuantitative reWrite).(13
) SQW is the second generation of a proprietary 3D similarity/superposition program written in-house at MRL. The original program SQ (for SemiQuantitative) was written in the late 1990's. SQ/SQW operates on a rigid molecules represented as heavy atoms that have been classified into seven physiochemical types (cation, anion, etc.). First, a clique matching algorithm is used to generate many orientations of a candidate molecule onto a target molecule. Second, a Nelder−Mead simplex algorithm adjusts the orientation of the candidate molecule to optimize the score. An analogue of compound 1
was the first potent and selective small molecule somatostatin receptor 2 (SST2) agonist reported at the time. The superposition of compound 1
(green) with the probe ligand (white) is shown in Figure .
Figure 2 Compound 1 (left-hand image) and superposition of compound 1 (green) with probe ligand (white) using SQW.(13)
MRL was not the only group to develop shape algorithms for the primary application in lead identification, but we did develop one of the earliest 3D superposition methods, called SEAL,(11
) which took into account charges and volumes. There are numerous independent software vendors (ISVs) and academic groups that have subsequently published in this area, and we refer you to the following references for an overview.10,19
In 2004 we undertook a large scale comparison of purchasable shape-based methods for use in virtual screening.10,20
In retrospect, we did find, as suggested by others in the literature, that there are many pitfalls and “gotchas” connected with the whole enterprise of method comparison that make it hard to arrive at robust conclusions. We refer interested readers to a special Journal of Computer-Aided Molecular Design
issue “Evaluation of Computational Methods: Insights, Philosophies and Recommendations”(21
) for many suggestions on how to properly conduct an evaluation.
The most important conclusion from our study is that, within the limits of retrospective screening, knowing the structure of an active ligand is better than knowing the atomic structure of its receptor. This is true if what one cares about is how many actives are retrieved and does not, for instance, need to find a plausible docking mode. We are not the first to say this; our conclusions are in agreement with earlier findings on this topic.(22
) It seems to be generally true regardless of which database one screens20,23
or which ligand or protein structure one uses for the virtual screen.(24
) As time went on we realized, based on valid critiques of our study, that we needed to change the way we compared methods, the most important of which had to do with the set of targets we used: (1) we sought to have enough targets to minimize the uncertainties due to the composition of the target set and (2) we would have to choose only those targets where the number of actives was fairly large. Simulation studies in MRL by Truchon and Bayly(25
) also reinforced the need for more actives. We therefore developed a set of 47 small molecule targets such that the number of diverse actives in the MDDR was >20. The majority of the targets have cocrystallized ligands in the Protein Data Bank (PDB),(26
) but some are derived 3D geometries using CORINA.
Given this new set, we carried out a number of studies(27
) comparing various 2D and 3D similarity methods as virtual screening engines. Since it is apropos for this venue, we will focus on ROCS (rapid overlay of chemical structures) and SQW, which are both 3D similarity methods. In our hands 3D similarity methods seem to embody the best combination of finding the most actives in virtual screening and having those actives be diverse.(20
) ROCS is considered a state of the art 3D similarity method. ROCS searches for optimal shape overlays, as illustrated in Figure . It uses atom-centered Gaussians to accurately represent volumes because such functions are much smoother than discrete “inside/outside” representations, e.g., molecules as fused spheres. As a consequence, the number of overlap maxima is much reduced, enabling approximations to the global maxima to be found quickly. It also includes the facility to match chemical types by representing atoms or groups of atoms as Gaussians of a given “type” or “color”, for instance, rings, hydrogen bond donors and acceptors. It has a lot in common with SQW in the core concepts (atoms typed as hydrogen-bond donor, acceptor, etc.; atoms represented as Gaussian functions, i.e., a soft, extended function, rather than a one or zero function corresponding to a hard sphere), but the detailed implementation is different. One interesting addition in ROCS is the inclusion of a “ring” term, where molecular superpositions get extra credit if ring centroids are superimposed, regardless of the type of ring. Our findings are that while SQW and ROCS do not perform the same on any given target, the average performance over 47 targets is surprisingly similar (Figure ).
Enrichment factors (EF) at 1% for ROCS and SQW results for over 47 unique targets. Enrichment here is the ratio of the number of actives at a given percent of the database to the expected number of such.
Despite the fact that 3D similarity methods perform very well, by no means are we saying they are a panacea. We often state in regard to virtual screening methods that “everything works on something; nothing works on everything”.(28
) If one is in the lead finding stage of a program, 3D similarity may be the most straightforward method to obtain diverse leads.29−31
However, if your protein target can adopt multiple conformations (because of inherent flexibility), one may be less successful retrieving a novel, active ligand using 3D similarity methods. This is the case for β-secretase (BACE), which is implicated in Alzheimer’s disease. For instance, if you used the hydroxyethylamine ligand from the PDB code 2B8L
) as a probe for virtual screening (which interacts directly with both catalytic aspartic acids Asp32
and occupies regions P1, P2, P3, and P1′), it would be nearly impossible to identify in the top rankings a spiropiperidine (PDB code 3FKT
)) which interacts with the catalytic aspartic acids via a water molecule and does not occupy regions P2 or P3 at all. Furthermore, the best ROCS superposition of these two ligands (Figure a) does not even qualitatively overlay the ligands in the way one observes crystallographically (Figure b). Why is this? Clearly 3D similarity methods do not take into account the influence of the receptor and, by design, will maximize volume overlap between two molecules. This is in contrast to how the BACE ligands are bound in their cognate sites (Figure b). In the BACE example, they occupy different spatial regions of the enzyme. This is clearly a failure of the assumptions behind 3D similarity methods, and protein flexibility obviously makes this issue more severe. That said, one should not abandon the utility of 3D shape based similarity in such cases; rather, such a virtual screen should be complemented with additional computational methods such as docking.
Figure 4 (a) Highest ranked ROCS superposition by Combo score (sum of the ST and raw color overlap) of ligands from PDB codes 3FKT (white) and 2B8L (green). (b) Aligned enzyme structures (3FKT and 2B8L) with cognate ligands. The key piperidine−water mediated (more ...)
In contrast to the failure exemplified in the BACE example, there are numerous examples where the shape of the ligand coupled with electrostatics to capture the polarity of the atoms has identified novel, noncongeneric hits via virtual screening.14−16,29−31
Are the hits retrieved found for the “right reason”? For instance, if overlaid in the active site of the protein or enzyme, would the correct chemical features align in the active site in contrast to the BACE example depicted in Figure a? In the case where ROCS was used to find inhibitors of ZipA-FtsZ,(31
) the authors subsequently crystallized one of the hits in ZipA and determined that ROCS predicted the binding mode. Induced fit, manifested as enzyme flexibility, was not integral to this binding motif and demonstrated that the hit retrieved in that study was found for the “right reason”. That is, when overlaid in the active site of the protein, the correct chemical features align in the active site.
The emergence of public data sets such as DUD allows us to ask if there are trends in the proficiency of shape tools; for instance, do they work worse with active compounds that are flexible or better for active sites that are small? Preliminary evidence is that shape is fairly robust with respect to operational parameters. However, some trends can be observed. Figures shows the performance of ROCS, measured by area under the Receiver Operator Characteristic (ROC) curve (AUC), over the cocrystal structures in DUD. Each symbol represents a query molecule in DUD, with its own particular set of decoys. We specifically asked the question how (A) heavy atom count, (B) the ratio of ROCS’ so-called “color” atoms to heavy atoms, and (C) intrinsic ligand affinity affect the performance of this shaped-based method to retrieve active compounds. A number of other properties, such as charge, ligand flexibility, number of color atoms, and a measure of site polarity, were also tested and found to have little or no correlation with performance. The decreasing performance with respect to number of non-hydrogen atoms in the query seemed reasonable; i.e., perhaps the conformational space of larger molecules was harder to search. On the other hand there seems to be no correlation of AUC with the number of rotatable bonds in the query. Investigation of the ratio of color atoms to heavy atom count in the query suggests that query molecules that do not have enough color points tend to have poorer performance, which is consistent with observations that ROCS “shape” tends to be poorer than ROCS with color. Finally, there does not seem to be any significant relationship between the potency of the query on its particular target with the ability of ROCS to select more actives with the same activity. The results in Figure are by no means final, as we have not formulated a data set where each set of probe ligands are of similar ligand affinity or molecular weight, etc.; however, a study focused on understanding those physicochemical properties is warranted. We have merely scratched the surface in exposing that some of these features may affect performance.
Figure 5 (a−c) Performance of ROCS over the DUD data set. DUD consists of a set of 40 protein−ligand systems, 38 of which has crystallographic coordinates. The data presented here are the results from taking the crystallographic ligand as the ROCS (more ...)
In conclusion, the application of 3D virtual screening methods has resulted in the identification of many active compounds in drug discovery programs. We and others have repeatedly demonstrated that 3D similarity-based virtual screening maintains enrichment in actives and increases the diversity in the compounds discovered. As a consequence, shape is integral to any drug discovery program. Finally, shape is a necessary requirement to facilitate the identification of both active and novel ligands in drug discovery, but it is not ultimately sufficient in the entire life cycle of a program where other factors, e.g., pharmacokinetic properties, may become more important and where shape has yet to play a significant role. In addition, there are clearly situations where shape is less useful. For instance, flexibility of the target enzyme or receptor can reduce the effectiveness of a 3D shape-based virtual screen, i.e., if there is less shape “coherency” between active molecules. In general practice, however, knowledge of merely one active compound can often outweigh the presumed advantages of an existing protein structure. Applications to molecules of a more fragment-like nature, where shape discrimination is subtler, are considered later in this Perspective in the context of library design.