The binding of a small molecule drug to its target protein in a cell is much more complex than a single docking calculation. For example, an ATP-competitive kinase drug would have hundreds of ATP-binding sites to choose from due to the large size of the kinome. Cancer drugs such as sunitinib are now known to potently inhibit many more kinase targets than previously expected 
. In addition, non-kinase targets of kinase drugs have also been found: NQO2 was the first non-kinase target discovered for imatinib 
, and several cytotoxic LIM kinase inhibitors were found to be actually inhibiting tubulin 
. Such studies imply that the target search space for any inhibitor should be the entire druggable proteome.
Our strategy was to find novel drug targets of existing drugs by computationally screening the druggable proteome. For this purpose, we chose molecular docking due to its speed, low cost, and detailed three-dimensional simulation. Moreover, docking can evaluate any protein with a solved structure due to its virtual nature, without the need for tailoring enzymatic assays or collecting drugs in solutions. However, docking is known to have a high false positive prediction rate, due to limitations such as incomplete binding pocket prediction, inadequate ligand conformation sampling, inaccurate scoring functions, lack of protein flexibility, and lack of water and cofactor molecules during the simulation. As evidenced in this study, only 31% of the 3570 known interactions docked with a good score. One review states that 10–50% of a set of diverse compounds can be expected to be docked correctly for a given target 
. We are well within this range, and believe our method performs quite well considering the large variety protein targets involved and the automated nature of the pipeline. However, the other 69% of known interactions were not predicted due to docking limitations.
Our method attempted to address these limitations. First, we manually included binding pockets that were present in PDB structure complexes but not predicted by the binding pocket search. Second, we docked each interaction 10 times to better sample ligand conformations. Third, we applied consensus score and rank criteria to further narrow down top scoring docking hits. Fourth, we used all available structures of a protein (versus choosing one representative structure), to allow a simple view of protein flexibility. We did not incorporate water and cofactor molecules in our docking simulations due to the computational complexity involved. However, by selecting proteins for which at least one known drug docked and scored well, we selected proteins for which the limitations of the docking software were not critical for a good prediction. In short, assuming the docked conformation of the known ligand was correct, we used only proteins for which the binding pocket was genuine, the scoring functions were adequate, the protein was in a conformation amenable for drug inhibition, and the lack of water or cofactor molecules didn't drastically affect the prediction.
Virtual screening studies typically involve docking large chemical databases to one protein target, selecting compounds that score within the top 0.5–1% of the database and then further prioritizing them by visual examination. When experimentally validating these top candidates, a 5% hit rate can be considered a successful endeavor (where a good hit is a predicted compound showing an experimental binding affinity in the µM or lower range) 
. Depending on the target, the crystal structure, the software used, post-docking criteria (such as chemical clustering), and even the individual performing the visual examination, the hit rate can be improved to 10–40% (Cavasotto et al.
had 14% hit rate from 50 tested compounds 
; Sabio et al.
had a 36% hit rate from 56 tested compounds 
In our case, both the standard scoring threshold and the known-inhibitor score were not sufficient. With a normal score threshold of −30, docking 4621 drugs against 252 proteins resulted in 104,625 predicted interactions. This is roughly 1% of the docked interactions, so even selecting the top 1% of the docking hits for validation becomes prohibitive for large-scale studies. It is important to note that each protein has different physiochemical properties: for some proteins, hundreds of compounds pass the −30 cut-off, while for other proteins none pass. Thus, using the known-inhibitor score as a cut-off allows for a threshold that is tailored to each protein. However, this method still predicted ~8000 interactions at the most stringent. Our consensus threshold allowed us to pick the top 1% (or any x%) of docked compounds with the best icm- and pmf- scores for each protein and further filter from there. Through testing many combinations, we found that using the consensus score with rank information allowed us the highest PPV – nearly 50% - and enrichment factor – 50 times better than standard −30 score threshold and 490 times better than random selection. This high enrichment for known interactions suggests that many of the other predictions that have not yet been experimentally tested may be true binding interactions.
There are limitations to this scoring scheme. Since the pmf-score is a statistical score comparing the docked interaction to known interactions in PDB, a chemical with a different scaffold or novel binding conformation may have a poor pmf-score and become predicted as a false negative. However, our foremost goal in this study was to eliminate as many false positive predictions as possible and obtain a high enrichment of true positives in our predicted interaction set. Thus, it was acceptable to miss some false negative predictions. In addition, the consensus score is quite simple with a linear separation method, and may not be as informative as a machine-learning algorithm that trains on known ligand docking scores. However, we desired an automated scoring method that did not depend upon the existence of known ligands. That is, if a protein structure had just one, or no known binders, our method would still be able to select the top 1% of docking hits.
To date, cross-docking of proteins to compounds has generally been used for small datasets. As an example, Huang et al.
docked 40 targets against 40 compounds to check whether their docking method could distinguish between a target's cognate ligands and the other targets' cognate ligands 
. In this large-scale cross-docking study, our use of a 1000-processor cluster was essential to completing the tens of millions of docking simulations in a timely manner. In addition, the large number of crystal structures and binding pockets involved required much of the docking pipeline be automated.
High-throughput computational screening of drug-target interactions represents a parallel approach to high-throughput experimental screening. Due to differences in experimental methods, assay settings, and protein panels, different studies may present differing results. For example, small molecule affinity purification methods that use whole cell lysates would give different results from in vitro
kinase assays that use a specific panel of proteins. In the case of gefitinib, two such studies had distinct differences in their proposed cellular targets 
. Differences in methods are also further compared in a study by Manley et al 
. We presented an example for BIM-8, which binds to PDPK1 differently in two similar in vitro
experiments. For MAPK14, the experimental results for nilotinib also varied. We experimentally tested two purchasable approved drugs against MAPK14 and found that nilotinib was a strong nanomolar inhibitor, and zafirlukast was also an inhibitor, though not as potent. Thus, interactions that are predicted to be very likely inhibitors computationally may merit extra study even if experimental tests are initially negative.
In short, we have developed a computational pipeline that can run large-scale cross-docking of compounds to targets. We developed stringent criteria to filter a large proportion of false positive interactions. The two case studies presented were selected based on known experimental binding assay data, so as to demonstrate the notable enrichment of known interactions using our scoring and ranking criteria. We hypothesized that predicting a set of interactions with a higher PPV (enrichment of known interactions) would also lend confidence to the other novel interactions in the set. This appears to have worked, as we were able to find validation for 31 predicted drug-target interactions that were not previously annotated in DrugBank, as well as validate two other inhibitors of MAPK14. Other drug-target interaction predictions are currently undergoing experimental validation; novel interactions discovered are potential drug repositioning candidates, but also provide insight into a drug's mechanism of action and adverse effects profile.