|Home | About | Journals | Submit | Contact Us | Français|
We recently discovered a new class of thiazole analogs that are highly potent against melanoma cells. To expand the structure-activity relationship study and to explore potential new molecular scaffolds, we performed extensive ligand-based virtual screening against a compound library containing 342,910 small molecules. Two different approaches of virtual screening were carried out using the structure of our lead molecule: 1) connectivity-based search using Scitegic Pipeline Pilot from Accelerys and 2) molecular shape similarity search using Schrodinger software. Using a testing compound library, both approaches can rank similar compounds very high and rank dissimilar compounds very low, thus validating our screening methods. Structures identified from these searches were analyzed, and selected compounds were tested in vitro to assess their activity against melanoma cancer cell lines. Several molecules showed good anticancer activity. While none of the identified compounds showed better activity than our lead compound, they provided important insight into structural modifications for our lead compound and also provided novel platforms on which we can optimize new classes of anticancer compounds. One of the newly synthesized analogs based on this virtual screening has improved potency and selectivity against melanoma.
Melanoma is the most deadly form of skin cancer, and its incidence is rapidly increasing, particularly in developed countries.1–3 Around 160,000 new cases of melanoma are diagnosed worldwide each year, and it is more frequent in males and Caucasians.4 According to a World Health Organization report, about 48,000 melanoma-related deaths occur worldwide per year.5 While early stages of melanoma can usually be cured by surgical removal, late-stage melanoma is known to be highly resistant to all current therapies.2 Despite tremendous efforts and significant progress in melanoma cancer research in the past few decades,6–10 dacarbazine (DTIC) still remains the only Food and Drug Agency approved small-molecule drug for advanced melanoma.11–13 Even this benchmark drug demonstrates response in fewer than 15% of patients.10, 14 With the ever increasing incidence of melanoma, there is an urgent need to develop more efficacious drugs.
In our ongoing efforts to search for small molecules as potential therapeutic agents for advanced melanoma, we recently discovered a new series of thiazole analogs that showed very potent activity against melanoma cells in vitro.15 One of the best compounds in this series has an IC50 value below 60 nanomolar. Screening results from the National Cancer Institute (NCI-60 screening) for our lead compound, LY-1-100, indicated nanomolar antiproliferative activity for all the cancer cell lines tested. Preliminary mechanism of action studies on this series of compounds indicated that they may interact with microtubules.15 In vivo testing with melanoma tumors showed substantial growth inhibitory activity for this series of compounds (unpublished data). To further expand our understanding of structure-activity relationships and to potentially identify new platforms for active compounds, we explored a compound library from the University of Cincinnati’s (UC) Drug Discovery Center, which contains 342,910 small molecules. All compounds are available to us for testing via an established agreement. Usually, compounds are shipped within 2 days of request. Therefore, we can obtain any compound we select easily for biological testing. Although other compound libraries may have more entries than does the UC library, often availability of the compounds is an issue. Therefore, we chose this library for our current studies.
We report herein two ligand-based virtual screening approaches using the structure of our lead molecule (Figure 1): 1) similarity search based on atom connectivity using Scitegic Pipeline Pilot software (Accelrys Software, Inc., San Diego, CA) and 2) similarity search based on molecular shape using Schrodinger software (Schrodinger, Inc., New York, NY). We showed that these two approaches are highly complementary and lead to different active molecular structures. These structures are quite suitable for further structural modification and provide new platforms for our anticancer drug discovery efforts.
To validate the connectivity similarity search approach, we first established a relatively small testing compound library. This library contains 22 known compounds similar to LY-1-100, 10 known dissimilar compounds, and a diverse set of 2000 compounds pulled from the University of Cincinnati Drug Discovery Center Compound Library (totally 342,910 compounds). These 2000 compounds were selected to be “druglike” in that they adhered to Lipinski rules and were filtered from a wide variety of functional groups. We seeded the known 32 compounds in the small testing library so they could be evenly distributed through the database file. Then we conducted a connectivity similarity search for lead compound LY-1-100 against the 2032 testing compound library. The small compound library was subjected to five similarity filters in parallel using the ECFP2, ECFP4, ECFC6, FCFP4, and FCFP6 property sets and Tanimoto distances using LY-1-100 as the lead structure.16–18 The top 400 compounds most similar to LY-1-100 from each of these operations were ranked by calculated similarity; then an average rank for each compound across the methods was calculated. The detailed algorithm of the search is shown in Figure S1.
A protocol was designed within Accelrys’s Pipeline Pilot in the UC Drug Discovery Center. The entire compound library was subjected to the same five similarity filters in parallel as described in the validation process. Then an average rank for each compound was also calculated. Our experience suggests that each of the property sets has some inherent strengths and weaknesses with respect to any given structure, so we used the above multiple, parallel similarity analysis to ensure that the optimal compounds by any method were not missed. The detailed algorithm of the search is shown in Figure S2.
We also tested our shape-based virtual screening using the same test database. The lead compound, LY-1-100, has a relatively rigid structure. Therefore, we chose its lowest energy conformer at an environment of pH 7.4 as an active conformation, which is consistent with our recently reported crystal structure of LY-1-100.15
All structures in this database were first prepared with the Ligprep software model (Schrodinger, Inc., New York, NY). During this preparation step, hydrogen atoms were explicitly added, all possible ionization states were generated between pH 5.0 and pH 9.0 using the Ionizer module, and the 3D molecular structures were minimized with OPLS-2005 forcefield in Schrodinger software suite. Tautomers were also generated in this step. All structures were then subjected to shape similarity search using the Phase software module. Each structure was allowed to sample up to 100 conformers for which molecular shapes were calculated and compared with that of LY-100. Similarity scores were ranked from most similar to least similar.
To make the intensive computation of shape-based virtual screening manageable for these 342,910 compounds, the database was first broken down into 17 smaller databases. Each small database contained about 20,000 molecules and was subsequently prefiltered to remove molecules containing reactive groups. Molecules in each small database were then prepared by the Ligprep software module to generate proper 3D structures with all possible ionization states between pH 5.0 and pH 9.0. Typically 30,000 to 50,000 structures were produced from this step for each database. Subsequently, 100 conformations were generated for each of the structures in the database, and the shape of each conformation was then compared to that of LY-1-100. A normalized shape similarity value was computed for each molecular conformation relative to that of LY-1-100, with 0 the most dissimilar and 1 the exact same shape. We set the threshold for a hit as a shape similarity greater than 0.7 regardless of surface property to keep the maximum hits. All hits were ranked down from the most similar hit. These calculations were repeated for each small database for all the molecules in the original database.
Hits from the two similarity searches were analyzed and their structures examined. The top hits were requested from the UC Drug Discovery Center for a subsequent two-step in vitro assay. The first step was a fixed drug concentration screening assay on two melanoma cell lines. Compounds that killed at least 50% of both cancer cells at 10 µM were selected for the second step, and their IC50 values on two melanoma cell lines and one control cell line (fibroblast cells) were determined. Human malignant melanoma cells A375 and mouse melanoma cells B16-F1 were used to evaluate the compounds’ in vitro anticancer activity. Both A375 cells and B16-F1 cells were obtained from American Type Culture Collection (Manassas, VA). All cell lines were cultured in DMEM media (Cellgro Mediatech, Inc., Herndon, VA), supplemented with 10% FBS (Cellgro Mediatech, Inc.), 1% antibiotic/antimycotic mixture (Sigma-Aldrich, Inc., St. Louis, MO), and bovine insulin (5 µg/ml; Sigma-Aldrich). Cultures were maintained at 37°C in a humidified atmosphere containing 5% CO2.
For the initial screening step, A375 and B16-F1 cells were seeded to 96-well plates at a density of 4000~5000 cells/well, respectively. After cells adhered to the plate, media were changed and test compounds were added together with media at a fixed concentration of 10 µM in duplicate. After 48-h incubation, cell viability was evaluated by using the CellTiter 96® AQueous One Solution Cell Proliferation Assay.19 This assay is based on the bioreduction of MTS [3-(4, 5-dimethylthiazol-2-yl)-5--(3-carboxymethoxyphenyl)-2-(4-sulfophenyl)-2H-tetrazolium]. This tetrazolium dye can be converted by living cells into a colored formazan product that is soluble in tissue culture medium and, therefore, provides a sensitive readout of cell life or death that can be monitored spectroscopically. After incubating cells with compounds for 48 h, 20 µl of CellTiter 96 AQueous One Solution Reagent was pipetted into each well, which contained the samples in 100 µl of culture medium. Plates were incubated for another 1.5 h at 37°C in a humidified, 5% CO2 atmosphere. Absorbance was recorded at 490 nm in a BioTek EL800 96-well plate reader (BioTek Instruments, Inc., Winooski, VT. Each compound’s cell killing rate was normalized against no-treatment controls.
The second step was to further measure the IC50s of active compounds selected from the initial step. We used the activity on fibroblast cells as a control to determine the selectivity of these compounds between cancer cells and normal cells. Human dermal fibroblast cells were purchased from Cascade Biologics, Inc., (Portland, OR) and cultured in the same condition as the two melanoma cell lines. After cells were seeded on 96-well plates, they were exposed to a wide range of eight concentrations of each compound in quadruplicate to determine the IC50 values. Then the CellTiter 96 AQueous One Solution Cell Proliferation Assay was used to measure cell viability. IC50 values were calculated by nonlinear regression analysis using GraphPad Prism (GraphPad Software, San Diego, CA).20, 21 Each assay was repeated three times on different occasions.
Results from the connectivity search using the test database showed that the first 19 most similar compounds were from the 22 seeded similar compounds according to the consensus score (average rank). The lowest rank of all seeded similar compounds was 31st in the search results. The highest ranking “dissimilar” compound was ranked at 131st. This is approximately 6.5% of the way through the small 2032 compound database, making it analogous to the 22,106th ranked compound from the original search of the 342,910 compound library. This validation exercise clearly showed that similar compounds are easily ranked higher than dissimilar compounds by the connectivity indices-based methods we used.
The connectivity searches produced a combined total of 1292 compounds. To illustrate that each property set identified somewhat different similar structures, only 13 compounds were in common among all five searches, 49 in common among four searches, and 160 in common among three searches. Results were consistent with a limited number of highly similar compounds and a fairly rapid digression to relatively low-similarity compounds. Out of these 1292 compounds, we selected the top-ranked 110 compounds, middle-ranked 110 compounds, and bottom-ranked 110 compounds to screen their activity on the two cancer cell lines: A375 and B16-F1. Our decision to select compounds from the top-, middle-, and bottom-ranked compounds was not only to ensure testing of any highly similar (and therefore likely to be active) compounds (top-ranked) but also to search for active compounds of more divergent structure (middle- and low-ranked compounds). Each of the 330 compounds was incubated with cells at a concentration of 10 µM for 48 h in duplicate. Percentages of cell death induced by these compounds are shown in Figure 2. Three active compounds were identified from the 110 top-ranked compounds (UC-297549, UC-791794, and UC-791475), one active compound from the 110 middle-ranked compounds (UC-406551), and four active compounds from the 110 bottom-ranked compounds (UC-791814, UC-791792, UC-792257, and UC-193691). IC50 values for these active compounds were subsequently measured on A375, B16-F1, and fibroblast cells (Table 1). The best compound identified, UC-297549, had an IC50 value less than 1 µM on A375 cells and good selectivity between cancer cells and normal cells. Very interestingly, almost all of these compounds contained a basic terminal amine group close to an aromatic ring. This feature is different from the lead compound structure (LY-1-100), suggesting that adding a properly positioned amine group may enhance the activity of LY-1-100. Another important feature revealed from Table 1 is that a basic terminal amine group may help increase selectivity. Among the identified eight active compounds, only one did not have the basic terminal amine group (UC-406551). The selectivity of this compound was worse than that of the lead compound. All other seven active compounds, which have the basic terminal amine group, had better selectivity than our lead compound. Finally, a basic terminal amine group is also expected to have the added benefit of a reduced logP value and improved water solubility.
With this in mind, we synthesized a new analog of LY-1-100 in which an amino group was added to the para-position of the phenyl ring (compound LY-2-183H). To our great satisfaction, LY-2-183H was more active and selective than LY-1-100 was (Table 1). As expected, LY-2-183H had much better water solubility, which is an important factor for future in vivo animal testing. Further optimization of this structure is currently in progress.
Encouraged by the structure of the initial connectivity screening and using the same protocol as described above in combination with substructural analysis, we selected an additional 40 compounds whose structures were close to the 8 active compounds. When we tested the activity for these 40 compounds on A375 and B16-F1 cells, 4 active compounds (10% hit rate, Figure 3) were identified (UC-792247, UC-792341, UC-98514, and UC-831104). Their IC50 values against all three cell lines are shown in Table 2. While these four compounds were not as active as was UC-297549, they contained more structural features than LY-1-100 and may provide more opportunities for further structural modifications.
During the testing compound library preparation step, the Ligprep process generated 6767 distinct structures from the original 2032 entries. Thirty-five ionization states were generated for the 32 seeded molecules, including 24 similar structures (protonated and unprotonated forms for both LY-2-103 and LY-2-84), and 11 dissimilar structures (Figure S3, seeded similar structures are highlighted in yellow, and dissimilar structures are highlighted in red). Validation search results for the seeded structures are listed in Figure S3. From these results, we can see that the top 26 entries from this search contained 20 of the 24 seeded similar structures (86% similar structures from 0.35% of total database). The four remaining similar structures ranked at 107, 255, 281, and 1508. Twenty-three out of 24 (96%) similar structures were found within the 4.2% of the total structures in this sample database. The saturated thiazole ring and the amide bond in LY-2-74 are responsible for its low similarity ranking of 1508, when compared with the molecular shape of LY-100 (Figure 1). This structure also showed the worst similarity rank in the connectivity-based similarity search (rank 31st). In contrast, the ranks for the 10 dissimilar structures spanned from 332 to 6712 out of the 6767 structures. In summary, we found that this shape similarity search approach could identify similar compounds, which provides reasonable validation for screening the entire database.
The classical concept of lock and key interaction between a receptor and its ligand indicates that molecular shape is a critical factor for high binding affinity. Similarities in connectivity and molecular shape have overlap but also have their own individual spaces. To explore potentially new platforms based on the molecular shape of LY-1-100, we performed shape similarity searches against this database. These searches produced a little over 5000 structures that had a similarity score larger than 0.7 and did not contain any reactive groups. We selected the 88 top-ranked compounds and performed an initial single concentration screening (Figure 4). Two active compounds (UC-521092 and UC-398535), which are very close analogs, were identified, and their IC50s are shown in Table 3. The tight criterion (similarity score must be at least 0.7) was likely responsible for limiting greater variations for identified structures.
Although still not as active as LY-1-100, these two compounds were more potent than compounds identified from connectivity-based virtual screening, with IC50 values in the nanomolar range. One important hint revealed from this search was that a properly constructed amide linkage between the five-member ring and the trimethoxyphenyl ring may be very beneficial to anticancer activity. Previously, we made one compound (LY-2-30) containing an amide bond that was not active. Based on results of this screening, we prepared another analog in which the direction of the amide bond was reversed to directly connect the carbonyl group to the trimethoxyphenyl ring (LY-2-173b-OTs). The activity of this compound was greatly improved (Table 3).
The high activity of compounds identified from this shape similarity search likely resulted from the very close structural features to LY-1-100. Obviously, this similarity also limited their structural variations; therefore, a dramatically different scaffold from LY-1-100 will be difficult to obtain. Hence, at least for this database, shape similarity search will not provide completely new molecular structures. On the other hand, depending on the criteria of the connectivity search, completely novel structural features and new molecular platforms could be identified, but loss of activity may be expected. These two approaches are therefore highly complementary.
Generally speaking, the connectivity similarity search gave us more diverse structure scaffolds that are bioactive than did the shape similarity search. For example, we got eight bioactive compounds in the first round of whole-library connectivity similarity search and four bioactive compounds in the further search. We obtained highly varied scaffolds such as compound UC193691, in which we found the lead compound trimethoxyphenyl ring replaced by an isothiourea group; in compound UC98514, we found the lead compound phenyl right replaced by the amidine group. But in the shape similarity search, we only got two bioactive compounds, and they both shared very close features with the lead compound. On the other hand, the shape similarity search gave us more bioactive compounds than did the connectivity search. The two hits from the shape similarity search had IC50 values on melanoma cell lines in the nanomolar range, while hits from the connectivity search were mostly in the micromolar range. The reason for this difference was largely to the result of the inherent algorithm difference of these two screening approaches.
We did not choose receptor-based virtual screening because we did not confirm the target for this new class compounds at the initial stage. Later mechanism of action studies revealed that the target is the tubulin colchicine-binding site. We will report our results for this receptor-based virtual screening in the near future.
Using both connectivity- and shape similarity-based virtual screening against a large database, we identified 14 new molecules from the UC compound library that are active against melanoma cells. These molecules provide several new functional groups and structure features compared with the lead structure. In summary, the novel pharmacophoric elements identified through these virtual screening exercises include a) a terminal basic amine group can improve activity and selectivity; b) a properly constructed amide linker between the five-member ring and the trimethoxyphenyl ring is beneficial to bioactivity; c) the thiazole ring in the lead compound is not necessary for bioactivity, and it can be replaced with an N-methylene hydrazine linker; d) an isothiourea group can replace the lead compound trimethoxyphenyl ring; and e) an amidine group can replace the phenyl ring of the lead compound.
This information is very helpful for understanding the structure-activity relationship and for further improving compound activity. While the shape similarity search did not provide diverse active structures, the active compounds identified were generally much more active than those identified by the connectivity search. Combining both connectivity and shape similarity search techniques was complementary for our ligand-based drug discovery efforts. Further modification of the lead compound is in progress based on the information obtained from these virtual screenings.
Supporting Information Available: Flowcharts (Figure S1 and Figure S2) for connectivity based virtual screening, structures for the 32 seeded molecules (Figure S3), and the ranking of these 32 molecules in shape similarity screening (Figure S3). This material is available free of charge via the Internet at http://pubs.acs.org
This research was supported by NIH grant R15CA125623. Zhao Wang acknowledges the generous support of the Alma and Hal Reagan Fellowship provided by the College of Graduate Health Sciences, University of Tennessee Health Science Center. Corporate agreement to access the UC Drug Discovery Center compound library was provided by funds from the College of Pharmacy, University of Tennessee Health Science Center. We thank Dr. David Armbruster at the University of Tennessee Health Science Center library for his editorial assistance.