|Home | About | Journals | Submit | Contact Us | Français|
In order to discover non-nucleoside inhibitors of HIV-1 reverse transcriptase (NNRTIs) that are effective against both wild-type (WT) virus and variants that encode the clinically troublesome Tyr181Cys (Y181C) RT mutation, virtual screening by docking was carried out using three RT structures and more than 2 million commercially available compounds. Two of the structures are for WT-virus with different conformations of Tyr181, while the third structure incorporates the Y181C modification. Eventually nine compounds were purchased and assayed. Three of the compounds show low-micromolar anti-viral activity towards either or both the wild-type and Y181C HIV-1 strains. The study illustrates a viable protocol to seek anti-HIV agents with enhanced resistance profiles.
Since the recognition of AIDS in 1981 the associated pandemic has claimed more than 25 million lives, and ca. 33 million people are currently infected with the causative agent, human immunodeficiency virus (HIV).1 AZT, which was introduced in 1987, was the first drug to show some success in combating HIV infection, though significant chemotherapeutic progress was not achieved until 1996 with the advent of triple combination therapy, HAART. Though further improvements have made the dosing more tolerable, HAART is only palliative, it is not globally available, serious side-effects are common, and, owing to the HIV’s high mutation rate, resistance to any drug regimen can be expected in time. Coupled with the lack of success in vaccine development,1 it is essential to seek new anti-HIV agents that feature efficacy against a broad spectrum of HIV variants, low cost, easy storage and administration, and reduced side effects.
Though inhibition of multiple HIV proteins is being pursued for suppression of viral replication, HIV reverse transcriptase (RT) has been the key target.2 The enzyme converts single-stranded viral RNA into double-stranded DNA, which is incorporated into the host cell’s genome by HIV integrase. The nucleoside class of RT inhibitors (NRTIs) including AZT are faulty substrates whose incorporation into the product DNA causes premature strand termination, while the non-nucleoside RT inhibitors (NNRTIs) are true inhibitors that bind to an allosteric site ca. 10-Å away from the polymerase active site.2 Notably, the widely prescribed Atripla is a once-a-day coformulation of two NRTIs, tenofovir and emtricitabine, and the NNRTI efavirenz.2
Our own anti-HIV efforts have been directed at discovery of new NNRTIs.3,4 Numerous compounds in multiple series have been found that are both potent against the wild-type (WT) virus and that have auspicious computed pharmacological properties. However, improvement in the performance against some common HIV strains that encode RT mutations is needed. In particular, though some of the NNRTIs are effective against the troublesome Lys103Asn (K103N) variant, the Tyr181Cys (Y181C) mutation has been consistently problematic.3,4 This RT modification arises early upon administration of most NNRTIs and confers resistance to many of them including nevirapine, the first approved drug in the class.2 Consequently, as reported here, virtual screening has been performed to seek leads that show activity against both wild-type HIV as well as a variant that encodes the Y181C mutation. More than two million compounds were screened in the docking calculations using three crystal structures of RT, a conventional WT structure, one with an alternative conformation for Y181, and a structure of Y181C RT. Though only nine compounds were ultimately purchased and assayed, one compound shows low-micromolar activity against both viral strains, while two others show similar activity against either the WT or Y181C strain. The study illustrates a viable protocol to seek leads that are active against viral mutants, and it has provided specific lead series that are being optimized with expectations for enhanced resistance profiles.
In apo HIV-RT structures such as 1rtj, shown in white in Figure 1, the side chains of both Tyr181 and Tyr188 are pointed “down” toward the assumed entrance of the NNRTI binding site.5 However, in the more than seventy reported crystal structures of HIV-RT in complex with NNRTIs, almost all have Tyr181 and Tyr188 rotated “up”, lining the π-rich binding pocket along with Trp229 and Phe227. Typically, the aromatic side chains of the two tyrosines and tryptophan engage in aryl-aryl interactions with the NNRTIs. At the time of this study there were also six crystal structures of RT-NNRTI complexes (1fko, 1rt3, 1rti, 1tv6, 2b5j, and 2be2) in the Protein Data Bank showing Tyr181 in the down orientation as for the apo enzyme.6-10 The structures were obtained from three different laboratories and they were co-crystallized with different ligands. The alternative conformations for Tyr181 and Tyr188 are illustrated in Figure 1.
It was originally thought that the Y181-down orientation was associated with weak inhibitors;11 however, our interest in this binding mode was reignited in 2005 by the report of the 2be2 structure and its ligand, R221239, which is a very potent NNRTI against both WT HIV (IC50 = 2 nM) and a Tyr181Cys variant (IC50 = 5 nM).9 A possible complication here was that the Y181-down orientation can also arise for complexes when the crystallization is performed at low pH, as in the 1fko case.12 However, examination of the experimental details indicated that this was not an issue for at least the 1tv6, 2b5j and 2be2 structures, which involved crystallizations near pH 7. Thus, the 2be2 RT structure emerged as a desirable one for virtual screening. In particular, Tyr181 does not appear to be interacting significantly with the ligand in the 2be2 structure (Figure 2); the aryl-aryl contacts are between the 3,5-dimethylphenoxy group of the ligand and Tyr188 and Trp229. Consequently, mutation to Cys181 is expected to be relatively benign. So, the idea was that other ligands that are well-accommodated in the 2be2 structure and that avoid contact with Tyr181 might also be effective towards Y181C variants.
In addition, to enhance the possibility of finding robust leads that could adapt to alternative RT structures, a conventional RT structure and one for a Y181C mutant were included in the virtual screening. The conventional 1rt4 structure,13 which was used in a prior docking exercise,14 and the high-resolution (2.5-Å) 1jla structure15 containing the Y181C mutation were chosen. The NNRTI binding sites for the three structures are overlaid in Figure 2. The three ligands, UC-781 (1rt4), TNK-651 (1jla), and R221239 (2be2), which were removed for the docking calculations, are illustrated in Scheme 1 in orientations similar to the manner in which they are overlaid in the binding site.
Our previous, reported effort in the virtual screening of NNRTIs used the 1rt4 structure and covered ca. 70,000 compounds from the Maybridge catalog.14 After an initial similarity filter and docking using Glide 3.5 with standard precision (SP),16,17 the top 500 compounds were re-docked in extra-precision (XP) mode.18 The top 100 of these were post-scored with an MM-GB/SA method that provided high correlation between predicted and observed activities.14 Though assaying of ca. 20 high-scoring library compounds failed to yield any actives, a viable core was identified that did lead to anti-HIV agents, which were potent against WT virus, but inactive towards the Y181C variant.4 In the present case, the latter problem has been approached by use of the three crystal structures and the screening has been expanded to cover the more than 2 million compounds in the ‘drug-like’ subset of the ZINC database, which was downloaded from zinc.docking.org.19 The ZINC collection has been compiled from catalogs of commercially available compounds.
For the three RT structures (1rt4, 1jla, 2be2),9,13,15 the crystallographic water molecules and ligands were removed, and the proteins were prepared using Schrödinger’s Maestro and Glide 3.5 software.16-18 The receptors were subjected to the standard protein preparation workflow, which involves a restrained, partial energy-minimization using the OPLS-AA force field.20 The three structures have less than 2.5-Å Cα root-mean-square deviation from each other in all three pair-wise cases. The atoms of the 17 residues that comprise the binding-site deviate by, at most, 1.2 Å across the three structures.
In our previous study,14 it was shown that Glide did well predicting the structures of the complexes of six NNRTIs with the wild-type 1rt4 structure. In order to investigate the ligand binding modes generated for the two other receptors, the ligands for 1jla and 2be2 were self-docked back into their protein structures using Glide 3.5 in SP mode. The results were essentially perfect, as illustrated in Figure 3. The average all-atom RMSD for comparing the crystal structures and Glide poses for the three ligands is 0.81 Å, with a maximum deviation of 1.03 Å for the 2be2 case.
All of the ZINC compounds were then docked into each of the three RT structures using Glide SP.16,17 The value in using multiple protein structures for virtual screening has been noted, though optimal ways to proceed and combine results are unclear.21,22 Presently, two approaches were taken, as summarized in Figure 4. First, the top 5,000 structures from each SP screen were submitted to extra precision (XP) scoring18 in a ‘parallel’ fashion. The XP processing takes about 15 minutes per compound on a 3-GHz Pentium processor, so it is typical to apply it only to compounds that score well using the SP mode. Among the 15,000 structures that passed the SP filter, there were only 17 structures in common, i.e., that were among the top-5000 SP scores for all three receptors.
The second approach started with the common compounds among the top 100,000 in each SP screen. These 4,684 compounds were then processed in XP mode. This ‘serial’ approach relies less on the rank produced from the SP scoring and more on the commonality of ranking reasonably well for all three receptors. The goal is to retain compounds that might end up with very favorable XP scores. In the end, XP scores were obtained for a total of nearly 20,000 compounds. Much additional filtering was still needed, as the intention was to purchase and assay roughly 10-20 compounds. This limit was set by cost considerations for materials and labor.
The three poses for compounds which ranked in the top 500 by XP scoring for each receptor were graphically displayed. Docked conformations from XP Glide frequently showed unreasonably short non-bonded contacts or twisted amides and esters; their rejection eliminated many of the top-ranking structures. Ligands often seem overly large for the binding sites and they end up too compressed. An example is provided in Figure 5 for the ZINC compound which ranked first for both the 2be2 and 1jla structures, and third for 1rt4 using the serial protocol. The poses have favorable interactions such as hydrogen bonding between the oxazinone and the backbone of Lys101 or Lys103, and stacking of the naphthalene moiety with the side chain of Tyr188; however, in each case the ester group is impossibly strained with an O=C-O-C dihedral angle near 90° and with associated overly short 1,6-H,H distances.23
Compounds with promising XP poses for multiple receptors as well as interesting and novel binding modes were sought. As further criteria, compounds with functionality that is prone to hydrolysis such as oximes, hydrazones, esters, and tertiary alcohols were avoided along with ones containing electrophilic functionality such as α,β-unsaturated ketones and other Michael acceptors. Such potentially reactive compounds are interesting to contemplate as a source of potentially viable cores, but they are not optimal for purchase given limited resources. With these considerations, compounds 1 - 4 in Figure 6 emerged as top choices. Additionally, after thorough visual inspection of the poses for the 17 molecules ranked in the top 5000 by SP for all three receptors, compound 5 was selected. Finally, top-scoring compounds for the individual receptors were considered, which led to picking compounds 6-9. Several additional compounds were of high interest, but they turned out to be unavailable for purchase. Analogues of some of the compounds also scored well, but to maximize diversity they were not selected except in the case of 1 and 2. Generally, for multiple analogues with similar scores, the smallest ones were given preference in view of the propensity for the docking to favor larger ligands and in order to begin potentially at a lower molecular weight in a lead series. For example, 3 and its 6-benzo-1,3-dioxolyl replacing furanyl analogue both scored well, as was the case for 1, 2, and 1-naphthyl replacing substituted phenyl analogues. Further details on the rankings for 1 - 9 are provided in the results section.
As a final check, predictions for pharmacologically important properties were computed using QikProp for the selected compounds.23 When QikProp is run on 1700 known oral drugs, 90% have MW < 470, QP log P < 5.0, QP log S > -5.7, and QP PCaco > 22 nm/s.3c,25 The rms errors for QikProp predictions are 0.5–0.6 log unit. Since the results for 1 – 9 in Table 1 compare well with the above limits and with the results for other NNRTIs, the compounds were purchased. The computed solubilities for 6, 7, and 9 are less than optimal for lead compounds, so this would need to be monitored if their series were pursued.
Compounds 1 – 9 and three known NNRTIs (Scheme 2), which are approved drugs, were obtained from commercial vendors. 1, 2, 3, and 8 were acquired and assayed as racemic mixtures. The chemical structures and purity of the compounds were verified by 1H NMR (Bruker DRX-400 and DRX-500) and mass spectroscopy (Waters Micromass ZQ) at the Yale Chemical Instrumentation Center. Activities against the wild-type IIIB strain of HIV-126 were determined using MT-2 human T-cells27 at an MOI of 0.1; EC50 values were obtained as the dose required to achieve 50% protection of the infected cells using the MTT colorimetric method. The CC50 for inhibition of MT-2 cell growth by 50% was obtained simultaneously.28 An analogous assay was performed using a variant strain of the virus that encodes the Y181C mutant form of HIV-RT.29
The assay results are listed in Table 2. The results for the reference compounds are similar to prior reports using similar assays.4b,30 Among the purchased compounds, 3 shows low-micromolar activity towards both the WT and Y181C-variant HIV-1, while 4 has 7.5-μM activity against the Y181C form, and 5 is a 4.8-μM NNRTI towards the WT virus. In view of the small number of purchased compounds and the prior difficulties in obtaining activity against the Y181C variant, the present virtual screening approach was very successful.
It may also be expected that the core structures for some of the remaining compounds in Figure 6 are viable; i.e., modification of substituents could lead to active analogues.4 For example, 1 and 2 bear some resemblance to PNU142721, which has 1-nM potency towards WT HIV-1, though only 1-μM activity towards a Y181C variant.31 The similarity is apparent when the compounds are aligned as in Scheme 3, which reflects the conformation of PNU142721 in the 1ikx crystal structure with K103N HIV-RT.32 This structure is 1rt4-like, having Y181 and Y188 both “up”, and the ligand’s furanopyridine moiety is well-stacked with the phenol of Y181, likely accounting for the loss in activity towards the Y181C mutant. The novel, internal hydrogen bond in the bound conformations of 1 and 2 and facile synthesis of alternative, substituted phenyl-ring derivatives were features that promoted their selections.
Glide SP and XP scores and rankings from the different protocol are reported in Table 3 for 1 - 9, while Table 4 contains SP and XP scores for the reference NNRTIs. In order to be in the top 100,000 compounds in each SP screen, a Glide score lower than ca. -9 was needed. Among the known NNRTIs in Table 4, only efavirenz would have reached this threshold for any receptor. Also, the purchased compounds had one or more SP scores at least as favorable as -10.0 with the best score being -11.8 for 6 with the 2be2 structure. The XP scores for the purchased compounds are also generally much better than those for the known NNRTIs with most purchased compounds having one or more XP scores in the -19 to -20 range. The XP results for active compound 3 were particularly auspicious since they were in this range for all three receptors. However, 3 would have been missed using just the parallel protocol (Figure 4) since it was not in the top 5000 compounds for any receptor with the SP scoring (Table 3). Similarly, 4 would have been overlooked in the absence of the serial protocol. 5 is the only active compound that passed the parallel filter, and it did yield extremely low XP scores with the 1jla and 1rt4 receptors. Finally, 6 – 9 were chosen in large part owing to their strong XP showing for one receptor. Though the results here are limited, it appears that this was not a productive strategy. Instead, seeking consensus on good performance for multiple receptors especially via the serial route appears to be the most promising protocol. Again, the serial approach involved performing the XP scoring on the 4,684 compounds that scored in the top 100,000 by SP for all three receptors. One could consider increasing the latter limit, which further deemphasizes the SP ranking, and then skip the parallel approach.
The Glide XP poses for the active compounds 3 – 5 with the three receptors are shown in Figure 7. 3 adopts a similar conformation and position in the binding site for all three RT structures, while 4 shows a different structure with 2be2 than the other two RT forms, and 5 has a different orientation with the 1rt4 receptor. When one of the binding modes is different, it is usually for the 1rt4 structure with both Tyr181 and Tyr188 “up”. For favorable aryl-aryl interactions between a ligand and Tyr181, the “up” orientation is preferred. This reinforces the sense that targeting the 1jla and 2be2 structures is appropriate for seeking NNRTIs that do not gain binding affinity via interaction with Tyr181 and are, therefore, resilient to the Y181C mutation.
The predicted structures of the complexes for 3 are reasonable based on our prior computations and lead optimization for oxadiazoles and oxazoles such as 10 and 11, which are 130-nM and 13-nM NNRTIs.4,14 The hydrogen bonds between the anilinyl NH and nitrile N of 3 and the backbone oxygen and NH of Lys101 were expected; however, it was not obvious that the larger naphthalene subunit would fit into the Tyr181-Tyr188-Trp229 π-box in place of the dihalobenzyl groups. Subsequent energy minimizations for complexes of the R and S enantiomers of 3 revealed a ca. 2 kcal/mol preference for the S enantiomer based on protein-ligand interaction energies. The results also indicated that the naphthyl group is well accommodated either heading “out” as in the 1jla and 1rt4 structures in Figure 7 or “in” as in the 2be2 structure. This variability could be helpful for maintaining activity against Tyr181 and Tyr188 variants.
For 4, the computed structures for the 1jla and 1rt4 complexes are unusual with the phenylthienopyrimidine fragment spanning the binding site diagonally with the phenyl ring near Trp229. The hydrogen bond between the amino group and the oxygen of Lys101 is typical, while the hydrogen bond between the ligand’s hydroxyl group and the backbone oxygen of Tyr188 is novel. The lowest XP score is for 4 with the 1jla structure, which is consistent with its activity towards the Y181C variant (Table 2).
For 5, the complexes with the 1jla and 2be2 models of RT are reminiscent of the actual complexes with TNK-651 and R221239 in these crystal structures (Figures (Figures22 and and3).3). The benzyl groups of 5 and TNK-651 overlay with the position of the dimethylphenoxy group of R221239 in the π-box. Furthermore, the uracil ring and attached CCO chain are positioned similarly to the COC-phenyl fragment of TNK-651 and CSC-furan unit of R221239, projecting into the channel between Phe227 and Pro236. Though there are also hydrogen bonds between the uracil fragment and the backbone oxygen and NH of Lys103 in the 2be2 pose, the pose received a poor XP score (Table 3). In comparison, the predicted pose for 5 with the 1rt4 structure has little precedent. In this case, the central ring is in the π-box, the uracil ring is exiting the binding site near Glu138, and there is an overly short intramolecular contact between the ether oxygen and an ortho hydrogen of the terminal phenyl ring. In view of the suggested similarities between 5 and R221239, it would be surprising if 5 did not bind to WT HIV-RT in a manner similar to that in the 2be2 structure.
Fundamentally, the present virtual screening exercise provided three compounds that are suitable starting points for lead optimization of NNRTIs that are expected to be active against both WT HIV-1 and Y181C RT variants. It is likely that simple modifications of some of the inactive compounds in Figure 6 can also provide entry into other active series. Since the variety of analogues that can be purchased is limited, our approach to lead optimization is focused synthesis guided by results of free-energy perturbation calculations.3,4 In fact, progress has been made with series based on 1, 3, and 5, as will be described in due course.
Virtual screening was performed by docking ca. 2 million commercially available compounds with HIV reverse transcriptase (RT). Three crystal structures of RT were used in order to find NNRTIs that would be effective against both wild-type HIV-1 and variants possessing the clinically troublesome Y181C RT mutation. Two of the structures, 1rt4 and 2be2, are for WT-virus with different conformations of Tyr181, while the third structure, 1jla, incorporates the Y181C modification. Two protocols were applied for the screening using standard-precision (SP) scoring and the more computational intensive extra-precision (XP) scoring with the Glide program. In the parallel mode, XP scoring was performed on the top 5000 compounds from the SP analysis for each RT structure. In the serial mode, XP scoring was applied to the common 4,684 structures that scored in the top 100,000 in the three SP runs. After much additional filtering by visual inspection of the top-ranked ca. 1500 complexes, the nine compounds in Figure 6 were eventually purchased and assayed. Three of the compounds showed low-micromolar anti-viral activity towards either or both the wild-type and Y181C HIV-1 strains. Though the number of assayed compounds was small, the serial screening strategy appears to be more promising than the parallel one in that it is less biased by the standard-precision scoring. The approach can clearly be expanded to include structures encoding additional resistance mutations. Optimization of the lead compounds and others suggested by the present virtual screening exercise is being pursued.
Gratitude is expressed to the National Institutes of Health (AI44616, GM32136, and GM49551) for support of this research. Receipt of the following reagents through the NIH AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH is also greatly appreciated: MT-2 cells, catalog #237, and nevirapine-resistant HIV-1 (N119), catalog #1392, from Dr. Douglas Richman; HTLV-IIIB/H9, #398, from Dr. Robert Gallo; and HIV-1IIIB (A17 Variant) from Dr. Emilio Emini.