|Home | About | Journals | Submit | Contact Us | Français|
Reliable and robust prediction of binding affinity for drug molecules continues to be a daunting challenge. We have simulated the binding interactions and free energy of binding of nine protease inhibitors (PIs) with wild-type and various mutant proteases by performing GBSA simulations, in which each PI’s partial charge was determined by quantum mechanics (QM) and the partial charge accounts for the polarization induced by the protease environment. We employed a hybrid solvation model that retains selected explicit water molecules in the protein with surface generalized Born (SGB) implicit solvent. We examined the correlation of the free energy with antiviral potency of PIs with regard to amino acid substitutions in protease. The GBSA free energy thus simulated showed strong correlations (r > 0.75) with antiviral IC50 values of PIs when amino acid substitutions were present in the protease active site. We also simulated the binding free energy of PIs with P2-bis-tetrahydrofuranylurethane (bis-THF) or related cores, utilizing a bis-THF-containing protease crystal structure as a template. The free energy showed a strong correlation (r = 0.93) with experimentally determined anti-HIV-1 potency.
The present data suggest that the presence of selected explicit water in protein, and protein polarization-induced quantum charges for the inhibitor, compared to lack of explicit water and a static force field-based charge model, can serve as an improved lead optimization tool, and warrants further exploration.
Virtual screening has been successful in the discovery of certain novel inhibitors, and a number of these inhibitors have advanced to clinical trials.1 When the structure of a target protein is available, virtual screening involves docking potential inhibitors against the protein and ranking the inhibitors by their predicted affinity using a scoring function. Molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) or Molecular Mechanics Generalized Born surface area (MM-GBSA) have been used in some instances in the post-processing and re-ranking of results from molecular docking.2 Of note, docking and scoring have currently been an integral part of drug discovery efforts and have produced documented successes, however, there is an urgent need for improvement of the accuracy of docking and scoring results.3 With this regard, Clark has described four areas of improvement, i.e., better scoring functions, treatment of protein flexibility, treatment of water molecules, and improved technology for data analysis of virtual screening results.1 The scoring functions fail if they do not properly account for solvation, entropy, or polarizability.1, 4
Water molecules form polar interactions with both proteins and ligands, fill empty spaces in cavities, and serve as an important component of molecular recognition. Lu et al. have analyzed water molecules present at the interfaces of 392 X-ray crystal structures of protein-ligand complexes and have reported high correlations between the polar van der Waals surface area of ligands and the number of ligand-bound water molecules in the crystal structures.5 In some instances, as many as twenty-one water molecules are bound to a ligand, with the average being 4.6.5 Despite their importance, the treatment of water molecules in docking calculations have not been widespread because of methodological limitations and poor understanding of how many and which water molecules are to be included in the simulation. By sampling multiple water positions during docking, Huang and Shoichet have recently assessed the ligand enrichment against twenty-four targets.6 Inclusion of water molecules increased enrichment against twelve targets, while remaining largely unaffected for the others.6 Fornabaio et al. have reported that waters play a significant role in the energetics of binding and have performed a hydropathic analysis of HIV-1 protease complexes.7 They have reported a significant improvement of the correlation between their HINT free energy scores and experimentally determined binding constants when appropriate bridging water molecules were taken into account.7
Most of the studies measure the accuracy of scoring functions by their ability to correctly rank the activity of a congeneric set of ligands. The prediction of activity of a ligand against mutant proteins is equally important in light of drug resistance in several diseases including acquired immune deficiency syndrome (AIDS) and cancers. In the present study, we focus on the resistance mutations of HIV-1 protease. HIV-1 protease acquires amino acid substitutions under the selection pressure of protease inhibitors (PIs), rendering HIV-1 resistant to such PIs.8 For example, an Asp30Asn (D30N) substitution causes resistance against nelfinavir. Some amino acid substitutions, while being initially selected under drug pressure against one inhibitor, confer on HIV-1 cross-resistance against other inhibitors.8 One example of such a substitution is M46I which is a primary indinavir-resistance-associated substitution, but M46I-containing HIV-1 is resistant to other inhibitors such as ritonavir, nelfinavir, and atazanavir.9 Analysis of the crystal structures of interactions of PIs with mutant proteases have shown that a number of drug resistance-associated mutations, such as G48V, V82A and I84V, occur in the catalytic active site of protease.10-12 Analyses of crystal structures of mutant proteases has revealed that there are, in general, no major conformational changes to the backbone conformation in such proteases and the changes in binding interactions from the wild-type may involve different polar interactions with a mutant side chain(s) or loss of favorable van der Waals contacts.13-15 Structural interactions, which are sometimes able to provide rational explanation of the mechanism of resistance, are not able to predict a priori, for example, whether V82A causes a higher resistance for ritonavir compared to DRV. More reliable predictions of the potency of inhibitors against protease with drug-resistant mutations would be of use in the design of novel and more potent inhibitors.
The free energy of binding of ligands to proteins can be simulated by methods such as free energy perturbation and linear interaction energy (LIE) approximation.16, 17 LIE is a semi-empirical method and is based on a linear approximation of polar and non-polar free energy contributions from molecular dynamics simulation averages.16 The LIE method has recently been used in calculating the binding free energy of N-sulphonyl-glutamic acid inhibitors to MurD ligase, and in probing the DNA replication fidelity.18, 19
In the current study we simulated the binding free energies of nine protease inhibitors against wild-type (PROWT) and mutant proteases (PROMT) with standard and hybrid GBSA protocols. While a number of water molecules are present in the X-ray crystal structures of protease-inhibitor complexes, a water molecule that mediates hydrogen bond interactions of the protease inhibitors with Ile50 and Ile50′ in the flap is common across several different inhibitor-protease complexes, and is present in the complexes for eight FDA-approved PIs. In this work, we explicitly incorporated water molecule bridging hydrogen bonds with the protease flap. For inhibitors nelfinavir and atazanavir, two additional water molecules that mediate hydrogen bonds between these inhibitors and other protease residues were also explicitly included. We compared the GBSA free energy of binding obtained from simulations with selected explicit water molecules in implicit solvation (a hybrid solvation model) with free energies that did not have the water molecule explicitly present. Furthermore, in the simulations, the inhibitor atoms had either forcefield-derived fixed partial charges or quantum mechanics-based partial charges that accounted for the polarization induced by the surrounding protein environment (a hybrid charge model). We also analyzed the correlation of the GBSA free energies obtained by the simulations with antiviral potency data (IC50 values). Our current data suggest that selective inclusion of explicit water molecule(s) and protein polarization effects may improve the robustness of GBSA free energy simulations and aid the design of inhibitors that are potent against both wild-type and multi-drug-resistant HIV-1 variants.
We explored various wild-type protease crystal structures from the protein data bank as starting templates for docking and subsequent free energy simulations. For convenience of protein expression and crystallization, some of the structures deposited in the protein data bank as wild-type structures have several mutations such as Q7K, K14R, R41K, L63P, I64V that are distant from the inhibitor binding site.15, 20, 21 These non-active site mutations may not drastically alter the conformation of the protease and its interactions with inhibitors compared to a pristine wild-type protease of HIV-1LAI or HIV-1NL4-3. While one non-active site mutant, depending on the residue and location, may not necessarily affect the binding affinity comparisons, crystal structures with four or five non-active site mutations are unsuitable to be used for free energy simulations, especially when comparing the simulation data with antiviral potency against wild-type HIV-1. We used 2FDE, obtained from the protein data bank, as the starting template for docking against darunavir (DRV), amprenavir (APV), GRL-98065, GRL-02031, and GRL-06579. 2FDE is a co-crystal of brecanavir and HIV-1LAI wild-type protease, and brecanavir (BCV) has a bis-THF ligand as a core.22 The PDB IDs of the crystal structures used for our simulations of the other inhibitors are as follows: 1HXB23 for saquinavir (SQV); 2O4P24 for TPV; 1OHR25 for nelfinavir (NFV); 1MUI26 for lopinavir (LPV), and 2AQU27 for atazanavir (AZV). Waters were not modeled in the crystal structure of LPV,26 but were present in all other structures.
Our goal was to explore the prediction of free energy of binding once a correct binding mode was obtained. In the present study, we demonstrate that the correct binding mode was reliably obtained when a ligand was docked against a protease structure obtained with a similar core. To decrease uncertainty arising due to cross docking of ligands to different proteases, we docked ligands against the native protease crystal or against a protease structure obtained with a similar core. It is important to keep in mind that protease side chains may undergo subtle conformational changes to accommodate protease inhibitors of different shapes and sizes (the molecular weights of the PIs in the current study range from 506 to 705) – and these changes might be difficult to capture by simple minimization following ligand docking to non-native crystal structures.
The interactions of protease inhibitors with wild-type HIV-1 protease were examined using computational structural modeling and molecular docking. Besides accounting for the conformational flexibility of the inhibitor, the polarization induced in the inhibitor by the protease was taken into consideration by employing polarizable quantum charges in the docking computations. The use of polarizable quantum charges has recently been shown to substantially improve the prediction of protein-ligand complex structures.28 The QM-polarized ligand docking protocol utilizing Glide version 4.5, QSite version 4.5, Jaguar version 7.0, and Maestro version 8.5 (Schrödinger, LLC, New York, NY 2007) was used as described below. The crystal coordinates described above were obtained from the Protein data bank (http://www.rcsb.org/) and used as starting templates. Hydrogens were optimized with constraints on the heavy atoms. The crystal water that mediates the interactions between protease inhibitors and the protease flap was retained, and all other crystal waters were deleted. Close interactions in the protease were annealed, and the docking grid was set-up. Polarizable ligand charges were determined at B3LYP/6-31G* level. The extra-precision mode of Glide,29, 30 which has a higher penalty for unphysical interactions, was used. For each docking simulation, up to five final poses were retained and were compared with available X-ray structures to verify that the conformations were reasonable. It was particularly important that the correct ring conformations were obtained during docking. LPV produced ring conformations that were different than the conformations obtained from crystal complexes in some docking solutions, and such conformations were discarded for the subsequent GBSA simulations.
The general principle of a GBSA model has been described before. The free energy of binding, ΔGbind is calculated as,2
where Ecomplex, Eprotein, and Eligand are the minimized energies of the protease-inhibitor complex, protease, and inhibitor, respectively;
where Gsolv(complex), Gsolv(protein), and Gsolv(ligand) are the solvation free energies of the complex, protein, and the inhibitor, respectively.
where GSA(complex), GSA(protein), and GSA(ligand) are the surface area energies for the complex, protease, and the inhibitor, respectively. The simulations were carried out using the GBSA continuum model31 in Prime, version 2.0 (Schrödinger, LLC, New York, NY, 2008). Prime uses a surface generalized Born (SGB) model employing a Gaussian surface instead of a van der Waals surface for better representation of the solvent accessible surface area.31
GBSA simulations were carried out for the protease-ligand complex structures obtained by molecular docking. The simulations were carried out in four different scenarios: i) no water molecules were retained in the protease, and ligand atoms have fixed charges based on the OPLS force field. This is the standard MM-GBSA simulation carried out in implicit solvation. The change in free energy obtained is denoted by ΔGmm, and the correlation coefficients are denoted by rmm. ii) no water molecules were retained in the protease and the ligand has protein polarized QM charges at B3LYP/6-31G* level. The protein polarized charge on the ligand is determined from the docked complex, and is used while computing Ecomplex, Gsolv(complex), GSA(complex) as well as for Eligand, Gsolv(ligand), and GSA(ligand). The change in free energy obtained is denoted by ΔGqm, and the correlation coefficients are denoted by rqm. iii) the bridging water molecule mediating the hydrogen bond interactions of inhibitors DRV, GRL-98065, APV, GRL-02031, GRL-06579, NFV, SQV, and AZV with Ile50 and Ile50′ in the flap was explicitly retained. This is a hybrid solvation model since implicit GBSA solvation terms for the whole system were used. For tipranavir (TPV), GBSA with the hybrid solvation model was performed by retaining a water molecule that bridges hydrogen bond interactions with Gly48 of one monomer of the protease. NFV and AZV were observed to have two additional bridging water molecules, and additional calculations in the presence of three explicit water molecules were performed for NFV and AZV. In the hybrid solvation model, the inhibitors either had MM charges (change in free energy and correlation coefficient denoted by ΔGmm/wat and rmm/wat, respectively) or iv) protein polarized QM (B3LYP/6-31G*) charges (change in free energy and correlation coefficient denoted by ΔGqm/wat and rqm/wat, respectively). In all simulations, the protease has OPLS charges. The strain energies of the ligands were taken into account.
DRV, GRL-98065, and GRL-02031 were synthesized as described previously.32-35 SQV and ritonavir (RTV) were kindly provided by Roche Products Ltd. (Welwyn Garden City, United Kingdom) and Abbott Laboratories (Abbott Park, IL), respectively. APV was a kind gift from Glaxo-Wellcome, Research Triangle Park, NC. NFV and indinavir (IDV) were kindly provided by Japan Energy Inc, Tokyo, Japan. LPV was synthesized by previously published methods.36 AZV was a kind gift from Bristol-Myers Squibb (New York, NY). TPV was obtained through the AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, National Institutes of Health.
To generate HIV-1 clones carrying desired mutations, site-directed mutagenesis using the QuikChange Site-directed Mutagenesis Kit (Stratagene, La Jolla, CA) was performed, and the mutation-containing genomic fragments were introduced to pHIV-1NLSma, as previously described.35,37 Determination of the nucleotide sequences of plasmids confirmed that each clone had the desired mutations but no unintended mutations. Each recombinant plasmid was transfected into 293T cells using Lipofectoamine 2000 Transfection Reagent (Invitrogen, Carlsbad, CA), and thus generated infectious virions were harvested 48 h after transfection and stored at −80°C until use. HIV-1carrying D30N substitution (HIVD30N) was generated since residue-30 is in the active site (Fig. 1) and the D30N substitution is known to cause primary drug resistance against the FDA-approved protease inhibitor NFV.38 HIVI50V was generated since Ile50 is in the flap region (Fig. 1) and interacts with various PIs through a bridging water molecule.39 The I50V mutation has been associated with resistance to APV, LPV, and RTV.40 HIVV82I/I85V was also generated since Val82 is located in the active site and its substitution to Ile is associated with HIV-1 resistance to most PIs, presumably due to the expansion of the active site and loss of favorable van der Waals contact.15, 20, 40, 41 We recently reported the emergence of I85V as a resistant mutation against a PI, GRL-98065 and chose to study the combined effect of V82A/I85V.32 HIV-1-infected patients who failed to respond to PI-containing regimens often have HIV-1 variants carrying both active site and non-active site mutations in protease, and we chose to explore such clinical HIV-1 isolates, HIV2840 and HIV2841. The former contained L10R, M46I, L63P, V82T and I84V substitutions, while the latter contained M46I, L63P, V82T and I84V substitutions.
To determine the drug susceptibilities of certain laboratory HIV-1 strains, MT-4 cells were employed as target cells, as described previously,37 with minor modifications. In brief, MT-4 cells (105/ml) were exposed to 100 TCID50s of drug-resistant HIV-1 strains in the presence or the absence of various concentrations of drugs and were incubated at 37°C. On day 7 of culture, the supernatants were harvested and the amounts of the p24 Gag protein were determined by using a fully automated chemiluminescent enzyme immunoassay system (Lumipulse F; Fujirebio Inc., Tokyo, Japan).42 The drug concentrations that suppressed the production of p24 Gag protein by 50% (IC50) were determined by comparison of the amount of p24 Gag protein produced in drug-treated cell cultures with the level of p24 Gag protein produced in a drug-free control cell culture. All assays were performed in duplicate or triplicate on more than three different occasions and the data are shown as means ± 1 S.D.
Since the number of ligand-bound water molecules in high resolution crystal structures varies widely,5 we first analyzed the number of water molecules in the five protease-inhibitor crystal structures used in the current study. The total number of water molecules ranged from 51 (protease-NFV complex,25 PDB ID 1OHR) to 124 (protease-TPV complex,24 PDB ID 2O4P). Within 4Å of the bound ligand, the number of water molecules present in the complexes with PDB ID: 1HXB, 2FDE, 1OHR, 2O4P, and 2AQU were 3, 5, 5, 10, and 10, respectively. We then analyzed the number of water molecules bridging hydrogen bond interactions between the protease and the ligands. The water molecule forming tetra-coordinated hydrogen bond interactions with Ile50 and Ile50′ in the flap was the only water molecule interacting with the ligand in the protease complexes of SQV (PDB ID: 1HXB), BCV (PDB ID: 2FDE), and APV (PDB ID: 1HPV43). Since this tetra-coordinated water molecule is considered an important pharmacophore for protease-inhibitor interactions,44 it was explicitly included in our docking and GBSA free energy simulations. Three water molecules formed bridging hydrogen bond interactions in the NFV-protease (PDB ID: 1OHR) and AZV-protease (PDB ID: 2AQU) complexes (Figs. 2c, and q), and simulations were carried out with both one and three bridging water molecules explicitly present for these inhibitors. TPV directly formed hydrogen bonding with Ile50 and Ile50′, and a water molecule bridged hydrogen bonding with Gly48 in the flap (Fig. 2g),24, 45, 46 and was explicitly included in the simulations. Simulations involving LPV did not include any crystal waters since none was present in the native LPV-protease complex.26
We determined the susceptibility of six recombinant infectious HIV-1 clones to each of the PIs that were chosen in the present study: 7 clinical available PIs (SQV, NFV, APV, TPV, DRV, and AZV) and 2 experimental PIs (GRL-02031 and GRL-98065) in the HIV-1 p24 Gag production inhibition assay, as previously described.35, 37 As illustrated in Table 1, most of the recombinant clones showed reduced susceptibility to the PIs examined by up to 16.5-fold. However, it was also noted that HIVD30N and HIV2841 had increased susceptibility to certain PIs. The increased susceptibility of HIVD30N to TPV with 33.3-fold was notable, although HIVD30N was also less susceptible to SQV, LPV, and NFV (Table 1). Both HIVI50V and HIV2840 were also less susceptible to most of the PIs (Table 1).
We next determined and analyzed the binding modes of nine different PIs with PROWT and a protease with an amino acid substitution at position 30 from an aspartic acid to asparagine (PROD30N). SQV has four hydrogen bond interactions with Asp29 and Asp30 in the S2 site of the wild-type protease, but has no hydrogen bonds with Asp29′, or Asp30′ in the S2′ site (Fig. 2a). When protease acquires the D30N mutation, SQV loses two hydrogen bonds with Asp29 and Asp30, and does not form any new and compensating hydrogen bonds with other protease residues (Fig. 2b). Comparison of antiviral data of SQV shown in Table 1 indicates that there was a 3.9-fold decrease in antiviral potency with the D30N mutation. It is possible that the decrease of antiviral potency of SQV for D30N mutant is due to the loss of hydrogen bonds with residues 29 and 30 for the mutant. Examining the hydrogen bonds in the S2 site for NFV against PROWT and PROD30N mutant protease (Figs. 2c-d), one observes that NFV has more hydrogen bonds with Asn30 of PROD30N compared to Asp30 of PROWT. An X-ray crystal structure has also demonstrated that NFV has a larger number of hydrogen bonds with PROD30N than with PROWT.11
However, D30N is a major amino acid substitution38 associated with NFV resistance of HIV-1 and HIVD30N showed a 5.3-fold increase in IC50 values in antiviral assays (Table 1). Examining the structural interactions of LPV and TPV against PROWT and PROD30N mutants (Figs. 2e-h), we observed that both the inhibitors have more hydrogen bonds with residues 29 and 30 of PROD30N than they have for those of PROWT. Antiviral data show that while LPV has a 3.1 fold increase in IC50, HIVD30N was about 30 times more sensitive to TPV (Table 1). Comparison of the structural interactions of DRV, TPV, GRL-02031, and GRL-98065 with PROWT and PROD30N (Figs. 2k-p) revealed that all of them have more hydrogen bond interactions with PROD30N. APV, which is 5-fold more potent against HIVD30N (Table 1), has three hydrogen bonds with Asp29, Asp30, and Asp30′ for PROWT, and forms three hydrogen bonds with Asp29 and Asn30 for PROD30N in structural models (Figs. 2i-j). Thus, it is clear that while the number of hydrogen bonds may provide an intuitive understanding of binding and/or antiviral potency, it may not always explain why certain PIs show a decrease in antiviral potency with D30N substitution while other PIs show an increase.
Since the number of hydrogen bonds between PIs and protease does not always help predict the potency of PIs as discussed above, we examined the free energies under four different simulation conditions: with and without explicit water(s); and with QM or with MM charges on the inhibitor (Table 2, Supplementary Tables S1 and S2). It was assumed that an increase in the change of free energy of binding (ΔΔG more positive) is to be expected for a decrease in antiviral activity, and vice versa. With the D30N mutation in PROD30N, SQV showed a reduction in antiviral activity by 3.9-fold. With the bridging water and QM charges on SQV, the free energy change (ΔΔGqm/wat) of the SQV-protease complex increased by 4 kcal/mol for the D30N mutation (Table 2). HIVD30N was resistant to NFV by 5.3-fold compared to HIVWT and ΔΔGqm/wat showed an increase in the free energy of binding by +7 kcal/mol. HIV-1 containing PROD30N was more sensitive to APV and TPV, and ΔΔGqm/wat for PROD30N with APV and TPV was −12 and −5 kcal/mol, respectively (Table 2). The ΔΔGqm/wat values showed that both TPV and APV had a higher affinity for PROD30N than for PROWT (Table 2) and correlated with the increase in sensitivity to these inhibitors.
We next compared the free energy changes (ΔΔGmm/wat) simulated using the bridging water molecule but with force field based fixed MM charges on the inhibitors. The ΔΔGmm/wat values of +4 kcal/mol and +9 kcal/mol for SQV and NFV, respectively, showed that the simulation results in an increase in the free energy of binding correlating with the reduction of antiviral activity. However, the ΔΔGmm/wat value showed a wrong trend for APV (+12 kcal/mol) and was alike for TPV (−2 kcal/mol). The ΔΔGqm and ΔΔGmm values were negative for SQV and NFV, respectively, while it was positive for APV. Thus ΔΔGqm and ΔΔGmm values, which did not incorporate the bridging water molecule explicitly, simulated inaccurate changes in the free energy of binding for SQV, NFV, and APV. The crystal structure for LPV (PDB ID: 1MUI) did not have any water molecules present, and all simulations involving LPV were carried out with implicit water. The ΔΔGqm and ΔΔGmm values for LPV were +17 kcal/mol and +21 kcal/mol, respectively. The increase in the change in the free energy of binding of LPV with PROD30N was consistent with its decrease in antiviral potency with D30N substitution. For TPV, the negative ΔΔGqm and ΔΔGmm values indicated favorable free energy of binding for PROD30N compared to PROWT and correlated with the increase in antiviral potency of TPV with the D30N mutant.
In summary, the ΔΔGqm/wat values provided consistent trends of the change in free energy of binding for PROD30N, which is resistant to SQV and NFV, but is more sensitive to APV and TPV. ΔΔGqm and ΔΔGmm did not always provide the correct trend of change in the free energies of binding.
The GBSA free energies were simulated under four conditions: i) with implicit solvation terms and MM charges on both ligand and protein (ΔGmm); ii) with an explicit water and implicit solvation terms (hybrid solvation model) with MM charges on both ligand and protein (ΔGmm/wat); iii) with implicit solvation terms and protein polarized QM charges on the ligand and MM charges on the protein (ΔGqm); and iv) with an explicit water and implicit solvation terms (hybrid solvation model) with QM charges on the ligand and MM charges on the protein (ΔGqm/wat). All the free energy values determined are shown in supplementary Tables S1 and S2.
We analyzed the correlation of the free energies thus computed with the experimentally-determined antiviral potency data (IC50), and the resultant correlation coefficients are shown in Table 3, where Set-1 refers to the correlation coefficient for PROWT, PROD30N, PROI50V; Set-2 refers to the correlation coefficient for PROWT, PROD30N, PROI50V and PROV82I/I85V; and Set-3 refers to the correlation coefficients for PROWT, PROD30N, PROI50V, PROV82I/I85V, PRO2840, and PRO2841. For Set-1, rmm (correlation coefficient of ΔGmm vs IC50) showed a strong correlation for only TPV, LPV, and AZV (Table 3). The rmm value was poor for the other PIs and indicates the difficulty of obtaining reasonable correlations between free energies and antiviral potency. The rqm (correlation coefficient of ΔGqm vs IC50) values that represented correlation coefficients when the free energies were simulated with polarized QM charges on the ligands showed significant improvement and a strong correlation for DRV, APV, GRL-02031, and LPV. However, both rmm and rqm values were poor for GRL-98065 and SQV.
We next determined, for Set-1, the correlation obtained by the hybrid water model that has an explicit bridging water molecule between the inhibitor and the protease flap. The explicit water was treated as a part of the protein, and implicit solvation terms were used. The rmm/wat value (correlation coefficient of ΔGmm/wat vs IC50) represented a greater correlation than rmm for all PIs except TPV and AZV. TPV directly formed hydrogen bonds with Ile50, and Ile50′, and the water molecule included in this calculation formed hydrogen bonds with Gly48 of one monomer of the protease dimer. For other PIs, the bridging water molecule formed hydrogen bonds with the flaps from both monomers. The rqm/wat value (correlation coefficient of ΔGqm/wat vs IC50) had a high degree of correlation for all PIs except AZV. Thus, the explicit inclusion of the water molecule bridging hydrogen bonds with the flap and protein polarized QM charges for the inhibitors provided strong correlation (r > 0.75) for seven out of eight inhibitors. The correlation coefficient rqm/wat for NFV with three bridging waters was 0.97, a significant improvement over the correlation coefficient of 0.77 obtained with one bridging water molecule. The rqm/wat value for AZV also improved from 0.16 to 0.64 with the inclusion of three bridging water molecules.
We also determined rmm, rqm, rmm/wat, and rqm/wat values for Set-2, which included PROV82I/I85V as well (Table 3). The rmm value was poor for all PIs except TPV, while the rqm values showed good correlations for DRV, LPV and TPV. The rmm/wat value showed strong correlation for only SQV, and good correlation (0.55 < r < 0.75) for NFV. The rqm/wat value showed strong correlations for DRV, APV, and SQV, and good correlations for GRL-98065 and TPV. The rqm/wat value for NFV jumped from 0.59 to 0.92 with the incorporation of three bridging water molecules instead of one.
The correlation coefficients with a hybrid water model and with QM polarized ligand charges (rqm/wat) on the PIs were higher for most of the PIs compared to the correlation coefficients obtained without any explicit water molecule and MM charges (rmm). The only exception was for TPV, which was the only non-peptidomimetic inhibitor among the PIs examined. TPV displaces the tetra-coordinated water molecule and interacts directly with Ile50 and Ile50′ in the flap.46 The hydrogen bond interaction of the bridging water molecule with TPV and Gly48 of one chain might not be an important contributor to its potency. Also, in general, the rqm/wat values provided better correlations than rqm.
We next analyzed the correlations of the free energies with the antiviral potency (IC50 values) for PROWT, PROD30N, PROI50V, PROV82I/I85V, PRO2840 that contains L10R, M46I, L63P, V82T, and I84V and PRO2841 that contains M46I, L63P, V82T, and I84V substitutions (Set-3 in Table 3). The analysis of PRO2840 and PRO2841 was substantially complex since both proteases contained non-active site substitutions, but it was worth examining the ability of the GBSA energy function to correlate with antiviral activity when substitutions distant from the inhibitor were present. In general, the correlation coefficients for Set-3 turned out to be low, indicating a lower correlation between the free energies and antiviral potencies when non-active site mutants were present. For DRV, the rmm, rqm, and rmm/wat values indicated that the corresponding free energies had no correlation with IC50 values. However, the value of rqm/wat for DRV was 0.83, showing a strong correlation for simulations with polarized QM charges on the inhibitor with a hybrid water model (Table 3, Fig. 3a). For SQV, strong correlations were obtained for both ΔGmm with IC50 (rmm = 0.78) and ΔGqm/wat with IC50 (rqm/wat = 0.84). For APV and TPV, rqm/wat values were 0.60, and 0.50, respectively; but there was no correlation when other GBSA protocols were used for these inhibitors. The rqm/wat values were greater than 0.75 for two inhibitors, and were greater than 0.55 for five inhibitors. The rqm, and rmm/wat values were greater than 0.55 for two inhibitors. Thus, the hybrid water model and inclusion of polarization effects, compared to the other protocols, simulated free energies with better correlation with anti-viral IC50 for more inhibitors.
Obtaining a correct relative rank of activity for inhibitors that had potency in the nanomolar range has been a real challenge for scoring methods.3 Scoring methods providing sufficiently high correlation with potency may serve as a lead optimization tool. In our data set, DRV and GRL-98065 have a bis-THF group as the core; GRL-06579 and GRL-02031 have a Cp-THF as the core, and APV has a THF group as the core. All these inhibitors were extremely potent against PROWT, and had a narrow range of activity ranging from 0.3 to 28 nM (Table 1). We computed ΔGexp from the antiviral IC50 values (Table-4). Such transformations have recently enhanced the understanding of the binding of N-sulphonyl-glutamic acid inhibitors to MurD ligase, and in understanding the efficiency of DNA catalysis.19, 47 Substitution of the THF group of APV with the bis-THF group (DRV) resulted in a −1.3 kcal/mol improvement in the free energy of binding. GRL-98065 has a 1,3-benzodioxole group as P2′ ligand compared to an aniline group in DRV which resulted in ΔGexp of GRL-98065 being lower (i.e. better binding affinity) by −1.4 kcal/mol than DRV. Both GRL-06579 and GRL-02031 have a Cp-THF as P2-ligand but have different substituents interacting with the S1′ and S2′ locations in the protease active site. The ΔGexp of GRL-06579 is lower by 1.4 kcal/mol than GRL-02031. Amongst these five PIs, APV and GRL-02031 have the worst ΔGexp, and GRL-98065 has the best ΔGexp. The absolute magnitudes of our simulated GBSA free energies are larger than those measures experimentally due to the force field parameters, and the form of the energy function, but trends and correlation with experiments can be deduced. Our ΔGqm/wat values simulated the correct trend by predicting that the worst binding free energy was for APV and GRL-02031, and the best binding free energy was for GRL-98065 (Supplementary Table S1). None of the other GBSA protocols predicted the trend correctly (Supplementary Table S1 and S2). We further analyzed the correlation of the free energies calculated by our methods against IC50 values for these five PIs. The rmm and rmm/wat values were 0.10 and 0.03, respectively indicating that there was no correlation of GBSA free energies simulated by using MM charges, with or without explicit inclusion of water. The difficulty of obtaining good correlation of free energies against antiviral activity is evident from this small but non-trivial data set. The rqm value was 0.74 indicating that incorporation of protein polarized QM charges on the inhibitors substantially improved the correlation. Including the bridging water molecule with protein polarized QM charges on the PIs resulted in a rqm/wat value of 0.93 for PROWT (Fig. 3b). Thus a simulation using the hybrid water model and protein polarized QM charges on the ligands resulted in a strong correlation while there was no correlation for simulations with fixed MM charges.
For our docking and subsequent GBSA simulations, we used the crystal coordinates of BCV-protease complex (PDB ID: 2FDE)22 as our starting template. BCV, DRV and GRL-98065 had a bis-THF moiety as the core ligand, and APV, GRL-06579, and GRL-02031 had cores that had a high similarity with bis-THF. The high correlation obtained with ΔGqm/wat-IC50 indicated that free energies obtained with a hybrid water model and polarized QM charges on the ligands would be a promising approach for lead optimization. Currently, we are exploring the utility of GBSA free energy simulations once a correct binding mode is obtained, and we wanted to use native or native-like crystal structure as an template as a proof-of-principle study. We have not explored docking ligands against non-native protease crystal structures in the current study. A more exhaustive examination that includes poses obtained by cross docking needs to be performed.
Progress has been made in improving the robustness of scoring functions to determine the relative affinity of inhibitors for target proteins, but there are vast scopes where significant improvement can be made.1, 3 Part of the reason for the inaccuracies in the scoring functions arise because i) the current methodologies largely account for enthalpic changes while completely ignoring entropic changes; ii) they do not properly treat protein flexibility; iii) they do not properly account for solvation and desolvation effects; and iv) they do not account for the polarization induced by the protein and the ligand on each other.1, 3, 7 Most of the studies on scoring functions deal with rank-ordering the activity of a congeneric set of ligands. The prediction of activity of a ligand against mutant proteins is very important in light of drug resistance mutations that emerge in many therapeutic areas.
We have explored the correlation of GBSA free energies with antiviral potency (IC50 values) of nine different PIs against wild-type and mutant proteases. Correlation with enzymatic Ki would result in similar conclusions since these PIs have good cell penetration and are highly protease-specific inhibitors. Further more, these PIs do not form aggregates, and do not have problematic properties discussed by Shoichet et al.48, 49 The GBSA free energies were simulated by four different protocols: i) fixed MM charges in implicit solvent (ΔGmm); ii) protein polarized QM charges of PIs in implicit solvent (ΔGqm); iii) fixed MM charges in a hybrid solvent (ΔGmm/wat); and iv) protein polarized QM charges of PIs in a hybrid solvent (ΔGqm/wat). The hybrid solvent protocols have retained an explicit water molecule(s) that bridges hydrogen bonding with the protease in an otherwise implicit solvent environment. The protein polarized ligand charges were determined during ligand docking in the protein environment at B3LYP/6-31G* level, and these polarized charges were maintained for the ligand for the full GBSA simulation cycle. These enable us to analyze the effect of a different charge model compared to the static charges from a force-field (supplementary figures S3 and S4). Free energies from a successful GBSA protocol should provide a good correlation with the antiviral activity against wild-type and mutant protease. Resistance mutations in the protease active site arise primarily due to loss of favorable binding interactions. Resistance caused by non-active site mutations is more difficult to understand and rationalize although some attempts to elucidate the mechanism have recently been made.12, 50-52
We initially compared the correlation coefficients for the PROWT, PROD30N, and PROI50V (Set-1). The antiviral potency of the inhibitors for wild-type and mutant proteases are shown in Table 1 and the inhibitors have potency (IC50 values) in the nanomolar range. The rqm/wat value was higher than 0.75 for 7 out of 8 PIs for Set-1, with rqm/wat being more than 0.90 for 4 PIs (Table 3). PROD30N was associated with SQV-, NFV-, and LPV-resistance of HIV-1 and increased susceptibility to APV and TPV (Table 1). It is noteworthy that rqm/wat showed substantial correlation values of greater than 0.75 even though the fold-change in antiviral activity for PROD30N is non-monotonic. AZV is the only PI that did not show a correlation of ΔGqm/wat with the IC50 value with one bridging water molecule. By including two additional bridging water molecules for NFV and AZV, rqm/wat improved to 0.97 and 0.64 respectively. Selective inclusion of explicit water molecules needs to be explored and validated. For Set-1, rmm/wat, rqm and rmm had strong correlation for 4, 3 and 3 PIs respectively.
Correlations for Set-2 included the mutant PROV82I/I85V. While V82I is located in the protease active site, and represents one of the major resistance-associated mutations for PIs, I85V was selected as a resistance-associated mutation for GRL-98065 even though it does not form a direct van der Waals contact.32 For Set-2, the number of PIs that showed strong correlation coefficients for rqm/wat and rmm/wat were 4 and 1 respectively, while rqm and rmm did not show strong correlations for any PI. Set-3 included Set-2 as well as PRO2840 and PRO2841, which had mutations distant from the active site. We included such mutations because they are seen in drug-resistant HIV-1-harboring patients, and we wanted to test the ability of the GBSA energy function to correlate with the antiviral IC50 for such protease substitutions. The rqm/wat values showed strong correlation coefficients for only DRV and SQV for Set-3, and moderate correlation for 3 other PIs. The correlation coefficients, rmm/wat and rqm were from 0.55 to 0.75 for two PIs in Set-3, suggesting that those two PIs had a moderate correlation between the free energies and the IC50 values.
Analysis of correlation coefficients in Table 3 indicated that rqm/wat had strong (r > 0.75) and moderate (0.55 < r < 0.75) correlation for more PIs than rmm/wat, rqm or rmm. This suggested that the GBSA free energies simulated with a hybrid water model with protein polarized QM charges on the inhibitors had a higher correlation with antiviral IC50 than the other free energy simulation protocols. Others1, 3 have suggested that improved treatment of solvation and polarizability may improve the robustness of scoring functions, and we have demonstrated that our use of selected explicit water molecule(s) and protein polarized QM partial charges on the inhibitor provided greater correlation with antiviral potency. Further improvement might be achieved by improving upon the hybrid water model, by accounting for the polarization induced by the inhibitors on the protein atoms, and by including changes in entropy.
This work was supported in part by the Intramural Research Program of Center for Cancer Research, National Cancer Institute, National Institutes of Health (DD and HM), a grant from the National Institutes of Health (GM 53386 to AKG), a grant from the Kumamoto University Global Center of Excellence Program, Global Education and Research Center Aiming at the control of AIDS (HM) supported by the Ministry of Education, Culture, Sports, Science, and Technology (Monbu-Kagakusho), a Grant-in-Aid for Scientific Research (Priority Areas to HM) from the Ministry of Education, Culture, Sports, Science, and Technology (Monbu-Kagakusho) of Japan (HM), and a Grant for Promotion of AIDS Research from the Ministry of Health, Labor and Welfare (Kosei-Rodosho) of Japan (HM). The work utilized the computational resources of the Biowulf cluster at the NIH.
Supporting Information Available: The simulated GBSA free energies are given in Supplementary Tables S1 and S2. The charges on DRV for OPLS2005 force field (MM level) and protein polarized charges at the B3LYP/6-31G* level are given in supplementary figures S3 and S4. This material is available free of charge via the Internet at http://pubs.acs.org.