|Home | About | Journals | Submit | Contact Us | Français|
Design of selective ligands for closely related targets is becoming one of the most important tasks in the drug development. New tools, more precise than fast scoring functions and less demanding than sophisticated Free Energy Perturbation methods, are necessary to help accomplish this goal. The methods of intermediate complexity, characterizing individual contributions to the binding energy, have been an area of intense research in the past few years. Our recently developed quantum mechanical/molecular mechanical (QM/MM) modification of the Linear Response (LR) method describes the binding free energies as the sum of empirically weighted contributions of the QM/MM interaction energies and solvent-accessible surface areas for the time-averaged structures of hydrated complexes, obtained by molecular dynamics (MD) simulations. The method was applied to published data on 27 inhibitors of matrix metalloproteinase-3 (MMP-3). The two descriptors explained 90% of variance in the inhibition constants with RMSE of 0.245 log units. The QM/MM treatment is indispensable for characterization of the systems lacking suitable force-field expressions. In this case, it provided characteristics of H-bonds of the inhibitors to Glu202, charges of binding site atoms, and accurate coordination geometries of the ligands to catalytic zinc. The geometries were constrained during the MD simulations, which characterized conformational flexibility of the complexes and helped in the elucidation of the binding differences for related compounds. A comparison of the presented QM/MM LR results with those previously published for inhibition of MMP-9 by the same set of ligands showed that the QM/MM LR approach was able to distinguish subtle differences in binding affinities for MMP-3 and MMP-9, which did not exceed one order of magnitude. This precision level makes the approach a useful tool for design of selective ligands to similar targets, because the results can be safely extrapolated to maximize selectivity.
Numerous approaches to the prediction of binding affinities have been developed, ranging in complexity from high-speed scoring function to computationally expensive Free Energy Perturbation (FEP) and Thermodynamic Integration (TI) methods. Each group of methods has its own area of application.
The scoring functions1 are classified as force-field based approaches,2,3 empirical scoring methods4–7 utilizing simplified energy functions, and knowledge-based scoring functions,8–10 which are derived from the prevalencies of atom-atom contact pairs in complexes of known structures. The scoring functions were developed for virtual screening and docking applications where the speed is critical. Simplifications introduced to achieve the speed, e.g. the omission of both a thorough conformational sampling and description of some interactions, on the other hand, limit the accuracy of the binding affinity predictions to 2–3 orders of magnitude, which frequently results in ranking violations.
The other end of the spectrum is formed by FEP,11 TI,12 and similar13 methods, representing the most rigorous way of calculating binding free energy through slow, step-wise transformation of the ligand parts.14 FEP is used with molecular dynamics (MD) or Monte Carlo (MC) simulations to produce extensive conformational sampling. In order for the FEP results to converge, a large number of pair-wise interaction evaluations and small perturbation steps are required. Most of the computational time is spent on searching unimportant conformations, making the methods less attractive for applications, which require estimation of binding affinity of a large set of compounds.
The methods of intermediate complexity, bridging the gap between scoring functions and FEP/TI methods, have recently become the subject of intense research. Conformational sampling, as one of the most time-consuming processes, needs to have the extent carefully optimized, to speed up the computations. The Mining Minima approach15 deploys a specialized algorithm16 to find relevant low-energy conformations, which are then used in the computation of the configuration integral. The LR (Linear Response) approximation17–19 and the MM-PBSA approach20,21 methods perform MD conformational sampling only for the initial and final states of the complex formation.
In MM-PBSA method, the binding free energy is calculated as the sum of molecular mechanical (MM) energy, solvation free energy, and entropic contributions averaged over a series of snapshots from MD trajectories. The electrostatic contribution to the solvation term is calculated by solving the Poisson-Boltzmann (PB) equation, and the non-polar contribution to solvation is assumed to be proportional to the solvent-accessible surface area (SA or SASA in Equation (1)).
The term Δ for the energies is calculated as the difference of the energy ensemble averages (denoted by angle brackets) of the complex and free components (ligand and free protein). In the QM/MM LR approach, the van der Waals and electrostatic energies, separately scaled by adjustable coefficients α and β, were replaced by the single QM/MM interaction energy for the time-averaged structure of the protein-ligand complex, scaled by α.25 The third Δ-term is the difference in the ensemble averages of SASA for the solvated ligand in the bound and free states. Protein-solvent and solvent-solvent interactions are included in adjustable coefficients α, β, and γ.26–28 The van der Waals coefficient α depends on the studied compounds and/or the used force field (e.g. different values were obtained for thrombin23,26,29). The magnitude of α was analyzed with respect to hydrophobicity of the binding site.26 The coulombic scaling coefficient β, initially assumed to be equal to ½, based on the linear response of the surroundings to electric fields,17–19 varies with the ligand nature and ligand surroundings (protein or water).30–32
Our replacement of the MM-based van der Waals and electrostatic terms by the QM/MM term25 opened a new avenue of applications for the LR approach, especially for the ligand-protein interactions mediated by coordination, covalent, or highly polarized weak bonds. The approach, when applied to zinc-dependent matrix metalloproteinase (MMP) 9, significantly improved description and prediction ability of the LR approach as given by Equation (1). Consequently, it was interesting to examine whether the approach consistently achieves a high performance. In the present study, the QM/MM LR approach was used to describe inhibition of MMP-3 by the same set of ligands as previously used for MMP-9.33 Both MMPs are involved in etiology of cancer invasion, metastasis, inflammation, and necrosis, and MMP-3 is involved in arthritis. Maximization of the differences in binding is required for the development of selective inhibitors. The isoenzymes have quite similar binding sites,34,35 so the additional goal was to examine whether the QM/MM LR approach is capable of distinguishing small differences in binding. The majority of known MMP inhibitors are active against several MMPs and the development of selective inhibitors is a nontrivial task.34 From a broader viewpoint, the development of selective inhibitors for similar isoenzymes is becoming one of the main tasks of computational drug design. This task requires a new class of methods because high-throughput docking and scoring are too coarse to capture the subtle variation in binding to similar targets.
Published inhibitory potencies, characterized by the inhibition constants Ki at 37 °C, of a series of 27 hydroxamate derivatives33 towards MMP-3 were used (Table I below). No experimental errors were published for the inhibitory potencies. Our experience with similar analyses of MMP inhibition shows that better than 90% reproducibility can be achieved in carefully executed experiments with purified enzymes. Coordinates of the MMP-3 catalytic domain were taken from the X-ray crystal structure of the MMP-3 in the complex with N-[[2-methyl-4-hydroxycarbamoyl]but-4-yl-N]-benzyl-P-[phenyl]-P-[methyl] phosphonamide, as deposited in the PDB (file 1B3D).36 Among the 21 available PDB files of MMP-3 inhibitor complexes, the ligand in this file is most similar to the studied compounds.
Ligand structures were sketched and minimized using the Sybyl 7.1 suite of programs37 running under Irix 6.5, and using the Tripos force field.38 After full geometry optimization using DFT/B3LYP-6-31G** approach,39 the Mulliken atomic charges40 were calculated. The FlexX program4,41 was used to dock the inhibitors into the active site of MMP-3 that was prepared as described below. The bound conformation in the active site was selected from the top 30 poses using, as the primary criterion, the distances between both hydroxamate oxygens and the catalytic zinc atom (2 Å preferred), and the FlexX ranking as the secondary criterion.42
Following hydrogen addition to the catalytic domain of MMP-3 (file 1B3D), the protein structure was relaxed by energy minimization with the coordinates of heavy atoms constrained. The complex was solvated with two layers of TIP3P water molecules,38 the positions of which were optimized by 10,000 cycles of conjugate gradient minimization, with constrained protein and ligand atoms. In the solvent equilibration phase, water molecules were subjected to MD simulation for 20 ps, keeping the protein and ligand atoms fixed. These steps allowed the solvent molecules to create a reasonable hydration shell of the ligand-receptor complex. The flexible regions was set to include protein atoms and water molecules closer than 5 Å to the ligand and minimized using MM with a distance-dependent dielectrics and conjugate gradient algorithm with a convergence criterion of 0.001 kcal/(mol Å).
The QM/MM approach43 was used for the optimization of the best docked structures (Step 2), and for the calculation of the single-point QM/MM energies of the time-averaged structures, to be utilized in QM/MM LR correlations (Step 4). In both cases, the QM region consisted of side chains of His205 and His211, the backbone atoms and side chains of His201 and Glu202, the entire inhibitor, and the catalytic zinc ion. The backbone atoms were included to obtain valid QM/MM cuts. The rest of the protein was treated with OPLS-AA force field,44 including charges, with the protein and water outside 5 Å of the superimposed ligands frozen. For the QM part of the QM/MM calculations, DFT functional B3LYP45 was deployed. The 6-31G* basis set was used for the interface atoms between the QM and MM regions. The LAV3P** basis set was employed for geometry optimization. For Zn, S, and P atoms, this means the Los Alamos effective core potential (ECP)46,47 with all the s functions and the last p and d Gaussian uncontracted; for the remaining atoms, it implies 6-31G** basis set. All calculation converged by reaching the root mean squared change in density matrix elements of 5.0×10−6 or below. The B3LYP functional was selected for structure optimization, because it provides as good or better geometries and energies as those from correlated ab initio methods for the first-row transition metal complexes.48
Sybyl with Tripos force field38 was used to perform the MD simulations under isothermal/isobaric (NPT) conditions The QM/MM-optimized coordination bonds of the catalytic zinc, slightly different for each ligand, were restrained with a harmonic potential with the force constant 200 kcal/(mol×Å2) and power 2. The Mulliken charges40 for the QM/MM optimized structures were used for the QM region, and the Gasteiger-Hükel49charges were used for the rest of the protein and water. The time step of the simulations was 1 fs, with the nonbonded interactions updated every 25 fs, using the cutoff of 12 . All residues within 6 of any ligand atom were mobile, and the remaining part of the protein and water was kept frozen.
During the heating phase, the temperature of the system was raised from 0 to 300 K in 15 ps. After 100-ps equilibration run, the production phase was carried out at 300 K for 200 ps. To characterize flexibilities of the binding site and those of the ligands, the root mean square deviations (RMSD) of the c-α carbons and the inhibitor atoms, both with respect to the initial conformations, were calculated throughout the MD simulations. The time-averaged structures, obtained from the readings recorded in 100-fs intervals, were collected at appropriate times. A brief minimization, using the Tripos force field with a distance-dependent dielectrics and the Powell conjugate gradient algorithm with a convergence criterion of 0.001 kcal/(mol Å), produced structures with practically standard bond lengths and angles, with the dihedrals representing the ensemble. In each simulation run, at least one reading showed the geometry of the system that was very close to the relaxed time-averaged structure.
The time-averaged structures were previously shown to result in similar LR correlations as those using ensemble averages.50,51 The single-point QM/MM energy calculations on the time-averaged structures provided the substitute of the ensemble averages of the QM/MM energies, which was obtained as the difference between the QM/MM energies of the complex and the free components (ligand and protein). The QM and MM regions were defined in the time-averaged structures in the same way as in the previous part. The SASA terms were calculated using the ProsSat option in the Homology module of the Insight II modeling package.52 The 1.4-Å probe was used, although recent studies recommend smaller probes.53 Molecular images were produced using Chimera54 (Figure 1) and Sybyl38 (Figure 4).
The experimental potencies were correlated with the LR terms using the linear least-squares fits55 based on Equation (1). The robustness and predictive abilities of the regression equations were probed by cross-validation. For this purpose, the fits to the potency data were generated leaving out one or more inhibitors from the calibration process. The resulting equation for each fit was used to predict the potencies of the omitted compounds. The predictive ability of Equation (1) was characterized by the root mean square errors (RMSE) of predicted potencies. We used the leave-one-out (LOO) approach and the leave-several-out (LSO) approach, where six inhibitors were randomly omitted, and the process was repeated 200 times.
The aim of this study was to construct the QM/MM LR model for the inhibitory potencies of 27 hydroxamate inhibitors of MMP-3 (Table I) and compare the results with the previously established QM/MM LR correlation25 for MMP-9 inhibition by the same set of ligands. Both enzymes have very similar binding sites, as suggested by a high correlation (r2=0.849) of the experimental activities of the investigated inhibitors (see below).
Our previously developed four-tier approach was applied,25 comprising: (1) docking of ligands with the selection of the bound pose based on the metal binding geometry; (2) QM/MM minimization of the complexes, (3) MD simulation of the complexes, with the zinc coordination bonds harmonically restrained to the optimized geometry from Step 2; and (4) calculation of the single-point QM/MM energy and the SASA term for the time-averaged structures from Step 3. The outcomes of Step 4 were correlated with experimental data using Equation (1).
The inhibitors33 of MMP-3 were docked into the active site of MMP-3 (PDB file 1B3D)56 using FlexX.4,41 The ranking of poses was based upon the distance between the catalytic zinc and both hydroxamate oxygens as the primary criterion, and FlexX score as the secondary criterion. The average distances from catalytic zinc to carbonyl (O1) and hydroxyl (O2) oxygens for the selected complexes were 1.874±0.071 and 1.973±0.088, respectively. To reduce the QM/MM convergence time, for the top complexes of each ligand, the region within 6Å of the ligand was briefly MM-optimized by the conjugate gradient minimization.
The QM/MM geometry optimization resulted in remarkably similar structures of the hydroxamate groups of inhibitors bound to the catalytic zinc. The average bond lengths and average Mulliken charges of zinc and the coordinating atoms are shown in Figure 1. The zinc charge was also fairly similar in all complexes, with the average value of 1.160 |e|. In Toba’s57 and our25 work, the catalytic zinc center adopted a trigonal-bipyramidal geometry, and the charge on zinc was found to be +0.8 |e| and 1.059 |e| respectively. Ryde,58 Hou,59 and Hoop et al.60 found the partial charge on the zinc atom to be 0.488 |e|, 0.549 |e| and |0.688 |e|, respectively. These differences are expected and may be due to different enzyme-inhibitor system, and use of different Hamiltonians or different basis sets in the QM calculations.
The catalytic zinc in the X-ray structure of MMP-3 (PDB file 1B3D) is coordinated to the nitrogens of three histidines, His201Nε2 (N2), His205Nε2 (N3), and His211Nε2 (N4), as seen in Figure 1, with the coordination bond lengths close to 2.1 Å. In the QM/MM optimized structures, the bond lengths of the Zn-O1 and Zn-O2 bonds (Table I) were 2.072±0.033 Å and 2.135±0.067 Å, respectively, indicating similar roles of the two oxygens in zinc coordination. The average coordination bond lengths to histidine nitrogens were even more similar (Figure 1). The hydroxamate O2-bound hydrogen (H2) was in all cases transferred to Glu202 but remained H-bonded to hydroxamate.
To assess coordination geometry, the bond angles (Table I) were compared with ideal values for two possible penta-coordinated geometries: trigonal bipyramid (TB) and square-based pyramid (SP). The average angles O1-Zn-N2, O1-Zn-N3, and O1-Zn-N4 were 110.17±2.06°, 157.46±13.41°, and 84.97±3.27°, respectively (ideal values for TB: 120°, 120°, 90°; and for SP: 90°, 180°, 90°). The average angles O2-Zn-N2, -N3, and -N4 were 101.57±5.47°, 94.28±3.47°, and 156.02±4.79°, respectively (both TB and SP ideally adopt the values 90°, 90°, 180°). The average bond angle of O2-Zn-O1 is 77.90±2.06° , approaching the ideal value of 90° for both TB and SP configurations. The average bond angles for N2-Zn-N3, N3-Zn-N4 and N2-Zn-N4 are 89.85±4.58°, 98.90±4.60° and 98.74±3.81°, respectively (ideal values are 120°, 90°, 90° for TB; and 90°, 90°, 90° for SP). These data indicate that coordination geometry of the catalytic zinc is closer to SP, with N2 forming the out-of-plane vertex, than to TB.
For the free/bound hydroxamate groups, the average bond lengths were (in Å): C1=O1 1.233/1.260, C1-N1 1.357/1.323, N1-O2 1.400/1.376, and O2-H2 0.979/1.406, respectively. In the QM/MM geometry optimization of the complex, the proton from the hydroxamate OH group was transferred to the Glu202 oxygen but remained H-bonded to the parent oxygen. The H-bonds in MMP-3 complexes seem to be stronger than in other MMP complexes: the average distance between the Glu202 oxygen and O2 oxygen of the hydroxamate obtained here (2.461 Å) is shorter than equivalent distances for the hydroxamate complexes with MMP-7,61 MMP-8,62,63 and MMP-9,25 which all exceeded 2.6 Å. The average C1=O1, C1-N1, and N1-O2 bond lengths for the bound state obtained here (1.260, 1.323, and 1.376 Å) are closer to deprotonated hydroxamate (1.294, 1.317, and 1.378 Å) than to neutral hydroxamate with respective distances of 1.279, 1.337, and 1.416 Å.64 This fact is due to the H-bond formation and does not indicate that hydroxamates bind as ionized molecules. The same behavior was observed for the hydroxamate binding to MMP-9.25 The QM/MM approach handled the zinc-ligand charge transfer,65 bond length changes, polarization, and ionization66 upon binding to zinc, which would be difficult to describe by even advanced force fields.66–72
The QM/MM optimized complexes were subjected to MD simulation using the harmonic bond length and angle constraints applied to the zinc coordination sphere. The average root-mean square deviation (RMSD) values of the backbone α-carbons, as compared to the initial conformation, for all the complexes were rather small, 0.150±0.030 Å per α-carbon, throughout the simulation (Figure 2). For all inhibitors, the average RMSD for the center of mass of all inhibitor atoms was 1.494±1.069 Å. Standard deviations (SD) of the individual RMSD values characterized flexibility of individual bound inhibitors. The RMSD and SD values for individual inhibitors are summarized in Table I and plotted in Figure 2. The RMSD values illustrate conformational sampling on the QM/MM optimized structures. The extents of movements, characterized by the SD values, of the α-carbons and of the inhibitor molecules are not correlated (the squared correlation coefficient r2 = 0.009). While inhibitor 24 (Table I) is the most moving ligand, the α-carbons exhibited the greatest mobility when ligand 3 was bound in the active site.
After 200-ps MD simulations for complexes, free ligands, and free protein were completed, the differences, between the complex and free components (ligand and protein), in the van der Waals and electrostatic energies, as well as the differences in the SASA terms of the bound and free ligands, were calculated for the time-averaged structures. In the correlations using Equation (1), the results obtained after 5 ps were used and are listed in Table I. Longer simulation times provided similar samplings that did not improve the correlations using Equation (1) and the results are not shown. The time-averaged structures were in all studied cases similar to at least one structure that was sampled in the MD simulation. For the same time-averaged structures, single-point QM/MM interaction energies were calculated and are summarized in Table I. These quantities explain inhibitor potency trends for groups of closely related analogs.
Compounds 1, 9, and 14 differ in R2 substituents (4-methoxyphenyl, phenyl, and pyridine, respectively), resulting in the activity decreasing in this order. For all three compounds, both the QM/MM and SASA terms increase in this order (Table I).
Compounds 9 – 11 (Ki = 34.9, 72.9, and 64.4 nM, respectively) are structural analogs of compound 1 but exhibit lower activity as compared to 1 (Ki = 5.20 nM). The 4-methoxyphenyl group in R2 position of 1 is replaced with phenyl (9), 4-methylphenyl (10), and 4-fluorophenyl (11) substituents. Compounds 9 – 11 have higher QM/MM energy values than 1; however, they also exhibit diminished burial of SASA (Table I).
The binding affinity decreases stepwise on moving from ortho (13) to meta (12) and para (11) fluorine substituents. A similar trend was also observed for the SASA and QM/MM energy values, which are most favorable for 13 and least favorable for 11 (Table I). However, the fluorine substituent of 13 forms no H-bond interactions, in contrast to fluorines in 12 and 11, which are H-bonded with a water molecule and the backbone NH group of Tyr223, respectively.
The LR correlations with experimental binding affinities using Equation (1) for both QM/MM and MD energies are summarized in Table II, along with the correlations from the preceding steps. Estimation of binding affinities based on simple linear regression of the FlexX scores with experimental activities resulted in poor correlation (r2=0.057). The use of the MM energies obtained from the minimized complexes in Step 1 increased the correlation to r2=0.425, with statistically significant van der Waals and electrostatic coefficients. The QM/MM energy minimization slightly improved the correlation in Step 2 to r2=0.465, thanks to the dominating SASA terms, because the QM/MM energies were not statistically significant. Inclusion of the QM/MM energy for the optimized geometry did not improve the correlation (r2=0.465), and the pertinent regression coefficient α had a negative sign as well as an inflated error. The cross-correlation between the QM/MM energy and the SASA term was weak (r2=0.228).
The use of the energy terms from the MD conformational sampling in Step 3 increased the correlation to r2=0.817 (Table II). The van der Waals coefficient α had a negative value. For both α and β coefficients, the error terms were almost equal to the coefficient estimates. Refitting Equation (1) with only the SASA term after Step 3 resulted in correlation of r2=0.797. Compounds 1, 2, 4, 15, 16, and 25 (Table I), which saw the greatest description improvement in Step 3 (0.4 – 1.2 log units) compared to step 2, exhibit a significant change in the conformation of the bound ligand upon MD sampling. The change in the binding conformation is accompanied by a large increase of the SASA term, except for compounds 15 and 25. The cross-correlation of the van der Waals and SASA terms was very weak (r2=0.045).
The best model, obtained for Step 4, contained just the single point QM/MM energy for the time-averaged structures and the SASA term. This treatment was most beneficial for compounds 6, 7, 22–24 and 26, resulting in improvement of residuals by 0.2 – 0.4 log units. The correlation for all 27 compounds was characterized by r2 = 0.903 and the standard deviation SD = 0.245 log units (Table II), reflecting a good agreement between actual and calculated values (Table I). For each coefficient, the probability grater than the F ratio was less than 0.0001, implying a negligible likelihood of the random occurrence of a significant coefficient magnitude. The cross-correlation between the QM/MM energies and SASA terms was weak as indicated by the r2 value of 0.342. The dominance of the SASA terms, clearly seen in Table I, is probably reflecting the effect of the inhibitor burial in the binding site. This phenomenon was described previously by Luque and Freire in the analysis of binding energies of several ligand-protein complexes.73
The adjustable coefficient κ in Equation (1) yields an attractive term of about −5.228 log units (Table II), providing a base value for the inhibitors, which is then modulated by the QM/MM interaction and SASA terms. The values of the QM/MM terms (Table I) are negative and the associated positive coefficient α (Table II) implies that a strong interaction between the inhibitor and the binding site is important for inhibition. The SASA terms (Table I) are negative, implying burial of the surface area upon binding. The positive value of the associated coefficient γ (Table II) indicates that the removal of mostly hydrophobic surface area from the contact with water upon binding promotes the binding, as could be expected for the hydrophobic effect.74 The obtained values of γ (Table II: 0.0058-0.011) are in the same range as the slopes of the linear dependencies of solvation free energies of various compounds on SASA: 0.006,22 0.007,75 and 0.016.76 The separation of the overall SASA into polar and non-polar components resulted in a slightly better correlation (r2=0.912, Step 4); with the coefficient values (1.106±0.340)×10−3 and (4.570±1.080)×10−3 for polar and nonpolar components, respectively.
The robustness of the regression equations and their predictive abilities were extensively probed by cross-validation. For this purpose, the fits to the potency data are generated leaving out one or more inhibitors from the calibration process. The resulting equation for each fit is used to predict the potencies of the omitted compounds. The leave-one-out (LOO) procedure and especially the leave-several-out (LSO) procedure with 200 random selections of 6-member test sets provided a thorough evaluation. The RMSE values using LOO (0.262) and LSO (0.264) were only slightly higher than the RMSE value of the complete data set (0.245). It appears that the conformational sampling included in the time-averaged structures (Step 3) and a good description of the zinc coordination bonds (Step 4) are jointly required for a good correlation with experimental inhibitory potencies. The improving quality of correlations by addition of individual Steps is documented in Figure 3.
A similar study as described above for MMP-3 was performed earlier for the same set of compounds inhibiting MMP-9.25 The protocols were slightly different: here the free ligands were simulated separately, while in the former study, the ligand ensembles from the MD simulation of the complexes were used, as previously suggested.21,77 When structures of entire binding sites are compared, they are quite similar for both isoenzymes.34,35 The superposition of the binding sites of MMP-3 and MMP-9 using C-α carbon atoms is shown in Figure 4.
As follows from the docking studies, the R1 substituents bind to subsite S1, and the R2 substituents occupy subsite S2’ in all cases. The X group is an extension of the zinc-binding group, and interacts with subsite S1. Most of the inhibitors reported in the literature bind extensively on the primed side. The non-primed side seems to be more open, with less specific binding pockets, where the inhibitor-binding features could be more diverse. This behavior can also be seen in studied hydroxamate complexes. Beyond subsite S1, the protein cleft is wide open, and the inhibitor binding is more flexible and less specific. Binding on the non-primed side may improve overall binding and enhance inhibitor potency.
For MMP-3, subsite S1 consists of residues Gly161, Asn162, Val163, Leu164, Ala165, His166, Ala167, and Tyr168. The residues forming subsite S2’ are Ala217, Leu218, Met219, Tyr220, Pro221, Leu222, and Tyr223. MMP-9 differs from MMP-3 in the following substitutions. In subsite S1, which is the most different of all six subsites, Gly161, Asn162, Val163, and Tyr168 in MMP-3 were replaced with Asp185, Gly186, Leu187, and Phe192, respectively, in MMP-9. In subsite S2’, the only difference is that Leu222 was replaced with Met422.
In spite of the relative similarity of the binding sites and mutual dependence of the inhibitory potencies in the two isoenzymes (Figure 5, statistical indices below), the QM/MM energies and SASA terms are not correlated: r2 = 0.353 and 0.669, respectively. A major difference is seen in the range of the QM/MM energies, which is about an order of magnitude larger for MMP-3 than for MMP-9. In addition to the slightly different protocols, possible reasons include shorter and stronger H-bonds of the inhibitors with Glu202, and different coordination geometries (SP in MMP-3 and TB in MMP-9), resulting in different coordination bond energies. Additional H-bonds of some ligands with the MMP-3 binding site also contribute to the difference in the QM/MM energies. The hydroxamate NH group forms a H-bond with the backbone carbonyl oxygen of Ala165 for all studied compounds except 15 and 17. Compounds 17, 21, and 27 have the NH group of hydroxamate engaged in an intramolecular H-bond with the P=O group. The P-O group of 20 and 21, and the P=O group of all the ligands except 15, 17, 20, and 21 form a H-bond with the backbone amide group of Leu164. The P=O group is also H-bonded with the backbone amide of Ala165 except for 4, 8, 13, 17, 22, 23, and 27. The side chain amino group and alkoxy oxygen of 15 form H-bonds with the backbone carbonyl of Leu222 and the backbone amide group of Leu164, respectively. The pyridine nitrogen of 14 and alkoxy oxygen of 17 form H-bonds with the backbone amide of Tyr223.
The magnitude of the coefficient α that is scaling the QM/MM energies in Equation (1) is much smaller (1.27×10−4, Table 2) for MMP-3 than the magnitude for MMP-9 (3.59×10−3) that was obtained for the same set of inhibitors using a slightly different protocol. This difference precludes an assessment of the transferability of the energy-scaling coefficient α between the MMP-3 and MMP-9 systems. The desolvation-characterizing coefficients γ remain remarkably similar in the MMP-3 and MMP-9 systems, and fit the range of the experimentally observed values for desolvation of organic molecules.22,75,76
Prediction of the MMP-3 data set using Equation (1) with the coefficients obtained from MMP-9 resulted in RMSE of 0.447 log units, whereas prediction of the MMP-9 data set with MMP-3 coefficients resulted in RMSE of 0.480 log units. The combined model (N=57) using 27 MMP-3 and 28 MMP-9 (the energy and SASA terms values in Table 1 of the original report25) inhibitors exhibits an intermediate agreement with experiment: r2 = 0.759 and RMSE = 0.429.
The ability of the QM/MM LR approach to truly capture small differences of binding affinities for similar metalloproteins is illustrated in Figure 5. The predicted affinities for MMP-9 are scattered around the dotted identity line. The experimental affinities for MMP-3 follow the same trend as the affinities for MMP-9, as indicated by the full line with the slope of 0.744±0.063 and the intercept of −1.845±0.509 (r2 = 0.849). In spite of this similarity, the predictions for MMP-3 (open points) are closely associated with the experimental MMP-3 data. The differences between affinities for MMP-3 and MMP-9 are most significant for the four best MMP-9 inhibitors, grouped in the low left corner (18 – 21, Table I). The predictions for MMP-3 and MMP-9 in this region do not show any tendency to averaging and clearly distinguish between the two targets. As follows from Figure 5, increased affinities to both enzymes can be expected to result in increased selectivities. In addition, selectivity may originate from other factors, as seen for the 3rd and 4th most active MMP-9 inhibitors (18 and 20 in Table I) In case of MMP-3, these inhibitors form H-bonds with Leu164 and Ala-165 whereas, no such H-bonds are observed for MMP-9 (Leu188 and Ala189, equivalent residues).
A predictive model for the binding affinities of MMP-3 inhibitors was developed using our QM/MM LR method that is based on a combined use of docking, QM/MM calculations, and MD simulations. Binding affinities are described as a linear combination of QM/MM interaction energies and SASA changes upon binding, both calculated for the relaxed time-averaged structures from MD simulations. The application of the approach to the published data on MMP-3 inhibition by 27 hydroxamates resulted in an excellent correlation (r2=0.903) between experimental and calculated values, as well as in stable and accurate predictions. A comparison with the previously developed QM/MM LR correlation for MMP-9 showed that the QM/MM and SASA terms for MMP-3 and MMP-9 are not correlated. The QM/MM energy terms for MMP-3 are about ten times larger than those for MMP-9, due to different coordination geometry of the catalytic zinc, additional or stronger H-bonds, and possibly due to slightly different computational protocols. The QM/MM LR approach captured the subtle differences of about one order of magnitude in the binding affinities for the most active inhibitors binding to both isoenzymes. This level of precision makes the approach a useful tool for the design of specific inhibitors for individual members of enzyme families.
This work was supported in part by the NIH NCRR grants 1P20RR15566 and 1P20RR16741, as well as by the access to resources of the Computational Chemistry and Biology Network and the Center for High Performance Computing, both at the North Dakota State University.