|Home | About | Journals | Submit | Contact Us | Français|
The papain/CLIK-148 coordinate system was employed as a model to study the interactions of a non-peptide thiocarbazate inhibitor of cathepsin L (1). This small molecule inhibitor, a thiol ester containing a diacyl hydrazine functionality and one stereogenic center, was most active as the S-enantiomer, with an IC50 of 56 nM; the R-enantiomer (2) displayed only weak activity (33 μM). Correspondingly, molecular docking studies with Extra Precision Glide revealed a correlation between score and biological activity for the two thiocarbazate enantiomers when a structural water was preserved. The molecular interactions between 1 and papain were very similar to the interactions observed for CLIK-148 (3a and 3b) with papain, especially with regard to the hydrogen bonding and lipophilic interactions of the ligands with conserved residues in the catalytic binding site. Subsequent docking of virtual compounds in the binding site led to the identification of a more potent inhibitor (5), with an IC50 of 7.0 nM. These docking studies revealed that favorable energy scores and correspondingly favorable biological activities could be realized when the virtual compound design included occupation of the S2, S3 and S1′ subsites by hydrophobic and aromatic functionalities of the ligand, and at least three hydrogen bonding contacts between the ligand and the conserved binding site residues of the protein.
Cathepsin L-like cysteine proteases have been involved in various disease states including malaria, leishmaniasis, Chagas' disease, African trypanosomiasis, toxoplasmosis, and amoebiasis, among others.1-7 These illnesses have generally been neglected by many pharmaceutical companies since projected drug development costs outweigh the purchasing power of the afflicted populations.8 With the formation of the NIH Molecular Libraries Screening Centers Network9 (MLSCN), academic centers in the network are providing the biological community with a wide range of small molecule biological probes, and in addition, have had the opportunity to focus on the discovery of new lead compounds for neglected disease targets (NDTs).8 The Penn Center for Molecular Discovery10 has recently developed a selective chemical probe for human cathepsin L,11 a cysteine protease implicated in a variety of NDTs. Since cathepsin L is highly homologous to the cysteine proteases expressed by parasites associated with NDTs,12, 13 the development of cathepsin L inhibitors could aid in the design of critically needed drugs for these diseases.
Many inhibitors of the parasitic cysteine proteases form covalent bonds with the active site Cys25 nucleophilic sulfur atom of their targeted enzymes. Such covalent binding is often apparently irreversible (or very slowly reversible), and may be required in order for inhibition to be sustained at the cellular level. More readily reversible covalent cysteine protease inhibitors have also been described,18 as have inhibitors that bind noncovalently.14 Understanding the molecular requirements for covalent attachment is of clear importance for designing new agents that will bind via this mechanism.
The X-ray coordinate system for papain/CLIK-148 (1cvz.pdb), solved at 1.7 Å resolution, is a representative example of the structure of a covalent ligand-bound cysteine protease complex.15 Since most of the amino acid residues that are involved in the binding of CLIK-148 to papain are conserved in cathepsin L, this publicly available high resolution structure has provided an excellent model for the successful design of highly active and specific cathepsin L inhibitors.15, 16 An alternative model would have been the structure of the inhibitor E-64 complexed to cathepsin L, but unfortunately, the coordinates for this structure are not publicly available.17 The coordinates for the structure of cathepsin L complexed to a noncovalently bound tripeptide are publicly available,14 but are less pertinent to our work because of the difference in the shape of the binding site in the tripeptide/cathepsin L complex compared with that of the covalently bound small molecule in papain (1cvz.pdb). We also explored the possibility of a homology model of cathepsin L based on the coordinates of 1cvz.pdb, however, considerable error was associated with subsequent dockings of this theoretical model, especially with regard to the positioning of critical hydrophobic groups in the non-prime subsites.18 We thus employed the papain/CLIK-148 structure for this study.
While molecular docking routines, including Schrodinger's XP Glide,19 adequately address hydrogen bonding, van der Waals, electrostatic, hydrophobic, and solvation contributions to binding, these algorithms do not routinely address covalent interactions between the ligand and the protein; covalent binding can only be addressed with quantum mechanical routines. Consequently, all XP Glide binding scores reported in the present study do not incorporate the contribution from the covalent interaction between the Cys25 sulfur and the electrophilic atom of the ligand. Instead, the distance between the nucleophilic sulfur of Cys25 and the electrophilic atom of the modeled ligand undergoing attack will be used to estimate the potential for covalent bond formation.
Although an activated electrophilic atom in the ligand is necessary for covalent binding to the protein, there are also non-covalent binding interactions that contribute significantly to biological activity. Such hydrogen bonding and hydrophobic interactions may be as important as the covalent bond formed between the ligand and the protein with regard to achieving high potency in vivo. Once the requisite hydrogen bonding atoms and hydrophobic functional groups on the ligand are correctly positioned within the binding site, the electrophilic center is placed in proximity to bind sulfur. This binding mechanism is illustrated by the succinyl epoxides represented by CLIK-14815 (3a and 3b) and E-64 (4).17, 20 These inhibitors have their origins in natural products, and have evolved to incorporate substituents for enzyme interactions beyond the electrophilic center that binds covalently to Cys25. The substituents (H-bond donors and acceptors as well as hydrophobic groups) may well play a pivotal role in positioning the ligand precisely for attack by the nucleophilic sulfur.
Following standard nomenclature,21 oligopeptides binding to a peptidase are cleaved between the S1 and S1′ subsites of the protein, with subsite S numbers increasing in both directions the farther the subsite is from the cleavage site. Occupation of the prime and non-prime subsites in the protein also affects selectivity within the cysteine protease family. For example, a derivative of the naturally occurring inhibitor E-64 (E-64c), binds covalently via the Cys25 sulfur, but is non-selective, inactivating most cysteine proteases. On the other hand, in the complex of CLIK-148 bound to papain (1cvz.pdb), CLIK-148, a selective cathepsin L inhibitor, is precisely positioned in the active site. Here, in addition to the formation of the Cys25 covalent bond, several papain residues participate both in hydrogen bonding and hydrophobic interactions with the succinyl epoxide inhibitor (CLIK-148), including Gln19, Cys25, Gly66, Asp158, Trp177, and Ser205 (Table 2). In addition, the inhibitor lies deeply buried in both the S2 and S1′ subsites. Disruptions of the interactions with the binding site residues can lead to significant loss of biological activity. Other diastereomers of the epoxy succinyl ligands, containing up to 4 stereogenic centers (16 possible stereoisomers) have been isolated and tested in selected cysteine proteases; not surprisingly, some are inactive when the stereogenicities of the ligands are changed. However, since the succinyl epoxide natural products contain at least three stereogenic centers, understanding the effect of stereo inversion on activity is not straightforward in these systems. In the present study, we have explored the effect of stereoinversion directly, through a multi-disciplinary approach employing chemical, computational, and biological techniques.
Molecular docking studies were performed using Maestro version 8.0 and Extra Precision Glide (XP Glide), version 4.5.19 All ligands were docked flexibly to papain, which served as a model for cathepsin L. The coordinates for the papain/CLIK-148 complex (1cvz.pdb) were downloaded from the RCSB Protein Data Bank.22 In this structure, CLIK-148 is covalently bound to papain. To prepare the system for docking, the covalent bond between the Cys25 sulfur and the epoxide carbon was manually deleted. The protein was then prepared for subsequent grid generation and docking using the Protein Preparation Wizard tool supplied with Glide. Using this tool, all hydrogen atoms were added to papain, the protonation states for histidine residues were optimized, crystallographic waters not deemed to be important for ligand binding were deleted, and the entire protein was minimized. The structural water near Asp158 was preserved; it was re-oriented and adjusted using the commands within the Protein Preparation Wizard tool available within Glide, so that hydrogen bonding between the ligand and the protein could be established. In order to validate the XP Glide algorithm for subsequent docking studies of the cathepsin L inhibitors, the non-covalently bound CLIK-148 was extracted from the complex and prepared for single ligand docking using the LigPrep application with the OPLS_2005 force field. Next, a grid was prepared for re-docking CLIK-148 into papain using the Receptor Grid Generation tool in Glide. With non-covalently bound CLIK-148 in place, the centroid of the workspace ligand was chosen to define the grid box. The option to dock ligands similar in size to the workspace ligand was selected for determining the grid sizing. For this single ligand docking, the extra precision mode was selected. The default settings for scaling the van der Waals radii were selected: a scaling factor of 0.8 and a partial charge cutoff of 0.25. No constraints were defined for the docking runs. The highest scoring docking pose (orientation plus conformation) returned for CLIK-148 was overlaid with the starting protein complex. For subsequent molecular docking of compounds 1, 2 and 5 in the binding site of papain, LigPrep was used for energy minimizations of these small molecules with the OPLS_2005 force field. Using the initial grid generated for papain, the extra precision docking was repeated for each molecule (1, 2, and 5), as described above. The orientation of the structural water near the Asp 158 was maintained for these docking studies.
Reaction progress curves for cathepsin L-catalyzed hydrolysis of Z-Phe-Arg-AMC at various inhibitor concentrations23 were used to estimate the rate constants for inhibition by compound 1. Baseline fluorescence readings (reaction mixture with no substrate) were fit to a cubic polynomial using the MATLAB polyfit function, and the resulting curve was subtracted from time course measurements for each inhibitor concentration. Raw fluorescence readings were linear with respect to AMC concentration. The time-zero reading for baseline fluorescence and the maximum average fluorescence reading were used as the minimum (0 μM) and maximum (1 μM) values for the linear scale, respectively.
Progress curves for each inhibitor concentration were fit to a 5-parameter inhibition kinetic model:
The best-fitting parameters were determined using APPSPACK optimization software with a linear least squares objective function.23 It was also necessary to estimate an additional parameter, representing the time-delay until the first experimental reading was taken (ca. 150 sec).
Analog activity analyses were conducted with the following assay buffer: 20 mM sodium acetate, 1 mM EDTA, and 5 mM DTT, pH 5.5. Confirmatory results were obtained utilizing the following assay conditions, replacing DTT with cysteine in the assay buffer. Compounds were serially diluted in DMSO and transferred into a 96-well Corning 3686 assay microplate to give 16 dilutions ranging from 50 μM to 1.5 nM. Human liver cathepsin L (Calbiochem 219402) was activated by incubating with assay buffer for 30 min. Assay buffer consisted of 20 mM sodium acetate, 1 mM EDTA, and 5 mM cysteine, pH 5.5. Upon activation, cathepsin L (300 pM) was incubated with 1 μM Z-Phe-Arg-AMC substrate and test compound in 100 μL of assay buffer for 1 h at room temperature. Fluorescence of AMC released by enzyme-catalyzed hydrolysis of Z-Phe-Arg-AMC was read on a PerkinElmer Envision microplate reader (excitation 355 nm, emission 460 nm). Data was scaled using internal controls and fit to a four parameter logistic model (IDBS XLfit equation 205) to obtain IC50 values in triplicate.
We recently described the identification of a potent, non-peptide inhibitor of cathepsin L (1, Figure 1), with an IC50 of 56 nM under defined conditions.11 This thiocarbazate contains a diacyl hydrazine functionality and a single stereogenic center. The most active congener proved to be the S-enantiomer, with an IC50 of 56 nM. The R-enantiomer (2, Figure 1), described here, was only modestly active against cathepsin L (IC50 = 33 μM). Molecular docking studies were initiated on both enantiomers, in order to probe the importance of the elements that contribute to binding. As a validation study, we first examined the non-covalent docking of the ring-opened epoxide form of CLIK-148 (3b, Figure 2) in papain. Initially, the orientation of the water molecule near Asp 148 was not suitable for hydrogen bonding with the ligand or with Asp 148. Using the routines available in XP Glide, this water molecule was re-orientated so that adequate hydrogen bonding could be established (Figure 5). Then, after breaking the covalent bond to Cys25 and re-docking CLIK-148 into the binding site, the experimentally derived binding mode was reproduced (Figure 5). In this docking pose (orientation plus conformation), with a score of -9.27 kcal/mol, CLIK-148 hydrogen bonds to the protein through side chain and backbone atoms of Gly66, Cys25, Gln19, and Asp158. The pyridine group occupies a hydrophobic aromatic pocket in papain in the S1′ subsite near Trp177 and the phenylalanine of the ligand occupies the S2 subsite. Taken together, these features play a major role in positioning the ligand appropriately for covalent attachment to the protein. In the highest scoring pose, the distance between the Cys25 sulfur and the electrophilic carbon of the ligand (epoxide ring carbon) is about 3 Å.
Since the validation study revealed that XP Glide could accurately reproduce the experimentally derived binding mode of CLIK-148, thiocarbazate 1 was docked into the binding site of papain (Figures 6 and and9a);9a); the di-imide functionality in 1 was maintained in the anti orientation with respect to the geometry of the NH bonds (Figure 6b). The highest scoring pose obtained had a binding energy of -9.03 kcal/mol, very close to the energy value observed in our validation study with CLIK-148/papain (-9.27 kcal/mol). Furthermore, the orientations of the ligands (CLIK-148 and 1) bound to papain were strikingly similar (Figure 6b). For the 1/papain complex, Gln19, Cys25, Gly66, Asp158, and Trp177 all participated in hydrogen bonding interactions with the ligand (Figure 6b), as observed for CLIK-148/papain. In addition, both the pyridine group of CLIK-148 and the 2-ethylphenyl anilide group of 1 occupied the S1′ subsite containing Trp177. The indole group of 1 occupied the S2 subsite, near Ser 205 (papain residue), in a fashion similar to that of the phenylalanine group of CLIK-148. The Ser 205 residue in papain corresponds to the Ala 214 residue in cathepsin L; this is the only residue in the binding site of cathepsin L that is not identical to the aligned residue in papain. However, both Ala and Ser side chains are small enough to accommodate the bulky hydrophobic groups present in both inhibitors (phenylalanine in CLIK-148 and 2-ethylphenyl anilide in 1),15 so this difference would appear to be negligible. Both CLIK-148 and 1 fully occupy the S1′ and S2 subsites. The tert-butoxy group of the NHBoc in compound 1 sits in the S3 subsite, occupying a large hydrophobic cleft. In addition, the amino acid-derived NH of inhibitor 1 hydrogen bonds to the backbone carbonyl of Asp 148, a conserved binding site residue. Inhibitors that span from the S to the S1′ subsites have high potency and selectivity toward cysteine proteases.15, 24 Compound 1 demonstrated selectivity towards cathepsin L versus other cysteine proteases, including cathepsins V, S, B, and K, with the greatest selectivity index observed for cathepsin K (150).23 The details of a structure activity relationship study for cathepsin L inhibition in the carbazate series are being published separately25, but reveal that removal of the NHBoc group causes a dramatic loss of cathepsin L inhibitory activity. This single synthetic change to the carbazate scaffold of 1 removes both the potential for hydrogen bonding with Asp 148 and the hydrophobic contact of the tert-butoxy group in the S3 subsite, resulting in a 400-fold reduction in cathepsin L inhibition.
As presented above, the R-enantiomer 2 had a cathepsin L inhibitory activity of only 33 μM. In an attempt to understand why this enantiomer was virtually inactive, we docked 2 into the binding site of papain. The highest scoring pose for 2 obtained from this XP Glide docking study (depicted in Figure 7) had a docking score of -7.0, two kcal/mol lower than the score for the S-enantioner 1. In addition, key hydrogen bonding and hydrophobic contacts that are established in the complexes of the active S-enantiomer (1) with papain, and in CLIK-148 with papain are completely disrupted for the R-enantiomer (2) in this binding site (Figures 8, 9a, and 9b). While the S-enantiomer 1 makes at least six hydrogen bonding contacts to the active site residues and large hydrophobic contacts within the S1′, S2 and S3 subsites (Figures 6a and 6b), the R-enantiomer forms only two hydrogen bonds to papain and does not occupy the S1′ subsite at all (Figures 8 and and9b).9b). This change in stereogenicity from S (1) to R (2) reveals a dramatic shifting of the 2-ethylphenyl anilide group out of this critical subsite in these docking studies (cf. Figures 9a and 9b). The key hydrogen bond made between the NH of the Trp residue in the S-enantiomer 1 with Asp 148 is also absent in the binding of the R-enantiomer 2 (Table 2). In addition, the indole of 2 makes fewer significant hydrophobic contacts in the S2 subsite (near Ser 205) than the indole of 1. These observed differences in binding interactions between 1 and 2 and the corresponding difference in docking scores provide a cogent rationale for the observed decrease in the cathepsin L inhibitory activity of 2 vs. 1 of almost three orders of magnitude (Table 1). Also noteworthy is the increase in the distance of the electrophilic carbonyl carbon in 2 to the Cys25 to 4.4 Å, suggesting that covalent bond formation might be less favorable.
These mechanistic insights into the binding site interactions of 1 suggested additional room within the S1′ subsite, which led to the design of compound 5. This oxocarbazate analog of 1 was designed to contain the requisite S stereogenicity, an oxygen in place of the thiol ester sulfur, and a large hydrophobic/aromatic group (quinoline) to occupy further the S1′ subsite. When this virtual compound (5, Figure 10) was docked into the binding site of papain, the best pose obtained had a score of -10.00, a score improved by 1 kcal/mol compared to that of our initial lead 1 (-9.03 kcal/mol). This compound was synthesized and found to be more potent than 1, with an enzyme inhibitory activity of 7 nM against cathepsin L. In the highest scoring docking pose for this compound, three hydrogen bonds are formed between 5 and the protein; moreover, the tetrahydroquinoline group on the ligand occupies the large hydrophobic pocket with Trp177 in the S1′ subsite (Figure 10). Changing the sulfur in 1 to an oxygen in 5 leads to a change in orientation of the ester bond, making a new interaction with His159 possible. This hydrogen bond is also observed in the binding of CLIK-148 to papain (Table 2). In both inhibitors (1 and 5), the carbazate carbonyl carbons are oriented for nucleophilic attack by Cys25, with the distances from the Cys sulfur to the carbonyl carbon in both ligands in the three angstrom range.
The structure of pro-cathepsin L (1mhw.pdb) was also explored in molecular docking studies with ligand 1. However, only very poor XP Glide scores could be obtained from these studies. The two highest scoring docking poses for 1 in the binding site of the pro-cathepsin L structure had scores of 1.15 and 6.74 kcal/mol. When the interactions between 1 and the pro-cathepsin L structure were examined, severe steric clashes between the indole of the ligand and the Leu 69 side chain were observed (a distance of 0.61 angstroms between the ligand and the Leu side chain). This residue corresponds to Tyr 67 in papain. However, in 1mhw.pdb, the Leu 69 side chain is pointing into the binding site cavity, whereas in papain, the Tyr 67 side chain hydroxyl is 6.11 angstroms removed from any atom in 1, and no unfavorable contacts are observed. Further unfavorable contacts were also observed between ligand 1 and the backbone atoms surrounding the Cys 25 residue in the pro-cathepsin L structure, and only one hydrogen bond was observed between the ligand and the conserved binding site residues. A homology model of cathepsin L based on the coordinates of CLIK-148 bound to papain was also generated (MOE software, CCG, Inc.). Docking scores for 1 in the binding site of the resulting theoretical model were somewhat better than those obtained for the pro-cathepsin L structure (-3.82 and -2.40 kcal/mol for the two highest scoring poses of 1 bound to the model structure), but these scores were still unfavorable. Since significantly better scores were realized for ligand dockings of 1 with the papain structure than with either the pro-cathepsin L structure or the theoretical model, this experimentally-derived system (1cvz.pdb) was used directly for all docking studies of the carbazate ligands.
To compare our docking analysis with the kinetic behavior of compound 1, we constructed a 5-parameter ODE model of reversible inhibitor binding and fit the model to reaction progress curves measured at various inhibitor concentrations (see Materials and Methods section). The best-fitting parameters were k1 = 2.3 μM-1s-1, k-1 = 0.30 s-1, kcat = 4.0 s-1, kon = 0.024 μM-1s-1, and koff = 2.2 × 10-5 s-1. Most notably, the rate of inhibitor dissociation (koff) from cathepsin L was extremely slow, leading to a Ki = koff / kon = 0.890 nM.23 Alternative reaction schemes for steady-state and irreversible inhibitor binding were also tested, but did not fit the data as well as the 5-parameter model. Taken together, the results from the docking and kinetic analyses suggest a covalent but slowly reversible mechanism of inhibition that is aided by strong non-covalent interactions.
The molecular-level determinants of cathepsin L inhibitory activity have been identified via molecular docking experiments of cathepsin L inhibitors in the binding site of papain. Utilizing the papain system as a model for cathepsin L, we found that the docking scores paralleled the experimental bioactivities for cathepsin L inhibition, and could be employed as a guide in selecting new molecules for synthesis. To this end, compound 5, designed to incorporate the key hydrogen bonding, hydrophobic and aromatic elements necessary for cathepsin L activity, proved to be the most active compound in the series of ligands studied thus far.
The docking studies reported here provide an understanding of the importance of the non-covalent determinants of bioactivity in the cathepsin L enzyme, and as a result, we have correlated the binding energy scores of selected cathepsin L inhibitors with biological activity. We conclude that the potential for covalent attachment of the Cys25 sulfur to the electrophilic center in the ligand is insufficient for generating high inhibitory potency against cathepsin L using our standard assay conditions. Our study supports this claim, given the extremely low potency of the R-enantiomer 2 (IC50 = 33 μM), and the observed loss of critical binding elements in the binding site model of 2 docked to papain. Even though the R-enantiomer contains the same activated thiocarbonyl group as found in the S-enantiomer, stereo inversion at a distant center in the molecule completely disrupts the other binding components that are critical for high potency. Notably, the 2-ethylphenyl anilide group in 2 does not make the necessary hydrophobic contacts in the S1′ aromatic pocket that are required for cathepsin L inhibition. Other key hydrogen bonding and hydrophobic contacts are also lost in the binding of 2 to papain. Additional derivatives in both the thiol ester and ester series that probe the binding features within this chemical series have been designed and synthesized, and as such are the subject of a companion chemistry publication describing the structure activity relationships.25
Supporting Information Available The coordinate files (pdb format) for the papain coordinate system derived from 1cvz.pdb, with the Cys 25 sulfur to ligand bond manually deleted (papain_mpb.pdb), plus the coordinates for compounds 1 (mpb_compound1.pdb), 2 (mpb_compound2.pdb), and 5 (mpb_compound5.pdb) in the same coordinate system as papain. This material is available free of charge via the internet at pubs.acs.org.