We have established an expression and purification protocol resulting in high yields of activated cruzain. The optimized auto-induction protocol has not only assisted in completing recent crystallography efforts,13,24
but also allowed for the cost-effective production of isotopically labeled cruzain for use in NMR studies. Due to its inherent instability, active cruzain has not yet been crystallized in its apo form. All previously published cruzain SAR studies had relied on obtaining high-resolution x-ray crystal structures, usually with an inhibited protease. With the exception of compound 6
none of the non-covalent inhibitors listed in have been successfully co-crystallized with cruzain. The existing crystallographic structures of cruzain will continue to be utilized for in silico
docking and modeling studies, and other high-throughput screening (HTS) methods such as fluorescence activity assays will help triage compound libraries in order to identify potential lead inhibitors. However, neither the docking or HTS studies will verify and characterize inhibitor binding sites or allow for more thorough biophysical studies of cruzain.
As we have illustrated here, with the advantages of selective isotopic labeling, binding modes of these inhibitors can be examined in a relatively fast manner using NMR-based methods. In particular, the use of selectively 15
N-His, and 13
C-Met labeled cruzain was helpful in quickly verifying the binding sites of eight small-molecule inhibitors. With the exception of the non-interactors/aggregators, titration of each compound using these cruzain samples perturbed the chemical shifts of residues at or near the active site. Importantly, these NMR titrations verified previously reported in silico
docking results of selected non-covalently binding compounds,13
including those for which there are no existing crystallographic data sets. Using current NMR automation technology with samples contained in standard 5 mm NMR tubes, a binary binding experiment (i.e. apo versus greater than 20 fold excess compound) can be performed in 1 to 2 hours per sample. The rate of data acquisition can be further enhanced using NMR flow-cells and 96 well plate sampling attachments, which requires smaller sample volumes.25
This has the added advantage of allowing for higher protein and inhibitor concentrations relative to the standard 5 mm tubes, and more importantly, faster data acquisition times.
Results from the NMR titrations can be used to verify whether these candidate compounds bind to cruzain and if they are competitive inhibitors. In addition, NMR-based titrations may verify recently proposed allosteric binding sites located greater than 10 Å from the catalytic Cys25.26
Insights gained from the NMR titrations of potential inhibitors, especially those that exhibit weak binding affinities to cruzain, may also help determine whether efforts at optimizing the compounds for further SAR studies are warranted. Although the NMR titration assays are not able to detect subtle differences between various chemical scaffolds, such as relative orientation within the binding pocket, they may be useful in determining differences between chemical analogs. For example, clear differences in the HSQC spectra are observed between compounds 2
(a covalent binder) and 3
(a non-interactor/aggregator). Both share the same chemical scaffold, but differ in the functional group attached to the purine ring: a nitrile group in compound 2
and a chloro group in compound 3
. In these cases, NMR spectroscopy and x-ray crystallography can be used in conjunction to identify and optimize potential therapeutic leads for cruzain.
There has been intense and long-standing interest in cysteine proteases, particularly due to their involvement in a wide range of human diseases.8,9,27,28
In particular, cysteine cathepsins, which make up 11 of the 15 canonical cathepsin family members, are known to play roles in extracellular matrix remodeling in humans, leading to the development of various pathologies including cancer, cardiovascular and inflammatory diseases.9,28
However, there are relatively few NMR-based studies that focus on their catalytic domains. One early effort focused on pH studies of active papain and relied on one-dimensional proton spectra.29
A second study presented two-dimensional NMR data of the zymogen procathepsin L, but did not report any useful resonance assignments.30
The majority of the previous NMR-based structure calculations have centered on protein inhibitors such as chagasin (cruzain)31
or p41icf (cathepsin L).32
More recently, NMR-based structural studies focusing on cysteine proteases such as Streptopain,33,34
Foot and Mouth Disease Virus Leader Protease,35
Ubiquitin C-terminal hydrolases,36,37
and SARS Coronavirus Main Protease (SARS CoV Mpro
have appeared in the literature. Importantly, in these enzyme constructs the catalytic cysteine had been substituted with either serine or alanine, or in the case of SARS CoV Mpro
, truncated, rendering the proteases inactive. To date, there are few examples of active papain-like cysteine proteases that have been extensively studied via NMR, including the Josephin domain of ataxin-3,39-41
the NlpC/P60 domain of lipoprotein Spr,42
and Sortase A.43
All of these proteases contain the catalytic cysteine and histidine residues positioned in approximately the same relative positions as papain. However, all have low overall sequential (< 15 %) and structural homology (backbone RMSD > 3.5 Å) with respect to papain and other members of the cathepsin cysteine protease family. Thus, to our knowledge, the data presented herein represents the first NMR-based study focusing on the mature, active form of a papain/cathepsin-like cysteine protease in its wild-type state.
On a more basic level, production of NMR-ready cruzain will allow for more extensive study into structure-function relationships. For example, backbone dynamics of both the zymogen and mature cruzain can now be examined in order to more fully understand how conformational mobility influences proteolytic activity. In addition, selectively labeled samples may also be utilized to determine the pKa
values of residues such as the catalytic Cys25 and His162 (Supporting Information
Supplemental Figs. S8
). Such information may be useful in further dissecting the mechanism of the enzyme.
Further insight into the mechanism of in vitro
procruzain self-activation is provided by our observation that K777 impedes, but does not completely abolish, cleavage of the pro-region from the catalytic domain. Examination of the structurally homologous Cys25 to Ser25 variant of procathepsin L (1CJL, backbone RMSD for the catalytic domain with 2OZ2 = 0.582 Å)44
shows a short helix positioned orthogonally at the active site occluding the catalytic residues. However, the 44 residues immediately preceding the N-terminus of the catalytic domain lie in the substrate binding cleft and are relatively unstructured. As has been suggested for all members of the papain/cathepsin cysteine protease family,44
the pro-region in cruzain may likewise adopt a similar conformation and mode of auto-inhibition. The relatively unstructured state of the pro-region may allow for transient interactions with the catalytic domain, which in turn, allows for the active site to be exposed in a catalytically competent state. Once an “active” enzyme is available, it would be ready to proteolyze other procruzain molecules, initiating a cascade of auto-activation events.
The expression and purification strategies described herein may also be adapted to papain and the other eleven known structurally homologous cysteine cathepsins, opening the possibility of characterizing their solution states via NMR spectroscopy or other biophysical methods. For example, K777 was recently demonstrated to inhibit the activities of cathepsins B, L, and S isolated from pancreatic extracts.45
Results from these studies could further the understanding of mechanisms which govern enzymatic activity, as well as assist the discovery of new inhibitors for this large family of closely related cysteine proteases.