|Home | About | Journals | Submit | Contact Us | Français|
A secreted chlamydial protease designated CPAF (Chlamydial Protease/proteasome-like Activity Factor) degrades host proteins, enabling Chlamydia to evade host defenses and replicate. The mechanistic details of CPAF action, however, remain obscure. We used a computational approach to search the protein data bank for structures that are compatible with the CPAF amino acid sequence. The results reveal that CPAF possesses a fold similar to that of the catalytic domains of the tricorn protease from Thermoplasma acidophilum, and that CPAF residues H105, S499, and E558 are structurally analogous to the tricorn protease catalytic triad residues H746, S965, and D1023. Substitution of these putative CPAF catalytic residues blocked the CPAF from degrading substrates in vitro, while the wild type and a noncatalytic control mutant of CPAF remained cleavage-competent. Substrate cleavage is also correlated with processing of CPAF into N-terminal (CPAFn) and C-terminal (CPAFc) fragments, suggesting that these putative catalytic residues may also be required for CPAF maturation.
Infection with chlamydial organisms that have adapted an obligate intracellular growth life cycle  imposes severe health problems in both humans and animals [2-8]. However, the pathogenic mechanisms of Chlamydia-induced diseases remain unclear. It is hypothesized that inflammatory responses induced during chlamydial intracellular replication significantly contribute to the chlamydial pathogenesis [9-12]. We have previously identified a Chlamydia-secreted protease designated as CPAF (Chlamydial Protease/proteasome-like Activity Factor) that may participate in chlamydial immune evasion and promote chlamydial intracellular survival [13-17].
CPAF was initially discovered through its ability to degrade RXF5 and USF1, host transcriptional factors that are required for MHC antigen expression [13-15]. This activity also suggested a mechanism through which Chlamydia evades host immune detection. CPAF may also contribute to chlamydial inhibition of apoptosis by degrading various proapoptotic BH3-only proteins [16, 18-20]. To facilitate chlamydial vacuole expansion, CPAF can solubilize portions of intermediate filaments (IF) by cleaving cytokeratin 8 , a major component of IF in epithelial cells. Recently, it was shown that CPAF could also cleave cyclin B and PARP, which may contribute to the blockade of host cell replication and repairing efforts at the late stages of infection . It thus seems clear that CPAF can promote chlamydial pathogenesis via multiple means ranging from evasion of host defense to facilitation of chlamydial vacuole expansion. CPAF therefore represents a bona fide virulence factor of Chlamydia.
Although full length CPAF is a 70 kDa protein, once secreted into the cytoplasm of the infected host cell, it is cleaved into two shorter polypeptides of MW ~29 KDa (CPAFn) and of MW ~35 KDa (CPAFc) [15, 17, 22, 23]. CPAFn and CPAFc remain associated as a complex designated CPAFn:CPAFc, Two CPAFc:CPAFn complexes subsequently come together to form the catalytically active “dimeric” molecule . Interestingly, full length CPAF is partially processed and acquires measurable proteolytic activity when expressed as GST-fusion proteins in bacterial but not eukaryotic cell expression systems [22, 24], which has provided a platform for further characterization of CPAF proteolytic activity. Biochemical studies have led to the suggestion that both the cleavage and the subsequent dimerization events are required for CPAF to degrade its host substrates [22, 24]. However, the mechanistic details of CPAF proteolytic activity remain unknown.
Here, we used a computational approach in which the CPAF amino acid sequence was used to search for protease homologs of known structure in order to identify putative catalytic residues. The results of this in silico analysis reveal that in three-dimensional (3-D) space, the H105, S499, and E558 residues of CPAF map to the catalytic residues H746, S965, and D1023 of the structure of the tricorn protease, a 720 kDa serine proteolytic complex from Thermoplasma acidophilum , strongly suggesting that these CPAF residues also play a catalytic role in CPAF proteolytic activity. This suggestion was tested using a site-directed mutagenesis approach in which we confirmed that CPAF mutants with substitutions at these putative catalytic residues lost their ability to degrade substrate proteins, while the wild type or a control CPAF mutant retained the ability to competently cleave these substrates. Furthermore, the ability to cleave the substrate proteins was correlated with the processing of CPAF into CPAFn and CPAFc fragments, suggesting that the putative catalytic residues are also required for CPAF self-processing.
We searched for possible structural homologs of CPAF using a variation of protein threading as implemented in the program HHPRED, which is available on a web server (http://toolkit.tuebingen.mpg.de/hhpred) [26, 27]. HHPRED uses pair-wise comparison of profile Hidden Markov Models (HMMs). Briefly, in the first step, an alignment of sequence homologs is built for the CPAF query sequence by multiple iterations of PSI-BLAST against the non-redundant sequence database from NCBI. In the second step, a single CPAF profile HMM is generated from the multiple sequence alignment, which contains a concise statistical description of the underlying alignment, including secondary structural information. For each column in the multiple alignment that has a residue in the query sequence, an HMM column is created that contains the probabilities of each of the 20 amino acids, plus 4 probabilities that describe how often amino acids are inserted and deleted at this position (insert open/extend, delete open/extend). These insert/delete probabilities are translated into position-specific gap penalties when an HMM is aligned to a sequence or to another HMM [26, 27].
These same two steps are also performed for each sequence corresponding to a known structure in the Protein Data Bank (PDB) in order to generate a library of profile HMMs to which the query profile HMM can be compared. In the third step, the query profile HMM is compared to each profile HMM in the structural database and scored [26, 27]. The HHPRED output, which consists of an alignment of a sequence to be modeled with known related structures is used as input for the program MODELLER, a program that automatically calculates a model of the query sequence containing all non-hydrogen atoms by satisfaction of spatial restraints .
The gene encoding the full-length CPAF wild type (Wt) enzyme was cloned into a pGEX6p-2 vector (Amersham Biosciences Corp, Piscataway, NJ) using the Chlamydia trachomatis serovar D genomic DNA as template and the oligonucleotide sequences 5′-CGC.GGA.TCC.ATG.GGT.TTT.TGG.AGA.ACA.TCG (forward) and 5′-AAAAGGAAAAGCGGCCGC.TCA.AAA.ACT.ACC.ATC.TTC.CGC (reverse) as cloning primers. The putative leader sequence (residues 1-24; http://stdgen.northwestern.edu/) in CPAF was not excluded during primer design. This is because the GST is fused to the N-terminus, and constructs with or without the putative leader sequence displayed no differences in expression level. This vector allows genes of interest to be expressed as fusion proteins with a 26 kDa glutathione-S-transferase (GST) as a fusion partner at the N-terminus. A site-directed mutagenesis kit (cat# 200518, Stratagene, La Jolla, CA) was used to create various mutant CPAF proteins using the Wt full length CPAF gene as the template . Briefly, complementary primers incorporating the desired nucleotide substitutions were used to introduce the corresponding mutations. After primer extension and amplification using PfuUltra DNA polymerase, the parental methylated and hemimethylated DNA was digested with the nuclease DpnI. The mutated molecules were then transformed into competent bacterial cells for nick repair and gene expression. The primers 5′-AAT.GAC.TTT.GCC.GCT.GGA.GTA (forward; the codon CAC coding for an H at 105 position in the Wt CPAF was changed to GCC that codes for an A and the altered nucleotides are underlined) and 5′-TAC.TCC.AGC.GGC.AAA.GTC.ATT (reverse) were used to create the mutant CPAF that carries the residue H105 to A mutation (designated as H105A), 5′-CAA.GAC.TTT.GCT.TGT.GCT.GAC (forward, TCT coding for S in Wt CPAF was mutated to GCT coding for A) and 5′-GTC.AGC.ACA.AGC.AAA.GTC.TTG (reverse) for S499A, 5′-GCC.TTC.ATT.GCC.AAC.ATC.GGA (forward, GAG coding for E in Wt CPAF was mutated to GCC coding for A) and 5′-TCC.GAT.GTT.GGC.AAT.GAA.GGC (reverse) for E558A, 5′-ACT.GGA.ATA.GCA.ACT.TGT.TCT (forward, AAA coding for K in Wt CPAF was mutated to GCA coding for A) and 5′-AGA.ACA.AGT.TGC.TAT.TCC.AGT (reverse) for the control mutant K540A.
All GST-CPAF constructs were expressed in E. coli XL1-Blue with IPTG as inducer as previously described [24, 29]. To reduce the amount of insoluble protein, all CPAF fusion constructs were induced at 30 °C for 3 hours. The GST-fusion proteins were released from bacteria by sonication on ice and purified using glutathione-conjugated beads (Amersham Biosciences Corp). The bead-immobilized fusion proteins were quantified, aliquoted and stored at -80°C till the digestion experiments.
Cell-free degradation assays were carried out as described previously [15, 24]. Cytosolic extracts (CE) containing Puma and keratin 8 or nuclear extracts (NE) containing RFX5 and USF-1 was used as substrates. Each CE was prepared by following a protocol previously described [17, 19]. Briefly, 1-2 × 107 normal HeLa cells were pelleted and resuspended in 0.5 ml of NP-40 buffer containing 1 % NP-40 (v/v), 0.5 % Triton X-100 (v/v) and 0.15 M NaCI, in 50 mM Tris (pH 8.0) plus a protease inhibitor cocktail [1 mM PMSF (cat# P7626), 20 μM leupeptin (L2884), 1.6 μM pepstatin A (P5318) and 1.7 μg/ml of aprotinin (A6279), all from Sigma, St. Louis, MO]. After gentle mixing, the extraction was carried out on ice for 20 min followed by a microfuge centrifugation to pellet the cell ghosts. The supernatants were collected, aliquoted and stored at -80 °C until use. Each NE was prepared as previously described [22, 24] by using the pellets after CE extraction. The pellets were resuspended in 0.5 ml of high salt lysis buffer containing 0.5 M NaCI and 1 % Triton X-100 (v/v) in 20 mM Tris (pH 8.0) plus the same protease inhibitor cocktail used for preparing CE. The extraction was carried out on ice for 30 min followed by a high-speed centrifugation. The supernatant was collected as NE. In some cases, the extractions were repeated a few times and the supernatants were pooled, aliquoted and stored at -80 °C until use. The cytosolic fractions from cells infected with C. trachomatis serovar L2 (L2S100) were used as the source of enzyme. Each L2S100 was prepared using an established protocol . Briefly, the infected cells were harvested via low speed centrifugation and the cell pellets were resuspended in a douncing buffer [(10 mM KCL, 1.5 mM MgCL2, 1 mM EDTA, 1 mM DTT, 250 mM sucrose in 20 mM Hepes-KOH (pH 7.5) with a protease inhibitor cocktail as described above]. After limited douncing to make sure that > 70% cells are broken without damage to either the nuclei or inclusions, the supernatants were harvested after a series of centrifugation including a final airfuge centrifugation at 100,000 × g and designated as L2S100. For digestion, the enzyme preps and substrates were mixed and incubated for 1 h at 37 °C. The residual substrates after digestion were detected using a Western blot assay as described below.
The Western blot assays were carried out as we previously described [30, 31]. Briefly, the bead-bound fusion proteins or reaction mixtures from cell-free degradation assays were subjected to protein separation in the SDS-polyacrylamide gels. After the resolved protein bands were blotted onto nitrocellulose membranes, primary antibodies were applied. These include mouse monoclonal antibodies (mAb) 100a for detecting CPAFc [15, 22], M20 for cytokeratin 8 (C5301, Sigma, Saint Louis, MO), rabbit polyclonal antibodies for RFX5 (cat# 200-401-191, Rockland Immunochemicals Inc. Gilbertsville, PA) and for USF-1 (sc-229, Santa Cruz Biotechnology, Inc., Santa Cruz, CA) and a rabbit mAb against Puma (EP512Y, ab33906, Abcam, Cambridge, MA). Primary antibody binding was probed with the corresponding secondary antibodies conjugated with horseradish peroxidase (Jackson ImmunoResearch Laboratories, Inc. West Grove, PA), followed by standard enhanced chemiluminescence (Amersham Biosciences Corp).
Figure 1 shows a sequence alignment between CPAF and Tricorn protease that includes secondary structural information and highlights the residues in both proteins that form the catalytic triad. There amino acid identity overall between the two proteins shown in Figure 1 is only 17 %, and due to multiple insertions in the CPAF sequence relative to that of the tricorn protease catalytic domain (CPAF residues 159-183, 217-249, 274-321, 335-345, 401-474), the relationship between CPAF and tricorn protease was not obvious from sequence alignment methods alone. Table 1 shows the statistics for the five top-scoring hits coming from the HHPRED analysis using the query CPAF amino acid sequence. The tricorn protease, the structure of which (pdb code 1K32  is shown in Figure 2A, scored highest for the query CPAF sequence. Tricorn protease consists of a trimer of dimers that associate to form a hexameric ring. The active sites of the tricorn protease are located at the dimer interfaces and are comprised of residues coming from the catalytic domains of each subunit (boxed in Figure 2A). The highest scoring portions of the CPAF molecule map only to the catalytic domains of tricorn protease and are shown in cartoon format in Figures 2A and 2B. An expanded view of the active site of tricorn protease and that predicted for CPAF based on the tricorn protease structure are shown in Figures 2C and 2D, respectively. The results of this computational analysis strongly suggest that CPAF amino acid residues H105, S499, and E558 correspond in 3-D space to H746, S965, and D1023, the catalytic triad of tricorn protease. These residues of the putative CPAF catalytic triad were chosen for further mutagenesis studies.
CPAF mutants with alanine substitution of the putative catalytic residues (H105A, S499A, E558A) together with the wild type (Wt), a cleavage-defective control mutant  and an unrelated alanine substitution mutant (K540A) were expressed as GST fusion proteins. The fusion proteins purified onto the glutathione-conjugated agarose beads were checked for both quantity and quality on a SDS-polyacrylamide gel (Figure 3). Each clone expressed an equivalent amount of full-length GST-CPAF fusion proteins migrating at the predicted molecular weight position (96 kDa). Free GST molecules were always present in all fusion preps. We then used the bead-bound fusion proteins as the source of enzyme to digest cellular proteins extracted from normal HeLa in a cell-free assay. The residual substrate proteins were monitored with corresponding antibodies on a Western blot (Figure 4). The transcription factors RFX5 (panel a) and USF-1 (b) were detected in the nuclear extract (lane 1) but completely degraded by the endogenous CPAF in L2S100 (lane 2). Both the Wt CPAF and the CPAF carrying a substitutional mutation of K540, a residue predicted not to participate in catalysis, with alanine, also completely degraded the transcription factors. However, the three CPAF mutants each with a putative catalytic residue replaced by an alanine (H105A, S449A & E558A) failed to degrade either RFX5 or USF-1. The processing-deficient mutant L281G also failed to degrade these substrates, which is consistent with our previous findings . We further compared these CPAF preps for their ability to degrade cytosolic substrates including the BH3-only domain protein Puma (panel c) and the cytokeratin 8 (d). Similarly, the CPAF mutants with the putative catalytic residue replacements were not able to efficiently degrade these two cytosolic substrates compared to the Wt and the unrelated mutant. These results together have demonstrated that the three residues H105, S499 & E558 are each required for optimal proteolytic activity of CPAF.
We have previously shown that CPAF activity is dependent on CPAF processing into two fragments [22, 24]. A substitutional mutation of L281 with glycine at the predicted cleavage site blocked both CPAF processing and proteolytic activity . CPAF processing has been used to assess CPAF activity [21, 24]. We then tested whether mutation of the catalytic residues can also affect CPAF processing. As expected, both the Wt and the unrelated control mutant CPAF preps generated the free CPAFc fragments, indicating that these preps were at least partially processed (Figure 5), which is consistent with their ability to degrade substrate proteins as described above. Interestingly, CPAF mutants with replacements of either the catalytic residues (H105A, S499A & E558A) or a cleavage site residue (L281G) were not processed at all since no free CPAFc fragments were detected from these preps. These observations suggest that the three putative catalytic residues also play critical roles in CPAF processing.
Since its discovery, much progress has been made toward understanding the biochemical properties of CPAF and the role of CPAF in chlamydial pathogenesis and vaccine development [21, 32-41]. However, the catalytic nature of CPAF had remained elusive. Because the relationship between CPAF and other proteases were not obvious using sequence alignment methods alone (see RESULTS section), we used computational methods in an effort to provide insight into possible CPAF mechanism(s). In traditional “protein threading” analyses, the amino acid sequence of an unknown structure is scanned against a database of known structures . For each known structure, which serves as a potential scaffold for the unknown protein's amino acid sequence, a scoring function assesses the compatibility of the sequence to the structural template. High scores yield possible 3-D models. Such methods have great utility because amino acid sequences diverge much more rapidly than 3-D structures, so although a given protein sequence may not possess significant amino acid identify with other proteins in the sequence database, it may still be quite compatible with 3-D scaffolding present in the structural database. In other words, two proteins may have very similar 3-D folds even though they possess very little sequence identity.
In the current study, we used the programs HHPRED [26, 27] and MODELLER  to predict putative catalytic residues of CPAF and found an excellent alignment of CPAF residues with the structure of the catalytic domains and the catalytic triads of tricon protease (Figure 1 and Table 1), a large serine protease from Thermoplasma acidophilum  (Figure 2). The results strongly suggest that CPAF is also a serine protease with a catalytic triad consisting of amino acid residues H105, S499, and E558. Thus, S499 likely serves as the nucleophile attacking the carbonyl carbon atom of the residue to be cleaved in the substrate, while H105 serves to polarize the hydroxyl group of S499, and E558 orients H105 via a bound water molecule and makes it a better proton acceptor via electrostatic effects. Tricorn has been shown to exhibit both tryptic and chymotryptic specificities . The X-ray structure reveals that specificity for basic P1 residues (preceding the cleavable bond) is conferred by D936, which is provided by the dyad-related subunit  (Figure 2C). As shown in Figure 2D, this specificity determinant (E394) also appears to be present in CPAF. Site-directed mutagenesis analyses have allowed us to confirm that these three residues are indeed critical for CPAF enzymatic activity in degrading four known CPAF substrate proteins. This conclusion is consistent with the observation that none of the point mutations affected the CPAF overall structure since the solubility of all CPAF preps remained essentially the same. Furthermore, this conclusion is apparently supported by a recent crystal structure of CPAF , although at this writing the coordinates are not available to permit a comparison to our model.
Intriguingly, as shown in Figure 2D, a cysteine residue, C500, sits immediately adjacent to the S499 of the predicted catalytic triad of CPAF and it is tempting to speculate that CPAF may also function as a cysteine protease, endowing it with broader substrate specificity than that demonstrated by tricorn protease. It is also interesting to note that, as shown in Figure 1 and Supplementary Figure 1, the CPAF sequences that map to the 3-D structure of the catalytic domains of tricorn protease are interspersed with sequences that do not seem to map with any known structure in the protein data bank. This suggests that these interspersing residues are likely involved in the higher order assembly of CPAF and not in catalysis, and that the overall oligomeric composition of CPAF is different than the hexameric arrangement of subunits in tricorn protease. In this context, it should be noted that the second-highest scoring molecule in the protein data bank for the CPAF query sequence is for the structure of the catalytic domains of the photosystem II D1 serine protease from Scenedesmus obliquus (pdb code 1FC6 ) which functions as a homodimer and not a homohexamer.
An additional intriguing finding is that mutation of the three putative catalytic residues also affects CPAF processing. Since CPAFc starts with the residue G284, the cleavage site has to be upstream of and close to this residue. Clearly, none of the three catalytic residues could be located in the cleavage site. Therefore, we hypothesize that CPAF processing might depend on its own enzymatic activity. This hypothesis is consistent with the recent finding that artificially induced oligmerization of CPAF can lead to CPAF fusion protein processing and activation .
Besides its role in pathogenesis, CPAF has been found to induce protective immunity against chlamydial infection. CPAF is one of the most immunodominant antigens during C. trachomatis infection in humans in terms of antibody production [29, 33, 35]. Although the human serum antibodies can neutralize CPAF enzymatic activity in test tubes, antibodies may play a very limited role in regulating CPAF activity inside the Chlamydia-infected cells. Indeed, the CPAF-induced protective immunity in mice was found to depend on a Th1 dominant and IFNγ-mediated response [39-41]. The characterization of CPAF proteolytic activity may enable us to design small molecule inhibitors that are cell-permeable for blocking CPAF activity in the Chlamydia-infected cells so that the infected cells can be efficiently detected and attacked by T lymphocytes.
This work was supported in part by grants from the US National Institutes of Health (to G. Zhong) and from the Robert A. Welch Foundation AQ-1399 (to P.J. Hart). Support for the X-ray Crystallography Core Laboratory by the Office of the Vice President for Research at the University of Texas Health Science Center is also gratefully acknowledged.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.