|Home | About | Journals | Submit | Contact Us | Français|
Enterotoxigenic anaerobic Bacteroides fragilis is a significant source of inflammatory diarrheal disease and a risk factor for colorectal cancer. Two distinct metalloproteinase types (the homologous 1, 2, and 3 isoforms of fragilysin (FRA1, FRA2, and FRA3, respectively) and metalloproteinase II (MPII)) are encoded by the B. fragilis pathogenicity island. FRA was demonstrated to be important to pathogenesis, whereas MPII, also a potential virulence protein, remained completely uncharacterized. Here, we, for the first time, extensively characterized MPII in comparison with FRA3, a representative of the FRA isoforms. We employed a series of multiplexed peptide cleavage assays to determine substrate specificity and proteolytic characteristics of MPII and FRA. These results enabled implementation of an efficient assay of MPII activity using a fluorescence-quenched peptide and contributed to structural evidence for the distinct substrate cleavage preferences of MPII and FRA. Our data imply that MPII specificity mimics the dibasic Arg↓Arg cleavage motif of furin-like proprotein convertases, whereas the cleavage motif of FRA (Pro-X-X-Leu-(Arg/Ala/Leu)↓) resembles that of human matrix metalloproteinases. To the best of our knowledge, MPII is the first zinc metalloproteinase with the dibasic cleavage preferences, suggesting a high level of versatility of metalloproteinase proteolysis. Based on these data, we now suggest that the combined (rather than individual) activity of MPII and FRA is required for the overall B. fragilis virulence in vivo.
A variety of microbial communities (the microbiome) exists in the human body, playing fundamental roles in human health and disease (1, 2). Normally, bacteria outnumber human cells within an individual by at least an order of magnitude. The gastrointestinal tract is part of the direct interface between the human organism and the microbiome. In certain circumstances, the beneficial relationship between the microbiome and the gastrointestinal tract becomes disrupted, causing disturbances leading to disease.
Chronic inflammation affects all phases of carcinogenesis, from favoring the initial genetic alterations that drive cancer formation to acting as a tumor promoter by establishing conditions in the surrounding tissues that allow the tumor to progress and metastasize (3–6). For example, chronic hepatitis B or C virus infections frequently lead to liver cancer (7), and chronic Helicobacter pylori infection leads to gastric cancer in some patients (8–10). Increased cancer incidence is likewise found in experimental mouse models of both infection-induced and noninfectious inflammation (11, 12).
The role of infectious and inflammatory processes in colon carcinogenesis is of great interest. Enterotoxigenic Bacteroides fragilis is both a significant source of chronic inflammation (e.g. inflammatory diarrhea and ulcerative colitis) and a risk factor for colorectal cancer (CRC)2 (4, 13–19). B. fragilis comprises typically only 0.5–2% of the cultured fecal flora (20–23) but causes over 80% of anaerobic infections (24). It is likely that the proinflammatory, protumorigenic role of B. fragilis in CRC and H. pylori in stomach/gastric cancer is similar (4, 19, 21, 23, 25).
There is a consensus among researches that metalloproteinase activity is essential for B. fragilis virulence and that this activity is encoded by the 6-kb pathogenicity island in enterotoxigenic B. fragilis strains (14, 21, 26, 27). The island contains at least two metalloproteinase genes. These genes encode fragilysin (FRA; also termed B. fragilis toxin or BFT), demonstrated to be important to pathogenesis, and metalloproteinase II (MPII), also a potential virulence protein. FRA exists in three homologous isoforms (FRA1, -2, and -3) with the sequence identity over 90%. In turn, sequence identity between FRAs and MPII is only 25% (Fig. 1).
FRAs and MPII are secretory zinc metalloproteinases with a zinc-binding HEXXHXXGXXH motif and a characteristic Met turn. These structural features, especially when combined, indicate that both FRAs and MPII exhibit the matrix metalloproteinases (MMP)/a disintegrin and metalloproteinase (ADAM) fold (28–31). The overall level of homology between the catalytic domain of bacterial FRAs and MPII and mammalian MMPs/ADAMs, however, is low.
Although there is limited information about the structural-functional features of FRAs (17, 32), the biochemical characteristics of MPII remain completely unknown. Because MPII has not yet been characterized, it is unclear if MPII can facilitate the toxigenic effect of FRAs in causing diarrhea, inflammatory bowel disease, and CRC (14, 26, 30, 33, 34). As a result, we cannot decipher, at the molecular level, how the proteolytic activity of B. fragilis tailors the normal luminal epithelium for inflammation and disease onset. Understanding the substrate cleavage specificity of MPII relative to FRAs may help to determine how infection-associated inflammation enhances carcinogenesis in the affected organs and how we may find a means to fight the disease.
Here, we performed a comparative characterization of MPII and FRA3, a representative of the FRA isoforms. Our data imply that, in contrast with the FRA family members, the unconventional MPII cleavage preferences mimic those of furin-like proprotein convertases. To the best of our knowledge, MPII is the first zinc metalloproteinase with the dibasic cleavage preferences, suggesting a high level of versatility of metalloproteinase proteolysis. Based on our results, we suggest that the combined (rather than individual) activity of MPII and FRAs is required for B. fragilis virulence.
The reagents were purchased from Sigma-Aldrich, unless indicated otherwise. 5-FAM-SLGRKIQIQK(QXL520)-NH2 fluorescence-quenched peptide substrate was acquired from AnaSpec. GM6001/Ilomastat, BB94/Batimastat, and AG3340/Prinomastat were obtained from EMD Millipore, Tocris Biosciences, and Allergan, respectively. Anthrax protective antigen-83 (PA83) was purchased from List Biological Laboratories. Recombinant human TIMP-2 was expressed in Madin-Darby canine kidney cells and then purified from conditioned medium as reported earlier (35). Human TIMP-1 and TIMP-3 were purchased from Invitrogen.
The frozen tumor and matching normal tissue deidentified biopsies were obtained from our preexisting collection of proximal CRC cancer specimens. Genomic DNA was extracted from the tissue samples using the DNeasy blood and tissue DNA purification system (Qiagen). The 501-bp fragment of the B. fragilis 16 S rRNA gene was amplified in the 100-μl PCRs containing genomic DNA (100 ng), the forward and reverse primers (5′-ATAGCCTTTCGAAAGRAAGAT-3′ and 5′-CCAGTATCAACTGCAATTTTA-3, respectively; 0.3 μm each), Crimson Taq DNA polymerase (1 unit), and 12.5 mm Tricine buffer, pH 8.5, supplemented with 42.5 mm KCl, 1.5 mm MgCl2, 6% dextran, and 0.2 mm dNTP mix. DNA amplifications were performed using denaturing of the samples at 95 °C for 5 min followed by 35 PCR amplifications (95 °C for 30 s, 52 °C for 30 s, 72 °C for 1 min). The products were separated by 2% agarose gel electrophoresis. Amplified 501-bp products were purified and sequenced to confirm their authenticity and identity. Two-sided Fisher's exact test was used to evaluate the statistical significance of the association of the bacteria with colorectal cancer.
The full-length cDNA coding for the wild type MPII proenzyme (gi:3046922) and the FRA3 proenzyme (PDB accession code 3P24; gi:315583580) were synthesized by Genewiz. PCR with the 5′-CACCATGCACCATCACCATCACCATGGAGCCTGTGCCGATGACCTG-3′ and 5′-TCAATGGTGGTGATGGTGGTGCTTGTCATCGTCATCTTTGTAGTCCTTTTGGATGCACTCCAG-3′ oligonucleotides as the forward and reverse primers, respectively, was then used to insert the His6 tags (both N- and C-terminally) and the FLAG tag (C-terminally) into the MPII template. Similarly, the 5′-CACCATGCACCATCACCATCACCATGGAGCCTGCAGCAATGAGGCC-3′ and 5′-TCAATGGTGGTGATGGTGGTGCTTGTCATCGTCATCTTTGTAGTCACCATCTGCGATCTCCCAGCC-3′ as the forward and reverse primers, respectively, were used to incorporate two His6 tags and a single FLAG tag into the FRA3 construct.
The MPII E352A proteolytically inactive mutant, in which Ala replaced the catalytically essential Glu residue of the active site, was constructed using PCR mutagenesis with 5′-CCGTATACACTGGCACACGCGATCGGCCATCTGCTGGGC-3′ and 5′-GCCCAGCAGATGGCCGATCGCGTGTGCCAGTGTATACGG-3′ as the forward and reverse primers, respectively, and the wild type MPII template (mutant nucleotides are underlined). Similarly, 5′-GATGGCCCACGCACTGGGCCACATC-3′ and 5′-GATGTGGCCCAGTGCGTGGGCCATC-3′ as the forward and reverse primers, respectively, and the wild type FRA3 template were used to generate the FRA3 E349A inactive mutant (mutant nucleotides are underlined). The authenticity of the constructs was confirmed by DNA sequencing.
The wild type and mutant tagged constructs were recloned into the pET101 expression vector (Invitrogen). The recombinant pET101 plasmids were used to transform competent E. coli BL21 (DE3) Codon Plus cells (Stratagene). Transformed cells were grown at 30 °C in LB broth containing ampicillin (0.1 mg/ml). Cultures were induced with 0.6 mm IPTG for 16 h at 18 °C. Cells were collected by centrifugation, resuspended in Tris-HCl buffer, pH 8.0, containing 1 m NaCl, and disrupted by sonication. The pellet was removed by centrifugation (40,000 × g; 30 min). The constructs were then purified from the supernatant fraction on a Co2+-chelating Sepharose Fast Flow column (GE Healthcare), equilibrated with 20 mm Tris-HCl buffer, pH 8.0, supplemented with 1 m NaCl. After washing out the impurities using the same buffer supplemented with 35 mm imidazole, the bound material was eluted using a 35–500 mm gradient of imidazole. The FRA fractions were combined and dialyzed against 20 mm Tris-HCl, pH 8.0, containing 150 mm NaCl. The purified material was kept at −80 °C until use. The purity of the material was tested by SDS-gel electrophoresis (12% NuPAGE-MOPS, Invitrogen) followed by Coomassie staining and by Western blotting with an anti-FLAG tag antibody.
The MPII E352A and FRA3 constructs (3 μg; 3 μm each) were co-incubated in PBS for 1 h at 37 °C with increasing concentrations of trypsin. Similarly, FRA3 (3 μg; 3 μm) was co-incubated for 1 h at 37 °C with increasing concentrations of active MPII in 50 mm HEPES buffer, pH 8.0, containing 1 mm CaCl2 and 50 μm ZnCl2. The cleavage was stopped by adding a 5× SDS sample buffer to the reactions. The digested samples were analyzed by SDS-gel electrophoresis in 12% NuPAGE-MOPS gels.
Peptide pool preparation and the following in vitro cleavage assays were performed using a methodology similar to those described in our recent publications (36–38). Template DNA encoding 10-amino acid residue-long peptide substrates was transcribed in vitro to produce the corresponding mRNAs. In vitro translation was then used to generate peptides covalently attached to their mRNA templates (39). To increase their stability, the peptide-mRNA fusions were converted to the corresponding covalent peptide-cDNA fusions (40). The peptide-cDNA fusions were biotinylated at the N terminus and immobilized on streptavidin-coated magnetic beads. We also spiked six biotinylated oligonucleotides into the peptide-cDNA pool to use as internal standards for normalization after PCRs. The magnetic beads with the immobilized peptide-cDNA pool (2 pmol) were co-incubated with MPII and FRA3 (1.2 or 12 pmol) at 37 °C for 30 min in 3-μl reactions containing 50 mm HEPES, pH 6.8, 10 mm CaCl2, and 10 μm ZnCl2. Reactions without the proteinases were used as negative controls. To identify cleaved peptide substrates, the cDNA molecules released by the proteinase treatment were collected, the DNA adapters required for sequencing were installed by PCR (41), and the obtained DNA constructs were sequenced using a MiSeq sequencing instrument (Illumina). In this study, we produced and tested 316 peptide sequences.
Peptide abundance in solution was quantified by counts of DNA reads corresponding to each peptide sequence. The cleavage efficiencies were determined by comparing counts in the proteinase-treated samples versus proteinase-less, buffer-only controls (each in triplicate). Next, we normalized the data for each of six different biotinylated substrates (internal standard) across the whole data set. In order to calculate an enrichment ratio for each substrate, we considered counts (in triplicate) in a positive sample over the average of counts of a control sample (buffer-only, no proteinase) + 3 × S.D. An enrichment ratio (ER) = 1 indicates that counts in a positive sample are 3 × S.D. values above the “buffer-only, no proteinase” negative sample.
Eleven 10-residue long peptides (GHSRRSRRSG, AGLRRAALGG, AGLRRASLGG, LRRKRSLSYS, GRHRRQIDRG, GNKRRGGTAG, SGHMHAALTA, SGPVSMRYTA, SGPRSLKSTA, SGPMSLRMTA, and PTKIYDNIYD) were synthesized by the Spyder Institute (Prague, Czech Republic). The peptides (1 μg; 50 μm each) were incubated for 1 h at 37 °C with the MPII or FRA3 (0.4 μg; 0.5 μm) in 20 μl of 50 mm HEPES buffer, pH 8.0, containing 1 mm CaCl2, 0.5 mm MgCl2, and 10 μm ZnCl2. The molecular mass of the intact peptides and the digest products was determined by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis using an Autoflex II mass spectrometer (Brucker Daltonics).
The sequence logos were obtained by calculating cleavage efficiency over the entire set of substrates and then selecting substrates with an ER value of >3. These substrates were considered susceptible to proteolysis. Substrates with an ER value of <3 were considered resistant to proteolysis. The resulting logos were created by the Web-based IceLogo program (42).
Molecular modeling of the substrate complexes of MPII and FRA3 was initiated from the crystal structure of FRA3 (PDB entry 3P24) (30). In this structure, the β11 strand of the FRA3 prodomain directly interacts with and tightly fits into the active site of the catalytic domain. The strand, however, runs in the opposite orientation relative to the substrate. To model the PRPLRAWGAA substrate, the Leu186–Val201 segment of the β11 strand was deleted in silico. The substrate was then modeled by placing the substrate's side chain heavy atoms at the positions of the heavy atoms in the β11 strand. The further optimization of the main and side chain atoms of the substrate was performed using molecular mechanical minimization and short molecular dynamics simulations via the Amber12 modeling package and FF99SB force field and the generalized Born method for representing the solution environment implicitly (43–46). In the course of our molecular mechanical optimization procedure, the proper orientation of the carbonyl group of the scissile bond relative to the active site zinc ion was maintained, and acetyl and N-methyl groups for capping the N and C termini of the modeled substrate were used. The structure of MPII was prepared by homology modeling using PDB entry 3P24 as a template and FFAS03 (47, 48) and Modeler (49) software. The fold of the PGRLR↓RSGAA substrate mimics the fold of the PRPLR↓AWGAA substrate in the FRA3 structure. The structures of FRA1 and FRA2 were modeled using FRA3 (PDB entry 3P24) as a template and Modeler software. The final models were displayed using PyMOL (Schrödinger, LLC, New York).
The assay for MPII cleavage activity was performed in 50 mm HEPES buffer, pH 8.0, containing 1 mm CaCl2, 0.5 mm MgCl2, and 10 μm ZnCl2. The 5-FAM-SLGRKIQIQK(QXL520)-NH2 substrate and enzyme concentrations were 10 μm and 50 nm, respectively. The total assay volume was 0.2 ml. Initial reaction velocities were monitored continuously at λex = 488 nm and λem = 520 nm using a Spectramax Gemini EM fluorescence spectrophotometer (Molecular Devices). All assays were performed in triplicate in wells of a 96-well plate.
Where indicated, MPII (50 nm) was co-incubated for 30 min at 20 °C with increasing concentrations of natural tissue inhibitors of MMPs (TIMP-1, TIMP-2, and TIMP-3) and small molecule hydroxamate inhibitors of MMPs (GM6001/Ilomastat, BB94/Batimastat, and AG3340/Prinomastat). The residual activity of the samples was then measured as above. EC50 values were calculated by determining the concentrations of TIMPs and hydroxamate inhibitors needed to inhibit 50% of the MPII cleavage activity. GraphPad Prism was used as fitting software.
Anthrax PA83 (2 μg; ~1 μm) was co-incubated for 1 h at 37 °C with increasing concentrations of MPII in 50 mm HEPES, pH 8.0, containing 1 mm CaCl2, 0.5 mm MgCl2, and 10 μm ZnCl2. The total volume of the reactions was 20 μl. Where indicated, GM6001 (1 μm) was added to the reactions to inhibit MPII. The cleavage reaction was stopped by adding a 5× SDS sample buffer. The digest samples were analyzed by SDS-PAGE using 4–20% polyacrylamide gel.
To test if B. fragilis is more prevalent in the CRC tumor biopsies as compared with the matching normal tissue, we isolated the total genomic DNA from 30 tumor biopsies and 30 matching normal tissue specimens. We then amplified by PCR the 501-bp fragment of the B. fragilis 16 S RNA gene using the isolated DNA samples. Selected amplified bands were sequenced to confirm their B. fragilis identity. Our data clearly demonstrated the presence of the B. fragilis 16 S RNA in 22 of 30 tumor biopsies (~73%) but only in 13 of 30 matching normal tissue samples (~43%), thus suggesting a statistically significant association of the bacteria with CRC (Fig. 2). In addition, the bacterium was found only in those matching normal tissues, which were derived from the patients with the B. fragilis-positive tumor samples.
To determine the potential proinflammatory, cancer-promoting role of B. fragilis toxin, we successfully cloned the full-length genes coding for B. fragilis MPII and FRA3. To facilitate their purification, the constructs were flanked by an N-terminal and C-terminal His6 tag and, additionally, by a C-terminal FLAG tag. We also constructed the catalytically inactive MPII and FRA3 (E352A and E349A mutants, respectively) in which the catalytically essential Glu residue of the active site was mutated into Ala. The constructs were expressed in and purified from lysates of E. coli (Fig. 3). MPII was readily self-activated, whereas the FRA3 proform was noticeably more stable, and only a minor fraction of the active enzyme accumulated in the purified FRA3 samples during storage. As judged by the mobility of the digest products in SDS gels and in agreement with the observations by others (30), trypsin proteolysis converted the FRA3 proenzyme and the MPII E352A inert mutant into the mature proteinase (data not shown). In turn, MPII was unable to convert the FRA3 proenzyme into the active proteinase (data not shown), suggesting that either human trypsin or an unidentified bacterial proteinase activates FRA3 in the gut in vivo.
In order to quickly characterize the substrate preferences of the FRA cleavage, we used recently published technology for screening of customizable peptide pools. The utility of peptide-cDNA fusion pools in multiplexed assays was demonstrated in our previous reports using multiple proteinases, including human furin, hepatitis C virus NS3/4A proteinase, and thrombin (36–38).
The design of the set of 316 10-mer peptides used in the assay was based on the known cleavage site of FRA3 in E-cadherin (17, 21) and on the consensus cleavage sites (Pro-X-X↓Leu) of human MMPs. The original cleavage data are presented in supplemental Table S1 and summarized in Table 1. To determine the position of scissile bonds, 11 representative individual peptides were resynthesized using standard peptide synthesis methodology. The individual peptides were then digested by MPII and FRA3. The digest reactions were analyzed by MALDI-TOF mass spectrometry to identify resulting digest fragments and, consequently, the scissile bonds. The MS data directly correlated with the results of the multiplex assay. The representative mass spectrometry analysis data are shown in Fig. 4A.
Of 316 tested peptide sequences, >80 and >230 were cleaved by MPII and FRA3, respectively (ER > 3). In order to identify the most probable cleavage sequences, we included only peptides with ER > 3 in our analysis. Although the peptide set used in the experiments was limited in size, it was sufficient to show a clear difference in substrate preferences of MPII versus FRA3 (Fig. 4B). Our data imply that MPII prefers Arg at both the P1 and P1′ positions. The presence of Leu at the P2 position, Arg or Gly at both the P3 and P3′ positions, and Gly at the P4 position are all characteristic of substrates efficiently cleaved by MPII. In turn, FRA3 prefers the presence of Leu and Pro at the P2 and P5 positions, respectively. Arg/Ala, Pro/Gly/Ser, and Arg/Ala/Leu are well tolerated at the P4, P3, and P1 positions, respectively. Overall, the resulting cleavage signature of FRA3 is most similar to the Pro-X-X-Leu consensus cleavage motif of MMPs.
Overall, our data suggest that there is a level of similarity between the dibasic cleavage signatures of MPII and the furin-like proprotein convertases (50). In agreement with this suggestion, anthrax PA83 treated with MPII appeared to be cleaved at the Arg193-Lys-Lys-Arg196 furin cleavage motif. As a result, MPII transformed PA83, albeit less efficiently than furin, into the C-terminal PA63 and the N-terminal PA20 fragments (Fig. 4C).
Our cleavage studies allowed us to elucidate a cleavage signature of MPII and FRA3 and thus to identify the commercially available fluorescence-quenched peptide 5-FAM-SLGRK↓IQIQK(QXL520)-NH2 as a substrate for MPII. This peptide employs a 5-FAM/QXL520 FRET pair for quantification of enzyme activity, providing a convenient direct assay for MPII activity. As we expected based on the multiplex assay results, this peptide is cleaved by MPII but not by FRA3. The availability of this peptide substrate allowed us to readily establish the fundamental enzymological features of the MPII activity, including its pH and temperature optimum dependence and effects of salts and glycerol (Fig. 5). Our results imply that MPII is a thermolabile protease with pH optimum at pH 8.0, that MPII is fairly resistant to high salt concentrations, and that the MPII activity is not enhanced by glycerol, which is normally used for enzyme stabilization and activity enhancement. Because we were not able to locate any commercially available fluorescent quenched peptide substrates for FRA3, we could not perform the similar enzymological studies with FRA3. Overall, we conclude that the cleavage preferences we determined for FRA3 roughly represent those of the entire FRA subfamily and that the cleavage preferences of MPII and FRAs are distinct.
In cells/tissues, the activity of MMPs/ADAMs is regulated by TIMPs (51, 52). In addition, small molecule inhibitors of a hydroxamate class are also readily available for the MMP/ADAM studies in vitro and in cell-based tests (53–56). These hydroxamates chelate the active site zinc atom and inactivate MMPs/ADAMs. Using the activity assay conditions we established, we identified compounds that inhibit MPII (Fig. 5C). The GM6001 hydroxamate inhibitor (N-((2R)-2-(hydroxamidocarbonylmethyl)-4-methylpentanoyl)-l-tryptophan methylamide; potent wide-range inhibitor of multiple individual MMPs/ADAMs; known also as Galardin and Ilomastat) (57–59) was capable of inhibiting MPII (EC50 = 31 nm). However, GM6001 is ~100-fold more potent in inhibiting MMPs (e.g. MMP-2 with an EC50 = 0.4 nm) (Table 2). Additional potent, broad spectrum hydroxamate inhibitors, including AG3340/Prinomastat (3(S)-2,2-dimethyl-4-(4-pyridin-4-yloxy)-benzenesulfonyl)-thimorpholine-3-carboxylic acid hydroxyamide) and BB-94/Batimastat ((2R,3S)-N4-hydroxy-N1-((1S)-2-(methylamino)-2-oxo-1-(phenylmethyl)ethyl)-2-(2-methylpropyl)-3-((2-thienylthio)methyl)butanediamide) were poorly efficient against MPII. In principle, medicinal chemistry structure optimization and/or screening of the hydroxamate class of inhibitors could identify more potent FRA inhibitors. In turn, TIMP-1, TIMP-2, and TIMP-3 (51), which are potent, subnanomolar MMP inhibitors, are significantly less potent inhibitors of MPII (EC50 > 700, 540, and 275 nm, respectively).
To elucidate structural elements that guide the cleavage preferences of MPII and FRA3, especially at the P2-P1′ positions, we modeled the PRPLR↓AWGAA substrate bound to FRA3 using the recently reported structure of the FRA3 proenzyme (PDB entry 3P24) as a template (30). The sequence of the peptide substrate (PRPLR↓AWGAA) was based on the FRA3 IceLogo plot (Fig. 4B). The structure of MPII was prepared by homology modeling using PDB entry 3P24 as a template. The sequence of the peptide substrate (PGRLR↓RSGAA) was based on the MPII IceLogo plot (Fig. 4B). The fold of the PGRLR↓RSGAA substrate in MPII mimics the fold of the PRPLR↓AWGAA substrate in the FRA3 structure (Fig. 6A).
It becomes evident from this modeling that a large size S1′ pocket can accommodate both small and large, hydrophobic and hydrophilic, and positively charged residues (but not the negatively charged Asp/Glu residues). This structural feature of the catalytic groove could explain a broad range specificity of FRA3 at the P1′ position. The S1′ pocket of FRA3 is organized by the side chains of Leu314, Gly344, Thr371, and Tyr370 and by the backbone of Leu365 and Tyr367. The less sizable, charged S1′ pocket of MPII is probably organized by the side chains of Ser319, Tyr347, Tyr373, and Ser374 and by the backbone of Ile321 and Leu368. Arg of the peptide substrate (and, probably, Lys as well) matches a highly negatively charged S1′ pocket of MPII. The presence of Glu276 in S1 of MPII (versus Ala276 in the corresponding position of FRA3) explains the preference of MPII (but not FRA3) for the P1 Arg. This preference, however, does exclude hydrolysis of the substrates with P1′ residues distinct from Arg, such as Gly and Ala, both of which are accepted, albeit less efficiently, by MPII (supplemental Table S1). Self-activation of MPII at the LSSR↓A site provides an additional support for the above suggestion. The extended S2 site readily accommodates the hydrophobic Leu substrate residue in both MPII and FRA3. The 2.8 Å resolution x-ray crystal structure of MPII we recently solved correlates very closely with our modeled MPII structure, thus corroborating our modeling studies.3
Because of the high sequence homology among FRA1, FRA2, and FRA3, the modeled structures of FRA1 and FRA2 are highly similar to that of FRA3 (PDB entry 3P24). There are a very few substitutions in FRA1 and especially in FRA2 that may affect their substrate binding mode and cleavage preferences as compared with those of FRA3 (Fig. 6B). Thus, the presence of Asp320 and Arg357 in FRA2 (versus Asn320 and Asn357 in FRA3) may have an insignificant effect on the P2 and P4 subsite specificity. In a similar way, the presence of Lys312, Met316, Glu357, Lys277, and Phe319 in FRA1 (versus Asn312, Ile316, Asn357, Asp277, and Leu319 in FRA3) may have a limited impact on the promiscuous P1–P4 and P2′ subsites that accept multiple residue types in FRA3 (Fig. 4).
The determination of the substrate recognition specificity of a protease is a necessary first step in developing druglike antagonists. Knowledge of an optimal peptide substrate greatly facilitates many of the subsequent steps in drug development and, in some instances, can provide a lead that is usually a few steps from a drug.
Enterotoxigenic B. fragilis is the most frequent disease-causing anaerobe (13, 21, 23). The production of the secretory metalloproteinases is essential for virulence of B. fragilis. There are two distinct secretory metalloproteinase types (MPII and FRA) encoded by the pathogenicity island in enterotoxigenic B. fragilis strains (21, 60–62). FRA exists in the three highly homologous isoforms (FRA1, FRA2, and FRA3), which exhibit over 90% sequence identity. In turn, there is only a low, 25%, identity of the peptide sequence of MPII with that of FRAs (21, 26, 30). Whereas FRAs have been the focus of multiple studies by others (13, 14, 19, 21, 30, 32, 63–69), the characteristics of MPII are completely unknown.
Here, we, for the first time, extensively characterized the purified recombinant MPII and FRA3, a representative isoform of the FRA subfamily. Based on the results of the high throughput multiplexed cleavage assay we recently developed (36–38), followed by digest of the individual selected peptides combined with the mass spectrometry analysis of the cleavage reactions, we revealed that MPII strongly prefers Arg residues at both the P1 and P1′ positions. The presence of Leu at the P2 position, Arg or Gly at both the P3 and P3′ positions, and Gly at the P4 position is also characteristic of the efficient substrates of MPII. From these perspectives, MPII mimics the dibasic Arg↓Arg cleavage motif of the furin-like proprotein convertases. We suspect that MPII interferes with the processing of those membrane and soluble precursor proteins, which are incompletely processed by proprotein convertases. As a result, MPII proteolysis may result in an abnormal precursor/mature protein ratio for the multiple targets of proprotein convertase processing, leading to pathological consequences (e.g. via aberrations of the furin-regulated functions, including cell-to-cell signaling, cell movement, Rho and Notch signaling, and many others) (37). Intriguingly, MPII appears to be the first zinc metalloproteinase with a dibasic cleavage preference, suggesting a high level of versatility of metalloproteinase proteolysis. Unfortunately, the readily available pyroglutamic acid-Arg-Thr-Lys-Arg-methyl-coumaryl-7-amide and other similar furin substrates cannot be used with MPII because the Arg-methyl-coumaryl-7-amide cleavage product is as quenched as the intact substrate.
In turn, FRA3 prefers the presence of Leu and Pro at the P2 and P5 positions of the substrate peptide, respectively. Arg/Ala, Pro/Gly/Ser, and Arg/Ala/Leu are well tolerated at the P4, P3, and P1, respectively. Overall, the cleavage signature of FRA3 is related to the Pro-X-X-Leu consensus cleavage motif of MMPs. Based on the recently published crystal structure of FRA3 (PDB entry 3P24) and the follow-on our in silico modeling of the MPII structure, we provided structural evidence for the specificity of MPII and FRA3. Our efforts identified the commercially available fluorescence-quenched peptide 5-FAM-SLGRK↓IQIQK(QXL520)-NH2 as a substrate for the routine cleavage studies of MPII. This substrate is efficiently cleaved by MPII but not by FRA3, for which a commercially available peptide substrate is currently unavailable. Our results, however, should allow us to develop FRA3 substrates to facilitate laboratory studies of the toxin isoforms.
We appreciate that many proteases harbor exosites that can be important elements for substrate selectivity and specificity. Likewise, structural properties of the substrate can be relevant for proper exposure of cleavage sites. Naturally, the use of short peptides in protease screenings is hampered by the lack of this structural information. Nevertheless, our results represent the first step in the right direction, and cell-based assays in the search for natural substrates of MPII in colon epithelial cells will follow shortly.
Our data suggest that because of their inefficient binding capacity, TIMPs, natural inhibitors of MMPs, cannot neutralize the FRA activity in the course of B. fragilis infection (Table 2). More efficient inhibitors might limit FRA activity in a clinically beneficial manner. Therefore, peptide substrates that mimic the cleavage preference of the FRA isoforms are urgently required for inhibitor design and the follow-on inhibitory studies.
In sum, our results may help to determine, at the molecular level, generalized mechanisms by which the microbiota contributes to inflammation and increases cancer risk in infected patients (70). Our results may be useful in the development of tools to monitor the presence of bacterial toxigenic proteinases and the bacterium itself in infected patients. Such tools may help to determine where, how, and when B. fragilis fragilysins contribute to infection, to validate the value of novel inhibitors of these proteinases, and, last, to control these proteinases in a manner that is clinically beneficial for the infected patients.
We thank all employees of Prognosys Biosciences for help and support. In particular, we are thankful to Traci Kitaoka and Erina He for assistance with DNA sequencing, Mark Chee for useful discussions and reading of the manuscript, Wayne Delport for help with sequencing data analysis, and Brittan Van Beuge for administrative support.
*This work was supported, in whole or in part, by National Institutes of Health Grants R01CA83017, R01CA157328, and R01DE022757 (to A. Y. S.), R44GM085884 (to I. A. K.), and R01GM098835 (to P. Cieplak).
This article contains supplemental Table S1.
3S. A. Shiryaev, A. G. Remacle, I. A. Kozlov, M. Perucho, P. Cieplak, A. A. Aleshin, R. C. Liddington, and A. Y. Strongin, manuscript in preparation.
2The abbreviations used are: