|Home | About | Journals | Submit | Contact Us | Français|
Two genes in the Escherichia coli genome, ypdE and ypdF, have been cloned and expressed, and their products have been purified. YpdF is shown to be a metalloenzyme with Xaa-Pro aminopeptidase activity and limited methionine aminopeptidase activity. Genes homologous to ypdF are widely distributed in bacterial species. The unique feature in the sequences of the products of these genes is a conserved C-terminal domain and a variable N-terminal domain. Full or partial deletion of the N terminus in YpdF leads to the loss of enzymatic activity. The conserved C-terminal domain is homologous to that of the methionyl aminopeptidase (encoded by map) in E. coli. However, YpdF and Map differ in their preference for the amino acid next to the initial methionine in the peptide substrates. The implication of this difference is discussed. ypdE is the immediate downstream gene of ypdF, and its start codon overlaps with the stop codon of ypdF by 1 base. YpdE is shown to be a metalloaminopeptidase and has a broad exoaminopeptidase activity.
Aminopeptidases form an abundant enzyme family in microorganisms (11), and multiple aminopeptidases are found in most sequenced microbial genomes. Aminopeptidases play key roles in protein degradation (5) and protein maturation (4), etc. The expanded families of aminopeptidases, with distinct sequence signatures and biochemical function, match the diversity in the chemical composition of the substrate peptides (18). For most aminopeptidases found by computer search, their substrate specificities have not been determined. Attempts to deduce substrate specificity on the basis of sequence similarity are hampered by the lack of clear sequence signatures that correlate with experimentally determined function.
Through computer analysis of microbial genes, we suggested the existence of segmentally variable genes (SVGs), whose products have a modular architecture composed of highly variable domains and well-conserved domains (24). The variable domains are typically over 70 amino acids (aa) in length and lack conserved sequence features that might suggest their function. Among many, we noticed one SVG family, members of which all encode proteins with a conserved C-terminal domain and a variable N-terminal domain (Fig. (Fig.1).1). Since it includes gene ypdF from Escherichia coli, we call it the YpdF family. In the products of these genes, the conserved C-terminal domains show strong global similarity to the 264-amino-acid-long methionine aminopeptidase (Map) in E. coli (4). The variable N-terminal domains, with average lengths of about 100 aa, do not show any detectable similarity to any known domains. It is known that the protein product of E. coli map is a metalloaminopeptidase and is activated in vitro by cobalt ions (19). The presence of an extra N-terminal domain plus a characteristic C-terminal domain similar to that of Map in the YpdF family resembles the domain structure observed in one of the two methionine aminopeptidases in Saccharomyces cerevisiae (6, 16). Most genes in the YpdF family are unknown, with no experimental evidence on detailed substrate specificities. It seemed possible that SVG family members encode similar aminopeptidase activity.
Examining the genomic context of ypdF in the E. coli genome shows that the start codon of its downstream neighboring gene, ypdE, overlaps with the stop codon of ypdF by 1 base (Fig. (Fig.2).2). Hence, expression of these two genes may be coupled and they may encode functionally related gene products. A similarity search reveals that ypdE homologs are present in over 60 microbial genomes and that YpdE has a subtle similarity (~25% identity) to a previously reported archaeal-type deblocking aminopeptidase (17). Like ypdF, most genes from the YpdE family remain uncharacterized.
In this study, we have expressed the active gene products of both ypdE and ypdF and shown that each encodes a metalloaminopeptidase. We have examined their substrate specificities for both.
For ypdF, the coding sequence was amplified by PCR from E. coli strain K-12 MG1655 genomic DNA using the forward primer b2385f and the reverse primer b2385r (Table (Table1).1). The purified PCR product was ligated into the pET28a (Novagen) plasmid between the EcoRI and HindIII sites. For ypdE, the coding sequence was amplified by PCR using the forward primer b2384f and the reverse primer b2384r (Table (Table1).1). The purified PCR product was ligated into the pET28a (Novagen) plasmid between the PstI and HindIII restriction sites.
Two N-terminal deletions of ypdF were made by amplifying the open reading frames (ORFs) using the forward primer b2385_d1f (starting from residue 103) or b2385_d2f (starting from residue 60) and the reverse primer b2385r. We first used pET28a to express the six-His-tagged recombinant protein; however, the yield was extremely low. We then cloned the purified PCR products into the plasmid pMALc2x (New England Biolabs) to produce MBP (maltose-binding protein)-fused protein. As a control, the intact ypdF gene was also cloned into the pMALc2x plasmid. To confirm the identity of all the cloned products, the inserts in the purified expression plasmids were analyzed by DNA sequencing (New England Biolabs).
pET28a carrying an ORF encoding a six-His-tagged protein was transformed into E. coli strain ER2566 (New England Biolabs). Transformed cells were cultured in LB medium supplemented with 100 μg/ml kanamycin at 37°C to mid-log phase. Protein expression was induced with 1 mM isopropyl-β-d-thiogalactopyranoside (IPTG), and the cells were incubated at 30°C overnight. Cells from 10 ml of culture were harvested by centrifugation at 4,000 × g for 20 min and stored at −20°C for 30 min. Frozen cells were resuspended in 0.7 ml of lysis buffer (300 mM NaCl, 50 mM NaH2PO4, 10 mM imidazole, pH 8.0) and then briefly sonicated in ice. The cell lysates were centrifuged at 14,000 × g for 20 min at 4°C, and the supernatant was loaded onto an Ni-nitrilotriacetic acid column. The column was then washed twice with 10 ml of washing buffer (300 mM NaCl, 50 mM NaH2PO4, 20 mM imidazole, pH 8.0). The column bound protein was eluted in 0.7 ml of elution buffer (300 mM NaCl, 50 mM NaH2PO4, 250 mM imidazole, pH 8.0). The final protein concentrations as determined from Bradford assays are approximately 0.4 μg/μl for YpdF and 0.3 μg/μl for YpdE.
For pMALc2x constructs, transformed cells were cultured in LB medium supplemented with 100 μg/ml ampicillin at 37°C to mid-log phase. The remaining purification steps follow the same procedure as described in reference 12.
Purified recombinant protein was assayed on a panel of peptides (BACHEM; New England Biolabs) to test their substrate specificity. All substrate peptides were dissolved in Tris buffer (10 mM, pH 7.6) to 1 mg/ml. Reactions were done in a total volume of 30 μl, with 25 μl of peptide substrate, 3 μl of purified enzyme, and 2 μl of metal ion (10 mM). The reaction mixture was incubated at 37°C for 30 min and then resolved by thin-layer chromatography (TLC) on a Silica Gel 60 plate (EMD Chemicals Inc.) using ethanol-isopropanol-water at a volume ratio of 1/2.1/0.9. The plate was then sprayed with ninhydrin (0.2% in ethanol) and heated. Images of the plate were taken under UV light at 366 nm.
We used a two-step PCR procedure (12) to generate mutations in YpdF targeted at the conserved motif in the N-terminal domain. Briefly, the segment encompassing the sequence for this motif was first deleted from ypdF and then replaced with a synthetic double-stranded oligonucleotide with the desired mutations. In the first step, a pET28a plasmid with the inserted ypdF ORF was used as the template; oligonucleotides b2385f and b2385_mr were used as the forward and backward primers to obtain the 5′ region of ypdF; oligonucleotides b2385_mf and b2385r were used as primers to obtain the 3′ region of ypdF. In the second PCR step, a 1:1 mixture of purified 5′ and 3′ PCR products from the first step was used as the template; oligonucleotides b2385f and b2385r were used as the forward and backward primers. Compared with intact ypdF, the PCR product (ypdFΔ) from the second PCR step has a short internal segment deleted. Meanwhile, a new recognition sequence for PmlI (CACĜTG) was created at the deletion site. The purified PCR product was ligated into pET28a to generate pET28a-ypdFΔ. Next, we synthesized two complementary oligonucleotides with the desired amino acid changes and ligated it into the PmlI-digested plasmid pET28a-ypdFΔ as a replacement for the deleted segment. The resulting plasmids were then analyzed by DNA sequencing after purification.
As shown in Fig. Fig.1,1, members of the YpdF family have conserved C-terminal domains of about 260 aa and variable N-terminal domains of about 100 aa. The conserved C-terminal domain shows overall similarity to the 264 aa encoded by E. coli map, and all five key residues involved in metal ion binding are conserved (24) (Fig. (Fig.11).
Genes showing strong similarity to ypdF are widely distributed in other sequenced genomes. Part of the list is shown in Fig. Fig.1.1. Interestingly, several genomes have multiple ypdF homologs (Fig. (Fig.1).1). The percent identity in the N-terminal-domain sequences of the multiple copies within the same genome is usually not high (<40%) and much less than the percent identity in the C-terminal-domain sequences, suggesting that these duplicated genes may have diverged.
YpdE in E. coli shows moderate similarity (~24% identity) to the previously reported metal ion-dependent deblocking aminopeptidase (PH0519) in the archaeon Pyrococcus horikoshii (1, 17) and to an aminopeptidase in Haloarcula marismortui (~21% identity in an ~150-aa region) (8). It is known that PH0519 shows broad aminopeptidase activity on nonblocked peptides and blocked peptides by acyl group (1). All three residues which are suggested to be involved in metal ion binding (17) are conserved (data not shown). However, computational analysis suggests that YpdE is not clearly orthologous to PH0519 because there exist multiple ypdE paralogs in both the P. horikoshii and E. coli genomes. Although the ypdE and ypdF ORFs overlap by 1 base in E. coli (including strains K-12, CFT073, and O157H7), this feature seems unique to E. coli and is not found in closely related species such as Salmonella enterica serovar Typhi or Shigella flexneri, etc. In the E. coli genome, the genes flanking ypdEF (Fig. (Fig.2)2) encode a sugar phosphotransferase (PTS) system that is responsible for carbohydrate transport (9).
We expressed the active recombinant YpdE and YpdF proteins in E. coli. The six-His-tagged proteins were purified by one-step chromatographic procedures using an Ni-nitrilotriacetic acid column. YpdF has 361 aa with a calculated molecular mass of 39.6 kDa, and YpdE has 345 aa with a molecular mass of 37.4 kDa. The purified recombinant YpdF and YpdE proteins exhibit single bands on sodium dodecyl sulfate-polyacrylamide gels with estimated molecular masses of 41 kDa and 38 kDa (data not shown).
We found that YpdF has limited methionyl aminopeptidase activity when it was tested on a variety of peptides starting with l-methionine (Fig. (Fig.3a).3a). A TLC assay shows that YpdF is capable of hydrolyzing the N-terminal methionine when the next amino acid is alanine, proline, or serine (Fig. (Fig.3a).3a). It does not show detectable methionine-releasing activity when the second amino acid is glycine, lysine, or leucine, etc., among all tested peptides (Fig. (Fig.3a3a and Table Table2)2) or when the first methionine residue is modified as formylmethionine or acetylmethionine (Table (Table2).2). Compared with the original methionyl aminopeptidase in E. coli (4), YpdF has a much narrower specificity in methionine cleavage. The map product in E. coli exhibits higher methionyl aminopeptidase activity when the second amino acid is glycine, alanine, proline, or serine, etc. (14). However, YpdF does not exhibit a similar property; for instance, it does not cleave methionine when the second amino acid is glycine (Fig. (Fig.3a3a).
Figure Figure3a3a reveals that the substrate preference of YpdF for methionyl aminopeptidase activity is Pro > Ala > Ser. We then tested if YpdF is able to cleave when residues other than methionine precede the second proline. As shown in Fig. Fig.3b3b and Table Table2,2, it is able to hydrolyze the Xaa-Pro peptide bond when the first amino acid is alanine (A), asparagine (N) (Fig. (Fig.3b),3b), or methionine (M) (Fig. (Fig.3a)3a) but not others (Table (Table2).2). Previous work has shown that aminopeptidase P (PepP) in E. coli is able to cleave essentially all Xaa-Pro peptides (23). Compared with E. coli PepP, YpdF has limited Xaa-Pro specificity. Other than the methionyl and Xaa-Pro aminopeptidase activity, YpdF does not show aminopeptidase activity on other peptides (Table (Table2),2), e.g., Ala-Ala-Ala or Ser-Ser-Ser, suggesting that it is not an aminopeptidase with broad specificity.
We tested if YpdE has deblocking activity against an array of acetyl or formyl group-blocked oligopeptides using TLC and matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry (data not shown). However, no detectable deblocking aminopeptidase activity was observed. Instead, we noticed that YpdE has a broad aminopeptidase activity on nonblocked peptides by progressively cleaving amino acids off the peptide substrate (Fig. (Fig.4).4). Its aminopeptidase activity stops at the residue before the first proline in the peptide, which was also observed for the related deblocking aminopeptidase homolog in archaea (1). This may be due to the unique imide bond in proline, which restricts the overall N-terminal conformation of the oligopeptides and disrupts the contacts between residues and active sites of the enzyme. From a panel of oligopeptides we examined by MALDI-TOF assay, we observed that YpdE can cleave most amino acids in the N terminus of the peptide (Table (Table2).2). Note that it does not cleave when proline is the first N-terminal residue (Table (Table22).
The effects of Co2+, Mn2+, Mg2+, Ni2+, Ca2+, Fe2+, and Zn2+ on both YpdF and YpdE using the substrate Met-Ala-Ser were examined, as shown in Fig. Fig.5.5. Mn2+ and Ni2+ can both substitute for Co2+ in assays for YpdF, while other divalent metal ions cannot. When metal ions are not present, YpdF loses its activity (Fig. (Fig.5a,5a, lane 12).
In Fig. Fig.5b,5b, we show that YpdE can be activated by Co2+, Ni2+, Mn2+, and Cu2+. Notice that the substrate used in Fig. Fig.5b5b is Met-Ala-Ser. YpdE is able to completely digest this tripeptide into individual amino acids in the presence of Co2+ and displays three spots on a TLC plate (Fig. (Fig.5b,5b, lane 5).
We asked whether the conserved C-terminal domain in YpdF is a stand-alone domain for the aminopeptidase activity. We then constructed two forms of YpdF with the N-terminal domain completely or partially deleted (Fig. (Fig.1a).1a). We expressed and purified them as MBP (maltose-binding protein) fusion proteins (unpublished data). For control purposes, we also expressed the intact ypdF and purified the product as an MBP fusion protein. Purified MBP-YpdF shows the same activity as six-His-tagged YpdF (data not shown). However, YpdF with the N-terminal domain completely or partially deleted (Fig. (Fig.1a)1a) loses its original aminopeptidase activity (data not shown). This suggests that the N-terminal domain is essential for the in vitro function of YpdF. However, we cannot rule out the possibility that the loss of the N terminus disrupts the overall folding of the protein.
Using the motif finding program MEME (2) and the N-terminal sequences extracted from YpdF and its homologs, we found a short motif conserved within a subgroup of the YpdF homologs from a number of distantly related species (Fig. (Fig.6a).6a). In the motif, two sites that are completely conserved are a negatively charged aspartate (D) and a positively charged arginine (R) (Fig. (Fig.6b).6b). Another site appears to be dominated by aromatic residues: tyrosine (Y) or phenylalanine (F) (Fig. (Fig.6b).6b). The fact that this motif is conserved across a diverse group of species in YpdF homologs suggests that it might be functionally important. We did not find any match to motifs of known function in motif databases. To investigate the possible role that this motif might play, we expressed a mutated YpdF protein in which the D, R, and Y sites were changed to A (DRY2A) and compared the mutated YpdF protein with the wild type. We found that the substrate specificity profile does not change between the wild type and this mutant. However, the mutated YpdF protein appears to have lost its methionyl aminopeptidase activity against the substrate Met-Ala-Ser while the activity against Met-Pro-Gly remains at the same level (data not shown). More work is needed to examine the functional role of this motif within the N terminus.
We have studied two new aminopeptidases in E. coli. YpdF has limited aminopeptidase activity on peptide substrates starting with Met-Xaa or Xaa-Pro but not on other peptides. It has been suggested that the E. coli methionyl aminopeptidase and proline aminopeptidase PepP belong to the same family and adopt a similar fold (3). YpdF further reveals an inherent relationship between the two.
During previous screens for methionyl aminopeptidase activity, YpdF may be missed because of the limited nature of the peptides used (4, 22). Ben-Bassat et al. (4) used the peptide Met-Gly-Met-Met to screen an E. coli clone library, but this is not a substrate for YpdF. Similarly, during the screen for Xaa-Pro aminopeptidases, polyproline was used (22). Again, this is not a substrate for YpdF because YpdF cannot process peptides with an N-terminal proline.
The cellular roles of both YpdF and YpdE remain elusive. In a recent study (7), neither mRNA nor protein product of YpdF and YpdE were detected in the crude cell lysate of E. coli. However, the fact that both genes are evolutionarily conserved across distantly related species suggests they may be functionally important (15). Among all substrates tested, the maximum length of the peptides that YpdE can digest is 18 and the minimum is 2. It has a broad aminopeptidase activity and therefore could be involved in the ATP-independent downstream processing in cytosolic protein degradation pathways.
The segmentally variable feature of the YpdF family suggests that the attachment of an extra N-terminal domain may provide a convenient way of evolving new functions from existing protein scaffolds. If this is generally true, we may see it in other peptidase families in the database. The exact function of the N-terminal domains of YpdF and its homologs remains elusive. The N-terminal domains of methionyl aminopeptidases in S. cerevisiae harbor zinc finger motifs and short stretches of basic amino acids, which are suggested to be responsible for binding to the ribosome (20). However, in YpdF and its homologs, we did not observe a similar compositional bias in the N termini. E. coli PepP is similar to the type Ib methionyl aminopeptidase in domain structure but has a much longer N terminus than that in most type Ib methionyl aminopeptidases. Could the length of the N-terminal domains be the leading factor that distinguishes the two, or have all of the PepP proteins, which seem to have evolved from the methionyl aminopeptidase family, preserved some of the methionyl aminopeptidase activity? This should be tested experimentally. The crystal structure is available for aminopeptidase P, and it forms a tetramer (dimer of dimers) (21). Part of the N-terminal domain is responsible for contacts between subunits. Again, this may suggest another possible function of the N terminus of the YpdF family of proteins.
Y.Z. thanks Shelley Cushing and Jack Benner at NEB for help with MALDI-TOF experiments and David Landry at NEB for help with TLC analysis. We thank two anonymous reviewers for helping us interpret our experimental results.
This work was supported by New England Biolabs Inc.