Aminopeptidases form an abundant enzyme family in microorganisms (11
), and multiple aminopeptidases are found in most sequenced microbial genomes. Aminopeptidases play key roles in protein degradation (5
) and protein maturation (4
), etc. The expanded families of aminopeptidases, with distinct sequence signatures and biochemical function, match the diversity in the chemical composition of the substrate peptides (18
). For most aminopeptidases found by computer search, their substrate specificities have not been determined. Attempts to deduce substrate specificity on the basis of sequence similarity are hampered by the lack of clear sequence signatures that correlate with experimentally determined function.
Through computer analysis of microbial genes, we suggested the existence of segmentally variable genes (SVGs), whose products have a modular architecture composed of highly variable domains and well-conserved domains (24
). The variable domains are typically over 70 amino acids (aa) in length and lack conserved sequence features that might suggest their function. Among many, we noticed one SVG family, members of which all encode proteins with a conserved C-terminal domain and a variable N-terminal domain (Fig. ). Since it includes gene ypdF
from Escherichia coli
, we call it the YpdF family. In the products of these genes, the conserved C-terminal domains show strong global similarity to the 264-amino-acid-long methionine aminopeptidase (Map) in E. coli
). The variable N-terminal domains, with average lengths of about 100 aa, do not show any detectable similarity to any known domains. It is known that the protein product of E. coli map
is a metalloaminopeptidase and is activated in vitro by cobalt ions (19
). The presence of an extra N-terminal domain plus a characteristic C-terminal domain similar to that of Map in the YpdF family resembles the domain structure observed in one of the two methionine aminopeptidases in Saccharomyces cerevisiae
). Most genes in the YpdF family are unknown, with no experimental evidence on detailed substrate specificities. It seemed possible that SVG family members encode similar aminopeptidase activity.
FIG. 1. Schematic alignment of YpdF and its homologs in several selected completely sequenced microbial species. Black boxes represent conserved blocks reported by BLOCKS (13) and are numbered I to VII according to their sequential order. Conserved residues involved (more ...)
Examining the genomic context of ypdF
in the E. coli
genome shows that the start codon of its downstream neighboring gene, ypdE
, overlaps with the stop codon of ypdF
by 1 base (Fig. ). Hence, expression of these two genes may be coupled and they may encode functionally related gene products. A similarity search reveals that ypdE
homologs are present in over 60 microbial genomes and that YpdE has a subtle similarity (~25% identity) to a previously reported archaeal-type deblocking aminopeptidase (17
). Like ypdF
, most genes from the YpdE family remain uncharacterized.
FIG. 2. Genomic region in E. coli that includes the two genes studied in this paper (ypdE and ypdF, shaded). The numbers shown between the genes are intergenic distances in base pairs. The arrowheads of the boxes indicate the direction of transcription. ypdE (more ...)
In this study, we have expressed the active gene products of both ypdE and ypdF and shown that each encodes a metalloaminopeptidase. We have examined their substrate specificities for both.