Homology models were constructed for over 100 proteins in the MLE subgroup, including 82 sequences that clustered with the experimentally characterized E. coli
and B. subtilis
AEEs, by using the Protein Local Optimization Program from a multiple sequence alignment, as described in Experimental Procedures. Only 65 of these proteins are shown in the phylogenetic tree (), which shows a representative subset sharing < 60% sequence identity. The template protein for all of the homology models was chosen to be 1TKK, the AEE from B. subtilis
in complex with the L-Ala-L-Glu substrate. Apo structures are also available for both the B. subtilis
and E. coli
AEEs, but the binding sites are partially open, making them poorly suited to our purposes (Kalyanaraman et al., 2005
Phylogenetic Tree of a Representative Subset of the Dipeptide Epimerase Group of the Enolase Superfamily
In general, one challenge associated with metabolite virtual screening is that existing metabolite libraries are undoubtedly incomplete (for example, the specific dipeptides that are shown to be substrates here are not included in KEGG). We hypothesized that all of the proteins considered in this study () were likely to be dipeptide epimerases, based on a phylogenetic tree of a larger subgroup in which the AEEs form a single clade (Glasner et al., 2006
), and the conservation of catalytic residues and a DxD motif involved in binding the NH3+
terminus of the dipeptide in the AEEs. Accordingly, we restricted the virtual screening to the 400 possible L/L dipeptides. For computational efficiency, the protein was treated as rigid.
In control docking calculations with the B. subtilis
AEE structure, L-Ala-L-Glu ranked 8 out of the 400 dipeptides; most of the other top-ranked dipeptides also had Glu or Asp at the epimerized position, and a small amino acid (Gly, Cys, Ser, Ala) in the first position (). Docking against a homology model of the E. coli
AEE (32% sequence identity) led to similar results. It should be noted that both the E. coli
and B. subtilis
AEEs epimerize dipeptides other than L-Ala-L-Glu, which is believed to be the physiologically relevant substrate, albeit with slower kinetics. For example, both epimerize L-Ser-L-Glu and L-Ala-L-Met, and the E. coli
AEE, which is the less specific of the two, epimerizes substrates such as L-Ala-L-His and L-Ala-L-Gln (Schmidt et al., 2001
). Kinetic constants have been measured for only selected substrates, but suggest roughly 1 order of magnitude slower kinetics for nonphysiological substrates, i.e., kcat
for epimerizing L-Ala-D-Glu is 7.7 × 104
and 4.7 × 104
) for the E. coli
and B. subtilis
AEEs, respectively, whereas the corresponding rates for L-Ala-D-Met are 2.8 × 103
and 2.2 × 103
Top L/L Dipeptides from Docking against the Homology Models of TM0006, the E. coli AEE, and Four Other Representative Proteins, as Well as the Template Used for Those Models
For most of the other homology models, especially those clustering relatively closely with the E. coli and B. subtilis AEEs, the docking results were similar to those obtained with the E. coli and B. subtilis AEEs, and thus consistent with the AEE activity. That is, the top hits were dominated by compounds with small amino acids in the first position, and negatively charged amino acids in the second, epimerized position. Four representative examples are shown in .
However, for ~20 of the proteins, the predicted specificities were dramatically different. Two major classes of novel predicted specificity were observed: a small number of enzymes (6) were predicted to epimerize positively charged dipeptides, and a somewhat larger number (~15) were predicted to epimerize hydrophobic (in both C- and N-terminal positions) dipeptides. Of these, we have obtained extensive experimental results (kinetics and multiple crystal structures) for the protein from Thermotoga maritima (gi:15642781, TM0006), confirming the computational predictions. Screening and structural studies are underway for several others, and those studies will be reported in due course.
The docking results for the homology model of TM0006, which shares 27% sequence identity with the B. subtilis AEE, are shown in . In the C-terminal, epimerized position, the docking results suggested selectivity for primarily aromatic, hydrophobic amino acids, instead of the strong selectivity for Glu in B. subtilis AEE. In the N-terminal position, top hits included Ser/Thr/Cys as well as larger hydrophobic amino acids such as Ile.
Experimental screening of L/L dipeptide libraries by mass spectroscopy (MS) confirmed the specificity switch (). In the Gly-Xxx, Ala-Xxx, and Thr-Xxx libraries, the best substrates had Phe, Tyr, or Trp in the epimerized position. Aliphatic side chains (Met, Leu, Ile) were also tolerated, and Ala-His and Thr-His were good substrates. In the N-terminal position, any hydrophobic amino acid was tolerated in the Xxx-Phe, Xxx-Tyr, and Xxx-His libraries. Dipeptides with charged, or most polar amino acids in the first position were usually poor substrates. Furthermore, the enzyme displayed no detectable muconate lactonizing enzyme (MLE) activity (results not shown), demonstrating that the GenBank and UniProt/TrEMBL annotations are incorrect. AEEs are part of a larger subgroup within the enolase superfamily, whose members are more similar to each other than other subgroups within the superfamily. The known functions of the subgroup are MLE, o-succinylbenzoate synthase, and racemization of N-succinyl or N-acetyl amino acids. No activity for these other assigned functions within the MLE subgroup was observed (data not shown).
Experimental Screening of TM0006 with L/L Dipeptides by Mass Spectroscopy to Detect Incorporation of Deuterium as a Result of Epimerization
Although MS screening of dipeptide libraries allowed us to simultaneously evaluate multiple substrates efficiently, we were only able to ascertain a rough approximation of activity. However, taking the MS screening results as a whole allowed us to prioritize our choice of substrates to carry out full kinetic assays. Kinetic constants were determined for selected dipeptide substrates by observing the change in optical rotation by polarimetry (). Using the E. coli
and B. subtilis
AEEs as standards, we expected that authentic substrates would exhibit values of kcat
in the 104
range (Schmidt et al., 2001
). Of the L-Ala-L-Xxx dipeptides assayed, L-Ala-L-Phe and L-Ala-L-His displayed values of 1.2 ± 0.2 × 104
and 1.3 ± 0.6 × 104
, respectively. Although generally grouped with polar amino acids, histidine is also aromatic. Likewise, we found that L-Ala-L-Tyr was also epimerized with an appreciable efficiency of 9.1 ± 0.8 × 103
. In order to minimize the possibility that the authentic substrate was overlooked during mass spectroscopic screening, additional L-Ala-L-Xxx dipeptides were characterized. These dipeptides were specifically chosen to systematically sample the different classes of amino acid side chains in the second position, regardless of apparent turnover in MS assays. We found that L-Ala-L-Glu and L-Ala-L-Leu were epimerized with values of kcat
of 4.9 ± 1 × 103
and 3.8 ± 1 × 103
, respectively. These results indicate that, although not optimal, negative and aliphatic side chains can also be accommodated in the C-terminal position. Finally, low turnover of L-Ala-L-Lys, 3.6 ± 0.2 × 102
, indicates that a positively charged group in the epimerized position is detrimental.
Kinetic Constants Obtained for Epimerization of Selected Dipeptide Substrates of TM0006
Kinetic constants were also determined for selected compounds from the L-Xxx-L-Phe and L-Xxx-L-His series. Although most of the dipeptides analyzed could serve as substrates, none had kinetic constants that approached the values of kcat/KM of 104 M−1 s−1 observed for dipeptides with L-Ala in the first position. Some compounds such as L-Phe-L-Phe exhibited low values of kcat (0.21 s−1), whereas others such as L-Lys-L-Phe and L-Ile-L-Phe had high values of KM. No detectable activity was observed for epimerization of L-Asp-L-Phe. Although L-Ala-L-His was a favored substrate with the value of kcat/KM essentially the same as that for L-Ala-L-Phe, other L-Xxx-L-His dipeptides were problematic substrates, with either no activity, inability to reach saturation, or evidence of substrate inhibition (). Taken together, the results support L-Ala as the optimal N-terminal residue.
Although the kinetic parameters determined for L-Ala-L-Phe, L-Ala-L-Tyr, and L-Ala-L-His at room temperature are in the range we expected for an authentic dipeptide epimerase, T. maritima is a hyperthermophile whose optimal growth occurs at 80°C. Although we were unable to perform the assays at temperatures elevated to this level, we were able to examine epimerization of L-Ala-L-Phe at 40°C and 50°C; the values of kcat/KM were found to be 4.1 ± 0.9 × 104 M−1 s−1 and 5.4 ± 0.4 × 104 M−1 s−1 at 40°C and 50°C, respectively. The values of kcat double with each 10°C increase (from 16 ± 7 s−1 at 28°C to 35 ± 6 s−1 at 40°C, and to 76 ± 20 s−1 at 50°C). From these results we conclude that the measured kinetic parameters likely underestimate the physiological efficiency of the enzyme. The physiologically relevant substrate is currently unknown, but we consider L-Ala-L-Phe, L-Ala-L-Tyr, and L-Ala-L-His to be the most likely candidates based on their kinetic constants.
The homology model revealed the structural basis for the change in specificity (). One critical determinant of specificity in the B. subtilis and closely related AEEs is Arg24, which coordinates the Glu side chain of the L-Ala-L-Glu ligand. The corresponding residue in TM0006 is Ser25 (). Other members of the dipeptide epimerase group also have substitutions at this position, including the E. coli AEE, which has Gly24 at the equivalent position. The specificity for Glu in E. coli AEE and related proteins is provided by Arg and Lys side chains at other positions within the same pocket. The pocket in TM0006, however, is primarily hydrophobic, accounting for the change in specificity. With respect to the N-terminal position of the substrate, the ability to accommodate side chains larger than Ala/Ser/Thr is conferred in part by the substitution of Gly294 at the position equivalent to Ile298 in the B. subtilis AEE.
Stereo View Depictions of the Dipeptide-Binding Site in the B. subtilis AEE and TM0006
Portions of the Multiple Sequence Alignment of TM0006, Several of Its Closest Homologs Based on the Phylogenetic Tree, and the E. coli and B. subtilis AEEs
The crystal structure of TM0006 was subsequently determined as an apo structure as well as in complex with L-Ala-L-Phe, L-Ala-L-Leu, and L-Ala-L-Lys, at 1.9–2.3 Å resolution (). During the preparation of this manuscript, an apo structure for an ortholog of TM0006 was released in the PDB (2ZAD; currently unpublished). This structure was not available when this work was performed, and it agrees closely with the apo structure determined here. The experimentally determined structure of the L-Ala-L-Phe complex is superimposed on the model generated by homology modeling and docking in . The experimental structure confirmed the proposed binding mode; the ligands superimpose almost perfectly. The positions of most of the protein side chains in the immediate vicinity of the ligand were also predicted accurately, reflecting no major errors in the sequence alignment used to generate the homology model. The greatest discrepancy is between the predicted and observed position of Arg54, which forms a salt-bridging interaction with Glu242 in the crystal structure. In the computational model, Arg54 is swung out into solution. This error may be due to a slight shift in the backbone near Arg54 between the homology model and the crystal structure, to a limitation of the energy function used for constructing the homology model, or both. Arg54 may play some role in substrate specificity, because it comes within 4 Å of the Phe side chain of the dipeptide ligand in the crystal structure, possibly forming a favorable cation-pi interaction.
Overview of Structures of TM0006 Obtained by X-Ray Crystallography
Superposition of the Models of L-Ala-L-Phe Bound to TM0006, Based on Homology Modeling and Docking and Crystallography
The active sites of the other holo structures are shown in Supplemental Data
(available online). The complex with L-Ala-L-Lys was determined to elucidate the structural basis for the relatively slow but detectable epimerization for this dipeptide, which is positively charged, in contrast to most of the other substrates, which are hydrophobic. The structure of the complex of TM0006 with L-Ala-L-Lys reveals that the positively charged nitrogen of the Lys side chain extends slightly out of the binding pocket through a narrow opening and is coordinated by water molecules.