We combined bioinformatics and experimental techniques to assess enzymatic properties and a possible physiological role for the TM1585 protein from
T. maritima, the first glycerate 2-kinase of the GK-II family with a reported 3D structure (
29). Members of the GK-II family are found in many diverse bacterial species, as well as in Archaea and Eukaryota, including mammals (Fig. ). GK-II is structurally distinct from the other two described families, GK-I (
2), the most abundant family in bacteria, including
E. coli, and GK-III, characteristic of plants (
1) and cyanobacteria. Some representatives of the GK-II family, including GckA from
M. extorquens (
3), human GLYCK (
9), and the enzyme from the archaeon
P. torridus (
24), have been characterized. Nevertheless, many GK-II family members with diverse bacterial genomes still have a misleading HPR annotation, a typical example of the error propagation in genomic databases originating from the incorrect functional assignment of the TtuD protein in
A. vitis (
4).
The detailed enzymatic analysis of the purified recombinant TM1585 presented here allowed us to rule out HPR activity and to confirm its GK activity with physiologically relevant steady-state kinetic parameters. A stringent reaction specificity observed for the TM1585 enzyme, the exclusive formation of 2PG (and not 3PG) product, together with previous reports for other GK-II family members, supports the unambiguous glycerate 2-kinase assignment for the entire family. Although two other families, GK-I and GK-III, were originally described as 3PG-forming enzymes, the recent data indicate that the GK-I family may, in fact, display a glycerate 2-kinase reaction specificity (
12). The latter observation is consistent with the possibility of a common evolutionary origin of GK-I and GK-II families that reveal marginal sequence similarities within their respective Rossman-like domains (
2).
Unlike many other carbohydrate kinases that belong to large and extensively studied enzyme families, GK-I and GK-II comprise distinct structural families with as-yet-unknown mechanisms of action. Crystallographic analysis of TM1585 allowed us to take the first steps toward structure-function understanding of the GK-II family by predicting the active-site area (
2,
29), including two conserved, positively charged residues, Lys47 and Arg325. These residues were tentatively implicated in interactions with negatively charged substrates, ATP, and, possibly, glycerate (
2,
29).
Our site-directed mutagenesis and steady-state kinetic data (Table ) confirmed the functional importance of these residues. One of them, Lys47, appears to play a particularly important role in catalysis, as even the K47R mutant conserving the positive charge displays a 180-fold drop in kcat, without any appreciable change in the apparent Km value for either glycerate or ATP. The second and less-conserved Arg325 residue (replaced by proline in ~25% of all compared bacterial enzymes) may be involved in charge or side chain interactions with ATP (but not glycerate), as suggested by a fivefold increase of the apparent KmATP value observed for the R325A, but not the R325K, mutant. Although an accurate mechanistic interpretation of these data awaits further analysis, an observed functional conservation of these residues provides additional support for the presumed conservation of the biochemical function within the entire GK-II family.
A remarkable homogeneity of substrate specificity across the GK-II family is in contrast with that of other large families of carbohydrate kinases, e.g., the FGGY protein family (Pfam accession number PF00294), which contains representatives with widely different substrate preferences toward a broad range of distinct sugars, including the xylulose kinase TM0116 (EC 2.7.1.17), the ribulokinase TM0284 (EC 2.7.1.16), the glycerol kinase TM1430 (EC 2.7.1.30), the gluconokinase TM0443 (EC 2.7.1.12), and the rhamnulokinase TM1073 (EC 2.7.1.5) in T. maritima (unpublished results). In addition to the experimental evidence accumulated for several divergent members of the GK-II family, its functional homogeneity is strongly supported by the metabolic reconstruction and genome context analysis performed in this study. A significant fraction (~40%) of GK-II family genes in a collection of diverse bacterial genomes are clustered on the chromosome with genes involved in glycerate metabolism (Fig. ), providing strong support for their functional assignment and for the assertion of respective metabolic pathways.
The comparative analysis of bacterial genomes included in the “glycerate metabolism” subsystem (see Table S1 in the supplemental material) revealed a remarkable diversity of pathways that involve GK-II enzymes (Fig. ). Among the most frequent are the pathways of glyoxylate, serine, and tartrate utilization; whereas the utilization of d-glucarate and d-glycerate is relatively rare for GK-II, it is quite common for GK-I family enzymes. An observed mosaic phylogenetic distribution of genes corresponding to all three enzyme families, GK-I, GK-II, and GK-III, that are often involved in the same pathways and even in similarly organized chromosomal clusters confirms the equivalence of their functional roles in vivo.
An inventory analysis of genes associated with all functional roles included in the subsystem allowed us to expand pathway assertions toward many other genomes where GK-II genes are not involved in any suggestive chromosomal clusters. In the case of T. maritima, a remote operon encoding homologs of SAT and HPR enzymes (TM1400-TM1401) was deemed the only possible functional context for GK-II enzyme (TM1585). The inferred three-step serine degradation pathway was further supported by the identification of conserved putative regulatory sites in the upstream region of the respective genomic loci in T. maritima and T. neapolitana (Fig. ).
Importantly, this analysis allowed us to suggest specific functional assignments for the TM1400 and TM1401 genes that belong to large enzyme families (PLP-dependent aminotransferase and d-isomer-specific 2-hydroxyacid dehydrogenase, respectively) with wide variations in substrate specificity between individual characterized representatives. Due to a divergent phylogenetic placement of T. maritima proteins, their precise substrate specificity could not be reliably predicted based solely on sequence similarity. Indeed, the current annotations of these proteins in most public archives are either imprecise (e.g., putative aminotransferase for TM1400) or incorrect (e.g., d-3-phosphoglycerate dehydrogenase).
Both inferred activities, SAT and HPR, were confirmed for the purified recombinant proteins TM1400 and TM1401, respectively, using specific assays (Fig. ). Although glyoxylate is considered a major physiological cosubstrate for other SAT enzymes, previously described as serine-glyoxylate aminotransferases (EC 2.6.1.45), it does not appear to be a relevant intermediate in the T. maritima metabolic network. At the same time, pyruvate, an important intermediary metabolite in T. maritima, was proven to be an efficient transamination cosubstrate of TM1400, which should be formally classified as SAT (EC 2.6.1.51). The challenge of distinguishing between biochemical and physiologically relevant activities is rather common for enzyme families with broad substrate specificities. Likewise, HPR enzymes (EC 1.1.1.81) from several species were shown to display an appreciable glyoxylate reductase (EC 1.1.1.26) activity. Although both of these activities are displayed in vitro by TM1401, only one of them, HPR, appears to be physiologically relevant for T. maritima.
Finally, the results obtained for two overlapping pairs of coupled reactions (TM1400 plus TM1401 and TM1401 plus TM1585) provided us with an experimental validation of the inferred three-step serine degradation pathway (Table ). Despite the obvious shortcomings of in vitro data, in combination with bioinformatic analysis, these data constitute sufficient evidence for confident inclusion of this pathway in the reconstruction of the
T. maritima metabolic network. Taking into account the organotrophic lifestyle of
T. maritima (
13), a possible physiological role for this pathway may be in the utilization of exogenous serine from its environment, enriched with amino acids and other carbon and energy sources.
In summary, the key results of this study obtained by applying a subsystems-based approach (
22) to the analysis of bacterial metabolic pathways that involve members of GK-I, GK-II, and GK-III enzyme families are as follows: (i) >1,000 individual genes from ~200 complete or nearly complete bacterial genomes and representing 12 distinct functional roles (mostly enzymes) were reliably and consistently annotated; (ii) a functional context of 76 members of the GK-II family identified in 65 diverse bacterial species was analyzed in detail, and specific pathways (or groups of pathways) were asserted in most of these species; (iii) in the
T. maritima case study, a novel putative regulon was identified, covering an inferred serine degradation pathway and two other pathways, utilization of glycerol and a part of glycolysis, that share a common intermediary metabolite (and a likely effector) 2PG; and (iv) a proposed version of the serine degradation pathway in
T. maritima implemented by the three enzymes, pyruvate-utilizing SAT (TM1400), HPR (TM1401), and GK-II (TM1585), was validated by in vitro reconstitution. A detailed experimental characterization of the glycerate 2-kinase TM1585, the main focus of this study, confirmed its substrate and reaction specificity and the functional importance of the two putative active-site residues, K47 and R237, largely conserved within the GK-II family.