We reassessed the propensity of the wild-type YkoF to crystallize using the commercially available Hampton I & II (Hampton Research), Wizard I & II (Emerald Biostructures), and Sigma (Sigma-Aldrich) sparse matrix screens, with a range of protein concentrations. All these attempts failed to yield crystals. Based on the surface entropy reduction concept,3
two double mutants were designed to generate surface patches potentially suitable for formation of crystal contacts. Both mutants, K112A/E114A and K33A/K34A, were used in all five screens, as described above for the wild-type protein. The K112A/E114A mutant remained recalcitrant to crystallization, but the K33A/K34A mutant readily crystallized in several conditions. Optimization revealed a pronounced preference for divalent ions in the crystallization conditions, with Ca2+
yielding the best crystals. When this work was completed, it was discovered that the second tier of crystallization screens at MCSG yielded crystals of wild-type, SeMet-labeled YkoF, which were used for independent structure determination.
Quality of the atomic models
The final crystallographic model of the K33A/K34A mutant was refined at 1.65 Å resolution to an R
factor of 17.2% and Rfree
of 22.4%. It consists of two independent monomers of YkoF denoted A and B and containing a total of 366 residues, 390 water molecules, two acetate ions and one Ca2+
. In the A molecule, eight N-terminal amino acid residues, three C-terminal residues, as well as three more residues located in the long linker connecting two internal repeats within the molecule, are not seen in the electron density map. The corresponding numbers for the B molecule are eight, four and eight. The quality of the model was assessed by PROCHECK4
and shows that 88.3% of the residues are in the most favored Ramachandran regions with no residues in the disallowed region. Crystallographic details for the mutant structure, as well as that of the complex with thiamin, are shown in .
Crystallographic data for the YkoF K33A,K34A mutant with and without bound thiamin
YkoF monomer contains a tandem of ferredoxin-like βαββαβ motifs
Each of the two independent YkoF monomers in the asymmetric unit folds into an eight-stranded, antiparallel β-sheet, with the strands arranged in the order 23148576. The four connecting α-helices are stacked against one face of the β-sheet, leaving the other exposed. The monomer has an internal approximate 2-fold symmetry axis, reflecting an internal tandem repeat of a βαββαβ tertiary motif. The repeat is not readily identifiable at the amino acid sequence level. The two ferredoxin-like motifs form a side-to-side contiguous β-sheet via an antiparallel interaction between β-strands 4 and 8. Thus, the topology of the β-sheet may be described as 2314/4′1′3′2′ to reflect the pseudosymmetry of the tandem repeat ().
Figure 1 The tertiary structure of the YkoF molecule. (A) A diagram showing the overall fold, with the N-terminal βαββαβ module colored in red and the C-terminal module shown in blue. The missing residues within (more ...)
The first motif, consisting of residues 9 to 85, can be superposed onto the second motif consisting of residues 116 to 190 with a root-mean-square difference between the 74 Cα pairs of 1.7 Å (). The most significant differences are associated with the conformation of the C termini of the structurally homologous helices A and A′. Helix A is shorter, consisting of only three turns, while helix A′ in the second motif has four turns followed by a short linker to the subsequent β-strand. The two motifs are connected by a 30 residues long linker, which begins at the C terminus of β-strand 4, and folds over helix B to connect to the first β-strand of the second repeat. This long loop has no secondary structure elements and shows significant flexibility, exemplified by lack of interpretable electron density for some of the residues in its center, and significantly higher isotropic displacement parameters (B factors) than those observed for the rest of the structure.
Each of the two βαββαβ repeats contains its own well-defined, hydrophobic core. In the N-terminal motif the residues that make up the core include Phe15, Leu32, Val40, Leu51, Ile65 and Met78, while in the second repeat the core includes Phe119, Cys117, Leu156, Leu167 and Val170. The two cores are connected by a constellation of aromatic residues which lie at the interface of the two repeats, i.e. Phe13, Phe59, Tyr66, Phe82, Phe164, and Phe171 ().
The hydrophobic core structure of YkoF. The residues within the N-terminal module are shown in red, within C-terminal module in magenta, while those within the inter-module interface are yellow.
The two molecules in the asymmetric unit form a head-to-tail homodimer with an extensive face-to-face interface between the β-sheets. Each monomer buries 2504 Å2 out of a total of 10,305 Å2 of solvent accessible surface. Thus, the assembled homodimer has an overall spherical appearance, with each monomer contributing one hemisphere (). A tight interface suggests a high association constant for the dimer, and gel filtration experiments (data not shown) confirmed that the protein exists as a dimer in solution.
A diagrammatic representation of the quaternary structure of YkoF highlighting the key residues involved in the intermolecular interface.
Van der Waals interactions between hydrophobic side-chains, direct H-bonds and water-mediated H-bonds contribute to the stability of the homodimer. Most notable are the stacking interactions between symmetry-related residues Tyr18A and Tyr18B as well as between Tyr122A and Tyr122B. Close interactions are also seen between Met20A and Phe24B; Phe24A and Met20B; Val77A and Tyr18B; Tyr18A and Val77B. A water-filled channel, 40 Å long, runs through the center of the dimer.
The HMP/thiamin binding sites
A close inspection of the YkoF crystal structure revealed in each molecule the presence of two symmetrically located putative small molecule binding sites. They are found in analogous locations in the two ferredoxin-like motifs: each is at the end of the central β-sheet in between the hairpin formed by β-strands 2 and 3 and helix A (). We found this very suggestive, particularly because the location of the putative binding site within each repeat coincides with the serine-binding site found in the structurally similar ACT-domain of 3PGDH.5
In the 1.65 Å resolution structure of the double mutant, the site in the N-terminal repeat is occupied by a clearly resolved acetate ion and five water molecules, while the one in the C-terminal repeat contains residual electron density which cannot be unequivocally interpreted.
Given the genetic data implicating the YkoF gene in HMP transport for thiamin biosynthesis in B. subtilis and some related Gram-positive bacteria, we hypothesized that the protein binds HMP and/or thiamin. Co-crystallization experiments with thiamin yielded good quality crystals, which allowed for characterization of the structure of the complex. Inspection of the 2.3 Å resolution electron density map revealed the presence of thiamin molecules in both putative binding sites in each of the two molecules in the asymmetric unit ().
Figure 4 The thiamin-binding sites of YkoF. (A) The high affinity site in the N-terminal motif, and (B) the low affinity site in the C-terminal motif in molecule A. The omit electron density map is contoured at 2.8σ level. Hydrogen bonds are shown as broken (more ...)
The two identical binding pockets within the N-terminal motifs of each of the monomers are lined with the hydrophobic side-chains of Leu17, Tyr18, Phe24, and Leu32. In addition, the benzene ring of Phe15 forms the bottom of the pocket against which the C(2′) methyl substituent of the pyrimidine ring rests. The specificity of the interaction with the HMP moiety of thiamin is mediated by hydrogen bonds to all the pyrimidine nitrogen atoms. The main chain of Leu17 interacts with the pyrimidine ring so that the amide donates a hydrogen bond to N(3′), while the carbonyl oxygen accepts one from the N(4′)-amino group. The second hydrogen of the amino group is donated to a water molecule that is also in close contact to the C(2) atom of the thiazole moiety: 3.1 Å. The C(2)-bound proton is quite acidic as it is adjacent to two electron-withdrawing atoms, i.e. N(3) and S(1), and it is therefore capable of donating a hydrogen bond to water forming a weak C–H···O bond, stabilizing thiamin's conformation. A similar interaction involving a Cl–
has been observed in the crystal structure of 3-benzyl-5-(2-hydroxyethyl)-4-methyl-1,3-thiazolium chloride.6
Finally, the side-chain hydroxyl of Thr49 donates a hydrogen bond to the N(1′) atom of the pyrimidine ring. The conformation of Thr49 is stabilized by an additional H-bond with Thr44, so that the hydroxyl is poised favorably to bind the ligand.
The binding pockets within the C-terminal motifs are very similar to those in the N-terminal motifs. Their walls are formed by Ala120, Leu121, Tyr129, Met130, Ile133, Val137, and Tyr152 with Phe119 at the bottom. Although the pocket is formed primarily by one protein molecule, two residues from the B molecule of the homodimer, Cys86, and His180, also contribute to the thiamin-binding site and are in van der Waals contact with the thiazole ring. Specificity of binding is enforced by hydrogen bonds analogous to those observed in the N-terminal motif. Thus, the main chain of Leu121, which is analogous to Leu17, anchors N(3′) and the N(4′)-amino group, while the hydroxyl of Ser154 is within an H-bonding distance of N(1′). The conspicuous difference between the two sites is the absence of a water molecule which bridges the N(4′)-amino group to C(2) of the thiazole.
To assess the affinities of the two sites for thiamin, we carried out microcalorimetric titration experiments, which confirmed the presence of two sites with different dissociation constants (). The high-affinity site has a KD in the low micromolar range, approximately 10 μM, whereas the second site has a much lower affinity, with a KD of about 250 μM. In both cases the affinity arises primarily from enthalpic changes, although in the case of the high-affinity site there is an unfavorable change in entropy, in contrast to a favorable entropic term in the low affinity site. Assuming that this difference can be rationalized by bound water molecules, the high affinity site would correspond to the N-terminal domain, while the low affinity site is likely to be the one residing in the C terminus.
Figure 5 Calorimetric titration of YkoF with thiamin. Raw data (upper plot), and a plot of the integrated heat versus thiamin/protein ratio (lower plot). The fitted line is calculated assuming two binding sites per YkoF monomer, with the following thermodynamic (more ...)
The conformation of thiamin is defined by the two dihedral angles around the C(7′) methylene bridge, i.e. the ϕT
angles defined as [C(5′)–C(7′)–N(3)–C(2)] and [N(3)–C(7′)–C(5′)–C(4′)].7
In the YkoF–thiamin complex, the ϕT
angles of both thiamin molecules in the high-affinity sites are ~–110° and ~75°, respectively. Those in the lower-affinity sites are marginally different, i.e. ~–98° and ~50°. These values are close to the so-called V conformation (ϕT
=90° and ϕP
=90°), in which the N(4′)-amino group is adjacent to the reactive C(2) on the thiazolium ring, and which was originally proposed to be the active form of the enzyme-bound coenzyme.8
The V conformation, which is unusual for free thiamin and its C(2)-substituted derivatives with the exception of thiamin thiazolone9
is, however, observed in complexes with enzymes that utilize thiamine diphosphate for catalysis.10
The V conformation is typically supported by a bulky hydrophobic side-chain which is in van der Waals contact with both aromatic rings of thiamin.11
In YkoF, this role appears to be assumed by Leu28 and Leu133. The functional significance of this is unclear, unless YkoF has some catalytic function that is neither suggested by the fold, nor readily identifiable from the structure.
Similarity to other proteins
A search for homologous and/or structurally related proteins using PSI-BLAST,12
and the MetaServer of the Polish Bioinformatics Site,14–16
revealed only two other protein sequences with a significant level of amino acid sequence similarity to YkoF, and a more distant relationship to the COG0011 and a related DUF77 domain families, with a total of 66 known sequences. Crystal structures of two members of the COG0011 family, i.e. the yeast protein YBL001c and the Methanobacterium thermoautotrophicum
protein MTH1187, have been recently determined.17
The two close homologues, from Oceanobacillus iheyensis
and Mesorhizobium loti
contain both βαβ βαβ repeats (). A majority of the residues involved in YkoF in ligand binding, in the hydrophobic core and in the dimer interface are all highly conserved among the three proteins. This pattern suggests that all three proteins share the same function. In contrast, all members of the COG0011 family, including the two proteins with known crystal structures,17
appear to contain a single repeat. However, we note that the putative binding cavity is largely preserved in both MTH1187 and YBL001c, and Thr47, which corresponds to Thr49 in YkoF and is involved in HMP recognition, is completely conserved.
Figure 6 Sequence alignment of the three YkoF proteins from bacterial species. Residues in the hydrophobic core are labeled with an asterisk, those involved in dimer formation are labeled with crosses and the residues important for thiamin recognition are indicated (more ...)
The ferredoxin-like βαββαβ fold is also observed among the ACT domains,18
which are related to the regulatory domain of the Escherichia coli
3-phosphoglycerate dehydrogenase (3PGDH),5
as well as in RAM domains involved in the regulation of amino acid metabolism in prokaryotes.19
All these structurally related proteins mediate allosteric regulation and ligand binding, and typically undergo oligomerization, albeit the architecture of the resulting oligomers varies significantly (). YkoF is unique in that it contains two βαββαβ motifs arranged side-by-side so that the terminal β-strands of both form an antiparallel sheet. Furthermore, YkoF generates a dimer with a face-to-face association of the exposed faces of the tandem's eight-stranded β-sheets, resulting in a pseudotetrameric arrangement of the constituent motifs. In contrast, the single βαββαβ motif of the ACT domain of 3PGDH forms a homodimer by a side-by-side association involving strand β2 rather than β4. The RAM domains show yet another homodimerization pattern forming a β-barrel by a face-to-face association of two βαββαβ monomers. The only protein that we found to contain an internal repeat of the ferredoxin-like βαββαβ fold is the C-terminal, regulatory domain of the pyridoxal phosphate-dependent allosteric threonine deaminase,20
but even here the side-to-side arrangement of the βαββαβ motifs is mediated by the antiparallel association of the β2 and β2′ strands, thus reversing their order as compared to YkoF. A dimer is then formed so that the β-sheets face each other, but they do not interact directly as they are separated by the residues from the inter-βαββαβ linkers.
Figure 7 Representative modes of oligomerization in the ACT/RAM superfamily. (A) The arrangement of modules in YkoF; (B) the ACT domains of d-3-phosphoglycerate dehydrogenase (1PSD.pdb) and (C) the RAM domains of the Lrp-like transcriptional regulator (1I1G.pdb). (more ...)
The crystal contacts
Each molecule in the homodimer is involved in two different inter-dimer contacts that synergistically form the crystal lattice. Along the Z axis, the dimers come into contact so that molecule A and molecule B form a symmetric interface mediated by the surface patches which contain the two mutated sites, Ala33 and Ala34, as well as Thr35 (). The solvent accessible main-chain carbonyl groups from Ala33 and Thr35 from symmetry-related molecules create a Ca2+ site complemented by three water molecules to generate a distorted pentagonal bipyramid (). This contact rationalizes our observation that the crystallization step was dependent on divalent cations, notably Ca2+.
Figure 8 The packing of the YkoF molecules in the crystal lattice. (A) The arrangement of non-crystallographic dimers, interacting via the engineered Ca-mediated crystal contact along the b axis; (B) the coordination of Ca2+ at the crystal contact containing the (more ...)
The second inter-dimer crystal contact involves residues Val147A, Lys141A, and Glu148A from one dimer, which form van der Waals contacts with Lys141B, Asp162B and Glu143B from the symmetry-related dimer. Three H-bonds (between Val147A O–Nε2 Glu143B, Asp157A Oδ–Nε2 GluB176 and LysA141 Nζ–Oγ1 AspB162) contribute to the stability of the contact.
Following the completion of this study we have learned of the independent structure determination of the wild-type YkoF based on crystals that were prepared in a second tier of screening at the Midwest Center. This structure, refined at 2.2 Å resolution, has been deposited with the Protein Data Bank (PDB) and released (1S7H). The model consists of four independent monomers and 471 water molecules, yielding an R factor of 22.1% and Rfree of 28.1%. Its comparison with the K33A/K34A double mutant structure shows that there is no difference between the two structures, with a representative root-mean-square difference between the Cα atoms of molecule A in one and molecule A in the other is only 0.43 Å (the corresponding difference between molecules A and B in the 1.6 Å structure is 0.15 Å). Close inspection of the mutated patch reveals that its main-chain structure is virtually identical, except for very minor conformational shifts due to crystal packing and calcium binding. In the wild-type structure Lys33 protrudes from the surface, clearly impeding intermolecular contact at this site.