3.2. Overview of the structure and comparison with other FadR-family members
The crystal structure of TM0439 was determined by multiwavelength anomalous dispersion (MAD) using SeMet-labeled protein. The atomic model was refined to 2.2 Å resolution (Table 1; see §
2). The protein has the canonical domain architecture of the GntR family, with an N-terminal WH domain and a C-terminal all-α-helical putative regulatory domain. The presence of only six α-helices within the C-terminal domain classifies TM0439 as a VanR member. Gel-filtration experiments (not shown) indicated that the protein was an obligate dimer in solution. The C
2 space-group symmetry allows the formation of a head-to-head dimer via
the crystallographic twofold axis, so that a large interface is buried between two C-terminal regulatory domains, with a resulting quaternary structure very close to that of FadR (van Aalten et al.
). In contrast, the two WH domains do not interact with one another, although they make limited crystal contacts with neighboring molecules in the unit cell. A comparison of TM0439 with FadR and with the recently deposited structures CGL2915, RO03477 and PS5454 shows dramatic differences in local tertiary and quaternary architectures, even though the individual domains are remarkably similar (Fig. 1).
Figure 1 Overview of the structure and comparison with other FadR-superfamily transcription factors. VanR-group members are shown at the top of the figure and FadR-group members are shown at the bottom of the figure. The PDB codes for the proteins shown are TM0439, (more ...)
As pointed out above, TM0439, RO03477 and PS5454 can be classified in the VanR group based on secondary-structure prediction, which identifies only six α-helices in their C-terminal domains (Rigali et al.
). In all three structures, a short linker connects the second β-strand of the WH domain directly to the α1
helix of the regulatory domain, so that the α0
helix seen in FadR is absent. In the TM0439 and RO03477 structures the mutual disposition of the WH and regulatory domains is similar, with the two WH domains in close proximity; in contrast, the structure of PS5454 is distinctly different, with the two WH domains at opposite ends of the homodimer. The two FadR-group proteins (i.e.
FadR and CGL2915) contain an extra α0
helix at the N-terminus of the regulatory domain. In FadR, this helix contains a sharp kink which reverses its course in the center, wedging it between the WH and regulatory domains. Consequently, the mutual disposition of the two domains of FadR is distinctly different from both TM0439 and RO03477 owing to a rotation of the regulatory domain relative to the WH domain. In CGL2915, the α0
helix is straight and as a consequence the two regulatory domains are swapped between the monomers (Gao et al.
The site of the three mutations made to enhance crystallizability is located in the loop between helices α2 and α3 of the C-terminal domain and is involved in a heterologous contact with a WH domain of a symmetry-related molecule. The site of the mutations is distant from functionally important structural elements.
3.3. The WH domain
The N-terminal portion of TM0439 (residues Val6–Val71) constitutes the winged-helix dsDNA-binding domain, with a canonical order of secondary-structure elements α1, α2, α3, β1, β2 (these are referred to henceforth as a1, a2, a3, b1, b2 in order to differentiate them from helices α0
in the regulatory domain). The HTH (helix–turn–helix) motif is made up of helices a2 and a3 with the connecting loop; the antiparallel two-stranded β-sheet makes up the ‘wing’. Helix a1 provides a critical interface with the C-terminal regulatory domain in the same monomer. The WH domain is a hallmark of the GntR family. Not surprisingly, a structural comparison using DALI
(Holm et al.
) identified a number of known WH domains with similar structure. The top hits, with Z
> 8.0, include all of the known putative GntR structures, but also the Zα domain of the viral E3L protein (PDB code 1sfu
), double-stranded RNA-specific adenosine deaminase (PDB code 1qbj
), catabolite gene-activator protein (CAP; PDB code 1i6f
) and LEXA repressor (PDB code 1jhf
). The pairwise r.m.s.d. values for the Cα
atoms are around 2.0 Å. The highest amino-acid sequence identity among proteins of known structure is observed for PDB entries 3c7j
(PS5454) and 2di3
(CGL2915), at 35% and 32%, respectively.
Although all known structures of WH domains are very similar, their mode of interaction with dsDNA can vary considerably. While most of them use the second helix of the HTH motif to bind to the major groove of the cognate DNA sequence (Gajiwala & Burley, 2000
), the FadR WH domain uses only the N-terminal fragment of this helix (Xu et al.
). Interestingly, residues Arg35, Arg45, Arg49 and Gly66, which are indispensable for DNA binding in FadR, are completely conserved in CGL2915. These observations suggest that CGL2915 may bind to DNA in a manner similar to FadR, which binds to TGGTN3
ACCA (Xu et al.
). In fact, an identical sequence was identified in the C. glutamicum
genome in the promoter of cgl2917
(Gao et al.
). However, in TM0439 the residue equivalent to Arg45 of FadR is Phe45, suggesting that the target DNA sequence for this protein is different. Both RO03477 and PS5454 also show differences from the putative dsDNA-binding consensus sequence (Fig. 2).
Figure 2 The overall architecture of the HTH domain of TM0439, with putative DNA-binding residues shown. The DNA is modeled into this figure based on the superposition of the FadR–DNA complex (PDB code 1hw2) onto the HTH domain of TM0439.
3.4. The regulatory FCD domain
The FCD domain of TM0439, encompassing residues Glu76–Glu212, contains six α-helices, as predicted for the VanR group, arranged into an antiparallel bundle. The same tertiary fold is observed in the regulatory domains of RO03477 (PDB code 2hs5
) and PS5454 (PDB code 3c7j
), both of which are VanR-group members. The C-terminal domains of CGL2915 (PDB code 2di3
) and FadR (PDB code 1hw1
) also show a very similar fold, with the sole exception of the additional α0
helix characteristic of the FadR group (Fig. 3). Pairwise r.m.s. differences between Cα
positions range from 2.2 to 2.9 Å. This structural similarity is particularly striking given the limited amino-acid sequence similarities of 18% between TM0439 and RO03477, 13% with PS5454, 17% with CGL2915 and only 11% with FadR. The FadR C-terminal domain is classified as a member of the FadR_C family (PF07840), while the remaining four domains belong to the FCD family (PFam 07729). Thus, the FadR and VanR groups are not equivalent to the FadR_C and FCD families, respectively, creating a confusing classification. We suggest that the FadR and VanR distinction should be discontinued.
Figure 3 The regulatory domain of TM0439 and comparison with other FCD/FadR domains. The overall domain structure and a close-up of the kinked helix α4 is shown for each protein on the right and left, respectively. In each domain the kinked α4 (more ...)
Although a fold comprising a six-helix antiparallel bundle is topologically simple, the FCD/FadR_C fold constitutes a unique family to the extent that DALI
(Holm et al.
) shows no other structurally related domains with a Z
score higher than 6. It seems that the distinction between the FadR_C and FCD families made in the Pfam database is insignificant and a single family, e.g.
FCD, should comprise all these proteins; in the following discussion, the term FCD shall refer to all members of the FCD/FadR_C fold.
An interesting structural feature of the FCD fold is a conserved kink in the α4
helix. This helix is noteworthy because its N-terminal part is intimately involved in the dimerization of the domain (see below), while the C-terminal portion constitutes the main interface with the WH domain of the same monomer. In TM0439, the α4
helix has six full turns and the kink occurs approximately after the first three. The kink results in a strained secondary conformation of Ile153 (ϕ = −107°, ψ = 11°), which leaves the amides of Asp155 and Arg156, as well as the carbonyl of Lys164, free from intra-helical hydrogen bonds. Instead, the side-chain Glu58 from the WH domain positions itself so that O1
‘caps’ the chain amides of both Asp155 and Arg156 (Fig. 3). An almost identical structural perturbation occurs in the corresponding α-helix in CGL2915, in which the kink at Leu167 (ϕ = −86°, ψ = −12°) leaves the amides of Leu169 and Ser170, as well as the carbonyl of Ala166, free; here, Ser81 from the WH domain performs the capping function (Fig. 3). A similar stereochemistry is reproduced in FadR, in which Met168 is at the center of the kink (ϕ = −78°, ψ = −23°), leaving the amides of Gly170 and Leu171 and the carbonyl of Gly167 uncapped but with no substitute hydrogen-bonding partners from the WH domain (Fig. 3). In RO03477 a similar kink occurs after the first two turns, not three as in the previous structures. Met168 is at its center (ϕ = −84° and ψ = −8°) and the free amides of Ser170 and Val171, as well as the carbonyl of Val167, are not involved in any hydrogen bonds (Fig. 3). The PS5454 structure is the only one in which the α4
helix is straight. It is also the only structure in which the WH domains are set apart. We will return to this point later.
3.5. The FCD domain as a dimerization module
The FCD domains are responsible for the dimeric architecture of the FadR transcription factors. The crystal structures of FadR and CGL2915 show an almost identical disposition of the FCD domains in the homodimers and suggest that the mode of dimerization is conserved (Gao et al.
). The TM0439 protein conforms to this paradigm. It forms a homodimer in which the interface is mediated exclusively by the α1
helix and the N-terminal portion of the α4
helix of the FCD domain. In each chain, 23 residues bury a surface of ~950 Å2
. The hydrophobic core of the interface is formed by Ile87, Met88, Met89, Phe92, Leu145, Leu146, Leu149 and Ile153. The residues that bury the largest solvent-exposed surface are Glu81, Glu84, Met88, Phe92, Asn143, Leu145, Leu149 and Lys152. A total of 14 hydrogen bonds and four salt bridges span the interface at its periphery (Fig. 4). Both the RO03477 and PS5454 structures have topologically very similar interfaces that are mediated by the α1
helices, although the buried solvent-accessible surfaces are smaller than in TM0439 (~780 and ~730 Å2
, respectively). The same overall architecture is also seen in FadR and CGL2915, but their FCD domains contain the additional α0
helix, which contributes significantly to the dimer contact. In FadR, the surface buried on dimerization is ~780 Å2
per monomer, of which 112 Å2
is contributed by Leu80, Ile82 and Leu83 from the α0
helix. In CGL2915, these buried surfaces are ~950 and ~145 Å2
, respectively; the latter surface is contributed by Ala79, Leu80, Ser83, Val84 and Gln87.
Figure 4 The dimerization interfaces of the FCD and FadR_C domains. For TM0439, two complete FCD domains are shown, with one monomer colored as in Fig. 3. Residues described in the text are represented as sticks. For the other structures only the helices (more ...)
Thus, the mode of dimerization of all FCD domains is highly conserved, notably in the absence of any significant amino-acid sequence similarities between the individual proteins. The unique nature of each interface suggests that heterodimerization is not possible within this family.
3.6. A novel metal-binding subfamily of FCD
Based on the FadR paradigm, it is thought that the regulatory domains of the FadR family bind small organic ligands and as a consequence undergo conformational changes that reorient the WH domains and affect their binding to cognate DNA. We were therefore interested whether the structure of TM0439 might reveal a putative binding site for such a ligand. Indeed, we find an internal polar cavity in the FCD domain, at the bottom of which are three histidines (His134, His174 and His196) with imidazole groups arranged in a three-blade propeller with the N2
atoms pointing towards a strong peak of positive electron density. When a dummy atom was placed in this density and refined, it was found to be 2.0–2.2 Å from the three N2
atoms, which is consistent with the coordination stereochemistry of a metal ion.
Histidines primarily coordinate metal ions via
atoms (Chakrabarti, 1990b
), even though they are preferentially protonated on these atoms in solution (Reynolds et al.
). Thus, histidines within metal-binding sites typically donate hydrogen bonds through their Nδ1
atoms to carboxyl side chains or other hydrogen-bond acceptors (e.g.
main-chain carbonyls) to stabilize the less favorable tautomeric form that is unprotonated on N2
(Argos et al.
; Christianson & Alexander, 1989
). In concert with this paradigm, two of the metal-binding histidines, i.e.
His134 and His196, are stabilized in this form by hydrogen bonds to neighboring carboxylic acids (Glu173 O1
acts as an acceptor for His196 Nδ1
and Glu90 O1
for His134 Nδ1
). In addition, His134 donates a Cδ2
O bond to the main-chain carbonyl of Asp130 (3.1 Å; Fig. 5). Similar CH
O bonds involving the C1
(H) group, which is modestly acidic, are commonly observed for histidines in proteins (Derewenda et al.
), but those involving Cδ2
(H) are rare.
Figure 5 Metal-binding sites of TM0439 (PDB code 3fms), CGL2915 (2di3) and PS5454 (3c7j). An OMIT map contoured at 5σ is shown for TM0439. This was generated by deleting the metal and acetate and truncating the histidines back to the Cβ atoms, (more ...)
The three imidazoles form a triangular propeller, with the angles at each N2
close to 60°. Further, the putative metal ion is elevated ~1.25 Å above the plane defined by the N2
atoms, as expected for tetrahedral coordination. The putative fourth position in the coordination sphere is unoccupied, and above it we find electron density consistent with a carbonate or an acetate ion, which may have originated from the crystallization mixture. The refined B
value for the metal (36 Å2
) was consistent with a divalent ion such as Zn2+
. In order to identify the metal, we employed atomic absorption spectroscopy on the SeMet samples used for crystallization and found stoichiometric amounts of Ni2+
. Metal removal was found to be kinetically impaired; greater than 48 h of dialysis against 10 mM
EDTA and 2 mM
DTT was required for its complete removal at 277 K. This slow removal may be a consequence of the inherently slow Ni2+
ligand-exchange kinetics as well as the relatively buried nature of the metal-binding site. We suspect that Ni2+
may have been inadvertently introduced during the purification protocol, i.e.
-affinity chromatography, and that Zn2+
is the physiological ligand; this is consistent with the tetrahedral coordination geometry, as well as the presence of histidines as coordinating residues, both of which favor Zn2+
(Dokmanić et al.
Using tryptophan fluorescence, we measured the metal affinity of TM0439 for both Zn2+
. Fig. 6(a
) shows the fluorescence emission spectrum upon excitation at 287 nm, with a characteristic tryptophan peak at λem
= 340 nM
. We find that Ni2+
binding is stoichiometric, 1:1, with K
= 1.47 ± 0.01 × 107
= 68 ± 5 nM
). Unexpectedly, Zn2+
binds with a stoichiometry of 2:1, with sequential binding constants of K
≥ 1.4 ± 0.1 × 107
≤ 71 ± 5 nM
) and K
≥ 4.5 ± 0.4 × 105
≤ 2.0 ± 0.2 µM
), respectively, with an approximately twofold increase in the Trp fluorescence (Fig. 6
). The origin of the second binding site is unknown and it is not clear whether the lower affinity site is of functional significance. We note that the protein contains a His6
tag which in principle could influence the apparent metal-binding affinities and stoichiometries. However, the N-terminal localization of the polyhistidine sequence virtually rules out any potential influence on the quantum yield of Trp154, which is located at the kink in the α4 helix of the C-terminal regulatory domain. Both Zn2+
bind to synthetic histidine-rich sequences with affinities of ~104
(Whitehead et al.
). Since the measured Zn2+
-binding constants are lower limits (see legend to Fig. 6
), it is unlikely that there is significant competition from the polyhistidine tail. Since we did not observe a secondary low-affinity Ni2+
-binding site, it may be possible that it is masked by competition from the His6
tail. Taken together and considering the relative abundance of Zn2+
compared with Ni2+
for most organisms (Outten & O’Halloran, 2001
), it is reasonable to hypothesize that TM0439 is a Zn2+
-binding protein, although our analysis did not include other transition metals, e.g.
Co or Mn, which in principle might also be involved.
Figure 6 Metal binding by TM0439 monitored by Trp fluorescence: (a) 200 µM Ni2+ and (b) 200 µM Zn2+ titrated into 5.3 µM TM0439. The inset plots the emission (λ = 340 nm) versus the metal:protein (more ...)
Interestingly, the structures of both CGL2915 (PDB code 2di3
) and PS5454 (PDB code 3c7j
) also contain metals bound in stereochemically analogous sites. In CGL2915 the coordinating histidines are His148, His196 and His218 and their imidazoles are stabilized in the Nδ1
-protonated tautomers by Glu106, Gln193 and Glu195, respectively. His148 is additionally stabilized by a CH
O bond via
, as is the case for His134 of TM0439. However, another protein atom, Oδ1
of Asp144 (analogous to Asp130 in TM0439), serves as an axial ligand (distal to His218), resulting in slightly distorted trigonal bipyramid coordination, with a water molecule completing the equatorial plane (Fig. 5). The same stereochemistry is preserved in the second, crystallographically independent, subunit. It is also interesting to note that Asp144 Oδ1
approaches the putative metal with the syn sp
orbital, as is usual in metal-binding sites (Chakrabarti, 1994
). The ligand in CGL2915 is annotated as Zn2+
based on XAFS data (Gao et al.
In the P. syringae
regulator (PDB code 3c7j
), the coordinating histidines are His148, His192 and His214, while the fourth ligand, equivalent to Asp144 in CGL2915, is Asn144. The His214 and Asn144 side chains serve as axial ligands and the latter is oriented with its side-chain O atom towards the metal. His192 and His214 are stabilized in the required tautomeric forms by hydrogen bonds from Nδ1
to Asp191 and Gln189, respectively. The His148 residue has the same interesting CH
O bond to the carbonyl of Asn144 as its counterparts in CGL2915 and TM0439. In one subunit, a single water molecule is found in an equatorial plane, while in the second independent monomer two water molecules complete an octahedral coordination sphere (Fig. 5). The metal in this structure is annotated as Ni2+
, consistent with the coordination preference and with reasonable B
Neither the FadR nor the RO03477 structures have metal-binding sites. In FadR, the three metal-coordinating histidines are replaced by Phe149, Tyr193 and Tyr215. In RO03477, one of the three histidines, His152, is present, but the other two are replaced by Asn196 and Tyr218, respectively, leaving no room for the metal.
An analysis of the genomic data for the FCD-domain family (PF07729) reveals that more than 2800 members have been identified to date in 402 species of eubacteria and four species of archaea. The amino-acid sequences show low average identity on full alignment (~21%). A majority (>70%) contain a complete set of motifs with all four putative metal-binding residues that together make up a consensus fingerprint, R-X
6-H, where Φ denotes a hydrophobic residue, typically Leu, Met or Ile, and residues involved in metal coordination are shown in bold. Because of poor amino-acid sequence conservation in this family, this fingerprint is not readily identifiable by automated sequence alignment.
Numerous examples of bacterial species contain a number of FCD-family proteins: Mycobacterium smegmatis contains 46 of these regulators, Rhodococcus sp. RHA1 contains 49, Arthrobacter sp. (FB24) contains 28 and Agrobacterium tumefaciens contains 51. Interestingly, the sequences are very diverse within each species but in each case about two-thirds show conservation of all metal-binding amino acids. This situation is in stark contrast to the FadR_C family, for which there are only 71 annotated sequences in 70 species (with only one gene per organism) and an average amino-acid identity of 48%.
3.7. Functional implications
The structural evidence presented here strongly suggests that the majority of FCD domains and therefore the majority of FadR transcription regulators are metal (most likely Zn2+
) dependent. What is not clear is whether these transcription factors are metal-sensing or whether the metal plays a structural role or perhaps is required for binding of other effector molecules through direct coordination bonds. Metal-sensing transcription factors are ubiquitous in prokaryotes, with seven major families characterized to date (Giedroc & Arunkumar, 2007
). Five of these families, i.e.
ArsR, MerR, CopY, Fur and DtxR, utilize WH domains, also found in the GntR regulators, for binding to dsDNA. Almost all of these proteins are dimeric and metals bind typically at or near dimer interfaces, enabling the metal-bound form of the regulators to repress, de-repress or activate the transcription of operons coding for metal-efflux pumps, transporters, redox machinery etc
. (Giedroc & Arunkumar, 2007
; Pennella & Giedroc, 2005
; Silver & Phung, 2005
). In the FCD domains, the metal-binding site is distinctly buried within an individual monomer and removal by dialysis takes a relatively long time, which would seems to argue against a role in sensing changes in metal concentration. It is therefore more plausible that the FCD domains bind carboxylic acids or small organic compounds containing carboxylic groups, so that the latter are buried and interact directly with the metal at the bottom of the ligand-binding cavity. The presence of acetate (or less likely carbonate) in the TM0439 structure is consistent with this hypothesis. However, the polar cavities observed inside the metal-binding FCD domains of TM0439 and CGL2915 are relatively small and do not appear to be able to bind larger organic compounds: calculations with a 1.4 Å probe resulted in only ~130 Å3
for TM0439 and ~72 Å3
for CGL2915. Interestingly, in PS5454 the volume of the cavity is difficult to estimate because one of the flanking loops is disordered in the crystal structure and the cavity appears to be open to bulk solvent. The loop that is disordered links the α4
helix with the α5
helix. We note that PS5454 is unique in that the α4
helix is straight, lacking the characteristic kink, and it is possible that the structure represents an ‘active’ conformer in which the cavities are open and able to bind a ligand, while the WH domains are ~68 Å apart, i.e.
ideally positioned to bind to major grooves separated by two complete turns of the dsDNA.
Further studies will be needed to fully characterize the new metal-binding subfamily of the FadR transcription regulators.