Geosmin (
1), whose name means “earth odor”, is a volatile microbial metabolite that is responsible for the characteristic smell of moist soil or freshly plowed earth
1,2. Geosmin is produced by a number of microorganisms, including most
Streptomyces and several species of cyanobacteria, myxobacteria, and fungi
4–10. The detection and elimination of
1, which has an exceptionally low threshold for human detection of less than 10 parts per trillion, is of considerable economic importance due to its association with undesirable musty or off-flavors in drinking water, wine, fish and other foodstuffs, and its resistance to removal by conventional water treatment
10,11.
The chemical origin of the characteristic odor of soil was first investigated in 1891 by Berthelot
12, but not until 1965 was the responsible agent, (−)-geosmin (
1), first isolated in pure form from the neutral extract of the fermentation broth of
Streptomyces griseus and the structure assigned
4,5. In 1981, incorporation of labeled acetate into geosmin using strains of
S. antibioticus suggested that this bicyclic C
12 metabolite might be a degraded sesquiterpene
13. Although there have been more 800 papers dealing with the production, detection, and remediation of geosmin and other volatile metabolites in water supplies, aquaculture products, and wine, there were no further reports on the mechanism of microbial geosmin biosynthesis until five years ago
14.
Expression in
Escherichia coli of a 2181-bp gene from
S. coelicolor A3(2) (SCO6073) gives a 726-amino acid protein with significant similarity in both the N-terminal and C-terminal halves to the well-characterized sesquiterpene synthase, pentalenene synthase
15. The full length recombinant protein catalyzes the Mg
2+-dependent conversion of farnesyl diphosphate (FPP,
2) to a mixture of germacradienol (
3), germacrene D (
4), octalin
5, and geosmin (
1), without involvement of any cosubstrates or redox cofactors ()
3,15,16. Incubation of FPP (
2) with the closely related recombinant germacradienol–geosmin synthase from
S. avermitilis (SAV2163, GeoA) yields an essentially identical mixture of
3,
4,
5, and geosmin (
1)
17. Deletion of the
S. avermitilis geoA gene abolished both germacradienol and geosmin production, which could be restored by reintroduction of a copy of wild-type
geoA17. Independently, researchers at the John Innes Institute have demonstrated that deletion of the
S. coelicolor SCO6073 gene abolishes geosmin formation
18.
Generation of germacradienol (
3) and germacrene D (
4) results from partitioning of a common intermediate, proposed to be carbocation
A, the initial product of ionization and cyclization of FPP (
2) ()
16. Formation of germacrene D (
4) involves a 1,3-hydride shift of H-1
si of FPP
16. The alternative formation of germacradienol (
3) involves competing loss of the original H-1
si proton of FPP (
2) and cyclization to a proposed enzyme-bound,
trans-fused bicyclic intermediate, isolepidozene (
6), a known compound previously isolated from liverworts
19 that would be converted to germacradienol by proton-initiated ring opening and capture of the resulting homoallyl cation by water. Further conversion of germacradienol to geosmin is proposed to involve protonation–cyclization, a novel retro-Prins type fragmentation resulting in loss of the 2-propanol side chain as acetone, and generation of the octalin intermediate
5. Reprotonation of
5 followed by a 1,2-hydride shift and quenching of the bridgehead cation by water will generate geosmin (
1)
20. The results of incubation reactions carried out in D
2O are also consistent with this mechanistic proposal
3.
Increasing either the concentration of germacradienol–geosmin synthase or the time of incubation enhances the relative proportion of geosmin to germacradienol as well as the absolute yield of
1.
3,17 These results are inconsistent with the formation of germacradienol and geosmin by partitioning of a series of exclusively enzyme-bound intermediates. A substantial fraction of the initially generated germacradienol must be released from the enzyme before rebinding to the protein and cyclization–fragmentation to generate geosmin, without distinguishing whether germacradienol and geosmin formation take place at the same or at distinct active sites, nor whether transient release of germacradienol is a mandatory event in geosmin biosynthesis. We now report conclusive evidence that germacradienol–geosmin synthase is a bifunctional enzyme possessing two independent active sites with distinct catalytic functions.
Two strictly conserved motifs found in all sesquiterpene and monoterpene synthases are essential to binding of the Mg
2+ cofactor: an aspartate-rich sequence
DDXX(
D/
E), that is found at amino acid (aa) residues 80–120 in microbial synthases and at aa 290–310 in plant synthases, and an NSE triad of residues, (
N/
D)DXX(
S/
T)XXX
E, generally found 140±5 aa downstream of the aspartate-rich motif
21–25. The vast majority of bacterial and fungal terpene synthases are 330–400 aa in length, corresponding to a subunit
Mr 35–45 kDa. The 726-aa germacradienol–geosmin synthase of
S. coelicolor is unusual in that it is about twice the size of a typical terpene synthase. Notably, both the N- and C-terminal halves of the SCO6073 protein show significant sequence similarity to the well-characterized sesquiterpene synthase, pentalenene synthase (28% identity, 41% similarity over the aa 1–319; 29% identity, 46% similarity over aa 407–726)
15. Both the N-and C-terminal regions of the protein harbor variants of the canonical aspartate rich domain, with a
DDHFL
E motif in the N-terminal half and an unusual
DDYYP motif in the C-terminal half (). Both halves also display typical NSE motifs at
NDLF
SYQR
E and
NDVF
SYQK
E. Curiously, the N-terminal half also appears to harbor an unusual repeat of the upstream NSE motif,
NDVL
TSRLHQF
E, located 38 aa downstream of the first. A 41-kDa truncated N-terminal mutant of germacradienol–geosmin synthase corresponding to aa 1–366 converts FPP to germacradienol, while a C-terminal truncation mutant (aa 383–726) catalyzes only slow solvolysis of FPP, with no detectable formation of cyclic products
15. Deletion of the N-terminal domain of SCO6073 also abolishes geosmin production by
S. coelicolor18.
Based on the discovery that full-length recombinant SCO6073 protein can convert FPP to germacradienol and geosmin
3, we have sought to localize the geosmin synthase activity by examining the behavior of both N- and C-terminal truncation mutants. To this end, we generated a 56 aa-longer C-terminal construct using the
NcoI-
XhoI fragment of plasmid pRW31 encoding full length SCO6073 as template for PCR amplificaton of the region corresponding to aa M327–H726 () and purified the derived pJJ3 protein (
Mr 46,000), carrying a C-terminal His
6-tag, to >95% homogeneity by Ni
2+-NTA chromatography.
Incubation of the N-terminal truncated mutant (pRW22p)
15 with 100 μM FPP (
2) generated a mixture of germacradienol (
3) and germacrene D (
4) accompanied by small quantities of octalin
5, but no detectable geosmin (
1), as determined by capillary gas chromatography–mass spectrometry (GC–MS) (, ,
Supplementary Fig. 1a). Increasing the concentration of the N-terminal domain from 2.4 to 9.0 μM, conditions under which wild-type protein yields a mixture containing up to 15% geosmin, had minimal effect on the relative proportion of the three products. Significantly, addition of 2.4 μM recombinant C-terminal protein (pJJ3p) to the incubation mixture containing 5.4 μM truncated N-terminal protein and 100 μM FPP resulted in formation of geosmin (
1), with a concomitant decrease in the proportion of germacradienol (
3) (, ). Using 5.5 μM C-terminal protein increased the proportion of geosmin (
1) to 17% while the fraction of germacradienol dropped to 66%, with insignificant variation in the proportion of germacrene D (
4) in both cases and a doubling of the relative proportion of octalin
5 (
Supplementary Fig. 1b). Control experiments reconfirmed that incubation of FPP (
2) with the C-terminal protein alone generated only minor amounts of the acyclic solvolysis products (
E)-α-farnesene (
7) and nerolidol (
8) along with a trace of (
Z)-α-bisabolene (
9) ()
15.
| Table 1Incubation of FPP with the N-terminal and C-terminal domains of S. coelicolor germacradienol-geosmin synthase. |
These results established either that the C-terminal protein directly catalyzes the formation of geosmin from germacradienol produced by the N-terminal domain or, less likely, that the C-terminal protein in some way modulates the catalytic activity of the N-terminal domain. To distinguish these two possibilities, we incubated an 85:15 mixture of germacradienol (
3) and germacrene D (
4), containing ~1% octalin
5, obtained from incubation of FPP with N-terminal protein, with 1.6 μM C-terminal protein (pJJ3p) in the presence of 4 mM MgCl
2 for 6 h at 30 °C. GC–MS analysis indicated a 2 % conversion to geosmin, which increased to 6% when the concentration of C-terminal mutant protein was augmented to 3.1 μM (;
Supplementary Fig. 2). Omission of Mg
2+ from the incubation mixtures completely abolished geosmin formation.
To explore further the role of the C-terminal domain in the formation of geosmin from FPP and germacradienol (
3), we used site-directed mutagenesis to target the conserved downstream aspartate-rich
DDYYP and the
NDVF
SYQK
E motifs of the full-length germacradienol–geosmin synthase (
Supplementary Table 1). When incubated for 11 h with 49 μM FPP, the D455N/D456N double mutant (19.5 μM) generated a 87:11 mixture of germacradienol (
3) and germacrene D (
4), accompanied by a trace of octalin
5, without any detectable geosmin. The N598L, D599L, S602A, and E606Q mutants gave similar product mixtures, again with complete suppression of geosmin formation. Importantly, geosmin production could be restored by pre-incubation of 16.1 μM of the D455N/D456N double mutant with 43 μM FPP for 5.5 h at 30 °C, followed by addition of 2.7 μM of the C-terminal truncation protein and incubation for a further 5.5 h, yielding a mixture 89:8:1:2 of
3,
4,
5, and geosmin (
1).
Site-directed mutagenesis of the presumptive Mg
2+-binding domains
21–25 in the N-terminal half of the germacradienol–germacrene D synthase resulted in decreases in catalytic efficiency as well as the generation of a variety of abnormal products that were characterized by GC-MS (). Thus the D86E mutant generated several aberrant products including the acyclic elimination products (
E)-α-farnesene (
7) and β-farnesene (
10) and a monocyclic sesquiterpene tentatively identified by retention index (RI) and mass spectrum as β-elemene (
11) (). Interestingly, the L90D mutant, with the canonical
DDHF
D sequence, was not only severely impaired catalytically, with a greater than 1,000-fold reduction in
kcat/
Km, but produced primarily a 1:6 mixture of β-elemene (
11) and γ-elemene (
12). Notably both the S272A and T271A mutants produced normal mixtures of
3,
4,
5, and geosmin (
1), with only relatively minor reductions in
kcat/
Km, suggesting that the unusual downstream NSE triad found in the N-terminal half of the protein plays no significant role in catalysis. Most interestingly, the S233A mutant, which was severely impaired catalytically (2500-fold reduction in
kcat/
Km), produced, in addition to the normal products
1 (11%),
3 (57%),
4 (18%),
5 (3%), a new sesquiterpene hydrocarbon, identified as isolepidozene (
6) (11%,
m/z 204, RI 1483), identified by comparison of the mass spectrum and GC retention index with the data for authentic isolepidozene (
Supplementary Fig. 3).
| Table 2Steady-state kinetic parameters and products for wild-type S. coelicolor (SCO6073) germacradienol-geosmin synthase and mutants. |
The experiments described above demonstrate unambiguously that the SCO6073 germacradienol–geosmin synthase of
S. coelicolor is a bifunctional enzyme in which the N-terminal half of the protein converts FPP (
2), the universal acyclic sesquiterpene precursor, to an 85:15 mixture of germacradienol (
3) and germacrene D (
4). The C-terminal half of the protein then rebinds the germacradienol and catalyzes proton-intiated cyclization–fragmentation of
3 to give geosmin (
1). Interestingly, the N-terminal domain also produces small amounts of octalin
5, thought to be an intermediate in the conversion of
3 to
1. Both the N- and C-terminal-catalyzed reactions have an absolute dependence on the divalent cation Mg
2+. Site-directed mutagenesis clearly implicates the Mg
2+-binding regions of each half of the protein in the conversion of FPP to germacradienol and thence to geosmin. While the role of Mg
2+ in terpene synthase catalyzed cyclization of allylic diphosphate substrates is well understood
21, the role of the divalent cation in the electrophilic polyene-type cyclization is unclear, since the germacradienol substrate does not have a pyrophosphate moiety that might interact with the bound Mg
2+.
Although bifunctional sesquiterpene and monoterpene synthases have not previously been described, fungal
ent-kaurene synthases have two distinct active sites that catalyze successive electrophilic cyclization reactions
26,27. The first reaction in the sequence, known as a Type B reaction, is a proton-initiated polyene cyclization of geranylgeranyl diphosphate (GGPP,
13), catalyzed by the N-terminal domain of the protein, to generate the bicyclic intermediate copalyl diphosphate (CPP,
14) (
Supplementary Scheme 1). CPP is then rebound by the C-terminal domain of the diterpene synthase which catalyzes an ionization–cyclization–rearrangement to yield the tetracyclic diterpene
ent-kaurene (
15) (Type A reaction). Interestingly, the order of the Type A and Type B electrophilic cyclization reactions is reversed in these fungal diterpene synthases compared to
S. coelicolor germacradienol–geosmin synthase
26. A DVDD motif in the N-terminal domain of
ent-kaurene synthase is responsible for the proton-initiated polyene cyclization of the GGPP substrate, similar to the aspartate-rich DV
DDTA motif of squalene-hopene cyclase, in which the D376 residue (bold) initiates the Type B polyene cyclization, rather than acting as a Mg
2+-binding domain
28. One or both aspartate residues in the
DDYYP motif of SCO6073 may therefore play a similar role in the conversion of germacradienol to geosmin.
The genome sequences of a variety of geosmin-producing actinomycetes and cyanobacteria reveal at least ten additional presumptive geosmin synthases with a high level of sequence homology to SCO6073 (45–78% identity, 57–85% similarity over >700 aa) (
Supplementary Table 2). The active site
DDHFL
E,
NDLF
SYQR
E,
DDYYP, and
NDVF
SYQK
E motifs are especially highly conserved in all of these presumed geosmin synthases (
Supplementary Fig. 4), as is the unusual, duplicated downstream NSE triad in the N-terminal domain that plays no apparent functional role in FPP binding or catalysis.
The detection of free germacradienol, as well as the demonstration that the C-terminal domain of SCO6073 converts germacradienol, produced by the N-terminal domain, to geosmin establishes that the germacradienol intermediate does not channel directly between the two active sites of the bifunctional protein. Instead substrate transfer involves release of the initially generated intermediate 3 into the medium and diffusive rebinding to the active site of the C-terminal domain.
Previous
in vitro biochemical and
in vivo molecular genetic studies are in agreement that the N-terminal domain of
S. coelicolor SCO6073 is essential for the formation of geosmin
15,18. The definitive demonstration that the N-terminal domain converts FPP to germacradienol but not to geosmin, and that the C-terminal domain catalyzes the cyclization–fragmentation of germacradienol to geosmin is at odds, however, with the earlier report that
S. coelicolor mutants carrying an in-frame deletion of aa 380–721 of the C-terminal domain of SCO6073 apparently could still produce geosmin
18.
The significant decreases in
kcat and increases in
Km as well as the formation of aberrant products as a consequence of mutations in the universally conserved Mg
2+-binding domains of terpene synthases is well-precedented
24,25. Frequently the aberrant products result from premature quenching or derailment of the normal reaction cascade of enzyme-bound carbocationic intermediates. The
trans-fused bicyclic sesquiterpene isolepidozene (
6) derailment product generated by the S233A mutant is in fact a postulated intermediate in the conversion of the germacradienyl cation
A to germacradienol
3 ()
3,17. The structure and stereochemistry of
6 are fully compatible with the observed regiochemistry and stereochemistry of labeling by chirally deuterated FPP of both germacradienol (
3) and geosmin (
1)
3,16. Further experiments to establish the mechanistic details of the conversion of germacradienol to geosmin and the role of the protein in mediating this fascinating transformation are in progress.