|Home | About | Journals | Submit | Contact Us | Français|
The genomes of sulfate-reducing bacteria remain poorly characterized, largely due to a paucity of experimental data and genetic tools. To meet this challenge, we generated an archived library of 15,477 mapped transposon insertion mutants in the sulfate-reducing bacterium Desulfovibrio alaskensis G20. To demonstrate the utility of the individual mutants, we profiled gene expression in mutants of six regulatory genes and used these data, together with 1,313 high-confidence transcription start sites identified by tiling microarrays and transcriptome sequencing (5′ RNA-Seq), to update the regulons of Fur and Rex and to confirm the predicted regulons of LysX, PhnF, PerR, and Dde_3000, a histidine kinase. In addition to enabling single mutant investigations, the D. alaskensis G20 transposon mutants also contain DNA bar codes, which enables the pooling and analysis of mutant fitness for thousands of strains simultaneously. Using two pools of mutants that represent insertions in 2,369 unique protein-coding genes, we demonstrate that the hypothetical gene Dde_3007 is required for methionine biosynthesis. Using comparative genomics, we propose that Dde_3007 performs a missing step in methionine biosynthesis by transferring a sulfur group to O-phosphohomoserine to form homocysteine. Additionally, we show that the entire choline utilization cluster is important for fitness in choline sulfate medium, which confirms that a functional microcompartment is required for choline oxidation. Finally, we demonstrate that Dde_3291, a MerR-like transcription factor, is a choline-dependent activator of the choline utilization cluster. Taken together, our data set and genetic resources provide a foundation for systems-level investigation of a poorly studied group of bacteria of environmental and industrial importance.
Sulfate-reducing bacteria contribute to global nutrient cycles and are a nuisance for the petroleum industry. Despite their environmental and industrial significance, the genomes of sulfate-reducing bacteria remain poorly characterized. Here, we describe a genetic approach to fill gaps in our knowledge of sulfate-reducing bacteria. We generated a large collection of archived, transposon mutants in Desulfovibrio alaskensis G20 and used the phenotypes of these mutant strains to infer the function of genes involved in gene regulation, methionine biosynthesis, and choline utilization. Our findings and mutant resources will enable systematic investigations into gene function, energy generation, stress response, and metabolism for this important group of bacteria.
Sulfate-reducing bacteria (SRB) are a diverse group of bacteria that can use sulfate as a terminal electron acceptor for growth. This method of energy conservation is considered to be an ancient form of respiration: it is estimated that SRB-mediated sulfate reduction has existed for ~3 billion years and was an important process during the early stages of life on earth (1). SRB are found in many diverse environments and contribute to the global sulfur and carbon cycles, including the mineralization of organic carbon in sea sediments (2). SRB are also common inhabitants of the human microbiome (3, 4), where they may play a role in inflammatory bowel disease (5).
SRB are important in a number of industries and applications. In the oil industry, SRB contribute to the souring of oil via the production of sulfides and corrosion of pipelines and wells (6). In wastewater treatment plants, SRB are used to remove sulfates and convert hydrogen sulfide by-products into precipitated heavy metals (7) Similarly, SRB play an important role in bioremediation by reducing and immobilizing heavy metals (8). Lastly, SRB hold potential for use in biological fuel cells to generate energy (9).
The SRB Desulfovibrio alaskensis G20, formerly known as Desulfovibrio desulfuricans G20, is derived from the G100A strain isolated from an oil well in California (10). Relative to the G100A strain, the D. alaskensis G20 strain is a spontaneously nalidixic acid-resistant mutant that is also cured of the native plasmid pBG1 (11). The sequenced D. alaskensis G20 genome has been annotated with proteomic and transcript data to improve gene calls (12). D. alaskensis G20 is quite distant from a well-studied SRB of the same genus, Desulfovibrio vulgaris Hildenborough; their 16S RNA sequences share 90% sequence similarity and D. alaskensis G20 shares 1,873 of its 3,258 protein-coding genes with D. vulgaris Hildenborough. Genetic tools based on homologous recombination, including markerless deletions and epitope tagging, are available for D. vulgaris Hildenborough (13, 14), but such tools have yet to be developed in D. alaskensis G20. The D. alaskensis G20 genome is predicted to contain 133 transcription factors and sigma factors (15). To date, only four of these transcription factors have been characterized experimentally, ArsR (16), MreC (17), SahR (18), and Dde_1614 (19). However, using comparative genomics, DNA binding motifs and target genes have been predicted for 50 of these regulators (20,–23).
Here, we describe the generation and preliminary analysis of a collection of D. alaskensis G20 transposon insertion mutants that have been tagged with DNA bar codes for high-throughput analysis of mutant fitness using competition assays. The transposon insertion location has been mapped for the entire collection, and the collection is archived to also allow single mutant investigations for the majority of genes. The D. alaskensis G20 transposon collection includes insertion mutants in 2,513 protein-coding genes and has already been used to investigate the suboptimality of gene expression in bacteria (24) and syntrophic growth of D. alaskensis G20 with methanogens (25). Another D. alaskensis G20 DNA bar-coded transposon collection has previously been described (26) and has been used to identify genes important for fitness in sediment (26, 27), H2 oxidation (28), and syntrophic growth with a methanogen (29). However, the Groh et al. collection is about a third of the size of our collection and has limited capacity for parallel analysis of mutant fitness because only 66 unique bar codes were used (26). In addition, only a fraction of the transposon insertions of the Groh et al. collection have been mapped, so the entire collection typically has to be screened for a phenotype before a follow-up study can begin (29).
In this paper, we highlight the utility of the D. alaskensis G20 transposon collection for generating insights into SRB gene essentiality, gene regulation, and metabolism. In addition to genes directly involved in sulfate reduction, we identified genes in folate, thiamine, and menaquinone synthesis as essential in D. alaskensis G20. To experimentally validate and update computationally predicted regulons, we measured gene expression in individual transposon mutants of regulatory genes. To augment the analysis of these expression data, we mapped the architecture of the D. alaskensis G20 transcriptome and identified 1,313 transcription start sites (TSSs) with high-density tiling microarrays and transcriptome sequencing (5′ RNA-Seq). Using the combined TSS and gene expression data, we updated the regulons of σ54, Fur, and Rex and confirmed the expected regulons of LysX, PerR, PhnF, and the histidine kinase Dde_3000. Taking advantage of DNA bar codes introduced into the transposon mutants, we generated two pools of D. alaskensis G20 transposon mutants for the parallel analysis of mutant fitness. Using the competitive fitness assay and single-gene validation, we demonstrated that the conserved hypothetical gene Dde_3007, which belongs to the uncharacterized family DUF39, is required for methionine synthesis and specifically for homocysteine synthesis. Lastly, we used the competitive fitness assay to verify that most genes of the choline utilization cluster are required for the anaerobic oxidation of choline and confirm, through expression analysis, that Dde_3291 regulates this gene cluster. As described here, our comprehensive collection of D. alaskensis G20 mutants is a valuable resource for the systems-level investigation of SRB physiology, both as single mutants and in a pooled fitness assay.
To enable systems-level investigation of a sulfate-reducing bacterium, we isolated and mapped the insertion location for 15,477 D. alaskensis G20 Tn5 mutants. Of the mutants, 14,834 were isolated on lactate-sulfate medium and the other 643 were isolated on lactate-sulfite medium. These mutants are maintained as an archived collection of individual strains and are available to the community for single-gene studies (see Data Set S1 in the supplemental material). As shown in Fig. 1A, the 15,477 transposon insertions are distributed roughly evenly across the chromosome but with some bias toward the origin. The 15,477 mapped mutants include insertions in 2,513 of the 3,258 (77%) protein-coding genes in the D. alaskensis G20 genome. For 2,314 genes, we mapped an insertion within the central portion (5 to 80%) of the gene (Fig. 1B).
Protein-coding genes with no mapped insertions may be essential for viability in the medium that we used to select the mutants (primarily lactate-sulfate). We categorized 337 D. alaskensis G20 genes with no insertions in the middle of the coding sequence (CDS) (defined as between 5 and 80% of CDS length) and with sequence similarity to a known essential gene in other bacteria as “expected essential” (see Materials and Methods). In addition, we identified 50 Desulfovibrio-specific essential genes that met the following criteria: the gene had to (i) have an ortholog in Desulfovibrio vulgaris Hildenborough, Desulfovibrio vulgaris Miyazaki, and Desulfovibrio desulfuricans ATCC 27774; (ii) not share significant homology to an essential gene contained in the OGEE database (30); (iii) be adjacent to and cotranscribed with another essential gene; and (iv) be at least 300 nucleotides in length. We considered genes conserved among these four members of the Desulfovibrio genus as functionally important and therefore more likely to be essential than less conserved genes without insertions. Short genes of less than 300 nucleotides, genes that contain repetitive elements that cannot be uniquely mapped, and genes without central insertions and without an ortholog to a known essential gene or adjacent to another essential gene in the same operon were not considered essential (Fig. 1B). We used the operon criterion as a filter to help identify essential genes because genes in the same operon often have similar functions. A complete list of expected and Desulfovibrio-specific essential genes is contained in Data Set S2 in the supplemental material.
Some of the Desulfovibrio-specific essential genes were anticipated based on their known, vital role in energy conservation. These genes include (i) the quinone-interacting membrane-bound oxidoreductase qmoABC (Dde_1111:4) (31), (ii) the adenylyl-sulfate reductase component aprB (Dde_1109), (iii) dissimilatory sulfite reductase dsrAB (Dde_0526:7), and (iv) transmembrane electron carrier components dsrMKJP. dsrO also lacked insertions but shared enough homology with nrfC from Salmonella enterica serovar Typhimurium LT2 to be considered an expected essential. We mapped transposon insertions in two genes known to be essential for sulfate reduction in the fraction of the collection that we isolated in lactate-sulfite medium: sat (sulfate adenylyltransferase; Dde_2265) and aprA (adenylyl-sulfate reductase; Dde_1110).
We also classified a number of genes involved in the biosynthesis of the cofactors NAD (nadAC), folate (folCPKD, Dde_2197), and menaquinone as either expected or Desulfovibrio-specific essentials. nadB did not make either list of essentials but also lacks transposon insertions. Desulfovibrio genomes do not contain an annotated dihydroneopterin aldolase (encoded by the folB gene) of the typical folate synthesis pathway. However, it has been demonstrated that 6-pyruvoyl tetrahydrobiopterin synthase paralogs, including DVU1352 from D. vulgaris Hildenborough, functionally rescue Escherichia coli folB mutants (32). Consistent with these observations, we classified Dde_2197, a putative 6-pyruvoyl tetrahydrobiopterin synthase and ortholog of DVU1352, as essential. An alternative pathway for menaquinone synthesis that uses futalosine as an intermediate has been described in Streptomyces species (33), and orthologs of these Streptomyces enzymes were classified as putative essentials in D. alaskensis G20 (Dde_0796:0799, Dde_3188, Dde_3185, Dde_1392, Dde_1323, and Dde_0150).
To aid in the interpretation of the transposon mutant fitness data and regulon inference from gene expression profiling, as described below, we collected high-resolution tiling microarray and 5′ RNA-Seq data from D. alaskensis G20 to identify operons, promoter motifs, and transcriptional start sites (TSSs). A representative 7-kb window of the D. alaskensis tiling microarray and 5′ RNA-Seq data are illustrated in Fig. 2A. To identify D. alaskensis G20 promoter motifs, we examined the upstream regions of 1,172 preliminary TSSs identified from the 5′ RNA-Seq and tiling microarray data and found two significant motifs, for σ70 (642 instances, P < 10−440) and σ54 (RpoN; 20 instances, P < 10−15) (Fig. 2B and C). The D. alaskensis G20 σ70 motif is very similar to the σ70 motif that we previously identified in D. vulgaris Hildenborough (34). Compared to the E. coli σ70 motif, the D. alaskensis G20 σ70 motif has a shortened −10 box and a stronger −35 box, which confirms our previous findings in D. vulgaris Hildenborough (34).
To identify new RpoN targets in D. alaskensis G20, we scanned the sequences upstream of the 1,172 preliminary TSSs with the Desulfovibrio RpoN motif from RegPrecise (21). From this analysis, we identified 11 new RpoN-dependent promoters that were previously unannotated in RegPrecise: Dde_2287:Dde_2285, Dde_0420:Dde_0418, Dde_3398 (at codon 13 within the open reading frame [ORF]), Dde_0818:Dde_0819, Dde_0062, Dde_1408, Dde_1017, Dde_0645, Dde_1501, and two unannotated small RNAs starting at positions 3627439 on the plus strand and 86651 on the minus strand (Fig. 2A). In contrast to σ70 and σ54, we did not identify a motif that corresponds to the remaining D. alaskensis G20 sigma factor, RpoH, nor did we find TSSs at the expected locations given the predictions in RegPrecise. We speculate that RpoH is not active under the growth conditions that we used for transcriptome analysis.
Using the identified D. alaskensis G20 σ70 and σ54 promoter motifs in combination with the tiling microarray and 5′ RNA-Seq data, we applied a semisupervised machine learning approach to identify genuine TSSs (see Materials and Methods). At a false discovery rate of 3%, we identified 1,313 high-confidence TSSs in D. alaskensis G20 (see Data Set S3 in the supplemental material for a full list).
A key challenge in microbial systems biology is mapping and modeling the gene regulatory networks of environmental bacteria. Despite the success of comparative genomics for predicting gene regulation in Desulfovibrio (20, 21, 23), the majority of D. alaskensis G20 regulators remain without predictions; most computational predictions are not experimentally verified; and even if a motif prediction exists, all targets may not be identified, as we demonstrated for RpoN. To address these challenges and to demonstrate the utility of the archived transposon mutant collection for targeted single-gene investigations, we measured gene expression in mutants of lysX, fur, rex, Dde_3000, perR, and phnF to validate and expand their predicted regulons.
LysX (Dde_2665) is a putative regulator of lysine utilization (23), and our tiling data confirm that lysX is cotranscribed in an operon with lysA. In addition to lysXA, LysX is predicted to regulate the lysine transporter LysW and the uncharacterized protein Dde_2468. In a defined medium with no lysine present, we observed little effect of the lysX mutant on gene expression relative to wild-type D. alaskensis G20 (Fig. 3A). However, in a defined medium with lysine, the lysX mutant strain had strongly increased expression of lysXA and lysW (Fig. 3B). Therefore, in the presence of excess lysine, it appears that LysX represses the last step in lysine biosynthesis (LysA) and the uptake of lysine (LysW). As D. alaskensis G20 is not believed to catabolize lysine, repressing the uptake of excess lysine may be adaptive. The expression of Dde_2468 did not respond to the presence of lysine, but the expression of the divergently transcribed gene Dde_2469 was altered in the lysX mutant (Fig. 3B). It is possible that binding of LysX to the site between Dde_2468:Dde_2469 affects the expression of Dde_2469 and not Dde_2468.
In a mutant for fur (Dde_2676), the ferric uptake regulator, most of the RegPrecise-predicted targets are strongly upregulated (Fig. 3C). Using the expression data and high-confidence TSSs, we identified two new members of the Fur regulon, Dde_3146:Dde_3144 and Dde_1239 (Fig. 3C), which encode hypothetical proteins, are induced in the fur mutant, and have Fur sites near the TSSs. Two predicted Fur targets, Dde_0753 (fur3) and Dde_0133 (bfr), are downregulated in the fur mutant, but their respective TSSs are near the Fur sites, so these predictions are still likely to be correct. Alternatively, Fur3 is a paralog of Fur (46% identity), so its downregulation could indicate that fur3 (and possibly bfr) is actually regulated by Fur3 and that the activity of Fur3 increases in a fur mutant background. In our expression data, fur is expressed more highly than fur3 and fur but not fur3 shows strong fitness effects (24), so we expect that Fur is the major regulator. Finally, the expression of the predicted Fur targets Dde_2805:Dde_2807 and Dde_2677:Dde_2676 did not change in the fur mutant. There is little expression of Dde_2805 in our transcriptomic data, so we cannot evaluate the Fur site relative to the TSS. The Fur site upstream of Dde_2677 is proximal to a TSS, but there is also read-through from the upstream genes, so we cannot draw a clear conclusion in this case either.
The redox-responsive repressor Rex regulates energy metabolism in a wide range of bacteria (22). In a rex mutant, we found that many predicted targets were upregulated as expected, but by less than 2-fold (“targets 1” in Fig. 3D). To confirm that these mild effects were specific to the rex mutant, we compared the expression data from the rex mutant to the expression data from other mutant strains. More precisely, we used linear regression to fit log2 expression levels in the rex mutant, using data from all of the other strains that were measured with the same array design (including wild-type D. alaskensis G20). Effects that cannot be predicted by this model are more likely to be directly due to the disruption of rex as opposed to subtle variations in growth conditions. A comparison of the model to the rex mutant data confirmed many of the expected targets. These include genes that are essential for sulfate reduction, namely, qmoABCD (Dde_1111:Dde_1114), sat (Dde_2265), adenylate kinase (Dde_2028), and pyrophosphatase (Dde_1178).
Conversely, other energy-production genes in the predicted Rex regulon were not induced in the mutant (“targets 2” in Fig. 3D), including sulfite reductase (dsrABD), adenylyl-sulfate reductase (apsAB), transmembrane complex (dsrMKJOP) (35), and type 1 cytochrome c3:menaquinone oxidoreductase (qrcABCD) (36). This might suggest that these genes are not actually targets of Rex, but their Rex sites are well conserved in other Desulfovibrio species (21). Additionally, studies with purified Rex protein from D. vulgaris Hildenborough confirmed that Rex binds some of these sites in vitro (J. Wall, personal communication). Instead, the lack of a response for these genes in the rex mutant seems to indicate a more complex mechanism of regulation. Two predicted target operons, dhcA-rnfDGEABF (Dde_0580:Dde_0587) and hysBA (Dde_2134:Dde_2135), were downregulated in the rex mutant, but these are predicted to be under complex regulation by other regulators as well. We removed the gene downstream of hysBA, Dde_2136, from the Rex regulon, as the tiling microarray data suggested that it is transcribed separately. Finally, some of the genes induced in the rex mutant that were not in the original regulon prediction have strong hits to the Rex motif near their TSS. Therefore, we added Dde_0552, Dde_1140, Dde_1591:Dde_1590, and Dde_2058 to the Rex regulon.
Our tiling microarray data confirm that the putative histidine kinase Dde_3000 is cotranscribed in an operon with the DNA binding response regulator Dde_3003. The ortholog of Dde_3003 in D. vulgaris Hildenborough, DVU2934, has a single specific binding site upstream of lpxC (37). Furthermore, a binding motif for DVU2934 was identified and confirmed by gel shift assays, and this motif is present upstream of lpxC in D. alaskensis G20 (37). Thus, we predict that Dde_3000 signals to Dde_3003 to control the expression of lpxC (Dde_2986). Consistent with this view, lpxC was strongly upregulated in the Dde_3000 mutant strain (Fig. 3E). Another possibility is that the insertion of a transposon within Dde_3000 would decrease the expression of Dde_3003 in the mutant strain, but we did not observe any decrease in the expression of Dde_3003. After taking expression data from other mutant strains into account with a linear regression, lpxC seems to be the only gene that is strongly upregulated in the Dde_3000 mutant (Fig. 3F). We examined some of the other outliers but did not find any hits to the response regulator’s motif. Thus, we confirmed that Dde_3000 signals to Dde_3003, and it appears that lpxC is the only target of Dde_3003, as in D. vulgaris Hildenborough. Dde_3003 is a predicted σ54-dependent transcriptional activator. Consistent with this, lpxC has a σ54 binding site (CGGCACGATTATTGCT) just upstream of the TSS, and the predicted binding site of Rajeev et al. (37) for Dde_3003 (GTGTAAAAAACACACA) is centered at −101 relative to the TSS. Since lpxC is upregulated in the Dde_3000 mutant, this implies that during growth in LS4D, Dde_3000 reduces the activity of Dde_3003.
PerR (Dde_3674) is a peroxide-sensing repressor involved in the regulation of oxidative stress. In a D. vulgaris Hildenborough perR (DVU3095) mutant, the predicted targets were derepressed during lactate-sulfate growth (38). Similarly, we observed that all four of the predicted members of the PerR regulon of D. alaskensis G20 (Dde_1143, Dde_1222, Dde_1320, and Dde_3674) were strongly induced in the perR mutant grown in lactate-sulfate medium (Fig. 3G). While additional genes changed expression in the mutant, we did not find hits to the PerR motif upstream of these genes, and so these probably result from indirect effects.
We measured expression in two mutants of phnF (Dde_3327), which encodes a putative regulator of phosphonate utilization (20). Similarly, in Mycobacterium smegmatis, a homolog of PhnF represses phosphonate utilization genes (39). Expression data from the D. alaskensis G20 phnF mutants were poorly correlated with data from the wild type and hence were hard to interpret (Fig. 3H). After comparison of the expression data from the phnF mutants to data from all other mutant strains using the regression model, it appears that all of the expected PhnF targets (Dde_3328:Dde_3336) are expressed more highly in the phnF mutants, as expected (Fig. 3I). Thus, our data confirm that Dde_3327 encodes a repressor of phosphonate utilization genes in D. alaskensis G20.
In each of the above examples, either we validated the predicted regulons for repressors using our baseline medium or we took advantage of the predicted signal for the regulator to profile gene expression under a physiologically relevant condition (LysX). To extend this workflow to the de novo discovery of new regulons for activators, the relevant signal should be first identified prior to expression profiling of the single regulatory mutant strain. In instances where the signal(s) is unknown, high-throughput mutant fitness profiling, such as described below for the choline utilization regulator, can be used to identify these signals. Given the scale on which these mutant fitness assays can be performed (40), this general workflow holds promise for uncovering new regulons.
To characterize nonessential genes in D. alaskensis G20, we constructed two pools of mutants and performed competitive fitness assays to simultaneously measure the fitness of 2,369 genes (40, 41). To calculate “gene fitness” scores for each gene, we averaged the fitness values for the insertion strains of the same gene, as described previously (40). Negative gene fitness scores are indicative of genes whose mutations result in reduced fitness relative to the typical strain in the pools. To validate this approach in D. alaskensis G20, we compared the fitness of 2,369 genes in LS4D versus LS4D supplemented with Casamino Acids. As expected, the fitness defects of many predicted amino acid biosynthesis genes were rescued with the addition of Casamino Acids (Fig. 4A).
Because the methionine synthesis pathway in Desulfovibrio is still unknown (42), we used the competitive, pooled mutant fitness assay to identify auxotrophs specifically rescued by the addition of methionine. In addition to the expected methionine biosynthesis genes hom (Dde_2731) and metH (Dde_2115), supplementation of minimal medium with methionine also rescued the fitness defects of the uncharacterized genes Dde_2711 and Dde_3007 (Fig. 4B). The D. alaskensis G20 MetH is missing the N-terminal “activation” domain [for reducing Co(II) to Co(I)] that is present in E. coli MetH. To identify this activity in D. alaskensis G20, we examined the new methionine auxotrophs identified by our fitness assay and found that Dde_2711 encodes a predicted ferredoxin and has homology to this missing activation domain of E. coli MetH. Dde_3007 encodes a conserved protein annotated as domain of unknown function DUF39. To determine if Dde_3007 is required for methionine biosynthesis, we complemented the methionine auxotrophy of a Dde_3007 mutant strain with a plasmid-carried copy of wild-type Dde_3007 (Fig. 4C). In the absence of the complementation plasmid, the addition of methionine or homocysteine also rescued the Dde_3007 mutant (Fig. 4D). In contrast, the addition of O-succinylhomoserine, l-homoserine, O-acetylhomoserine, or cystathionine did not rescue the methionine auxotrophy of the Dde_3007 mutant (data not shown). Taken together, these results suggest that Dde_3007 performs a step in methionine synthesis between l-homoserine and homocysteine.
We used these mutant fitness results to predict the methionine biosynthesis pathway in D. alaskensis G20 (Fig. 4E). Dde_2048 (lysC), an aspartate kinase, and Dde_0254 (asd), an aspartate-semialdehyde dehydrogenase, show only moderately reduced fitness in minimal medium (Fig. 4B), possibly due to redundancy in the D. alaskensis G20 genome (i.e., proAB [Dde_1633, Dde_2689] or argBC [Dde_2015, Dde_3455]). The uncertainty in the pathway remains between l-homoserine and homocysteine, as D. alaskensis G20 lacks the metB and metC genes of the classic methionine biosynthesis pathway from E. coli. In D. alaskensis G20, we propose that l-homoserine is activated to O-phosphohomoserine by an unknown enzyme(s). We have indirect evidence that O-phosphohomoserine is a metabolite in the D. alaskensis G20 methionine biosynthesis pathway: O-phosphohomoserine serves as a common metabolite for threonine and methionine synthesis in Methanococcus jannaschii (43), and D. alaskensis G20 contains ThrC (Dde_0171), the enzyme that converts O-phosphohomoserine to threonine. In addition, the new enzyme identified here as putatively part of the methionine biosynthesis pathway, Dde_3007, has a homolog in M. jannaschii. We propose that Dde_3007 performs a step in the methionine biosynthesis pathway between the activated l-homoserine and homocysteine intermediates (Fig. 4D; see below for full explanation). D. alaskensis G20 contains two predicted methionine synthase genes, a vitamin B12-independent enzyme encoded by metE (Dde_2328) and a vitamin B12-dependent enzyme encoded by metH (Dde_2115). metE does not have a significant phenotype in minimal medium and is probably not the predominant methionine synthase in D. alaskensis G20 under our growth conditions. In contrast, metH mutants have reduced fitness in minimal medium but are only moderately rescued by the addition of methionine (Fig. 4B). One potential reason for the incomplete rescue of the metH mutant with methionine is that there are not enough methyl groups in the mutant to obviate the need for the S-adenosyl-l-methionine (SAM) cycle.
Orthologs of Dde_3007 are found in other organisms which are known to synthesize methionine but which do not contain known genes for transforming l-homoserine to homocysteine, including DET0921 in Dehalococcoides ethenogenes 195 (44) and MJ0100 in Methanococcus jannaschii DSM 2661 (43). The ortholog of Dde_3007 in M. jannaschii, MJ0100, also contains a CBS domain that has been shown to sense SAM (45). This CBS domain is absent from Dde_3007, so we speculate that the enzyme is under feedback inhibition by SAM in M. jannaschii but not in D. alaskensis G20. Orthologs of Dde_3007 are sometimes found in close proximity to a putative homoserine kinase (Dester_DRAFT_0700 from Desulfurobacterium thermolithotrophum BSA, or ThenaDRAFT_1089 from Thermodesulfobium narugense Na82), which suggests that Dde_3007 is not the missing homoserine kinase but rather has another role. Furthermore, orthologs of Dde_3007 are often found adjacent to a ferredoxin domain or fused to it (i.e., THA_1098 in Thermosipho africanus). Dde_3007 orthologs are also often adjacent to COG2122; unfortunately, our mutant collection does not contain an insertion within the representative in D. alaskensis G20 (Dde_2535), but this family contains an ApbE-like domain that is probably a flavin transferase (46). The proximity to these genes suggests that Dde_3007 participates in a redox reaction. Indeed, a biochemical study of methionine synthesis in M. jannaschii suggested that methionine synthesis in that organism proceeds from O-phosphohomoserine and that protein-bound persulfide might be the sulfur source, with the sulfur being transferred via a redox reaction (43). However, Dde_3007 and its relatives do not have any conserved cysteines, so it is unlikely to be a persulfide carrier. Overall, we propose that Dde_3007 participates in the reductive transfer of a sulfur group to O-phosphohomoserine to form homocysteine.
Dde_3007 is part of a larger family, variously known as domain of unknown function 39 (DUF39), COG1900, or PF01837 (http://pfam.sanger.ac.uk/family/PF01837). Members of this family are sometimes annotated as IMP dehydrogenase, but according to the Pfam curators, this annotation is spurious. The genomes of some methanogens contain two members of this family, one of which may be an ortholog of Dde_3007 and the other of which is often in proximity to genes that are involved in the synthesis of coenzyme M. For example, in Methanoculleus marisnigri JR1, Memar_0110 is a member of DUF39 and is adjacent to genes encoding cysteate synthase (47) and sulfopyruvate decarboxylase (a fused ComDE) (48). Subsequent steps in coenzyme M synthesis involve the transfer of a sulfide group from sulfotoacetaldehyde to form coenzyme M, but the genes involved are not known. So, we propose that other members of DUF39 are involved in the transfer of a sulfide group to sulfotoacetaldehyde to form coenzyme M.
D. alaskensis G20 can grow by coupling the oxidation of choline to the reduction of sulfate (10). Recently, Craciun and Balskus identified a lyase in D. alaskensis G20 (CutC; Dde_3282), which cleaves choline to form trimethylamine and the toxic metabolite acetaldehyde (49). The acetaldehyde is probably further oxidized to acetate, which is coupled to sulfate reduction. In addition, they used comparative genomics to identify a larger, 16-kb gene cluster (termed the choline utilization or cut cluster) containing cutC and a number of other genes predicted to be involved in choline oxidation, including components of a microcompartment thought to be necessary for acetaldehyde sequestration (49). To systematically identify D. alaskensis G20 genes required for choline oxidation, we compared fitness data from the mutant pools grown with either choline or lactate as the carbon source. As illustrated in Fig. 5A, 16 cut cluster genes are required for choline utilization in D. alaskensis G20 including aldehyde dehydrogenases, alcohol dehydrogenases, and several microcompartment shell proteins. Therefore, our results demonstrate genetically that a microcompartment and acetaldehyde detoxification are required for choline oxidation in D. alaskensis G20.
In addition to the cut cluster, we identified additional genes important for choline utilization, including Dde_3288:Dde_3291, which are adjacent to and divergently transcribed from the cut cluster (Fig. 5A). The putative role of Dde_3291, a putative regulator, is described below. We also identified an acetaldehyde:ferredoxin oxidoreductase (Dde_2460), with a molybdenum or tungsten cofactor, which lies outside the cut cluster and shows a choline-specific fitness defect and may be responsible for detoxification outside the microcompartment (Fig. 5A). Alternatively, the D. alaskensis G20 microcompartment might disproportionate acetaldehyde to acetylphosphate and ethanol, as proposed for the ethanolamine utilization microcompartment of Salmonella (50). In this case, the soluble acetaldehyde dehydrogenase would be involved in reoxidizing the ethanol to acetate, which would be coupled to sulfate reduction.
Our mutant fitness data suggested that Dde_3291, a MerR family transcriptional activator, might be an activator of the choline utilization genes (Fig. 5A). Our tiling microarray data (collected with lactate as the carbon source) confirmed that Dde_3291 is part of an operon (Dde_3288:Dde_3291) that is expressed in the absence of choline, while the rest of the cut gene cluster (Dde_3284:Dde_3264) is weakly expressed during growth with lactate. By comparing sequences upstream of Dde_3284, Dde_3288, Dde_3291, and their homologs in Desulfovibrio salexigens and D. desulfuricans, we identified a palindromic motif, CnTTCCCCnnnnGGGGAAnG, with sites in D. alaskensis G20 upstream of Dde_3288 and Dde_3284. The motif upstream of Dde_3284 is centered at −23 to the TSS, which is expected for MerR-type activators that bind between the −10 and −35 boxes (51).
To test the hypothesis that Dde_3291 regulates the cut cluster, we collected expression data from a Dde_3291 transposon mutant and D. alaskensis G20 wild type after transfer to either a defined lactate-sulfate medium or choline-sulfate medium. By collecting expression data 1 h after transfer, we hoped to observe changes in gene expression without biasing the experiment by the reduced growth of the Dde_3291 mutant strain in choline-sulfate medium. Our results show that the Dde_3291 transposon mutant has greatly reduced expression of the cut cluster genes with choline as a carbon source (Fig. 5B), but not with lactate (Fig. 5C). Therefore, Dde_3291 activates the expression of choline utilization genes (cut cluster) in the presence of choline as a carbon source.
In wild-type D. alaskensis G20 cells with choline, we observed diminishing expression along the length of the putative Dde_3288:Dde_3264 (cut) operon (correlation of the position in the operon versus the log2 ratio, r = 0.93, P < 10−7). The expression of the downstream genes in the cut cluster was also less sensitive to the mutation in Dde_3291, with Dde_3267:Dde_3264 showing little upregulation in the mutant background with choline (Fig. 5B). We propose that Dde_3291 regulates the initiation of transcripts upstream of Dde_3284 and that nonspecific termination, along with weak transcription from internal promoters, leads to less of an effect on the expression of the far downstream genes.
The expression data also suggested that Dde_3039, a paralog of the choline-trimethylamine lyase cutC, might be regulated by Dde_3291 (Fig. 5B). The expression pattern of Dde_3039 does not seem to be an artifact of cross-hybridization, as the expression effect was just as strong even after removing the data from 28 (out of 125) potentially cross-hybridizing probes. Additionally, we found a weak hit to the Dde_3291 motif (gaacCCcTtCCCcTTAcGGGAgGGTtgc) upstream of Dde_3039. Overall, it seems likely that Dde_3291 directly regulates Dde_3039. However, the function of Dde_3039 remains unclear, as our fitness data show that it is not important for choline utilization (Fig. 5A).
Here, we present a comprehensive transposon mutant library of Desulfovibrio alaskensis G20 as a genetic resource for investigating gene function in sulfate-reducing bacteria. The transposon mutant collection enables targeted investigation of single genes, which we used to confirm the predicted regulons of LysX, PhnF, PerR, and Dde_3000 as well as to update the regulons of Fur and Rex. Additionally, because the transposon mutants were engineered to contain DNA bar codes, pooled mutant fitness assays with the D. alaskensis G20 mutants can be used to quickly generate lists of candidate genes, which can be followed up using the mapped and archived collection. We used this workflow to identify Dde_3007, a novel gene required for methionine biosynthesis, and Dde_3291, a regulator of choline utilization in D. alaskensis G20. Given the ease and scalability of the pooled mutant fitness assay, it is now feasible to assess the mutant fitness for each D. alaskensis G20 gene across hundreds of diverse conditions to globally infer gene function, as we have previously demonstrated in Shewanella oneidensis MR-1 (40). In summary, high-throughput and targeted investigations with the D. alaskensis G20 transposon mutant collection can be used to uncover key genes and pathways in this environmentally and industrially important but poorly studied group of bacteria.
Desulfovibrio alaskensis G20 was a gift of Judy Wall (University of Missouri). The E. coli conjugation donor strain WM3064 was a gift of William Metcalf (University of Illinois). D. alaskensis G20 was typically grown in an anaerobic chamber (Coy Laboratories, Grass Lake, Michigan) with an atmosphere of nitrogen, carbon dioxide, and hydrogen (90:5:5) at 30°C. For the mutant pool experiments, we grew the cultures in Hungate tubes that were filled and capped in the anaerobic chamber and incubated outside the chamber in the dark at 30°C. For culturing D. alaskensis G20 in lactate-sulfate medium, we used two variations of Postgate’s medium C (1): LS4D (52) and MOLS4 (31). LS4D (pH 7.2) contained 60 mM sodium lactate, 50 mM sodium sulfate, 8 mM magnesium chloride, 20 mM ammonium chloride, 2.2 mM potassium chloride (added after autoclaving), 0.6 mM calcium chloride, 30 mM PIPES [piperazine-N,N′-bis(2-ethanesulfonic acid)] buffer, trace minerals, and vitamins (53). For LS4D, we used resazurin as a redox indicator before autoclaving and titanium citrate as a reductant just prior to inoculation. To make the rich medium LS4, we supplemented LS4D with 0.1% (wt/vol) yeast extract. Rich lactate-sulfite medium (LS3) is identical to LS4, except that we reduced the concentration of sodium lactate to 15 mM, omitted the sodium sulfate, and added 10 mM sodium sulfite. MOLS4 (pH 7.2) contains 60 mM sodium lactate, 30 mM sodium sulfate, 8 mM magnesium chloride, 20 mM ammonium chloride, 2 mM potassium chloride, 0.6 mM calcium chloride, 30 mM Tris-HCl buffer (pH 7.4), trace minerals, iron(II) chloride (0.06 mM)-EDTA (0.12 mM) solution, and vitamins. We added 0.1% (wt/vol) yeast extract to MOLS4 to make the rich medium MOYLS4. MOCS4 is the same as MOLS4 except that we replaced the sodium lactate with 30 mM choline chloride and reduced the concentration of sodium sulfate to 15 mM. For MOLS4, MOYLS4, and MOCS4, we added hydrogen sulfide to a final concentration of 1 mM as a reductant just prior to inoculation. All media were autoclaved and moved to the anaerobic chamber before cooling. For plates, we used the same medium formations except that we added agar to a final concentration of 1.5% (wt/vol). We placed agar plates in the anaerobic chamber for 1 day prior to use. For culturing the diaminopimelic acid (DAP) auxotroph WM3064, we supplemented LB with DAP to a final concentration of 300 µM.
We previously published detailed methods that describe the DNA bar code (TagModule) collection (54) and the use of these TagModules to generate DNA-bar-coded transposon mutants in Shewanella oneidensis MR-1 (40) and Zymomonas mobilis ZM4 (41). The same methods were used to generate the D. alaskensis G20 transposon mutant collection. Each TagModule contains two unique 20-bp DNA sequences, termed the UPTAG and DOWNTAG, which are flanked by common PCR priming sequences. We cloned these TagModules into the mini-Tn5 transposon delivery vector pRL27 (55), as previously described (54). We created the D. alaskensis G20 transposon mutant collection by conjugating wild-type D. alaskensis G20 with the E. coli donor strain WM3064 harboring the TagModule-marked pRL27 transposon delivery vectors. With minor modifications, we used a previously described conjugation protocol (26). Briefly, we combined mid-log-phase D. alaskensis G20 and WM3064 in a single Eppendorf tube, pelleted the cells by centrifugation, and resuspended the cell pellet in 20 µl of LS4 medium. This concentrated mixture of cells was conjugated for 16 h at 30°C on a nylon filter (0.2-µm pore size; Supelco) on an LS4 agar plate. Postconjugation, we transferred the nylon filter to 3 ml of LS4 medium, inverted the tubes several times to remove the cells from the filter, incubated the cells for 6 h at 30°C, and plated the cells on LS4 plates supplemented with 400 µg/ml G418. We picked single, G418-resistant colonies into the wells of a 96-well microplate containing 500 µl of LS4 and 800 µg/ml G418 per well. After growth to stationary phase, we added glycerol to a final concentration of 10% (vol/vol) for long-term storage of the transposon mutants at −80°C. For 643 mutants, we replaced LS4 medium with LS3 medium for all transposon mutagenesis steps. For each transposon mutant, we mapped the transposon insertion location and identified the TagModule using a two-step arbitrary PCR and sequencing protocol, as previously described (54). See Table S1 in the supplemental material for a list of all primers used in this study. In total, we picked 21,696 colonies for the D. alaskensis G20 collection and mapped the transposon insertion location for 15,477 mutant strains. See Data Set S1 for a complete list of the D. alaskensis G20 transposon collection.
We classified a D. alaskensis G20 protein-coding gene as an expected essential if (i) the gene had an ortholog in Desulfovibrio vulgaris Hildenborough, Desulfovibrio vulgaris strain Miyazaki, and Desulfovibrio desulfuricans ATCC 27774; (ii) no transposon was mapped to the central (5 to 80%) portion of the gene; and (iii) the gene had a significant BLAST hit (>30% identity) in the OGEE database of essential genes (30) or has an ortholog (using unique COGs or TIGRfam) of an essential gene in either E. coli (56) or Bacillus subtilis (57). We classified protein-coding genes as putative Desulfovibrio-specific essentials if the genes met the first two criteria described above and shared an operon with and were adjacent to another Desulfovibrio-specific or expected essential. Additionally, to be classified as a Desulfovibrio-specific essential, the gene had to be at least 300 nucleotides long. We used a gene length cutoff of 300 nucleotides because, given the number of mutants mapped to and the length of the genome, we would expect a transposon insertion every 241 nucleotides.
We performed D. alaskensis G20 tiling microarray (NimbleGen) experiments on mid-exponential-phase cultures grown in LS4D and LS4 media using techniques described previously (34). Briefly, after removing probes with a second-best BLAT hit of 50 or more nucleotides to avoid cross-hybridization, we collected data for over 2 million 60-mer probes that covered both strands of the genome with a 6-nucleotide step size. We computed normalized log levels with a model that takes into account a genomic control and nucleotide content, as described previously (34). After removing the probes with the lowest 1% intensities in the genomic DNA control, we adjusted the normalized expression values so that their median was 0.
We prepared a 5′ RNA-Seq library with mRNA from a mid-log-phase, LS4D culture of D. alaskensis G20, using previously described techniques (34). Briefly, we treated the mRNA with terminator 5′-phosphate-dependent exonuclease (Epicentre) to remove partially degraded transcripts, converted 5′ triphosphates to monophosphates, and ligated an RNA sequencing adaptor. After cDNA synthesis, we enriched for products that contained adaptors on both ends by PCR and purified the library using Ampure DNA XP beads (Beckman). We sequenced 40 nucleotides (Illumina GA IIx) and aligned 18 million reads to the D. alaskensis G20 genome with ELAND (Illumina).
To identify D. alaskensis G20 promoter sequence motifs, we analyzed a preliminary set of 1,172 TSSs that had at least 50 5′ RNA-Seq reads and showed a sharp rise in normalized log2 intensity in the tiling microarray data from LS4D (34, 58). For each preliminary TSS, we extracted the sequence from −40 to +1 on the transcribed strand and searched for motifs using MEME 3.5 with a motif width of 30 to 40 nucleotides and the zero-or-one-occurrence per site (zoops) model (59). We used Patser (60) to score every location in the genome for how well it matched the significant σ70 motif and the σ54 motif from RegPrecise (21).
We considered any location with 50 reads in the 5′ RNA-Seq data and with more reads than surrounding locations (up to 25 nucleotides away) as a potential transcription start site (TSS). To classify these 14,844 candidates as genuine TSSs, we considered the number of reads, whether the tiling data showed a sharp rise at that location (34, 58), and the strength of any promoter motif upstream of the TSS. We combined these sources of information with a semisupervised machine learning approach: to generate training data for each data source, we used the other two data sources to label potential TSS locations as likely or unlikely to be genuine TSSs (34). We used these training data to infer a statistical model for each source of information. Each statistical model converts the raw score(s), such as how well the TSS matches a promoter motif or the number of 5′ RNA-Seq reads, to an estimate of the log odds, log [P(Score|TSS)/P(Score|not TSS)], based on how often that score occurs in the likely-TSS or unlikely-TSS training sets. For each tiling experiment, we used two different features—the difference in log intensity between the regions on either side of the putative TSS and the local correlation to a step function (58)—so that we had four tiling features. We built a statistical model for each tiling feature separately and then combined the log odds for these features by finding the best-fitting linear combination (i.e., logistic regression). Then, we added the log odds from 5′ RNA-Seq, tiling, and promoter motifs (i.e., a naive Bayesian classifier). Finally, we chose an arbitrary cutoff (log odds >4) to identify high-confidence TSSs. Above this cutoff, we obtained 1,313 TSSs in the genuine data. When we shuffled the data, by computing tiling features and motif features for randomly selected locations, we obtained just 40 locations above our threshold (log odds >4). This suggests that the high-confidence TSSs include about 40 false positives, or a false discovery rate of 3% (40/1,313).
We measured gene expression in wild-type D. alaskensis G20 and 18 different regulatory mutants. See Table S2 in the supplemental material for a list of these mutant strains and the growth conditions used for expression profiling. For each mutant, we verified the correct strain by PCR with a transposon and genome-specific primer pair. The regulatory mutants and wild-type D. alaskensis G20 were typically grown to mid-log phase and centrifuged at 4°C for 10 min at 10,000 × g, and the harvested cells were stored at −80°C. For strain JK05048 (transposon mutant in Dde_3291), we transferred late-log-phase cells growing in MOLS4 to either fresh MOLS4 or MOCS4 medium for 1 h before harvesting cells. For strain JK05162 (transposon mutant in Dde_2665; lysX), we transferred late-log-phase cells growing in MOLS4 to either fresh MOLS4, MOLS4 with 0.3 mM lysine, or MOYLS4 medium and incubated them for 1 h before harvesting cells. As controls for the Dde_3291 and Dde_2665 experiments, we did the same 1-h incubation experiments with wild-type D. alaskensis G20. RNA isolation, cDNA synthesis, labeling, hybridization to NimbleGen microarrays, and data analysis were performed as previously described (61). For each experiment, we set the median of the normalized log2 expression levels to zero.
We designed two pools of D. alaskensis G20 transposon mutants, pool 1 with 4,069 strains and pool 2 with 4,056 strains, such that within each pool, each strain contains a unique TagModule (54). We constructed and assayed two pools in order to maximize the number of unique transposon insertions, as we have more insertion mutants than TagModules. Individual transposon mutants were rearrayed from the glycerol stock microplates to new microplates with fresh LS4 medium supplemented with G418 (800 µg/ml) using a liquid handling robot (Beckman Biomek 3000) housed in the anaerobic chamber. The fresh cultures were grown for 2 days at 30°C, and all of the individual strains were combined using the robot. For each pool, we added glycerol to a final concentration of 10% (vol/vol) and stored multiple 1-ml aliquots at −80°C. During construction of the pools, any D. alaskensis G20 transposon mutant strains with E. coli contamination were excluded. Additionally, some mutants did not grow at all or grew poorly from the original glycerol stocks. For example, some of the mutants selected on lactate-sulfite medium did not grow in the lactate-sulfate medium used to construct the pools. Lastly, some transposon mutants likely have a wrongly assigned TagModule. For these reasons, we do not have fitness data for all of the strains in the original pool designs.
We performed pooled mutant fitness assays as previously described (40, 41). The two pools of mutants were grown separately to mid-log phase in LS4 at 30°C, and samples of each pool culture were collected as a “start” control. The remaining culture was pelleted, washed twice with phosphate buffer, and finally resuspended in the same volume of LS4D or phosphate buffer (for the choline experiment). We inoculated the pools in the selective medium at a starting optical density at 600 nm (OD600) of 0.02 in 10 ml of medium. After growth of the mutant pool reached saturation (4 to 6 population doublings), we collected “condition” samples. Genomic DNA extraction, DNA bar code amplification, and hybridization of the DNA tags to the GenFlex 16K_v2 microarray (Affymetrix) were performed as described previously (40, 62). For some experiments, we hybridized the UPTAGs from pool 1 and the DOWNTAGs from pool 2 to a single microarray because the two tags in the TagModule provide redundant data (54). In this study, we performed pooled fitness assays under the following five conditions: LS4D, LS4D with 0.2% (wt/vol) Casamino Acids, LS4D with 1 µM methionine, MOLS4 without vitamins, and MOCS4 without vitamins. We excluded the vitamins in the latter two experiments because our vitamin solution contained trace amounts of choline chloride.
Data processing, normalization, and calculation of strain and gene fitness were performed as described previously (40). Briefly, we calculated the fitness of a strain in the pool as the log2 ratio of its bar code signal intensity under the condition relative to the start. We averaged the fitness values from relevant strains to calculate gene fitness. If a gene had data from a central insertion (within the central 5 to 80% of the gene), then data from other, edge insertions were not included in the average. In this paper, we report only the averaged gene fitness values. We normalized the fitness values so that the typical gene had a fitness of zero under each condition.
We complemented the methionine auxotrophy of a transposon mutant in Dde_3007 (strain JK00771) by introducing a wild-type copy of Dde_3007 on plasmid pMO9075 (63). Our tiling array data suggested that the annotated start codon of Dde_3007 was incorrect, and comparative genomics suggested that the true start codon was at position 2993462. We cloned a copy of Dde_3007 with the revised start codon into pMO9075 using Gibson assembly and verified the clone, pJK2, by sequencing. Plasmids pMO9075 and pJK2 were introduced into wild-type D. alaskensis G20 and JK00771 by electroporation (16) and selected on MOYLS4 plates supplemented with 800 µg/ml spectinomycin.
All fitness data are available in MicrobesOnline (http://microbesonline.org/). The D. alaskensis G20 gene expression data from this study are available in the Gene Expression Omnibus (GEO) under the accession numbers in parentheses: tiling microarray data (GSE39471), 5′ RNA-Seq data (GSE49484), and the regulatory mutant data (GSE49530).
Primers used in this study.
Transposon mutant strains used for expression profiling.
Desulfovibrio alaskensis G20 transposon mutant collection. Download
List of putative Desulfovibrio alaskensis G20 essential genes. Download
List of high-confidence Desulfovibrio alaskensis G20 transcription start sites. Download
This work conducted by ENIGMA was supported by the Office of Science, Office of Biological and Environmental Research, of the U.S. Department of Energy under contract no. DE-AC02-05CH11231. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Citation Kuehl JV, Price MN, Ray J, Wetmore KM, Esquivel Z, Kazakov AE, Nguyen M, Kuehn R, Davis RW, Hazen TC, Arkin AP, Deutschbauer A. 2014. Functional genomics with a comprehensive library of transposon mutants for the sulfate-reducing bacterium Desulfovibrio alaskensis G20. mBio 5(3):e01041-14. doi:10.1128/mBio.01041-14.