Overview of Gene and Protein Features
The finished sequence for D. aromatica
reveals a single circular, closed chromosome of 4,501,104 nucleotides created from 130,636 screened reads, with an average G+C content of 60% and an extremely high level of sequence coverage (average depth of 24 reads/base [see Additional file 3
]). Specific probing for plasmids confirmed no plasmid structure was present in the clonal species sequenced, which supports anaerobic benzene degradation. It is noted however that the presence of two tra
clusters (putative conjugal transfer genes; VIMSS582582-582597 and VIMSS582865-582880), as well as plasmid partitioning proteins, indicates this microbial species is likely to be transformationally competent and thus likely to be able to support plasmid DNA structures.
The Virtual Institute for Microbial Stress and Survival (VIMSS, http://www.microbesonline.org
) and the Joint Genome Institute http://genome.jgi-psf.org/finished_microbes/decar/decar.home.html
report 4170 and 4204 protein coding genes, respectively [see Additional file 3
]. Cross-database comparisons were done to assure the highest probability of capturing candidate orfs for analysis. The majority of proteins are shared between data sets. Variations in N-termini start sites were noted, both between JGI and VIMSS datasets and between initial and later annotation runs (approximately 200 N-termini differences between four runs of orf predictions were noted for the initial two annotation runs, Joint Genome Institute's, done at Oak Ridge National Laboratories – ORNL, and VIMSS).
The most definitive functional classification, TIGRfams, initially defined approximately 10% of the proteins in this genome; as of this writing, 33% of predicted proteins in the D. aromatica
genome are covered by TIGRfams, leaving 2802 genes with no TIGRfam classification [see Additional file 4
]. Many proteins in the current and initial non-covered sets were investigated further using K. Sjölander's HMM building protocols (many of which are available at http://phylogenomics.berkeley.edu
), to supplement TIGRfams. The Clusters of Orthologous Genes (COG) assignments were used for classification in the families of signaling proteins, but specific function predictions for these proteins also required further analyses. The metabolic and signaling pathways are discussed below, and the identity of orthologs within these pathways are based on analysis of phylogenomic profiles of clusters obtained by HMM analysis, with comparison to proteins having experimentally defined function.
Anaerobic aromatic degradation – absence of known enzymes indicates novel pathways
One of the more striking findings is the absence of known key enzymes for monoaromatic degradation under anaerobic conditions. One of the primary metabolic capabilities of interest for this microbe is anaerobic degradation of benzene. Fumarate addition to toluene via benzylsuccinate synthase (BssABCD) is recognized as the common mechanism for anaerobic degradation by a phylogenomically diverse population of microbes [14
] and has been called "the paradigm of anaerobic hydrocarbon oxidation"[17
]. Benzoyl CoA is likewise considered a central intermediate in anaerobic degradation, and is further catabolized via benzoyl CoA reductase (BcrAB) [17
]. Populated KEGG maps in the IMG and VIMSS databases, based on BLAST analyses, indicate the presence of some of the enzymes previously characterized as belonging to the Bss pathway in D. aromatica
, yet more careful analysis shows the candidate enzymes to be members of a general family, rather than true orthologs of the enzymes in question. The majority of catabolic enzymes of interest for D. aromatica
are not covered by TIGRfams or COGs families. For this reason Flower Power clustering, SCI-PHY subfamily clade analysis, and HMM scoring were used to ascertain the presence or absence of proteins of interest (for a detailed description, see Additional file 1
). The most reliable prediction-of-function approaches for genomically sequenced protein orfs are obtained using the more computationally intensive HMM modelling and scoring utilities. This allows the protein in question to be assessed by phylogenetic alignment to protein families or sub-families with experimentally known function, providing much more accurate predictions [18
To explore the apparent lack of anaerobic aromatic degradation pathways expected to be present in this genome, all characterized anaerobic aromatic degradation pathways from A. aromaticum
] were defined by HMMs to establish presence or absence of proteins in both the D. aromatica
BH72 genomes (these three genomes comprise nearest-neighbor species in currently sequenced species [see Additional file 1
]. In A. aromaticum
EbN1, ten major catabolic pathways have been found for anaerobic aromatic degradation, and nine of the ten converge on benzoyl-CoA [21
]. A key catalytic enzyme or subunit for each enzymatic step was used as a seed sequence to recruit proteins from a non-redundant set of Genbank proteins for phylogenetic analysis. Benzylsuccinate synthase, present in A. aromaticum
] as well as Thauera aromatica
], and Geobacter metallireducens
], is not present in either the D. aromatica
BH72 genomes (see Table ). Benzoyl-CoA reductase and benzylsuccinate synthase, previously denoted as "central" to anaerobic catabolism of aromatics, are likewise absent. The set of recruited proteins for both benzylsuccinate synthase and benzoyl-CoA reductase indicate they are not as universally present as has been suggested. D. aromatica
does encode a protein in the pyruvate formate lyase family, but further analysis shows that it is more closely related to the E. coli
homolog of this protein (which is not involved in aromatic catabolism) than to BssA. Anaerobic reduction of ethylbenzene is carried out by ethylbenzene dehydrogenase (EbdABCD1, 2) in A. aromaticum
. This complex belongs to the membrane bound nitrate reductase (NarDKGHJI) family. In D. aromatica
, this complex of proteins is only present as the enzymatically characterized perchlorate reductase (PcrABCD; [24
]) which utilizes perchlorate, rather than nitrate, as the electron acceptor. EbdABCD proteins in A. aromaticum
(VIMSS814904-814907 and VIMSS816928-816931) occur in operons that include (S)-1-phenylethanol dehydrogenases (Ped; VIMSS 814903 and 816927) [25
], both of which are absent from D. aromatica
, as is the acetophenone carboxylase that catalyzes ATP-dependent carboxylation of acetophonenone produced by Ped.
Anaerobic aromatic degradation enzymes in near-neighbor Aromatoleum aromaticum EbN1.
For all pathways except the ubiquitous phenylacetic acid catabolic cluster, which is involved in the aerobic degradation of phenylalanine, and the PpcAB phenylphosphate carboxylase enzymes involved in phenol degradation via 4-hydroxybenzoate, all key anaerobic aromatic degradation proteins present in A. aromaticum EbN1 are missing from the D. aromatica genome (Table ), and the majority are also not present in Azoarcus BH72. The lack of overlap for genes encoding anaerobic aromatic enzymes between these two species was completely unexpected, as both A. aromaticum EbN1 and D. aromatica are metabolically diverse degraders of aromatic compounds. In general Azoarcus BH72 appears to share many families of proteins with D. aromatica that are not present in A. aromaticum EbN1 (eg signaling proteins, noted below).
Anaerobic degradation of benzene occurs at relatively sluggish reaction rates, indicating that the pathways incumbent in D. aromatica
for aromatic degradation under anaerobic conditions might serve in a detoxification role. Another intriguing possibility is that oxidation is dependent on intracellularly produced oxygen, which is likely to be a rate-limiting step. Alicycliphilus denitrificans
strain BC couples benzene degradation under anoxic conditions with chlorate reduction, utilizing the oxygen produced by chlorite dismutase in conjunction with a monooxygenase and subsequent catechol degradation for benzene catabolism [26
]. A similar mechanism may account for anaerobic benzene oxidation coupled to perchlorate and chlorate reduction in D. aromatica
. However, anaerobic benzene degradation coupled with nitrate reduction is also utilized by this organism, and remains enigmatic [5
The extremely high divergence of encoded protein families in this functional grouping differs from the general population of central metabolic and housekeeping genes: Azoarcus BH72, Azoarcus aromaticum EbN1 and D. aromatica are evolutionarily near-neighbors within currently sequenced genomes, as defined both by the high level of protein similarity within house-keeping genes (defined by the COG J family of proteins), and 16sRNA sequence. Azoarcus BH72 and A. aromaticum EbN1 display the highest percent similarity between housekeeping proteins within this triad, with 138 of the 156 COG J proteins in A. aromaticum EbN1 displaying highest similarity to their BH72 counterparts. On average these two genomes display 83.5% amino acid identity across shared COG J proteins. D. aromatica is an outlier in the triad, with higher similarity to Azoarcus BH72 than A. aromaticum EbN1 (43 of D. aromatica's 169 COG J proteins are most homologous to A. aromaticum EbN1 orthologs with an average 71% identity, and 67 are most homologous to Azoarcus BH72 with an average 72% identity).
Comparative genomics have previously established that large amounts of DNA present in one species can be absent even from a different strain within the same species [27
]. In addition, the underestimation of the diversity of aromatic catabolic pathways (both aerobic and anaerobic) has been noted previously [28
], and a high level of enzymatic diversity has been seen for pathways that have the same starting and end products, including anaerobic benzoate oxidation [29
Aerobic aromatic degradation
encodes several aerobic pathways for aromatic degradation, including six groups of oxygenase clusters that each share a high degree of sequence similarity to the phenylpropionate and phenol degradation (Hpp and Mhp) pathways in Comamonas
]. The mhp
genes of E. coli
are involved in catechol and protocatechuate pathways for aromatic degradation via hydroxylation, oxidation, and subsequent ring cleavage of the dioxygenated species. Only one of the clusters in D. aromatica
encodes an mhpA
-like gene; it begins with VIMSS584143 MhpC, and is composed of orthologs of MhpABCDEF&R, and is in the same overall order and orientation as the Comamonas
cluster as well as the E. coli mhp
gene families [32
] (see Fig. , cluster 3). These pathways are also phylogenomically related to the biphenyl/polychlorinated biphenyl (Bhp) degradation pathways in Pseudomonad
]. For Comamonas testosteroni
, this pathway is thought to be associated with lignin degradation [31
]. Hydroxyphenyl propionate (HPP), an alkanoic acid of phenol, is the substrate for Mhp, and is also produced by animals in the digestive breakdown of polyphenols found in seed components [33
]. Each gene cluster appears to represent a multi-component pathway, and is made up of five or more of various combinations of dioxygenase, hydroxylase, aldolase, dehydrogenase, hydratase, decarboxylase and thioesterase enzymes.
Figure 1 Aerobic degradation of aromatic compounds: multiple Mhp-like dioxygenase clusters. Each of the six mhp-like gene clusters in the D. aromatica genome is depicted. Recent gene duplications between individual proteins are shown by a purple connector between (more ...)
The single predicted MhpA protein in D. aromatica (VIMSS584155), which is predicted to support an initial hydroxylation of a substituted phenol substrate, shares 64.4% identity to Rhodococcus OhpB 3-(2-hydroxyphenyl) propionate monooxygenase (GI:8926385) vs. 26.4% for Comamonas testosteroni (GI:5689247), yet the remainder of the ohp genes in the Rodococcus ohp clade do not share synteny with the D. aromatica mhp gene cluster.
Other aromatic oxygenases
Two chromosomally adjacent monooxygenase clusters, syntenic to genes found in Burkholderia and Ralstonia spp, indicate that D. aromatica might have broad substrate hydroxylases that support the degradation of toluene, vinyl chlorides, and TCE (Fig. and Table ), and are thus candidates for benzene-activating enzymes in the presence of oxygen.
Figure 2 Catabolic oxygenases of aromatic compounds: Synteny between D. aromatica, P. mendocina, Burkholderia and R. eutropha. Orthologous gene clusters for P. mendocina, R. eutropha JMP134, Burkholderia JS150 and D. aromatica are shown. D. aromatica possesses (more ...)
Aromatic degradation in D. aromatica: Mono- and Di-oxygenases.
One monooxygenase gene cluster, composed of VIMSS581514 to 581519 ('tbc2 homologs,' Fig. ), is orthologous to the tbuA1UBVA2C
gene families (from P. stutzeri, R. pickettii
, and Burkholderia
JS150). This gene cluster includes a transport protein that is orthologous to TbuX/TodX/XylN (VIMSS581520). Specificity for the initial monooxygenase is not established, but phylogenetic analysis places VIMSS581514 monooxygenase with near-neighbors TbhA [34
], reported as a toluene and aliphatic carbohydrate monooxygenase (76.5% sequence identity), and BmoA [35
], a benzene monooxygenase of low regiospecificity (79.6% sequence identity). The high level of similarity to the D. aromatica
protein is notable. The region is also highly syntenic with, and homologous to, the tmo
AECDBF (AY552601) gene cluster responsible for P. mendocina
's ability to utilize toluene as a sole carbon and energy source [36
Just downstream on the chromosome is a phc/dmp/phh/phe/aph
-like cluster of genes, composed of the genes VIMSS812947 and VIMSS 581535 to 581540 ('tbc1 homologs,' Fig. ). Overall, chromosomal organization is somewhat different for D. aromatica
as compared to Ralstonia
. D. aromatica
has a fourteen gene insert that encodes members of the mhp
-like family of aromatic oxygenases between the tandem tbc 1 and 2-like oxygenase clusters (see Table ), with an inversion of the second region compared to R. eutropha
. Clade analysis indicates a broad substrate phenol degradation pathway in this cluster, with high sequence identity to the TOM gene cluster of Bradyrhizobium
, which has the ability to oxidize dichloroethylene, vinyl chlorides, and TCE [37
]. The VIMSS581522 response regulator gene that occurs between the two identified monooxygenase gene clusters shares 50.3% identity to the Thaurea aromatica tutB
gene and 48.2% to the Pseudomonas
sp. Y2 styrene response regulator (occupying the same clade in phylogenetic analysis). VIMSS581522 is likely to be involved in the chemotactic response in conjunction with VIMSS581521 (histidine kinase) and VIMSS581523 (methyl accepting chemotaxis protein), which would confer the ability to display a chemotactic response to aromatic compounds.
Overall, several mono- and di-oxygenases were found in the genome, indicating D. aromatica has diverse abilities in the aerobic oxidation of heterocyclic compounds.
There are several gene clusters indicative of benzoate transport and catabolism. All recognized pathways are aerobic. The benzoate dioxygenase cluster BenABCDR is encoded in VIMSS582483-582487, and is very similar to (and clades with) the xylene degradation (xylXYZ) cluster of Pseudomonas.
There is also an hcaA oxygenase gene cluster, embedded in one of the mhp clusters (see cluster 5, Fig. ). Specificity of the large subunit of the dioxygenase (VIMSS582049) appears to be most likely for a bicyclic aromatic compound, as it shows highest identity to dibenzothiophene and naphthalene dioxygenases.
Cellular interactions with community/environment – secretion
Fifteen transport clusters include a TolC-like outer membrane component, and recent gene family expansion is noted within several families of ABC transporters for this genome. TolC was originally identified in E. coli
as the channel that exports hemolysin [43
], and hemolysin-like proteins are encoded in this genome. Two groups of ABC transporters occur as a cluster of five transport genes; these five-component transporters have been implicated in the uptake of external macromolecules [44
The presence of putative lytic factors, lipases, proteases, antimicrobials, invasins, hemolysins, RTXs and colicins near potential type I transport systems indicate that these might be effector molecules used by D. aromatica
for interactions with host cells (eg. for cell wall remodeling). Iron acquisition is likely to be supported by a putative FeoAB protein cluster (VIMSS583997, 583998), as well as several siderophore-like receptors and a putative FhuE protein (outer membrane receptor for ferric iron uptake; VIMSS583312). Other effector-type proteins, likely to be involved in cell/host interactions (and which in some species have a role in pathogenicity [45
]), are present in this genome. Adhesins, haemagglutinins, and oxidative stress neutralizers are relatively abundant in D. aromatica
. A number of transporters occur near the six putative soluble lytic murein transglycosylases, indicating possible cell wall remodeling capabilities for host colonization in conjunction with the potential effector molecules noted above. Homologs of these transporters were shown to support invasin-type functions in other microbes [45
]. Interaction with a host is further implicated by: VIMSS581582, encoding a potential cell wall-associated hydrolase, VIMSS581622, encoding a predicted ATPase, and VIMSS3337824/formerly 581623, encoding a putative membrane-bound lytic transglycosylase.
Eleven tandem copies of a 672 nucleotide insert comprise a region of the chromosome that challenged the correct assembly of the genome, and finishing this region was the final step for the sequencing phase of this project (see Methods). Unexpectedly, analysis of this region revealed a potential open reading frame encoding a very large protein that has been variously predicted at 4854, 2519 or 2491 amino acids in size during sequential automated protein prediction analyses (VIMSS3337779/formerly 582095). This putative protein, even in its smallest configuration, contains a hemolysin-type calcium-binding region, a cadherin-like domain, and several RTX domains, which have been associated with adhesion and virulence. Internal repeats of up to 100 residues with multiple copies have also been found in proteins from Vibrio, Colwellia, Bradyrhizobium, and Shewanella spp. (termed "VCBS" proteins as defined by TIGRfam1965).
Other potential effector proteins include: three hemolysin-like proteins adjacent to type I transporters, eight proteins with a predicted hemolysin-related function, including VIMSS583067, a hemolysin activation/secretion protein, VIMSS580979, hemolysin A, VIMSS583372, phospholipase/hemolysin, VIMSS581868, a homolog of hemolysin III, predicted by TIGRfam1065 to have cytolytic capability, VIMSS582079, a transport/hemolysin, and VIMSS581408, a general hemolysin. Five predicted proteins have possible LysM/invasin domains, including: VIMSS580547, 581221, 581781, 582766, and 583769. One gene, VIMSS583068, encodes a putative 2079 amino acid filamentous haemagglutinin, as well as a hasA-like domain, making it a candidate for hasA-like function (hasA is a hemophore that captures heme for iron acquisition [46
Type II secretion
Besides the constitutive Sec and Tat pathways, D. aromatica
has several candidates for dedicated export secretons of unknown function, with 3–4 putative orthologs of PulDEFG interspersed with a lytic transglycosylase and a hemolysin (VIMSS582071-582085). The region from VIMSS581889 to VIMSS581897 includes pul
DEFG type subunits and an exe
A ATPase like protein. It is bracketed by signaling components comprised of a histidine kinase, adenylate cyclase, and a protein bearing similarity to the nitrogen response regulator gln
G (VIMSS581898), which has been shown to be involved in NH3
assimilation in other species [47
In addition, there is a nine-gene cluster that encodes several proteins related to toluene resistance (VIMSS581899 to 581906).
A pilus-like gene cluster (which can also be classified as type IV secretion) occurs in VIMSS580547-580553, encoding a putative lytic transglycosylase, ABC permease, cation transporter, pilin peptidase, pilin ATPase and PulF-type protein. This assembly resembles other pilin assemblies associated with attachment to a substrate, such as the pilus structure responsible for chitin/host colonization in Vibrio cholerae
Another large pilus-like cluster (VIMSS584160-584173) occurs in close proximity to the mhpCEFDBAR oxygenase genes (see eg VIMSS584157, mhpR).
Type III secretion
has been shown to be chemotactic under various circumstances. The flagellar proteins (FliAEFGHIJKLMNOPQR, FlaABCDEFGHIJK) are followed by an additional cluster of 15 chemotaxis/signal transduction genes (VIMSS580462-580476), and homologs of FlhC and D regulatory elements required for the expression of flagellar proteins (VIMSS582640 and 582641) [49
], identified by phylogenetic clustering, are also present. Since D. aromatica
has a flagellum and displays chemotactic behavior, it is likely that the flagellar gene cluster is solely related to locomotion, though type III secretion systems can also encode dedicated protein translocation machineries that deliver bacterial pathogenicity proteins directly to the cytosol of eukaryotic host cells [50
Type IV secretion
There are two copies of a twenty-one gene cluster that includes ten putative conjugal transfer (Tra) sex-pilus type genes in the D. aromatica
genome (VIMSS582582-582601 and VIMSS582864-582884), indicating a typeIV secretion structure that is related to non-pathogenic cell-cell interactions [51
Type VI secretion
A large cluster of transport proteins that is related to the virulence associated genetic locus HIS-1 of Pseudomonas aeruginosa
and the VAS genes of V. cholerae
] includes homologs of hcp1, IcmF and clpV (as VIMSS583005, 582995 and 583009, respectively, in D. aromatica
[see Additional file 6
]). This IcmF-associated (IAHP) cluster has been associated with mediation of host interactions, via export of effector proteins that lack signal sequences [53
]. Further evidence for type VI secretion is found in the presence of three proteins containing a Vgr secretion motif modeled by TIGRfam3361, which is found only in genomes having type VI secretory apparatus. Though most bacteria that contain IcmF clusters are pathogenic agents that associate with eukaryotic cell hosts [54
], it has been reported that the host interactions supported by this cluster are not restricted to pathogens [55
The type IV pili systems might be involved in biofilm development, as interactions with biofilm surfaces are affected by force-generating motility structures, including type IV pili and flagella [56
]. Quorum sensing is a deciding input for biofilm formation, and the presence of an exopolysaccharide synthetic cluster lends further support for biofilm formation. Further, derivatives of nitrous oxide, which is an evident substrate for D. aromatica
, are a key signal for biofilm formation vs cell dispersion in the microbe P. aeruginosa
Cellular interactions with community – quorum sensing
Quorum sensing uses specific membrane bound receptors to detect autoinducers released into the environment. It is involved in both intra- and inter-species density detection [58
]. Cell density has been shown to regulate a number of cellular responses, including bioluminescence, swarming, expression of virulence factors, secretion, and motility (as reviewed in Withers et al. 2001 [60
encodes six histidine kinase receptor proteins that are similar to the quorum sensing protein QseC of E. coli
(VIMSS580745, 582451, 582897, 583274, 3337577 (formerly 583538), and 583893), five of which co-occur on the chromosome with homologs of the CheY like QseB regulator, and two of which appear to be the product of a recent duplication event (VIMSS583893 & 3337577). Of the six QseC homologs, phylogenetic analysis indicates VIMSS582451 is most similar to QseC from E. coli
, where the QseBC complex regulates motility via the FlhCD master flagellar regulators (VIMSS582640 and 582641). The presence of several qseC/B
gene pairs indicates the possibility of specific responses that are dependant on different sensing strategies. In other species, expression of ABC exporters is regulated by quorum sensing systems [46
]; gene family expansion is indicated in the ABC export gene pool as well as the qseC/B sensors in D. aromatica
N-acyl-homoserine lactone is the autoinducer typical for gram negative bacteria [61
], yet D. aromatica
lacks any recognizable AHL synthesis genes. Ralstonia
Betaproteobacteria likewise encode several proteins in the qse
C gene family and display a diversity of candidate cell density signaling compounds other than AHL [62
]. The utility of having a diverse array of quorum sensing proteins remains to be determined, but appears likely to be associated with a complex, and possibly symbiotic, lifestyle for D. aromatica
closely reflects several metabolic pathways of R. capsulatus
, which is present in the rhizosphere, and its assimilatory nitrate/nitrite reductase cluster is highly similar to the R. capsulatus
]. Encoded nitrate response elements also indicate a possible plant association for this microbe, as nitrate can act as a terminal electron acceptor in the oxygen-limited rhizosphere. Alternatively, nitrous oxide (NO) reduction can indicate the ability to respond to anti-microbial NO production by a host (used by the host to mitigate infection [73
]). Several gene families are present that indicate interactions with a eukaryotic host species, including response elements that potentially neutralize host defense molecules, in particular nitric oxide and other nitrogenous species.
Nitrate is imported into the cytosol by NasDEF in Klebsiella pneumoniae
] and expression of nitrate and nitrite reductases is regulated by the nasT protein in Azotobacter vinelandii
]. A homologous set of these genes are encoded by the cluster VIMSS580377-580380 (NasDEFT), and a homolog of nar
K is immediately downstream at VIMSS580384, and is likely involved in nitrite extrusion. Upstream, a putative nas
BDC cluster (assimilatory nitrate and nitrite reduction) is encoded near the nar
XL-like nitrate response element. VIMSS580393 encodes a nitrate reductase that is homologous to the NasA cytosolic nitrate reductase of Klebsiella pneumoniae
]. Community studies have correlated the presence of NasA-encoding bacteria with the ability to use nitrate as the sole source of nitrogen [77
]. The large and small subunits of nitrite reductase (VIMSS580391 nir
B and VIMSS580390 nir
D) are immediately adjacent to a transporter with a putative nitrite transport function (VIMSS580389 NirC-like protein). The NirB orf is also highly homologous to both
NasB (nitrite reductase) and NasC (NADH reductase which passes electrons to NasA) of Klebsiella pneumoniae
. HMMs created from alignments seeded by the NasB and NasC genes scored at 3.2e-193
, respectively, to the VIMSS580391 NirB protein. D. aromatica
is similar to Methylococcus capsulatus, Ralstonia solanacearum, Polaromonas
, and Rhodoferax ferrireducens
B and nir
D gene clusters. However, the presence of the putative transporter nir
C (VIMSS580389) shares unique similarity to the E. coli
and Salmonella nir
Putative periplasmic, dissimilatory nitrate reduction, which is a candidate for denitrification capability [78
], is encoded by the nap
DABC genes (VIMSS 3337807/581796-581799). A probable cytochrome c', implicated in nitric oxide binding as protection against potentially toxic excess NO generated during nitrite reduction [79
], is encoded by VIMSS582015. Although most denitrifiers are free living, plant-associated denitrifiers do exist [80
]. There is no dissimilatory nitrate reductive complex nar
GHIJ, but rather, NarG and NarH-like proteins are found in the evolutionarily-related perchlorate reductase alpha and beta subunits [24
]. These proteins are present in the pcr
cluster, VIMSS582649-582652 and VIMSS584327, as previously reported for Dechloromonas
Ammonia incorporation appears to be metabolically feasible via a putative glu-ammonia ligase (VIMSS581081), an enzyme that incorporates free ammonia into the cell via ligation to a glutamic acid. An ammonium transporter and cognate regulator are likely encoded in the Amt and GlnK-like proteins VIMSS581101 and 581102.
Urea catabolism as a further source of nitrogen is suggested by two different urea degradation enzyme clusters. The first co-occurs with a urea ABC-transport system, just upstream of a putative nickel-dependent urea amidohydrolase (urease) enzyme cluster (VIMSS583666, 583671–583674, and VIMSS583677-583683; see Table ). The second pathway is suggested by a cluster of urea carboxylase/allophanate hydrolase enzymes (VIMSS581083-581085, described by TIGRfams 1891, 2712, 2713, 3424 and 3425), which comprise four proteins involved in urea degradation to ammonia and carbon dioxide in other species, as well as an amidohydrolase [82
Putative nitrogen fixation gene cluster in D. aromatica
Nitric oxide (NO) reductase
The chromosomal region around D. aromatica's two nosZ homologs is notably different from near-neighbors A. aromaticum EbN1 and Ralstonia solanacearum which encode a nosRZDFYL cluster. D. aromatica's nosRZDFYL operon lacks the nosRFYL genes, and displays other notable differences with most nitrate reducing microbes. In D. aromatica, two identical nosZ reductase-like genes (annotated as nosZ1 and nosZ2, VIMSS583543 and VIMSS583547) are adjacent to two cytochrome c553s, a ferredoxin, and a transport accessory protein, and are uniquely embedded within a histidine kinase/response regulator cluster and include nosD and a napGH-like pair that potentially couples quinone oxidation to cytochrome c reduction. This indicates the NO response might be involved in cell signaling and as a possible general detoxification mechanism for nitric oxide.
The Epsilonproteobacteria Wolinella succinogenes
is quite similar to D. aromatica
for nitric oxide reductase genes (both have two nosZ
genes, a nosD
gene and a napGH
pair in the same order and orientation [83
]), but the W. succinogenes
genome lacks the embedded signaling protein cluster. Further, nitric oxide reductase homologs NorDQEBC (VIMSS582097, 582100–582103), along with the cytochrome c' protein (VIMSS582015), which has been shown to bind nitric oxide (NO) prior to its reduction [79
], are all present, and potentially act in detoxification roles. It has been shown that formation of anaerobic biofilms of P. aeruginosa
(which cause chronic lung infections in cystic fibrosis) require NO reductase when quorum has been reached [84
], so a role in signaling and complex cell behavior is possible.
shares other genome features with D. aromatica
. It encodes only 2042 orfs, yet has a large number of signaling proteins, histidine kinases, and GGDEF proteins relative to its genome size. It also encodes nif
genes, several genes similar to virulence factors, and similarity in the nitrous oxide enzyme cluster noted above. W
is evolutionarily related to two pathogenic species (Helicobacter pylori
and Campylobacter jejuni
), and displays eukaryotic host interactions, yet is not known to be pathogenic [85
]. The distinction between effector molecules causing a pathogenic interaction and a symbiotic one is unclear.
Nitrogen fixation capability in D. aromatica
is indicated by a complex of nif
-like genes (see Table ), that include putative nitrogenase alpha (NifD, VIMSS583693) and beta (NifK, VIMSS583694) subunits of the molybdenum-iron protein, an ATP-binding iron-sulfur protein (NifH, VIMSS583692), and the regulatory protein NifL (VIMSS583623), that share significant sequence similarity and synteny to the free-living soil microbe Azotobacter vinelandii
. D. aromatica
further encodes a complex that is likely to transport electrons to the nitrogenase, by using a six subunit rnf
ABCDGE-like cluster (VIMSS583616-583619, 583621 and 583622) that is phylogenomically related to the Rhodobacter capsulatus
complex used for nitrogen fixation [86
]. There is a second rnf
-like NADH oxidoreductase complex composed of VIMSS583911-583916, of unknown involvement (see Fig. ). A. aromaticum
EbN1 and Azoarcus
BH72 each encode two rnf
-like clusters as well.
Embedded in the putative nitrogen fixation cluster are two gene families involved in urea metabolism (Table ). This includes the urea transport proteins (UrtABCDE) and urea hydrolase enzyme family (Ure protein family).
Hydrogenases associated with nitrogen fixation
Uptake hydrogenase is involved in the nitrogen fixation cycle in root nodule symbionts where it is thought to increase efficiency via oxidation of the co-produced hydrogen (H2
]. D. aromatica
encodes a cluster of 13 predicted orfs (Hydrogenase-1 cluster, VIMSS581358-581370; Table ) that includes a hydrogenase cluster syntenic to the hox
KGZMLOQR(T)V genes found in Azotobacter vinelandii
, which reversibly oxidize H2
in that organism [88
]. This cluster is followed by a second hydrogenase (Hydrogenase-2 cluster, VIMSS581373-581383). The hydrogenase assembly proteins, hyp
ABF and CDE are included (VIMSS581368-581370 and 581380-581381, and VIMSS3337851 (formerly 581382)) as well as proteins related to the hydrogen uptake (hup
) genes of various rhizobial microbes [87
]. The second region, with the hyp
-like clusters, lacks overall synteny to any one genome currently sequenced. It does, however, display regions of genes that share synteny with Rhodoferax ferrireducens
, which displays the highest percent identity across the cluster, both in terms of synteny and protein identity.
Hydrogenase clusters associated with nitrogen fixation.
VIMSS581384 encodes a homolog of the HoxA hydrogenase transcriptional regulator, which has been shown to be expressed only during symbiosis in some species [89
]. Regulation is indicated by homologs of NtrX (VIMSS581123) and NtrY (VIMSS581124); the NtrXY pathway comprises a two-component signaling system involved in the regulation of nitrogen fixation in Azorhizobium caulinodans
Carbon Fixation via the Calvin-Benson-Bassham cycle
The genes indicative of carbon fixation, using the Calvin cycle, are present in the D. aromatica
genome. This includes Ribulose 1,5-bisphosphate carboxylase (RuBisCo, VIMSS581681), phosphoribulokinase (cbbP/PrkB, VIMSS581690), and a fructose bisphosphate (fba, VIMSS581693) of the Calvin cycle subtype. The RuBisCo cbbM
gene is of the fairly rare type II form. D. aromatica
CbbM displays a surprisingly high 77% amino acid identity to CbbM found in the deep-sea tube worm Riftia pachyptila
]. In a recent study of aquatic sediments, Rhodoferax fermentans, Rhodospirillum fulvum
and R. rubrum
were also found to possess the cbb
M type II isoform of RuBisCo [92
]; this sub-type is shared by a only a few microbial species.
Further putative Cbb proteins are encoded by VIMSS581680 & 581688, candidates for CbbR (regulator for the cbb operon) and CbbY (found downstream of RuBisCo in R. sphaeroides
The presence of the cbbM gene suggests the ability to carry out the energetically costly fixation of CO2, though such functionality has yet to be observed, and carbon dioxide fixation capability has been found in only a few members of the microbial community.
There is a potential glycolate salvage pathway indicated by the presence of two isoforms of phosphoglycolate phosphatase (gph, VIMSS583850 and 581830). In other organisms, phosphoglycolate results from the oxidase activity of RuBisCo in the Calvin cycle, when concentrations of carbon dioxide are low relative to oxygen. In Ralstonia (Alcaligenes) eutropha and Rhodobacter sphaeroides, the gph gene (cbbZ) is located on an operon along with other Calvin cycle enzymes, including RuBisCo. In D. aromatica, the gph candidates for this gene (VIMSS583850 and 581830), are removed from the other cbb genes on the chromosome in D. aromatica; however VIMSS581830 is adjacent to a homolog of Ribulose-phosphate 3-epimerase (VIMSS581829, rpe).
SNOQP gene cluster codes for a cbb-type cytochrome oxidase that functions as the terminal electron donor to O2
in the aerobic respiration of Rhodobacter capsulatus
]. These genes are present in a cluster as VIMSS580484-580486 and VIMSS584273-584274; note that these genes are present in a large number of Betaproteobacteria.
Other carbon cycles, such at the reverse TCA cycle and the Wood-Ljungdahl pathways, are missing critical enzymes in this genome, and are not present as such.
Sulfate and thiosulfate transport appear to be encoded in the gene cluster composed of an OmpA type protein (VIMSS581631) followed by orthologs of a sulfate/thiosulfate specific binding protein Sbp (VIMSS581632), a CysU or T sulfate/thiosulfate transport system permease T protein (VIMSS581633), a CysW ABC-type sulfate transport system permease component (VIMSS581634), and a CysA ATP-binding component of sulfate permease (VIMSS581635).
In addition, candidates for the transcriptional regulator of sulfur assimilation from sulfate are present and include: CysB, CysH, and CysI (VIMSS582364, 582360 and 582362, respectively).
A probable sulfur oxidation enzyme cluster is present and contains homologs of SoxFRCDYZAXB [95
], with a putative SoxCD sulfur dehydrogenase, SoxF sulfide dehydrogenase, and SoxB sulfate thiohydrolase, which is predicted to support thiosulfate oxidation to sulphate (see Fig. ). Functional predictions are taken from Friedrich et al. [95
] [see Additional file 7
]. A syntenic sox
gene cluster is also found in Anaeromyxobacter dehalogens
(although it lacks sox
FR) and Ralstonia eutropha
, but not in A. aromaticum
EbN1. Thiosulfate oxidation, however, has not been reported under laboratory conditions tested thus far, and experimental support for this physiological capability awaits further investigation.
Figure 5 Sulfur oxidation (thiosulfate to sulfate) candidates in R. eutropha, R. palustris, and D. aromatica. Proposed model for this periplasmic complex is as follows: SoxXA, oxidatively links thiosulfate to SoxY; SoxB, potential sulfate thiohydrolase, interacts (more ...)
Conversely, the cytoplasmic SorAB complex [96
] is not present in D. aromatica
nor A. aromaticum
EbN1, although it is found in several other Betaproteobacteria, including R. metallidurans, R. eutropha
, R. solanacearum
, C. violaceum
, and B. japonicum
Gene Family Expansion
To determine candidates for recent gene duplication events, extensive phylogenomic profile analyses were conducted for all sets of paralogs in the genome. Flower Power recruitment and clustering against the non-redundant Genbank protein set was done, and the resulting alignments were analyzed using the tree-building SCI-PHY or Belvu based neighbor-joining utilities. The alignment of two or more D. aromatica protein sequences in a clade such that they displayed higher % identity to each other than to orthologs present in other species was interpreted as an indication of a probable recent duplication event, either in the D. aromatica genome itself or in a progenitor species. Results of this analysis are shown in Table .
Potential gene family expansion is indicated in several functional groups, including the following: signaling proteins (including cAMP signaling, histidine kinases, and others), Mhp-like aromatic oxidation complexes, nitrogen metabolism proteins and transport proteins.
Most duplications indicate that a single gene, rather than sets of genes, were replicated. An exception is the Tra/Type IV transport cluster (VIMSS582581-582601 and VIMSS582864-582884) noted previously. In the protein sets for the histidine kinase/response regulator, duplication of histidine kinase appears to occur without duplication of the adjacent response regulator. The paralogs created by recent duplication events are typically found well-removed from one another on the chromosome, although some tandem repeats of single genes were noted. However, the highest percent identity was not found between pairs of genes in tandem repeats.