|Home | About | Journals | Submit | Contact Us | Français|
Members of the gram-negative, strictly aerobic genus Comamonas occur in various environments. Here we report the complete genome of Comamonas testosteroni strain CNB-2. Strain CNB-2 has a circular chromosome that is 5,373,643 bp long and has a G+C content of 61.4%. A total of 4,803 open reading frames (ORFs) were identified; 3,514 of these ORFs are functionally assigned to energy production, cell growth, signal transduction, or transportation, while 866 ORFs encode hypothetical proteins and 423 ORFs encode purely hypothetical proteins. The CNB-2 genome has many genes for transportation (22%) and signal transduction (6%), which allows the cells to respond and adapt to changing environments. Strain CNB-2 does not assimilate carbohydrates due to the lack of genes encoding proteins involved in glycolysis and pentose phosphate pathways, and it contains many genes encoding proteins involved in degradation of aromatic compounds. We identified 66 Tct and nine TRAP-T systems and a complete tricarboxylic acid cycle, which may allow CNB-2 to take up and metabolize a range of carboxylic acids. This nutritional bias for carboxylic acids and aromatic compounds enables strain CNB-2 to occupy unique niches in environments. Four different sets of terminal oxidases for the respiratory system were identified, and they putatively functioned at different oxygen concentrations. This study conclusively revealed at the genomic level that the genetic versatility of C. testosteroni is vital for competition with other bacteria in its special niches.
The members of the genus Comamonas are gram-negative, strict aerobes and frequently occur in diverse habitats, including activated sludge, marshes, marine habitats, and plant and animal tissues (4, 12, 13). They grow on organic acids, amino acids, and peptone, but they rarely attack carbohydrates. Some species, such as Comamonas testosteroni, can also mineralize complex and xenobiotic compounds, such as testosterone (17) and 4-chloronitrobenzene (CNB) (54). Their diversified niches make Comamonas species environmentally important and also suggest that the genus Comamonas represents a group of bacteria that can adapt very well, both ecologically and physiologically, to environments.
To understand better how environmental microbes adapt to their environments, many well-known environmental microbes, such as Pseudomonas putida (53) and Rhodococcus sp. strain RAH1 (31), have been sequenced. The genome data for these organisms, as well as other environmental microbes, provide not only an understanding of physiological and environmental functions at the genetic level but also a starting point for systems biology analyses of these microbes. Until now, none of the Comamonas species has been sequenced, although these organisms represent an important group of environmental microbes.
C. testosteroni strain CNB-1 was isolated from CNB-contaminated activated sludge and grows with CNB as a sole source of carbon and nitrogen, and it has been used successfully for rhizoremediation of CNB-polluted soil (25). Strain CNB-1 has a circular chromosome and a large plasmid, and the genes involved in the degradation of CNB on plasmid pCNB1 were identified previously (28). In the present study, the genome of strain CNB-2, which was derived from strain CNB-1, was sequenced, and a genome analysis was performed parallel to physiological experiments. The aim of this work was to obtain genetic insight into how C. testosteroni adapts to changing and diverse environments.
Strain CNB-2 is a mutant of C. testosteroni CNB-1 that lost the degrading plasmid pCNB1 (28). The physiological and biochemical properties of both CNB-1 and CNB-2 were determined by using API 20NE and API 50CH identification systems (bioMérieux, Hazelwood, MO) at 30°C.
The genome of CNB-2 was sequenced by using a massive parallel pyrosequencing technology (454 GS 20) (30). All useful reads were assembled into 545 contigs (>500 bp) covering the genome 25-fold. The gaps were closed by using a multiplex PCR technique (10, 48). Finally, the sequences were assembled using the PHRED_PHRAP_CONSED software package with the Unix system (10, 48). The final consensus quality score of each base was more than 30. Genome assembly was assessed by performing long-range PCR and GC skew analysis (see Fig. S1 in the supplemental material).
Coding sequences were predicted by combining results obtained with Glimmer packages and GeneMark online (6, 27). Translational products of the coding sequences were annotated by searching the GenBank database using the basic local alignment search tool (BLAST) for proteins. Putative functions of translation products were confirmed by using KEGG, TIGRFams, and Clusters of Orthologous Groups (COGs) (14, 33, 46). The tRNA genes were identified by using the tRNAScan-SE tool (26). The rRNA genes were identified by BLASTN using the 16S rRNA gene of strain CNB-1. Transporter genes were predicted by analyzing all annotated genes (putative proteins) with the Transport Classification Database (http://www.tcdb.org) (39). The metabolic pathway of CNB-2 was constructed in silico by using KEGG tools (33). The protein domains were predicted using the NCBI Conserved Domain Database (http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml) (29). Transmembrane domains and signal peptide regions were identified with TMHMM2.0 and SignalP3.0, respectively (2, 23).
The genome data for C. testosteroni KF-1 (accession no. NZ_AAUJ00000000), Escherichia coli ATCC 8739 (CP000946), and P. putida KT2440 (AE015451) were retrieved from a website (http://www.ncbi.nlm.nih.gov/genomes). BLASTP was employed for protein sequence analysis. Significant similarity of proteins was defined using an e value of <1e−05 and >30% identity.
The complete genome sequence and annotation of C. testosteroni CNB-2 have been deposited in the GenBank database under accession number CP001220.
Strain CNB-2 has a circular chromosome that is 5,373,643 bp long and has a G+C content of 61.4% (Fig. (Fig.1).1). It contains 4,803 predicted open reading frames (ORFs), three rRNA operons, and 79 tRNA genes for all 20 amino acids (Table (Table1).1). Based on BLASTP searches (e value, <1e−5), 3,514 ORFs encode proteins with assigned functions, 866 conserved proteins, and 423 hypothetical proteins. Using COG functional assignment, 4,649 ORFs were predicted. The gene coding sequences cover 86.4% of the genome, and the average length of the coding sequences is 963 bp. The likely origin of replication (oriC) was located between the ribosomal protein L34 and DnaA genes based on GC skew analysis (see Fig. S1 in the supplemental material). Four putative DnaA boxes (DnaA binding sites) were found between the L34 and DnaA genes.
The genome of CNB-2 contains large numbers of mobile genetic elements (MGEs) (Table (Table2).2). Three putative prophages (prophages I, II, and III) were identified. Remarkably, these prophages harbor accessory genes possibly acquired by horizontal gene transfer (HGT) from other bacterial species. For example, a gene cluster encoding putative terephthalate dioxygenase (CtCNB1_1493 to CtCNB1_1496) was identified in prophage I. A total of 39 ORFs encoding various transposases and integrases of the IS21, IS4, IS3, and IS110 families were discovered (Table (Table2).2). The presence of highly homologous or even identical ORFs among these 39 ORFs suggests that recent gene transposition or gene duplication events occurred in the genome. The MGEs in the CNB-2 genome are located mainly in the region near the terminal DNA replication site.
HGT might result in significant changes in the G+C content of the genome and provide additional physiological capacity to the cells. In the region from 2.75 to 2.97 Mbp, the G+C content is 57.0%, which is lower than the G+C content of the whole genome (61.4%). Based on the observation that there are genes encoding many putative transposes in this region (Fig. (Fig.1),1), it was deduced that strain CNB-2 acquired this region via HTG. Functionally, this region of the CNB-2 genome contains gene clusters putatively involved in xenobiotic compound degradation (e.g., protocatechuate degradation pathway) and heavy metal resistance (copper and arsenate resistance).
A different strain of C. testosteroni, strain KF-1, is being sequenced (http://www.ncbi.nlm.nih.gov). Based on 16S rRNA gene analysis, strains CNB-2 and KF-1 are close phylogenetic relatives (99% identity). In addition, they occur in similar niches (activated sludge) and are able to degrade xenobiotics. A comparison of the two genomes showed that they are highly similar to each other; the genetic evidence that these strains are ecologically and physiologically similar includes the finding that 4,272 of the 4,803 ORFs of strain CNB-2 are significantly similar to their counterparts in strain KF-1, whose genome contains 5,492 ORFs. However, the results also revealed differences between CNB-2 and KF-1 (Table (Table1).1). The difference in the sizes of the two genomes is ca. 0.7 Mbp. The three prophages of CNB-2 were not found in the KF-1 genome. Corresponding ORFs were not found in KF-1 for a total of 531 ORFs of CNB-2 that are frequently associated with MGEs. The genomic differences between CNB-2 and KF-1 are at least partially due to the abundance of MGEs in the CNB-2 genome. The results of a further comparative analysis of the genomes of strains CNB-2, KF-2, P. putida KT2440, and E. coli ATCC 8739 are shown in Fig. S2 in the supplemental material.
Strain CNB-2 grows on mineral medium, indicating that it is able to synthesize all its cellular components, such as fatty acids, purine and pyrimidine nucleotides, and the 20 amino acids. The results of the in silico genomic analysis supported this hypothesis. More information concerning general metabolism is shown in Fig. Fig.22.
The genus Comamonas was defined as a genus that has a poor ability to assimilate carbohydrates. A total of 49 carbon sources were tested in this study, and CNB-2 was not able to assimilate any sugar except glycerol and gluconate. The glycolysis in glucose catabolism was incomplete due to the lack of hexokinase and glucokinase genes. Genes encoding glucose-6-phosphate 1-dehydrogenase and 6-phosphogluconolactonase of the pentose phosphate pathway were not found in the CNB-2 genome. Thus, oxidation of glucose via this pentose phosphate pathway was not possible (Fig. (Fig.2).2). However, the nonoxidative part of the pentose phosphate pathway was complete, and five-carbon sugars were generated for biosynthesis. An almost complete Entner-Doudoroff pathway was constructed in silico for gluconate assimilation (Fig. (Fig.2),2), and physiological tests showed that strain CNB-2 grew on gluconate. In contrast to the poor ability of Comamonas species to metabolize carbohydrates, members of the genus Pseudomonas, such as P. putida KT2440, are able to metabolize many sugars (19). This genome project examining a representative of Comamonas species revealed that these species do not have a complete sugar-metabolizing pathway. Based on these findings, it was proposed that Comamonas and Pseudomonas species, which are two groups of environmental microbes, might play different roles in the geobiochemical cycles of elements.
C. testosteroni was named based on its ability to metabolize testosterone. A cluster of genes (CtCNB1_1351 to CtCNB1_1362) similar to the previously identified genes responsible for testosterone degradation in C. testosteroni TA441 was found in the strain CNB-2 genome (1, 16-18) (Fig. (Fig.33).
As an important environmental microbe, strain CNB-2 is able to utilize many other kinds of compounds, such as aromatics and short-chain fatty acids, as carbon sources. In the CNB-2 genome, 37, 18, and 47 genes encoding putative dioxygenases, hydroxylases, and oxidoreductases, respectively, were annotated for aromatic and cyclic hydrocarbon degradation. Strain CNB-2 was able to grow on benzoate, gentisate, phenol, 3-hydroxybenzoate, 4-hydroxybenzoate, protocatechuate, and vanillate, and the corresponding gene clusters in the genome were identified (Fig. (Fig.3).3). Key genes involved in degradation of the following compounds were identified in the CNB-2 genome: catechol (CtCNB1_3144 to CtCNB1_3155), protocatechuate (CtCNB1_2740 to CtCNB1_2747), and gentisate (CtCNB1_2777 to CtCNB1_2780). A range of peripheral enzymes that direct aromatic compounds into central pathways were also discovered, such as benzoyl-coenzyme A (benzoyl-CoA) ligase (CtCNB1_0097), benzoyl-CoA oxygenase (CtCNB1_0065 to CtCNB1_0067), phenol hydroxylase (CtCNB1_3158 to CtCNB1_3162), 3-hydroxybenzoate hydroxylase (CtCNB1_3410), 4-hydroxybenzoate hydroxylase (CtCNB1_2156), and vanillate monooxygenase (CtCNB1_4192). Also, 66 Tct systems and nine TRAP-T systems (see Table S1 in the supplemental material) for transportation of carboxylic acids were identified. The carboxylic acids were metabolized through the tricarboxylic acid cycle to provide energy and key intermediates for cell growth.
Strain CNB-2 uses nitrate and ammonium as nitrogen sources, and three genes (CtCNB1_0130, CtCNB1_0202, and CtCNB1_0616) encoding ammonium channels and two gene clusters (CtCNB1_0586 to CtCNB1_0588 and CtCNB1_1234 to CtCNB1_1236) for ABC-type nitrate transporters were found in the genome. Inorganic nitrogen is incorporated into glutamine by glutamine synthetase (CtCNB1_3271). Additionally, genes encoding putative transporters for import of exogenous amino acids, oligopeptides, and branched-chain amino acids were identified in the CNB-2 genome. These compounds could be metabolized through the urea cycle and tricarboxylic acid cycle, which could supply not only an adequate nitrogen source but also a carbon source for cell growth.
Strain CNB-2 contains genes for a broad range of transport proteins (Fig. (Fig.2;2; see Table S1 in the supplemental material). A total of 22% of all genes were predicted to be involved in substrate transport. This value is higher than the values for E. coli, Haemophilus influenzae, and Helicobacter pylori, which live in relatively closed environments, but it is similar to the values for microbes such as P. putida that live in open environments. Ten genes were predicted to encode a channel transporter (TC #1.A) responsible for ammonia, magnesium-cobalt, and urea-amide. A total of 378 genes probably encode carriers (TC #2) for sugars, polyols, drugs, neurotransmitters, Krebs cycle metabolites, phosphorylated glycolytic intermediates, amino acids, peptides, osmolytes, siderophores, iron siderophores, nucleosides, organic anions, and inorganic anions. Thirteen predicted APC family transporters (TC #2.A.3) may be involved in amino acid transport in the cells. Thirty-six genes encode 12 protein complexes of the RND family, which catalyze drug or heavy metal efflux via an H+ antiport mechanism. In the genome of CNB-2, the group of transporters (TC #3.A.) driven by P~P bond hydrolysis is the largest group, and such transporters export or import a large number of compounds, including amino acids, drugs, metal ions, inorganic anions, proteins, and aromatic compounds.
Effective transport is important for microbial evolution. Import of compounds into cells is vital for metabolic pathways, and export of toxic compounds out of cells is also vital for bacterial survival in harsh environments. As mentioned above, 22% of the genes in the CNB-2 genome were predicted to encode channels, electrochemical potential-driven transporters, ABC-type transporters, RND efflux systems, protein secretion systems, and other transport-related proteins based on the results obtained using BLASTP searches in the Transport Classification Database (e value, <10e−5). This value is similar to the values for bacteria living in soil or plants, including Streptomyces coelicolor (35), P. putida KT2440 (53), and Pseudomonas aeruginosa (24), but it is higher than the values for symbiotic and pathogenic bacteria, such as Mycobacterium leprae (51). Further analysis revealed that P. aeruginosa (24) and Agrobacterium tumefaciens (5) possess high numbers of transporters for sugars, amino acids, and peptides and that C. testosteroni CNB-2 and P. putida KT2440 (53) have many fewer transporters for sugars. Interestingly, P. putida utilizes many sugars for growth, but strains of Comamonas species, including strain CNB-2, assimilate few sugars.
Protein export that controls secretion of toxins, hydrolytic enzymes, and adhesins has an important role in Comamonas species living in various niches and colonizing different surfaces. Four types of protein secretion have been identified in gram-negative bacteria (40), but only two types were found when the CNB-2 genome was examined. The putative type I secretion system of strain CNB-2 consists of proteins encoded by five gene clusters (CtCNB1_2177 to CtCNB1_2179, CtCNB1_2281 to CtCNB1_2285, CtCNB1_2442 to CtCNB1_2446, CtCNB1_3019 to CtCNB1_3022, and CtCNB1_3183 to CtCNB1_3186), including a pore-forming outer membrane protein, a membrane fusion protein, and an inner membrane ATP-binding cassette protein. The type I secretion system is also involved in exporting arabinose, hemolysin, and drugs from the cell. The type II secretion system functions via Sec, Tat, and Pul export pathways (3, 52), and the type II secretion system genes are usually conserved in gram-negative bacteria (7). The Sec genes of CNB-2 were identified and similar to those of E. coli, but they were not clustered, except for the SecD, SecF, and YajC genes (CtCNB1_4737 to CtCNB1_4739), which were organized consecutively and encoded a putative protein complex. The Tat genes, tatABC (CtCNB1_0735 to CtCNB1_0737), were located in a putative operon, but a tatE gene was not found. As suggested by previous studies (41), the function of TatE might be replaced by TatA. A gene cluster (CtCNB1_0617 to CtCNB1_0634) that encoded proteins similar to proteins of Klebsiella oxytoca (38) was identified in the CNB-2 genome.
In order to respond to environmental changes in a quick and efficient way, environmental bacteria have evolved various signal-sensing and transduction systems (36, 44). According to COG and Conserved Domain Database analyses (47), 300 ORFs in the CNB-2 genome were assigned to functions for signal transduction and regulation. Furthermore, 58 ORFs were predicted to encode putative response regulators, 46 ORFs were predicted to encode putative histidine kinases, and 10 ORFs were predicted to encode putative hybrid sensory kinases. These putative regulators and kinases were further grouped into 43 complete two-component signal transduction systems (see Table S2 in the supplemental material). Moreover, 64 ORFs that contained one output domain (50) and encoded proteins that putatively functioned as one-component signal transduction systems were identified in the CNB-2 genome (see Table S3 in the supplemental material). The high number of signal transduction and regulation genes in the genome of CNB-2 is comparable to the number in the P. aeruginosa PAO1 genome (45) and is higher than the number in the E. coli genome (32).
The comprehensive and diverse signal transduction and regulation systems enable strain CNB-2 respond to various environmental changes. This conclusion is supported by analysis of functional domains of the signal transduction systems. Various input (signal-sensing) domains, such as HAMP, PAS, KdpD, and CHASE, of one- and two-component systems were discovered (see Tables S3 and S4 in the supplemental material) (8). In particular, the PAS domain (55) was frequently identified (see Tables S2 and S3 in the supplemental material), and three PAS domains were found in the putative CtCNB1_4372 protein (see Table S3 in the supplemental material). The CtCNB1_4372 protein, a putative one-component signal transduction system with a putative ammonium monooxygenase neighbor, might be involved in the response to changing oxygen concentrations and in regulation of ammonium assimilation by strain CNB-2.
Chemotaxis is an important microbial response to environmental signals, and chemotactic responses to aromatic compounds have been characterized for Pseudomonas and Ralstonia species (11, 34). Our results showed that strain CNB-2 exhibited chemotaxis toward succinate, while strain CNB-1 exhibited chemotaxis toward both succinate and CNB. Although the putative genes linking the sensing of chemical signals and cell movement could not be identified at this stage, some related genes were identified in the CNB-2 genome. One locus (CtCNB1_0471 to CtCNB1_0475) encoding three putative regulators and one putative kinase localized between the genes for flagellar biosynthesis and chemotaxis, and ORFs encoding putative proteins (CtCNB1_0475, CtCNB1_0476, and CtCNB1_0479) that were homologous to the CheA, CheY, and CheB proteins were found in the CNB-2 genome.
Several loci of the CNB-2 genome were predicted to be responsible for drug and heavy metal export (see Table S1 in the supplemental material). Three gene clusters (CtCNB1_2506 to CtCNB1_2509, CtCNB1_3138 to CtCNB1_3141, and CtCNB1_3971 to CtCNB1_3974) encode the system for exporting arsenate. Analysis of the CNB-2 genome revealed that this strain has six CzcA-type heavy metal efflux pumps and 11 RND-type efflux transporters for various drugs (Fig. (Fig.2;2; see Table S2 in the supplemental material). The genes encoding these pumps and transporters possibly conferred additional resistance to heavy metals and drugs to strain CNB-2. Moreover, strain CNB-2 has many chaperones that help the cell overcome toxicity. Toxic drugs and heavy metals could induce heat shock proteins (42, 49), as genes encoding heat shock proteins (HslJ [CtCNB1_3317], IbpA [CtCNB1_3446], HslU [CtCNB1_4124], and HslJ [CtCNB1_4571]) were identified in the genome of strain CNB-2 (see Table S4 in the supplemental material). Interestingly, the drug and heavy metal resistance genes are associated with MGEs. For example, four gene clusters (CtCNB1_2496 to CtCNB1_2499, CtCNB1_2507 to CtCNB1_2510, CtCNB1_2513 to CtCNB1_2520, and CtCNB1_2521 to CtCNB1_2524) encoding P-type ATPase, the CzcA family efflux pump, and arsenate resistance are flanked by genes encoding Tn3-type transposases, IS116-IS110-IS902-type transposases, and IS4-type transposases, respectively.
Strain CNB-2 is strictly aerobic and requires oxygen molecules as the final electron acceptor. Genes encoding all the aerobic respiratory chain members were tentatively identified in the CNB-2 genome. Four putative terminal oxidase complexes, encoded by CtCNB1_1811 to CtCNB1_1814, CtCNB1_2157 to CtCNB1_2161, CtCNB1_3937 to CtCNB1_3941, and CtCNB1_3431 to CtCNB1_3432, that were highly similar to cbb3-type, bo3-type, aa3-type, and bd-type complexes (20) were found. It was reported previously that cbb3-type and bd-type terminal oxidases functioned at lower oxygen concentrations in endosymbiotic rhizobia (37) and under microaerobic conditions in nitrogen-fixing bacteria (15, 22) and in association with animals (21, 43). Thus, the robust competitiveness of strain CNB-1 in the rhizosphere and wastewater treatment plants may be attributed to the presence of multiple terminal oxidases (25). Genome analysis suggested a putative transhydrogenase (CtCNB1_0257 to CtCNB1_0258) that might regulate energy metabolism in strain CNB-2 at different oxygen concentrations.
Our physiological experiment showed that CNB-2 could respire with nitrate aerobically to produce energy. Accordingly, a gene cluster consisting of five genes (napABCDE) that is responsible for the first step of aerobically denitrification (9) and encodes putative periplasmic nitrate reductases was identified. Thus, strain CNB-2 has multiple terminal oxidases depending on the oxygen concentration and is able to use nitrate as an electron acceptor.
C. testosteroni CNB-2 is the first member of the genus Comamonas that has been completely sequenced. Its genome size is close to the genome sizes previously reported for Pseudomonas strains, including P. putida GB-1 (6.1 Mb) (accession no. CP000926), P. putida KT2440 (6.2 Mb) (AE015451), and Pseudomonas stutzeri A1501 (4.6 Mb) (CP000304). Comamonas species have been found to be ubiquitous in terrestrial and aquatic environments and to be associated with plants and animals. They have also been found in severely polluted environments. The diverse niches occupied by Comamonas species reflect their effective adaptation to various physiochemical conditions. In this study, the high genetic enrichment and capacity of C. testosteroni CNB-2 revealed its evolutionary adaptations and genetic foundation for living in diverse and complex ecological niches. The first description of a C. testosteroni genome provided in this paper should significantly facilitate further basic and applied investigations of Comamonas species.
This work was supported by grants 30725001 and 30621005 from the National Natural Science Foundation of China.
Published ahead of print on 4 September 2009.
†Supplemental material for this article may be found at http://aem.asm.org/.