|Home | About | Journals | Submit | Contact Us | Français|
“Anaerocellum thermophilum” DSM 6725 is a strictly anaerobic bacterium that grows optimally at 75°C. It uses a variety of polysaccharides, including crystalline cellulose and untreated plant biomass, and has potential utility in biomass conversion. Here we report its complete genome sequence of 2.97 Mb, which is contained within one chromosome and two plasmids (of 8.3 and 3.6 kb). The genome encodes a broad set of cellulolytic enzymes, transporters, and pathways for sugar utilization and compared to those of other saccharolytic, anaerobic thermophiles is most similar to that of Caldicellulosiruptor saccharolyticus DSM 8903.
Microorganisms that grow at elevated temperatures and are able to utilize a range of carbohydrates have potential utility in the conversion of lignocellulosic biomass to bioenergy. Many hyperthermophilic bacteria and archaea (optimal temperature [Topt], ≥80°C) from marine environments are able to grow on various α- and β-linked glucans, but none of them are able to efficiently hydrolyze crystalline cellulose and plant biomass (1). The most thermophilic cellulolytic species known at present include the strictly anaerobic bacterium “Anaerocellum thermophilum.” The type strain (Z-1320) was isolated from thermal springs of Kamchatka (Russia) almost 2 decades ago (10). It grows optimally at 75°C and utilizes both simple and complex polysaccharides, with lactate, acetate, CO2 and H2 as end products (10). In particular, A. thermophilum DSM 6725 efficiently utilizes the two main components of plant biomass (cellulose and hemicellulose), as well as untreated grasses with low-lignin (napier grass, Bermuda grass) or high-lignin (switchgrass) contents and a hardwood (poplar) (S.-J. Yang, I. Kataeva, S. D. Hamilton-Brehm, N. L. Engle, T. J. Tschaplinski, C. Doeppke, M. Davis, J. Westpheling, and M. W. W. Adams, submitted for publication).
The genome of A. thermophilum DSM 6725 was sequenced at the U.S. Department of Energy Joint Genome Institute (JGI) using an 8-kb library. In addition to Sanger sequencing, 454 pyrosequencing (454 Life Sciences) was carried out to an average depth of coverage of 20×. All general aspects of library construction and sequencing performed at the JGI can be found at http://www.jgi.doe.gov. Draft assemblies were based on 38,121 total reads, and all libraries provided 13× coverage. The Phred/Phrap/Consed software package (http://www.phrap.com) was used for sequence assembly and quality assessment (4-6). After the shotgun stage, reads were assembled with parallel Phrap (High Performance Software LLC). Possible misassemblies were corrected with Dupfinisher (7) or transposon bombing of bridging clones (Epicentre Biotechnologies, Madison, WI). Gaps between contigs were closed by editing in Consed, custom primer walking, or PCR amplification (Roche Applied Science, Indianapolis, IN). A total of 981 additional reactions were necessary to close gaps and to raise the quality of the finished sequence. The completed genome sequences contain 41,706 reads, with an average of 13.4× coverage achieved in the chromosome (and 56× in pATHE01 and 15× in pATHE02) with an error rate of 0.01 in 100,000.
The chromosome of 2,919,718 bp has 35.17% GC. Plasmids pATHE01 of 8,291 bp and pATHE02 of 3,653 bp have 38.53% and 42.92% GC, respectively. Two native A. thermophilum DSM 6725 plasmids, pBAL (GenBank accession no. AX710673) and pBAS (AX710687), had been sequenced earlier (2). The sequence of pATHE02 perfectly matches that of pBAS2, while that of pATHE01 is similar to pBAL, but pBAL contains 19 gaps and 3 mismatches. Similarly, the gene sequences of 16S rRNA (GenBank accession number L09180) (9) and of the large multidomain glycoside hydrolase CelA (GenBank accession number Z86105) (12) from A. thermophilum Z-1320 were previously reported, and their corresponding genes in the genome sequence contain 2 inserts and 12 mismatches and 23 mismatches, respectively.
The genome size of A. thermophilum DSM 6725 is similar to that of the cellulolytic bacterium Caldicellulosiruptor saccharolyticus DSM 8903 (Topt of 70°C, 2.97 Mb, 35% GC; GenBank accession number CP000679), which was also isolated from a continental hot spring (11), but smaller than that of the cellulolytic bacterium Clostridium thermocellum ATCC 27405 (Topt of 60°C, 3.8 Mb, 39% GC; accession no. NC_009012) (JGI) and larger than that of the xylanolytic bacterium Thermotoga maritima MSB8 (Topt of 80°C, 1.86 Mb, 46% GC; accession no. NC_000853) (8).
The chromosome of A. thermophilum DSM 6725 is predicted to contain 2,662 coding sequences, three rRNA operons, and 47 tRNA genes. pATHE01 and pATHE01 are predicted to contain 8 and 4 open reading frames, respectively. The genes in the genome of A. thermophilum DSM 6725 are predicted to be organized into 573 multigene transcripts and 626 single-gene transcripts (3), and a total of 102 transcripts contain genes that are predicted to be involved in the degradation of complex polysaccharides. While most of the genes in A. thermophilum DSM 6725 have their best BLAST hits (E value of <1e-20) in the genome of C. saccharolyticus DSN 8903, a total of 550 genes do not. Of these, 18 are predicted to have functions relating to biomass degradation, suggesting that these genes may contribute to any nutritional differences between the two organisms. The genomes of A. thermophilum DSM 6725 and C. saccharolyticus DSM 8903 contain 25 and 68 putative transposase genes, respectively. This might account for the apparent genome plasticity within the two genomes of these closely related bacteria, which were isolated from similar geothermal freshwater environments.
The genome sequence and annotation of the Anaerocellum thermophilum DSM 6725 chromosome and the two plasmids pATHE01 and pATHE02 were deposited in GenBank under accession numbers CP001393, CP001394, and CP001395, respectively.
This research was supported by a grant (DE-PS02-06ER64304) from the Bioenergy Science Center (BESC), Oak Ridge National Laboratory, a U.S. Department of Energy (DOE) Bioenergy Research Center supported by the Office of Biological and Environmental Research in the DOE Office of Science. Work at the Joint Genome Institute is performed under the auspices of the U.S. Department of Energy's Office of Science, Biological and Environmental Research Program and by the University of California Lawrence Berkeley National Laboratory under contract DE-AC02-05CH11231, by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344, and by Los Alamos National Laboratory under contract DE-AC02-06NA25396.
Published ahead of print on 3 April 2009.