Approximately two-thirds of biogenic methane is derived from acetate, yet only two genera of methanoarchaea, Methanosarcina
, that can utilize acetate as a substrate for methanogenesis have been isolated. While various attributes and tools have allowed Methanosarcina
spp. to be extensively studied, Methanosaeta
spp. have received little attention due to their slow growth and difficulties in culturing. To address the recalcitrant nature of this genus, the Methanosaeta thermophila
genome sequence has been completed (6
), and the complete genome sequence of Methanosaeta concilii
is announced here.
A whole-genome shotgun approach was used for sequencing of the M. concilii
GP6 genome. The sequence data were acquired on an ABI 3700 capillary sequencer, and the 52,189 attempted shotgun reads provided 11.3× sequence coverage. The genome sequence was assembled using Phred/Phrap software tools (1
) and viewed in CONSED (4
). Fosmid paired-end sequences were used to identify misassembled regions. The sequence assembly was further refined, and finishing experiments were designed using the Autofinish tool in CONSED (3
). In all, 11,291 autofinish and advanced finishing reads were attempted. Sixty-six small insert clones spanning local misassembled regions were identified for transposon mutagenesis experiments, and consensus sequences were generated from 3,983 attempted reads. Twenty-three fosmids spanning gross misassembled or large-gap regions were sequenced with 18,201 combined attempted reads. These sequences were used as backbones in the main genome assembly to resolve misassembled regions. Paired-end sequences and fingerprint data from fosmid clones were used to validate the finished genome assembly at a 1-kbp-resolution scale.
The finished M. concilii
GP6 genome is composed of two replicons, a 3,008,626-bp circular chromosome and an 18,019-bp plasmid. The plasmid-to-chromosome molar ratio is >200:1, suggesting that this plasmid is present at a high copy number. The G+C content of the plasmid is 43.19%, compared to 51.03% G+C in the circular chromosome. The DNA sequence was submitted to the JCVI Annotation Service (http://www.jcvi.org/cms/research/projects /annotation-service/
), which utilizes Glimmer, Blast-Extend-Repraze (BER) searches, HMM searches, TMHMM searches, SignalP predictions, and automatic annotations from AutoAnnotate. Manatee, downloaded from SourceForge (manatee.sourceforge.net), was used to manually review the output.
The M. concilii
genome is 61% larger than the 1,861,571-bp M. thermophila
genome, with 71% more protein coding sequences (2,906 versus 1,696 open reading frames [ORFs]); however, the coding fractions are similar (84.7% for M. concilii
and 82% for M. thermophila
). Alignment of these genome sequences reveals poor conservation of gene order, consistent with the substantial genetic divergence reflected in 16S rRNA sequence comparisons (5
). Protein content comparison suggests that nearly 50% of the predicted proteins encoded in the M. concilii
genome are not shared with M. thermophila
, indicating that these genomes have evolved by gene acquisition via lateral gene transfer or gene duplication in M. concilii
, gene loss in M. thermophila
, or perhaps a combination of these. The addition of Methanosaeta
to the methanoarchaeal genome compilation offers an unprecedented opportunity for significant insight into these difficult microbes and comparative genomic approaches to address the nature of these microbes and their biological impact.
Nucleotide sequence accession numbers.
The complete genome sequence of M. concilii GP6 was deposited in GenBank under accession numbers CP002565 (chromosome) and CP002566 (plasmid).