A complete genome of a new Methanococcus maripaludis
strain (X1) was isolated from metagenomic data generated from a subsurface thermophilic saline oil reservoir. M. maripaludis
is a hydrogenotrophic methanogen widely found in mesophilic anaerobic saline surface environments (1
) and recently noticed by petroleum microbiologists because of its ability to corrode petroleum reservoir surface infrastructure (3
). Despite the abundance of M. maripaludis
X1 genomic DNA in our metagenomic data pool, we could not detect M. maripaludis
X1 16S rRNA genes in laboratory cultures incubated under reservoir conditions after the addition of a variety of nutrients, consistent with the recent finding that environmental organisms that are abundant and cosmopolitan are less likely to grow rapidly under rich nutrient conditions (5
). To our knowledge, the M. maripaludis
X1 genome is the first genome from a noncultured microorganism reconstructed directly from de novo
sequencing of a metagenomic data pool.
Formation water was collected from an offshore oil field (approximately 800-m depth, 50°C, with ~2.3% salinity) near Malaysia that had not been subject to water flooding or other treatments. Total DNA was extracted from 6 liters of fluid using the WaterMaster DNA purification kit (Epicentre, Madison, WI), and sequencing was performed using a combination of GS-FLX and Illumina (Ramaciotti Centre, Sydney, Australia) sequencing technologies.
The GS-FLX reads were initially assembled using Newbler (454 Life Sciences), and then both the GS-FLX and the paired-end Illumina data were assembled using Velvet (6
) to produce 7,719 contigs from the complete set of metagenomic reads. Alignment of all contigs to the M. maripaludis
S2 reference genome produced a subset of 42 that covered most of the S2 reference genome. Five of these contigs were >100 kbp in length, and 11 were over 50 kbp, with the longest being 153,187 bp. These contigs (with an average 6-fold coverage from GS-FLX reads and 101-fold coverage from Illumina reads) formed the skeleton (total of 1,611,868 bp) used to guide the assembly of the complete M. maripaludis
X1 genome. The lengths of the contigs indicated that each was derived from a single strain, as the presence of DNA from multiple strains would have caused Velvet to break them into small pieces at regions of significant difference. In fact, point differences observed at the ends of some contigs, possibly due to slight variations in the genome of the X1 organism or sequencing errors, caused Velvet to create multiple contigs. The gaps between contigs and ambiguous patches within contigs (i.e., NNNNNNNNNN fragments; about 7.7% of the total genome) were filled by manually aligning individual reads to the contig ends using in-house programs. The complete genome was annotated using the NCBI Prokaryotic Genomes Automatic Annotation Pipeline. It contained 1,746,697 bp with 32% G+C content, 1,892 predicted genes, and 2 rRNA operons.
Nucleotide sequence accession number.
The complete genome sequence for the M. maripaludis X1 genome was deposited under GenBank accession no. CP002913.