Thermococcus sp. strain 4557 (=CGMCC 1.5172) was isolated from the deep-sea hydrothermal vent Guaymas Basin site in the Gulf of California at a depth of 2,000 m. It was a sulfur-reducing, strictly anaerobic, and hyperthermophilic archaeon. Phylogenetic analysis based on 16S rRNA gene sequences of validly published species showed that strain 4557 belonged to the order
Thermococcales. Also, it was most closely related to a fast-growing and cell-fusing archaeon,
Thermococcus celericrescens TS2
T (
3), with a similarity of 99.7%. The
Thermococcales order encompasses three distinct genera:
Thermococcus,
Pyrococcus, and
Palaeococcus (
1). Members of the genus
Thermococcus, which are sulfur reducing and ubiquitously present in various deep-sea hydrothermal vent systems, are considered to play a significant role in the microbial consortia. To date, complete genome sequences of five
Thermococcus species, including
Thermococcus barophilus,
Thermococcus gammatolerans,
Thermococcus kodakarensis,
Thermococcus onnurineus, and
Thermococcus sibiricus, have been determined, and these showed variability of those strains at the level of genomic content and organization (
2,
4–
7).
The genome sequence of
Thermococcus sp. 4557 was determined at the Shenzhen Huada Genomics Institute (BGI; Shenzhen, China) with a strategy of Solexa paired-end sequencing technology. Sequencing resulted in a total of 215.64-Mb high-quality reads with approximately 100-fold coverage of the entire genome. All of the reads were assembled by using the SOAPdenovo alignment tool (
http://soap.genomics.org.cn/index.html#intro2), and 59 contigs were generated. Gaps between contigs were closed by direct PCR sequencing with primers, which were designed to anneal to each end of neighboring contigs. Protein-coding sequences were predicted and annotated by the NCBI Prokaryotic Genomes Automatic Annotation Pipeline (PGAAP) (
http://www.ncbi.nlm.nih.gov/genomes/static/Pipeline.html).
The Thermococcus sp. 4557 complete genome consists of a single circular chromosome of 2,011,320 bp with a G+C content of 56.08%. The chromosome is predicted to contain 2,144 protein-coding sequences, which cover 91.68% of the overall genome. Also, one copy of 16S-23S rRNA, two copies of 5S rRNA, a 7S rRNA for signal recognition particle, and 45 tRNA genes were detected. In addition, there were five putative transposons and 22 tandem repeat sequences in the genome.
Like the other five sequenced Thermococcus species above, the genome of strain 4557 possesses numerous genes for metabolism of proteins and starch, element sulfur reduction, various metal ion transporters, and a stress chaperone system. In addition, strain 4557 carries some genes for degradation aromatic compounds, which may be favorable for its survival and growth in the deep-sea hydrothermal vent Guaymas Basin site, in which a continuing oil seep exists. A detailed analysis of the genome of Thermococcus sp. 4557 and comparative genome analysis with other Thermococcus members will promote our understanding of its adaptation to the deep-sea hydrothermal environment.
Nucleotide sequence accession number. The complete genome sequence of strain 4557 is available in GenBank under accession number CP002920.