strain NS-C was isolated from a shallow submarine hot spring at Lucrino Beach near Naples, Italy (1
), and successfully grown in culture (14
). Since then, T. litoralis
has been the focus of studies on biocatalysis (10
), archaeal metabolism (2
), DNA replication (4
), and protein splicing (15
was grown at the New England BioLabs fermentation facility as described previously (1
). Genomic DNA was purified, and DNA libraries were constructed for paired-end Illumina sequencing-by-synthesis at Cofactor Genomics, Inc. (St. Louis, MO). Sequencing reads were assembled using the Cofactor Genomics short oligonucleotide analysis package (SOAP)-based de novo
assembly pipeline and annotated using the NCBI Prokaryotic Genomes Automatic Annotation Pipeline (PGAAP).
The T. litoralis
chromosome was assembled into 77 contigs with a total assembled genome length of 2,309,438 bp, similar to other sequenced Thermococcus
genomes. The chromosome contains 2,724 predicted open reading frames, 47 tRNA genes, 1 16S and 1 23S rRNA gene, and 2 5S rRNA genes. The T. litoralis
genome G+C content is 43%, which is similar to the 38% calculated from the melting point (Tm
) determination reported by Belkin and Jannasch in 1985 (1
The majority of T. litoralis replication proteins had the highest similarity to homologs from Thermococcus sibiricus MM 739, although the DNA polymerase D large subunit was most similar to its homolog in Pyrococcus yayanosii CH1, DNA polymerase B (commercialized as Vent DNA polymerase) to its homolog in Thermococcus species strain AM4, and replication factor C (RFC) small subunit to its homolog in Methanocaldococcus jannaschii. A lesion bypass DNA polymerase gene similar to dpo4 or dinB was not found.
played a pivotal role in our understanding of inteins. Inteins are mobile genetic elements that can be horizontally transferred after cleavage of the intein insertion site in empty host protein genes (termed exteins) by the associated homing endonuclease domain present in many inteins using a simple double-strand break repair mechanism (16
). This insertion is apparently neutral, because the intein protein can splice itself out of the precursor protein to yield a fully functional mature extein protein. The inteins found in T. litoralis
DNA polymerase B (Vent DNA polymerase) were the first to have Ser or Thr as nucleophiles instead of Cys at both splice junctions (15
), and the Tli pol-2 intein was the first to have different N- and C-terminal nucleophiles (Ser and Thr, respectively). This allowed positioning of the splice site prior to the intein N-terminal Ser or Cys nucleophile, which was later confirmed by N-terminal sequencing of excised inteins. Thirteen inteins are present in the T. litoralis
genome in 8 different genes. These inteins all contain dodecapeptide (DOD) family homing endonuclease domains except for the Tli MCM-1 intein. All are standard, class 1 inteins except for the Tli KlbA intein, which splices using the class 2 mechanism (18
). The Tli MCM-2 intein insertion site (VLVLADMGIA/CIDEIDKMSD, CDC21-e) has not been previously noted in InBase, the on-line intein database (16
Nucleotide sequence accession numbers.
This whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number AHVB00000000. The version described in this paper is the first version, AHVB01000000.