|Home | About | Journals | Submit | Contact Us | Français|
Thermococcus species are widely distributed in terrestrial and marine hydrothermal areas, as well as in deep subsurface oil reservoirs. Thermococcus sibiricus is a hyperthermophilic anaerobic archaeon isolated from a well of the never flooded oil-bearing Jurassic horizon of a high-temperature oil reservoir. To obtain insight into the genome of an archaeon inhabiting the oil reservoir, we have determined and annotated the complete 1,845,800-base genome of T. sibiricus. A total of 2,061 protein-coding genes have been identified, 387 of which are absent in other members of the order Thermococcales. Physiological features and genomic data reveal numerous hydrolytic enzymes (e.g., cellulolytic enzymes, agarase, laminarinase, and lipases) and metabolic pathways, support the proposal of the indigenous origin of T. sibiricus in the oil reservoir, and explain its survival over geologic time and its proliferation in this habitat. Indeed, in addition to proteinaceous compounds known previously to be present in oil reservoirs at limiting concentrations, its growth was stimulated by cellulose, agarose, and triacylglycerides, as well as by alkanes. Two polysaccharide degradation loci were probably acquired by T. sibiricus from thermophilic bacteria following lateral gene transfer events. The first, a “saccharolytic gene island” absent in the genomes of other members of the order Thermococcales, contains the complete set of genes responsible for the hydrolysis of cellulose and β-linked polysaccharides. The second harbors genes for maltose and trehalose degradation. Considering that agarose and laminarin are components of algae, the encoded enzymes and the substrate spectrum of T. sibiricus indicate the ability to metabolize the buried organic matter from the original oceanic sediment.
Thermococcus sibiricus is a hyperthermophilic anaerobic archaeon isolated from a well of the never flooded oil-bearing Jurassic horizon of the high-temperature Samotlor oil reservoir (Western Siberia) (32). The sampling site had a temperature of 84°C and was located at a depth of 2,350 m. Thermococcus species are widely distributed in terrestrial and marine hydrothermal areas (4), as well as in deep subsurface oil reservoirs (38, 53). Close relatives of T. sibiricus with a 16S rRNA gene sequence similarity of >99% were identified in high-temperature oil wells of Japan (54) and China (36). Together with the genera Pyrococcus and Palaeococcus, Thermococcus spp. form the euryarchaeal order Thermococcales (4). Most of these hyperthermophilic archaea are organoheterotrophs that utilize proteins, starch, and maltose with elemental sulfur (S°) or protons as electron acceptors (4, 47). Two exceptions are Thermococcus strain AM4 (51) and Thermococcus onnurineus (23), which are also capable of lithotrophic CO-dependent hydrogenogenic growth (52).
Genomic sequences of T. onnurineus (23), Thermococcus kodakaraensis (11), Pyrococcus horikoshii (18), Pyrococcus furiosus (44), and Pyrococcus abyssi (7) provided information on the genetic and metabolic machinery of these closely related organisms. However, none of them originated from deep subsurface oil reservoirs. These habitats contain low levels of dissolved organic carbon and trace amounts of free amino acids (53) but harbor high numbers of anaerobic organisms, reaching 1.4 × 106 cells ml−1 (38). T. sibiricus was originally reported to grow exclusively on peptides (32). The organism was obtained from a sample of an oil-bearing Jurassic horizon that had never been flooded. The temperature, pH, and salinity characteristics for growth correlated with the natural conditions at the sampling site. Therefore, an indigenous origin was suggested for T. sibiricus, which might have survived over geologic time by metabolizing buried organic matter from the original oceanic sediment (32).
Here we present the genome of T. sibiricus and show that it encodes numerous hydrolytic enzymes and metabolic pathways which may allow the utilization of diverse organic polymers from the original oceanic sediment. Our experiments provide evidence for these new physiological features and support the suggestion of the indigenous origin of T. sibiricus.
T. sibiricus MM 739 (DSM 12597) (32) was obtained from the culture collection of the Laboratory of Hyperthermophilic Microbial Communities, Winogradsky Institute of Microbiology of the Russian Academy of Sciences. For the isolation of DNA, T. sibiricus was grown anaerobically at 78°C in an atmosphere of 80% N2 and 20% CO2 in basal mineral medium supplemented with peptone (5 g liter−1), yeast extract (0.1 g liter−1), and S° (10 g liter−1) as described previously (32). Cells were harvested in the early exponential growth phase, and genomic DNA was isolated according to the Marmur procedure (30).
Potential growth-supporting substrates were identified by the cultivation of T. sibiricus under strictly anaerobic conditions at 78°C in basal medium (32) supplemented with yeast extract (0.06 g liter−1) and S° (10 g liter−1). The following substrates were tested: peptone, starch, maltose, dextran, amorphous cellulose, carboxymethyl cellulose, microcrystalline cellulose, agarose, chitin, xylan, pectin (all at 2 g liter−1), maltose (10 mM), cellobiose (10 mM), phenol (0.5 mM), toluene (0.5 mM), hexadecane, acetone, sodium linoleate, sodium palmitate (all at 1 mM), olive oil (1 ml liter−1), glycerol (1 ml liter−1), and CO (100% in the gas phase). The ability to grow on peptone, maltose, and CO was analyzed in the absence of S° as well. Growth was determined by direct cell counting with a phase-contrast microscope.
The T. sibiricus genome was sequenced on a Roche GS FLX genome sequencer by the standard protocol for a shotgun genome library. The GS FLX run resulted in the generation of about 70 Mb of sequences with an average read length of 230 bp. The GS FLX reads were assembled into four large contigs by GS De Novo Assembler. The 454 contigs were oriented into scaffolds, and the complete genome sequence was obtained upon the generation and sequencing of appropriate PCR fragments. The assembly of the genome at sites with IS elements was verified by PCR amplification and sequencing of these regions.
The rRNA genes were identified by searching against the Rfam database (12). tRNA genes were located with tRNAscan-SE (27). Protein-coding genes were identified with the GLIMMER gene finder (8). Whole-genome annotation and analysis were performed with the AutoFACT annotation tool (21). Clusters of regularly interspaced repeats (CRISPR) were identified with CRISPR Finder (13); putative transposon-related proteins were found by searching against the IS database (http://www-is.biotoul.fr/is.html). Signal peptides were predicted with SignalP v. 3.0 (http://www.cbs.dtu.dk/services/SignalP/) by using the HMM algorithm.
The annotated genome sequence has been deposited in the GenBank database under accession no. CP001463.
T. sibiricus has a single circular chromosome of 1,845,800 bp with no extrachromosomal elements (see Fig. S1 in the supplemental material). There are a single copy of a 16S-23S rRNA operon and two distantly located 5S rRNA genes. A total of 46 tRNA genes carrying 44 different anticodons coding for all 20 amino acids are scattered over the genome.
By a combination of coding potential prediction and similarity search, 2,061 potential protein-coding genes were identified with an average length of 815 bp, covering 91% of the genome. These values are in good accordance with the general correlation between the microbial genome size and the predicted gene numbers. Through similarity and domain searches of the predicted protein products, the functions of 69% of the genes (1,413 genes) may be predicted with different degrees of confidence and generalization. The functions of the remaining 648 genes (31%) cannot be predicted from the deduced amino acid sequences; 181 of them are unique to T. sibiricus, with no significant similarity to any known sequences.
About 70% of the T. sibiricus proteins showed similarity to those of T. kodakaraensis and T. onnurineus (see Fig. S2 in the supplemental material), and 1,024 of them are also present in all three Pyrococcus species. This shared set of genes might belong to the common ancestor of members of the order Thermococcales. About 387 of the T. sibiricus proteins are not present in other members of the order Thermococcales, and although many of them encode proteins with unknown function, there are several intriguing genes that might be responsible for the specific traits of T. sibiricus, as discussed below.
Pairwise comparison by all-vs-all BLASTP of the genomes of T. sibiricus and other Thermococcus spp. confirmed the high frequency of shuffling-driven genome rearrangements (see Fig. S2 in the supplemental material). In contrast to the well-conserved gene context among Pyrococcus genomes, the genomes of thermococci are more rearranged and the synteny is conserved only for short chromosomal segments.
Analysis of the repeated sequences and a search against the IS database revealed seven copies of putative transposons of three types. We found two almost identical complete copies of an IS605 family transposon, which was named ISTSi1. This 1,856-bp-long IS element contains two genes, orfA and orfB, that encode homologs of the resolvase (COG2452) and transposase (COG0675) of IS605. The almost identical sequences of two copies of ISTsi1 suggest that this IS element invaded the T. sibiricus genome recently on the evolutionary time scale and may remain functional. ISTsi1-like sequences are also present in the genomes of other Thermococcus species.
Another IS element of the IS605 family, ISTsi2, is present in two almost identical copies. This 1,398-bp-long IS element contains a pseudogene of an IS605 family transposase (orfB, COG0675) carrying three frameshift mutations. In addition, a shortened copy of this IS element (1,359 bp), also carrying a transposase pseudogene, was found. Perhaps the propagation of this already nonfunctional IS element depends on another active transposon. The only archaeal homologue of the “restored” transposase of ISTsi2 is the PF1609 protein of P. furiosus; many others are encoded in the genomes of thermophilic bacteria (Caldicellulosiruptor, Anaerocellum, Geobacillus, etc.), suggesting ancient invasion of the genome of the common ancestor of members of the order Thermococcales by this IS element.
The third IS element, ISTsi3, belongs to the IS200/IS600 family and is present in one full-size copy (1,884 bp) and two shortened copies (1,630 bp and 1,626 bp). The full-size copy contains two genes (COG1943 and COG0675), while the shorter copies harbor only the longer gene, orfB (COG0675). Note that proteins similar to OrfB are encoded in the genomes of other Thermococcus species but OrfA-like proteins are present only in crenarchaeal and bacterial genomes.
In the crenarchaeal genus Sulfolobus, site-specific integration of viruses occurs at tRNA genes by the integrase-mediated mechanism (39). Such regions bordered by a partitioned integrase gene have been found in other members of the order Thermococcales, P. horikoshii, and T. kodakaraensis (11). In particular, the T. kodakaraensis genome contains four such regions comprising 18 to 28 kb each and encoding several genes, including genes related to DNA replication (11). The search for similar integrated viruses in the T. sibiricus genome revealed only one such region, bordered by a partitioned integrase homologues to the integrase from the SSV1 virus for Sulfolobus shibatae. This region is much shorter (2.7 kb) than virus-related regions present in the T. kodakaraensis genome and contains only a single gene (Tsib_0741), thus probably representing the late stage of elimination of the integrated virus-related element from the genome.
The T. sibiricus genome contains a single CRISPR containing 24 repeat spacer units. The spacer regions are supposed to be derived from extrachromosomal elements like viruses (33), and the spacer transcripts may inactivate mobile-element propagation by a mechanism somewhat similar to eukaryotic RNA interference (2, 26). We found no matches between the spacer sequences and any known archaeal extrachromosomal genetic elements, and no CRISPR spacer matched the sequences of the IS elements present in the T. sibiricus genome. Other representatives of the order Thermococcales generally harbor multiple CRISPR loci (11) carrying more repeat spacer units. This difference may reflect the constant and relatively noncompetitive natural environment of T. sibiricus, which is rarely invaded by viruses and other mobile elements.
The archaeal chromosomal replication initiation site (oriC) was first identified in P. abyssi within the noncoding region located upstream of a gene that encodes a homolog of the eukaryotic Orc1/Cdc6 cell division control protein (35). Members of the order Thermococcales contain a single replication initiation site, while some other archaea, for example, Sulfolobus species, carry multiple chromosomal replication origins, and multiple cdc6 genes were found to be located close to them (31, 45). In three Pyrococcus species, ori regions contain a pair of conserved origin recognition box (ORB) sequences, the Orc1/Cdc6 protein binding sites, separated by an ~250-bp spacer region (45). Likewise, the ori site of T. kodakaraensis is located in an AT-rich region located upstream of a gene for the Orc1/Cdc6 homolog (11). Usually, the ori sites are also linked to a “replication island,” a variety of replication- and recombination-associated genes (e.g., polymerase, rfc, radA, helicase, etc.).
Nucleotide composition disparity analysis (Z-curve analysis) (57) of the T. sibiricus genome showed one major peak at around 1.78 Mb where nucleotide compositional deviation change occurred (Fig. (Fig.1).1). This peak was found for the MK disparity component of the Z curve, as observed for the Sulfolobus acidocaldarius (5) and Pyrobaculum aerophilum (57) chromosomal ori sites. The search for ORB sequences (45, 46) revealed that the noncoding region located between the Tsib_1993 and Tsib_1994 genes (functions unknown) contains two ORB sequences, as well as an additional partial copy of this motif (Fig. (Fig.1).1). As in pyrococci, two full copies of the ORB sequence are oriented in opposite directions and separated by a 250-bp spacer. The position of this region coincides with the location of the MK disparity peak (Fig. (Fig.1),1), further supporting the hypothesis that the ori site is located at this point.
The orc1/cdc6 homologue is present in the T. sibiricus genome (Tsib_1590), but unlike in other members of the order Thermococcales, it is located distantly from the potential oriC site and no ORB-like sequences were found around this gene. The replication island near oriC is most complete in P. abyssi and P. horikoshii, while in T. kodakaraensis it is separated into two regions, where genes for the Orc1/Cdc6 homolog and DNA polymerase are 290 kb away from the genes for the RadB and RF-C subunits (11). The genome of T. sibiricus is more rearranged; only the radA gene (Tsib_1992) remains to be associated with oriC. Two other regions are located distantly from this site—the cdc6 gene (Tsib_1590) linked to DNA polymerase II subunits (Tsib_1588 and Tsib_1589) and the third region comprising the rfc (Tsib_0101 and Tsib_0102) and radB (Tsib_0108) genes. It is possible that the more condensed replication island structure in Pyrococcus species reflects specific adaptation of their DNA replication machinery to extreme thermophily (98 to 103°C) relative to thermococci growing at considerably lower temperatures (80 to 90°C).
T. sibiricus was originally reported to grow on proteinaceous substrates, peptone, yeast extract, beef extract, and soya bean extract, but not on starch, pyruvate, glucose, acetate, methanol, ethanol, lactate, or H2/CO2 (32). S° as an electron acceptor was not obligately required but stimulated growth.
Analysis of the T. sibiricus genome revealed the presence of genes required for protein export systems, and the SignalP algorithm predicted a total of 281 proteins carrying N-terminal signal sequences. Most of them have been annotated as transporters, proteases, glycoside hydrolases, lipases/esterases, and hypothetical proteins. The encoded proteins suggested the ability of T. sibiricus to utilize additional substrates. Therefore, in this work we attempted to adapt the organism to growth on some substrates whose utilization could be expected from the genomic data (see Table S1 in the supplemental material). We performed cultivation experiments and, after sequential transfers, identified maltose, dextran, amorphous cellulose, carboxymethyl cellulose, cellobiose, agarose, hexadecane, acetone, olive oil, and glycerol as new substrates for T. sibiricus. Starch, microcrystalline cellulose, chitin, xylan, pectin, phenol, toluene, long-chain fatty acids, and CO did not support its growth (see Table S1 in the supplemental material).
Growth of T. sibiricus on proteins and peptides as the sole source of energy and carbon is enabled by the functioning of two signal peptide-containing extracellular proteases, a pyrolysin-like serine protease (Tsib_0267) and an archaeal subtilisin-like serine protease (Tsib_1234). The cell can import peptides by ABC-type dipeptide/oligopeptide transport system, dipeptide ABC-transporters, and the OPT family oligopeptide transporter. The imported peptides can be cleaved by more than 20 encoded peptidases, including aminopeptidases, carboxypeptidases, and dipeptidases. The amino acids can then be deaminated by a number of aminotransferases (see Table S2 in the supplemental material) in a glutamate dehydrogenase (Tsib_1110)-coupled manner, followed by oxidation to generate the corresponding coenzyme A (CoA) derivatives, as proposed for P. furiosus (1). The oxidation step involves at least four types of ferredoxin-dependent oxidoreductases with distinct substrate specificities (1): pyruvate:ferredoxin oxidoreductase, 2-ketoisovalerate:ferredoxin oxidoreductase, indolepyruvate:ferredoxin oxidoreductase, 2-ketoglutarate:ferredoxin oxidoreductase, and the second 2-ketoglutarate:ferredoxin oxidoreductase (see Table S2 in the supplemental material). An additional heterodimeric 2-oxoacid:ferredoxin oxidoreductase, the substrate specificity of which cannot be identified by sequence comparison, may be involved in the oxidation of the acyl-CoA derivates.
The acyl-CoA derivatives in T. sibiricus are converted to the corresponding acids by the two acetyl-CoA synthetases (ACS I and II) and a succinyl-CoA synthetase (see Table S2 in the supplemental material) with concomitant substrate level phosphorylation to generate ATP, as shown in P. furiosus (29) and T. kodakaraensis (49). The pathway involving a two-step transformation of acetyl-CoA to acetate with phosphotransacetylase and acetate kinase, which is common in anaerobic bacteria, presumably does not operate in T. sibiricus because homologs of the currently known genes for phosphotransacetylase and acetate kinase are absent.
As an alternative assimilation pathway for amino acids, 2-keto acids derived from amino acids can be decarboxylated, depending on the redox balance of the cell, to corresponding aldehydes by ferredoxin-independent reactions of the ferredoxin-dependent oxidoreductases (28) and then oxidized to form carboxylic acids by the function of five aldehyde:ferredoxin oxidoreductases (see Table S2 in the supplemental material). Alcohol dehydrogenases (ADHs) might be responsible for the reduction of aldehydes to alcohols in the absence of sufficient amounts of the terminal electron acceptor, S°, in the reaction coupled to the oxidation of NADPH to NADP+, which would dispose of excess reductant (28). In the T. sibiricus genome, there are three genes for short-chain ADHs and one gene for Fe-dependent ADH (see Table S2 in the supplemental material).
The genome encodes numerous enzymes for the metabolism of polymeric carbohydrates and oligosaccharides, which correlates with the growth of T. sibiricus on maltose, dextran, amorphous cellulose, carboxymethyl cellulose, cellobiose, and agarose observed in this work. T. sibiricus cannot grow on chitin, xylan, or pectin, which is consistent with the absence of genes that encode chitinases, xylanases, and pectinases in its genome.
The inability of T. sibiricus to grow on starch correlated with the genomic data. Contrary to several Thermococcus and Pyrococcus species, no genes for extracellular amylolytic enzymes were found on the T. sibiricus genome. A single identified α-amylase (Tsib_1115) lacks a signal peptide and is intracellular. T. sibiricus was found to be able to grow on maltose. Extracellular maltose and maltooligasaccharides could be transported into the cell and then hydrolyzed to glucose. The production of glucose could be accomplished by several intracellular enzymes: α-amylase, glycogen-debranching enzyme (Tsib_0365), two α-glucosidases (Tsib_1113 and Tsib_0873), and a 4-α-glucanotransferase (Tsib_0455).
In contrast to starch, T. sibiricus grew quite well on dextran, which is an α-1,6-linked d-glucose polymer branched at α-1,4 linkages. However, no extracellular enzymes for the degradation of dextran could be identified in the genome.
The genome predicts the ability of T. sibiricus to metabolize cellulose, which is in agreement with its observed growth on amorphous cellulose. The T. sibiricus genome encodes a number of β-specific glycoside hydrolases, which form a complete system for cellulose degradation. The Tsib_0320-to-Tsib_0334 gene cluster includes three putative endo-β1,4-glucanases (extracellular Tsib_0326 and Tsib_0328, as well as intracellular Tsib_0327). Both extracellular enzymes are homologous to endo-β-1,4-glucanases CelA (TM1524) and CelB (TM1525) from Thermotoga maritima, which was found to be active against soluble substrates (25), as well as to P. furiosus endo-β-1,4-glucanase EglA (PF0854), which shows maximum activity on C4-to-C6 cellobiose oligosaccharides (3). Like the T. maritima and P. furiosus enzymes, both extracellular endo-β-1,4-glucanases from T. sibiricus do not contain a cellulose-binding domain (58) required to bind to and efficiently hydrolyze crystalline cellulose. This correlates with the inability of T. sibiricus to grow on crystalline cellulose and explains its growth on amorphous cellulose, carboxymethyl cellulose, and cellobiose. The imported cellooligomers are further degraded to smaller oligosaccharides by intracellular endo-β-1,4-glucanases, as suggested for T. maritima (6). At least three such enzymes are encoded by T. sibiricus (Tsib_0327, Tsib_0137, and Tsib_1215).
Like T. neapolitana and T. maritima, T. sibiricus and other archaea do not contain genes for cellobiohydrolase, a key enzyme in classical cellulolytic systems. However, despite the absence of this enzyme, the genome of T. sibiricus encodes an intracellular β-glucan glucohydrolase (Tsib_0580) that displays high homology with functionally characterized intracellular β-glucan glucohydrolase GdhA from Thermotoga neapolitana (56), which has been suggested as a substitute for cellobiohydrolase. The soluble products of the initial hydrolysis of cellulose by endo-β-1,4-glucanases in Thermotoga are then hydrolyzed by intracellular GghA, which preferentially attacks the longer cellooligomers cellotriose to cellohexaose, releasing glucose from the nonreducing end and producing cellobiose (56). Cellobiose can also be hydrolyzed by a β-glucosidase (Tsib_0334) most closely related to functionally characterized P. furiosus β-glucosidase CelA (PF0073) (55). An additional gene product involved in cellulose degradation is Tsib_0320, which encodes a putative intracellular cellobiose phosphorylase. Tsib_0320 is related to functionally characterized intracellular cellobiose phosphorylases CbpA from T. neapolitana (56) and CepA from T. maritima. In Thermotoga sp., cellobiose phosphorylase attacks cellulose-derived cellobiose, producing the activated molecule glucose-1-phosphate and glucose.
The genome encodes an extracellular laminarinase (endo-β-1,3-glucanase, Tsib_0321) highly homologous to functionally characterized laminarinase LamB (PF0076) from P. furiosus and to many bacterial laminarinases. This enzyme acts best on the β-1,3-glucose polymer laminarin, hydrolyzing it into smaller laminari-oligomers; displays some hydrolytic activity with the β-1,3-1,4 glucose polymers lichenan and barley β-glucan; and is not active with the β-1,4 glucose polymer cellulose (15). The source of laminarin in nature is brown algae. Endo-β-1,4-glucanase (cellulase), laminarinase, and β-glucosidase, encoded by the gene cluster Tsib_0320 to Tsib_0334, could perform synergistic activity during the hydrolysis of complex carbohydrates, as demonstrated earlier for barley β-glucan and laminarin (10).
The genome predicts the ability of T. sibiricus to metabolize agarose, which is confirmed by the observed growth on this substrate. Utilization of agarose is apparently enabled due to the function of an extracellular β-agarase (Tsib_0325). Agar, a cell wall constituent of many red algae, exists in nature as a mixture of unsubstituted and substituted agarose polymers and can constitute up to 70% of the algal cell wall. Agarase hydrolyzes agar and agarose, generating oligosaccharides. β-Agarase Tsib_0325 displays high similarity to the putative β-agarases of Rhodopseudomonas palustris TIE-1 and many different bacteria, including Victivallis vadensis and Pseudomonas putida. No homologs of Tsib_0325 or agarases are present in the genomes of any other archaea. A gene attached to the Tsib_0325 gene encodes a β-galactosidase (Tsib_0324) which would detach galactose residues from the oligosaccharides produced by β-agarase.
The lower-molecular-weight oligosaccharides resulting from the hydrolysis of polymers can be imported into the cell via binding protein-dependent ABC-type transport systems. The gene cluster Tsib_0320 to Tsib_0334 includes genes that encode an ABC transport system (Tsib_0329 to Tsib_0333) for β-glucosides. This system is homologous to the cellobiose/β-glucoside ABC transport system of P. furiosus (20).
The above data show that one particular region of the T. sibiricus genome (bp 308742 to 328657, corresponding to genes Tsib_0320 to Tsib_0334) carries a set of genes responsible for the utilization of cellulose, laminarin, agar, and other β-linked polysaccharides. This region contains genes for the extracellular cleavage of these substrates, transport of oligosaccharides into cells, and their subsequent intracellular cleavage to monomers (see Table S2 in the supplemental material). The closest homologs of all enzymes (except β-agarase) were found in Thermotoga and Pyrococcus species; however, this region is absent in T. kodakaraensis and T. onnurineus. In T. sibiricus, this gene island is embedded in a cluster of ribosomal protein genes with a one-to-one correlation to highly conserved corresponding genes in T. kodakaraensis and T. onnurineus (see Fig. S3 in the supplemental material). All of the above observations suggest that the “saccharolytic gene island” was horizontally transferred to the T. sibiricus genome from unknown thermophilic bacteria probably related to Thermotoga sp. The possible archaeal origin of this gene island also cannot be excluded, since extensive lateral gene transfer from archaea was suggested for T. maritima, in whose genome about 24% of the genes were found to bear the most similarity to archaeal genes (37). Homologs of some of the genes in the T. sibiricus saccharolytic gene island are also present in P. furiosus, but they are distributed in the genome and not located in a particular gene island.
Another supposed case of lateral transfer of saccharolytic genes in members of the order Thermococcales is the 16-kb maltose and trehalose degradation locus found in the P. furiosus and Thermococcus litoralis genomes (9). A similar gene island is present in the T. sibiricus genome (bp 340628 to 356664, corresponding to genes Tsib_0356 to Tsib_0369) with a one-to-one correlation to P. furiosus genes PF1737 to PF1751 (see Fig.S3 in the supplemental material). This gene cluster, which encodes a trehalose synthase (Tsib_0361), a glycogen-debranching enzyme (Tsib_0365), and a glycosidase (Tsib_0366), also harbors genes of two ABC transport systems for α-glucosides. The first (Tsib_0358 to Tsib_0360 and Tsib_0363) and the second (Tsib_0367 to Tsib_0369) transporters are related to the CUT1 (carbohydrate uptake) family Mal-I transporter of P. furiosus, which recognizes and transports maltose and trehalose, as well as to the Mal-II transporter of P. furiosus, which is specific for maltooligosaccharides (24). In addition, the second transporter contains a permease component (Tsib_0367) whose halves are highly homologous to the transmembrane ABC transporter permeases PF1748 and PF1749 that are upregulated in maltose- and starch-grown P. furiosus and have been suggested to function in maltose and starch assimilation (24). Tsib_0362 encodes TrmB, a maltose-specific transcriptional repressor of the Mal-I and Mal-II gene clusters in P. furiosus. This maltose and trehalose degradation gene island is present in the genomes of T. sibiricus, T. litoralis, and P. furiosus but is absent in closely related species (T. kodakaraensis, T. onnurineus, P. abyssi, and P. horikoshii), further supporting the hypothesis of its lateral transmission among members of the order Thermococcales.
The genome of T. sibiricus contains genes that encode enzymes of the modified Embden-Meyerhof (EM) pathway of glucose metabolism (see Table S2 in the supplemental material) of members of the order Thermococcales (50): ADP-dependent glucokinase, phosphoglucose isomerase (glucose-6-phosphate isomerase), ADP-dependent phosphofructokinase, fructose-1,6-bisphosphate aldolase, triosephosphate isomerase, glyceraldehyde-3-phosphate:ferredoxin oxidoreductase, nonphosphorylating glyceraldehyde-3-phosphate dehydrogenase, phosphoglycerate mutase, enolase, and pyruvate kinase. The pyruvate formed is further converted to acetyl-CoA and CO2 by pyruvate:ferredoxin oxidoreductase.
Generation of different sugars as precursors for anabolic pathways, especially during growth on proteins and for the synthesis of glucose polymers (glycogen) as storage material, proceeds in the gluconeogenic pathway, which shares many enzymes with the EM pathway, and is therefore partly reversible to the EM pathway (50). The T. sibiricus genome encodes the following enzymes to reverse the irreversible reactions of the EM pathway: phosphoenolpyruvate synthase, 3-phosphoglycerate kinase, glyceraldehyde-3-phosphate dehydrogenase, and fructose-1,6-bisphosphatase (see Table S2 in the supplemental material).
Attempts to grow T. sibiricus on medium with phenol and toluene as substrates did not result in growth, suggesting that it is unable to utilize aromatic components of crude oil. Correspondingly, in its genome, the genes that encode the key enzymes for anaerobic degradation of aromatics—benzoyl-CoA reductase, 4-hydroxybenzoyl-CoA reductase, benzylsuccinate synthase, and 6-oxocyclohex-1-ene-1-carbonyl-CoA hydrolase (16, 22, 41)—were absent.
In contrast to aromatic compounds, sequential transfers of T. sibiricus on medium with hexadecane as the substrate revealed reproducible growth, suggesting a potential of the organism to metabolize n-alkanes—the main constituents of petroleum and its refined products. Anaerobic bacteria with the capacity for alkane degradation have been isolated relatively recently, and the enzymes responsible have not been purified and characterized yet (14, 19). One mechanism of activation under anoxic conditions is carboxylation of the third carbon atom in an alkane (16, 34), which has been proposed for the sulfate-reducing bacterium Desulfococcus oleovorans Hxd3, which utilizes long-chain C12 to C20 alkanes. A more widespread activation mechanism is the addition of fumarate at the subterminal carbon atom, which yields a substituted succinate [(1-methylalkyl)succinate], using a glycyl radical as an initiator (34, 42). The gene that encodes the candidate enzyme (1-methylalkyl)succinate synthase (Mas) has been identified recently in alkane-utilizing denitrifying bacterial strain HxN1 (14). The tentative Mas protein is presumably a heterotrimer (MasCDE) that contains a motif (in MasD) characteristic of glycyl radical-bearing sites. The large subunit MasD is encoded by the T. sibiricus genome as Tsib_0631 and annotated as pyruvate:formate lyase. Close homologs of the subunits MasE and MasC are absent. MasG, the activase of MasCDE, is highly homologous to Tsib_0629, which is annotated as pyruvate-formate lyase-activating protein. It is impossible to conclude whether Tsib_0631 and Tsib_0629 encode the n-alkane-activating enzyme Mas and its activator or pyruvate:formate lyase and its activator. In T. kodakaraensis, similar open reading frames were described as pyruvate-formate lyase, and a role for this enzyme in pyruvate metabolism was proposed. Therefore, although T. sibiricus apparently grows on hexadecane, the lack of enzymatic data on anaerobic alkane activation does not allow the derivation of insight into the mechanism of hexadecane activation from the genome.
In alkane-utilizing bacteria, fatty acid chains formed from alkanes as acyl-CoA thioester can be further degraded via conventional β-oxidation with the production of acetyl-CoA and its subsequent mineralization to CO2 (41). However, a β-oxidation pathway is not present in T. sibiricus since homologs of the currently known genes of its key enzymes—fatty acid CoA ligase, acyl-CoA dehydrogenase, and enoyl-CoA hydratase/isomerase—are not encoded in its genome.
The degradation of propionate (e.g., in “Aromatoleum aromaticum” EbN1) follows the methylmalonyl-CoA pathway, which is probably also involved in the degradation of odd-chained fatty acids, yielding propionyl-CoA as an intermediate (41). The key enzymes of the methylmalonyl-CoA pathway are propionyl-CoA carboxylase (PccAB and AccBC) and methylmalonyl-CoA mutase (SbmAB). The genome of T. sibiricus encodes PccAB and AccBC (Tsib_1534 to Tsib_1536), as well as SbmAB (Tsib_0812). Therefore, both key enzymes of the methylmalonyl-CoA pathway are present. However, propionyl-CoA synthase (PrpE) is not encoded, suggesting that T. sibiricus is unable to activate and utilize propionate. Considering the stimulatory effect of hexadecane on growth and the absence of conventional β-oxidation, as well as propionate-activating propionyl-CoA synthase, an unknown pathway for the metabolism of this substrate could be involved in T. sibiricus.
Genome analysis predicts the growth of T. sibiricus on acetone, which was confirmed by growth experiments. In “A. aromaticum” EbN1, degradation of acetone and 2-butanone involves the functions of acetone carboxylase (AcxABC) and succinyl-CoA:3-ketoacid-CoA ligase (KctAB) (22). In T. sibiricus, AcxA (Tsib_0819) and AcxB (Tsib_0820) are present. KctA and KctB are highly homologous to the N- and C-terminal halves of Tsib_0955, respectively. Degradation of acetone would result in acetyl-CoA, and degradation of 2-butanone would yield propionyl-CoA, which is further metabolized to pyruvate in the methylmalonyl-CoA pathway identified in T. sibiricus.
The genome of T. sibiricus contains 15 genes that encode lipases/esterases. Four of these putative enzymes found in T. sibiricus contain signal peptides (Tsib_0218, Tsib_0981, Tsib_1042, and Tsib_1454), suggesting that they operate extracellularly. In accordance with the genomic data, T. sibiricus was found to grow on olive oil, whose major constituent is triolein. The lipolytic growth of T. sibiricus appears to be unique in the order Thermococcales and confirms the presence and functionality in this organism of extracellular true lipases that hydrolyze ester bonds in triacylglycerides of long-chain fatty acids. In addition, T. sibiricus was found to grow on glycerol but not on long-chain fatty acids. These findings agree with the absence of a β-oxidation pathway of fatty acid degradation. Therefore, growth on olive oil results from the utilization of glycerol liberated after hydrolysis of triglycerides by an extracellular lipase(s). Glycerol could be transported into the cell by Tsib_1774, which is partially related to glycerol uptake facilitator GlpF from Archaeoglobus fulgidus, and phosphorylated by glycerol kinase (Tsib_0768) and enter the modified EM pathway.
T. sibiricus conserves energy by substrate level phosphorylation, as well as by oxidative phosphorylation (anaerobic respiration) linked to proton or sulfur reduction and proton translocation by membrane-bound complex I-related complexes MBH 1, MBH 2, and MBX (see below). Metabolism of sugars in the modified EM pathway and further oxidation of pyruvate result in the formation of two ATP molecules per molecule of glucose oxidized to acetate. Metabolism of proteins yields one molecule of ATP by substrate level phosphorylation per amino acid molecule oxidized.
The main mechanism of energy conservation in T. sibiricus appears to be proton translocation by three NADH:quinone oxidoreductase (complex I)-related systems: two membrane-bound proton-reducing H2-evolving hydrogenase complexes, MBH 1 and MBH 2 (energy-converting hydrogenases, Ech) (47), and one membrane-bound NADP+-reducing complex (MBX) eventually linked to the reduction of S° (48). The electron donor for these complexes is likely reduced ferredoxin, which is formed in ferredoxin-dependent reactions in the pathways of carbohydrate and protein oxidation. The function of H2-evolving hydrogenase complexes is supported by the observed evolution of H2 during the growth of T. sibiricus in the absence of S° as an electron acceptor (e.g., on peptone).
The genome of T. sibiricus contains two similar 14-gene operons that encode MBH 1 and MBH 2 (see Table S2 in the supplemental material) and are related to the MBH operon of P. furiosus, which has the same gene order and encodes a six-subunit proton-translocating NiFe hydrogenase and multisubunit Na+/H+ transporters (17, 48). It should be noted that two MBH operons are present in D. kamchatkensis (43), whereas single MBH operons were found in P. furiosus and T. kodakaraensis. A 13-gene operon, MBX (Tsib_1905-Tsib_1917), is related to the MBX operon of P. furiosus and has the same gene order (48).
In the absence of S°, T. sibiricus presumably conserves energy by the function of MBH 1 and MBH 2 with reduced ferredoxin as an electron donor and protons as an electron acceptor. Thereby, protons are reduced to H2 and a transmembrane proton gradient is established.
The growth of T. sibiricus is inhibited by high H2 concentrations and stimulated by the presence of S° as an electron acceptor. In the presence of S°, T. sibiricus conserves energy by the function of MBX, as proposed for P. furiosus (48). MBX oxidizes reduced ferredoxin, translocates protons, and reduces NADP+. NADPH is reoxidized by NADPH:S° oxidoreductase (Tsib_0263), which transfers electrons to S° to produce H2S.
The proton gradient established by the functions of MBH 1, MBH 2, and MBX is used by ATP synthase for the synthesis of ATP. The genome of T. sibiricus encodes a single Na+-ATP synthase encoded by the genes atpHIKECFABD (Tsib_1798 to Tsib_1790). Assuming that the Ech hydrogenases and MBX complex translocate protons, the presence of Na+/H+ transporters in the gene clusters of MBH 1, MBH 2, and MBX can be explained by the requirement to convert the proton gradient into a Na+ ion gradient used for ATP synthesis by the Na+-ATP synthase of T. sibiricus. Note that relatively high salinity (optimal growth at 20 g liter−1 NaCl) is required for growth of T. sibiricus (32).
In addition to S°, some archaea can respire with sulfate, thiosulfate, or nitrate. However, genome analysis suggests that these compounds cannot be employed as electron acceptors by T. sibiricus because homologs of the currently known enzymes of the dissimilatory sulfate and thiosulfate reduction pathway (sulfate adenylyltransferase, adenylylsulfate reductase, sulfite reductase, and thiosulfate reductase), as well as the dissimilatory nitrate reductase and nitrite reductase, are not encoded.
The metabolic pathways of substrate utilization and energy generation identified in the genome of T. sibiricus are presented in Fig. Fig.2.2. The genomic sequence provided a basis for the discovery of many new physiological features of this organism, which was initially described as a specialized utilizer of proteinaceous substrates. In addition to the expected protein degradation and metabolism machinery, the genome encodes numerous extracellular and intracellular enzymes for carbohydrate degradation and sugar transport systems. In agreement with the genes identified, we have found T. sibiricus to be capable of growing on oligomeric and polymeric α-linked (maltose and dextran) and β-linked (cellobiose, cellulose, laminarin, and agarose) carbohydrates. The ability to grow on agarose was previously found in Desulfurococcus fermentans (40) but was never shown for representatives of the order Thermococcales or other hyperthermophilic Euryarchaeota. In the case of T. sibiricus, this capacity is supported by genome analysis, which revealed the presence of a β-agarase gene. The complete set of genes responsible for the hydrolysis of β-linked polysaccharides, their transport into the cell, and intracellular hydrolysis of oligomers to monosaccharides is located on a saccharolytic gene island, which is absent in the genomes of other members of the order Thermococcales and was probably acquired by T. sibiricus by lateral gene transfer. The second laterally acquired gene island contains genes for maltose and trehalose degradation.
Moreover, T. sibiricus probably possesses a new mechanism of n-alkane degradation, since its growth is stimulated by hexadecane and no enzymes of currently known pathways are encoded. Not reported so far for hyperthermophilic archaea is the lipolytic growth of T. sibiricus on triacylglycerides. This growth is apparently enabled by the function of an extracellular true lipase(s) encoded by the genome.
The new physiological features of T. sibiricus are consistent with the proposal of the indigenous origin of the organism in the never flooded oil-bearing Jurassic horizon of the Samotlor oil reservoir, where it was found (32) and explain its survival over geologic time, as well as its proliferation in this habitat. Indeed, in addition to amino acids present in oil reservoirs at limiting concentrations (53), its growth could be supported by cellulose, agarose, laminarin, and triglycerides from oceanic sediments, as well as alkanes from crude oil. Considering that agarose and laminarin are components of algae, the substrate spectrum of T. sibiricus and the enzymes it encodes indicate the ability to metabolize buried organic matter from the original oceanic sediment.
We thank Alexander Lebedinsky for helpful discussions and critical reading of the manuscript.
This work was supported by the Federal Agency for Science and Innovations of Russia (contract 02.512.11.2201). The work of M.L.M. and E.A.B.-O. on the analysis of the growth characteristics of T. sibiricus was supported by the Program Molecular and Cellular Biology of RAS.
Published ahead of print on 15 May 2009.
†Supplemental material for this article may be found at http://aem.asm.org/.