T. sibiricus has a single circular chromosome of 1,845,800 bp with no extrachromosomal elements (see Fig. S1 in the supplemental material). There are a single copy of a 16S-23S rRNA operon and two distantly located 5S rRNA genes. A total of 46 tRNA genes carrying 44 different anticodons coding for all 20 amino acids are scattered over the genome.
By a combination of coding potential prediction and similarity search, 2,061 potential protein-coding genes were identified with an average length of 815 bp, covering 91% of the genome. These values are in good accordance with the general correlation between the microbial genome size and the predicted gene numbers. Through similarity and domain searches of the predicted protein products, the functions of 69% of the genes (1,413 genes) may be predicted with different degrees of confidence and generalization. The functions of the remaining 648 genes (31%) cannot be predicted from the deduced amino acid sequences; 181 of them are unique to T. sibiricus, with no significant similarity to any known sequences.
About 70% of the T. sibiricus proteins showed similarity to those of T. kodakaraensis and T. onnurineus (see Fig. S2 in the supplemental material), and 1,024 of them are also present in all three Pyrococcus species. This shared set of genes might belong to the common ancestor of members of the order Thermococcales. About 387 of the T. sibiricus proteins are not present in other members of the order Thermococcales, and although many of them encode proteins with unknown function, there are several intriguing genes that might be responsible for the specific traits of T. sibiricus, as discussed below.
Pairwise comparison by all-vs-all BLASTP of the genomes of T. sibiricus and other Thermococcus spp. confirmed the high frequency of shuffling-driven genome rearrangements (see Fig. S2 in the supplemental material). In contrast to the well-conserved gene context among Pyrococcus genomes, the genomes of thermococci are more rearranged and the synteny is conserved only for short chromosomal segments.
Mobile genetic elements.
Analysis of the repeated sequences and a search against the IS database revealed seven copies of putative transposons of three types. We found two almost identical complete copies of an IS605 family transposon, which was named ISTSi1. This 1,856-bp-long IS element contains two genes, orfA and orfB, that encode homologs of the resolvase (COG2452) and transposase (COG0675) of IS605. The almost identical sequences of two copies of ISTsi1 suggest that this IS element invaded the T. sibiricus genome recently on the evolutionary time scale and may remain functional. ISTsi1-like sequences are also present in the genomes of other Thermococcus species.
Another IS element of the IS605 family, ISTsi2, is present in two almost identical copies. This 1,398-bp-long IS element contains a pseudogene of an IS605 family transposase (orfB, COG0675) carrying three frameshift mutations. In addition, a shortened copy of this IS element (1,359 bp), also carrying a transposase pseudogene, was found. Perhaps the propagation of this already nonfunctional IS element depends on another active transposon. The only archaeal homologue of the “restored” transposase of ISTsi2 is the PF1609 protein of P. furiosus; many others are encoded in the genomes of thermophilic bacteria (Caldicellulosiruptor, Anaerocellum, Geobacillus, etc.), suggesting ancient invasion of the genome of the common ancestor of members of the order Thermococcales by this IS element.
The third IS element, ISTsi3, belongs to the IS200/IS600 family and is present in one full-size copy (1,884 bp) and two shortened copies (1,630 bp and 1,626 bp). The full-size copy contains two genes (COG1943 and COG0675), while the shorter copies harbor only the longer gene, orfB (COG0675). Note that proteins similar to OrfB are encoded in the genomes of other Thermococcus species but OrfA-like proteins are present only in crenarchaeal and bacterial genomes.
In the crenarchaeal genus Sulfolobus
, site-specific integration of viruses occurs at tRNA genes by the integrase-mediated mechanism (39
). Such regions bordered by a partitioned integrase gene have been found in other members of the order Thermococcales
, and T
). In particular, the T
genome contains four such regions comprising 18 to 28 kb each and encoding several genes, including genes related to DNA replication (11
). The search for similar integrated viruses in the T
genome revealed only one such region, bordered by a partitioned integrase homologues to the integrase from the SSV1 virus for Sulfolobus shibatae
. This region is much shorter (2.7 kb) than virus-related regions present in the T
genome and contains only a single gene (Tsib_0741), thus probably representing the late stage of elimination of the integrated virus-related element from the genome.
genome contains a single CRISPR containing 24 repeat spacer units. The spacer regions are supposed to be derived from extrachromosomal elements like viruses (33
), and the spacer transcripts may inactivate mobile-element propagation by a mechanism somewhat similar to eukaryotic RNA interference (2
). We found no matches between the spacer sequences and any known archaeal extrachromosomal genetic elements, and no CRISPR spacer matched the sequences of the IS elements present in the T
genome. Other representatives of the order Thermococcales
generally harbor multiple CRISPR loci (11
) carrying more repeat spacer units. This difference may reflect the constant and relatively noncompetitive natural environment of T
, which is rarely invaded by viruses and other mobile elements.
The archaeal chromosomal replication initiation site (oriC
) was first identified in P
within the noncoding region located upstream of a gene that encodes a homolog of the eukaryotic Orc1/Cdc6 cell division control protein (35
). Members of the order Thermococcales
contain a single replication initiation site, while some other archaea, for example, Sulfolobus
species, carry multiple chromosomal replication origins, and multiple cdc6
genes were found to be located close to them (31
). In three Pyrococcus
regions contain a pair of conserved origin recognition box (ORB) sequences, the Orc1/Cdc6 protein binding sites, separated by an ~250-bp spacer region (45
). Likewise, the ori
site of T
is located in an AT-rich region located upstream of a gene for the Orc1/Cdc6 homolog (11
). Usually, the ori
sites are also linked to a “replication island,” a variety of replication- and recombination-associated genes (e.g., polymerase, rfc
, helicase, etc.).
Nucleotide composition disparity analysis (Z-curve analysis) (57
) of the T
genome showed one major peak at around 1.78 Mb where nucleotide compositional deviation change occurred (Fig. ). This peak was found for the MK disparity component of the Z curve, as observed for the Sulfolobus acidocaldarius
) and Pyrobaculum aerophilum
) chromosomal ori
sites. The search for ORB sequences (45
) revealed that the noncoding region located between the Tsib_1993 and Tsib_1994 genes (functions unknown) contains two ORB sequences, as well as an additional partial copy of this motif (Fig. ). As in pyrococci, two full copies of the ORB sequence are oriented in opposite directions and separated by a 250-bp spacer. The position of this region coincides with the location of the MK disparity peak (Fig. ), further supporting the hypothesis that the ori
site is located at this point.
FIG. 1. Z-curve analysis of the T. sibiricus genome (the MK disparity component is presented) showing the major peaks where nucleotide compositional deviations occur. On the genome map, the cdc6 gene is shown. At the bottom, the structure of the predicted oriC (more ...)
homologue is present in the T
genome (Tsib_1590), but unlike in other members of the order Thermococcales
, it is located distantly from the potential oriC
site and no ORB-like sequences were found around this gene. The replication island near oriC
is most complete in P
, while in T
it is separated into two regions, where genes for the Orc1/Cdc6 homolog and DNA polymerase are 290 kb away from the genes for the RadB and RF-C subunits (11
). The genome of T
is more rearranged; only the radA
gene (Tsib_1992) remains to be associated with oriC
. Two other regions are located distantly from this site—the cdc6
gene (Tsib_1590) linked to DNA polymerase II subunits (Tsib_1588 and Tsib_1589) and the third region comprising the rfc
(Tsib_0101 and Tsib_0102) and radB
(Tsib_0108) genes. It is possible that the more condensed replication island structure in Pyrococcus
species reflects specific adaptation of their DNA replication machinery to extreme thermophily (98 to 103°C) relative to thermococci growing at considerably lower temperatures (80 to 90°C).
Growth of T. sibiricus on different substrates. T
was originally reported to grow on proteinaceous substrates, peptone, yeast extract, beef extract, and soya bean extract, but not on starch, pyruvate, glucose, acetate, methanol, ethanol, lactate, or H2
). S° as an electron acceptor was not obligately required but stimulated growth.
Analysis of the T. sibiricus genome revealed the presence of genes required for protein export systems, and the SignalP algorithm predicted a total of 281 proteins carrying N-terminal signal sequences. Most of them have been annotated as transporters, proteases, glycoside hydrolases, lipases/esterases, and hypothetical proteins. The encoded proteins suggested the ability of T. sibiricus to utilize additional substrates. Therefore, in this work we attempted to adapt the organism to growth on some substrates whose utilization could be expected from the genomic data (see Table S1 in the supplemental material). We performed cultivation experiments and, after sequential transfers, identified maltose, dextran, amorphous cellulose, carboxymethyl cellulose, cellobiose, agarose, hexadecane, acetone, olive oil, and glycerol as new substrates for T. sibiricus. Starch, microcrystalline cellulose, chitin, xylan, pectin, phenol, toluene, long-chain fatty acids, and CO did not support its growth (see Table S1 in the supplemental material).
Metabolism of proteins.
Growth of T
on proteins and peptides as the sole source of energy and carbon is enabled by the functioning of two signal peptide-containing extracellular proteases, a pyrolysin-like serine protease (Tsib_0267) and an archaeal subtilisin-like serine protease (Tsib_1234). The cell can import peptides by ABC-type dipeptide/oligopeptide transport system, dipeptide ABC-transporters, and the OPT family oligopeptide transporter. The imported peptides can be cleaved by more than 20 encoded peptidases, including aminopeptidases, carboxypeptidases, and dipeptidases. The amino acids can then be deaminated by a number of aminotransferases (see Table S2 in the supplemental material) in a glutamate dehydrogenase (Tsib_1110)-coupled manner, followed by oxidation to generate the corresponding coenzyme A (CoA) derivatives, as proposed for P
). The oxidation step involves at least four types of ferredoxin-dependent oxidoreductases with distinct substrate specificities (1
): pyruvate:ferredoxin oxidoreductase, 2-ketoisovalerate:ferredoxin oxidoreductase, indolepyruvate:ferredoxin oxidoreductase, 2-ketoglutarate:ferredoxin oxidoreductase, and the second 2-ketoglutarate:ferredoxin oxidoreductase (see Table S2 in the supplemental material). An additional heterodimeric 2-oxoacid:ferredoxin oxidoreductase, the substrate specificity of which cannot be identified by sequence comparison, may be involved in the oxidation of the acyl-CoA derivates.
The acyl-CoA derivatives in T
are converted to the corresponding acids by the two acetyl-CoA synthetases (ACS I and II) and a succinyl-CoA synthetase (see Table S2 in the supplemental material) with concomitant substrate level phosphorylation to generate ATP, as shown in P
) and T
). The pathway involving a two-step transformation of acetyl-CoA to acetate with phosphotransacetylase and acetate kinase, which is common in anaerobic bacteria, presumably does not operate in T
because homologs of the currently known genes for phosphotransacetylase and acetate kinase are absent.
As an alternative assimilation pathway for amino acids, 2-keto acids derived from amino acids can be decarboxylated, depending on the redox balance of the cell, to corresponding aldehydes by ferredoxin-independent reactions of the ferredoxin-dependent oxidoreductases (28
) and then oxidized to form carboxylic acids by the function of five aldehyde:ferredoxin oxidoreductases (see Table S2 in the supplemental material). Alcohol dehydrogenases (ADHs) might be responsible for the reduction of aldehydes to alcohols in the absence of sufficient amounts of the terminal electron acceptor, S°, in the reaction coupled to the oxidation of NADPH to NADP+
, which would dispose of excess reductant (28
). In the T
genome, there are three genes for short-chain ADHs and one gene for Fe-dependent ADH (see Table S2 in the supplemental material).
Metabolism of carbohydrates.
The genome encodes numerous enzymes for the metabolism of polymeric carbohydrates and oligosaccharides, which correlates with the growth of T. sibiricus on maltose, dextran, amorphous cellulose, carboxymethyl cellulose, cellobiose, and agarose observed in this work. T. sibiricus cannot grow on chitin, xylan, or pectin, which is consistent with the absence of genes that encode chitinases, xylanases, and pectinases in its genome.
The inability of T. sibiricus to grow on starch correlated with the genomic data. Contrary to several Thermococcus and Pyrococcus species, no genes for extracellular amylolytic enzymes were found on the T. sibiricus genome. A single identified α-amylase (Tsib_1115) lacks a signal peptide and is intracellular. T. sibiricus was found to be able to grow on maltose. Extracellular maltose and maltooligasaccharides could be transported into the cell and then hydrolyzed to glucose. The production of glucose could be accomplished by several intracellular enzymes: α-amylase, glycogen-debranching enzyme (Tsib_0365), two α-glucosidases (Tsib_1113 and Tsib_0873), and a 4-α-glucanotransferase (Tsib_0455).
In contrast to starch, T. sibiricus grew quite well on dextran, which is an α-1,6-linked d-glucose polymer branched at α-1,4 linkages. However, no extracellular enzymes for the degradation of dextran could be identified in the genome.
The genome predicts the ability of T
to metabolize cellulose, which is in agreement with its observed growth on amorphous cellulose. The T
genome encodes a number of β-specific glycoside hydrolases, which form a complete system for cellulose degradation. The Tsib_0320-to-Tsib_0334 gene cluster includes three putative endo-β1,4-glucanases (extracellular Tsib_0326 and Tsib_0328, as well as intracellular Tsib_0327). Both extracellular enzymes are homologous to endo-β-1,4-glucanases CelA (TM1524) and CelB (TM1525) from Thermotoga maritima
, which was found to be active against soluble substrates (25
), as well as to P
endo-β-1,4-glucanase EglA (PF0854), which shows maximum activity on C4
cellobiose oligosaccharides (3
). Like the T
enzymes, both extracellular endo-β-1,4-glucanases from T
do not contain a cellulose-binding domain (58
) required to bind to and efficiently hydrolyze crystalline cellulose. This correlates with the inability of T
to grow on crystalline cellulose and explains its growth on amorphous cellulose, carboxymethyl cellulose, and cellobiose. The imported cellooligomers are further degraded to smaller oligosaccharides by intracellular endo-β-1,4-glucanases, as suggested for T
). At least three such enzymes are encoded by T
(Tsib_0327, Tsib_0137, and Tsib_1215).
and other archaea do not contain genes for cellobiohydrolase, a key enzyme in classical cellulolytic systems. However, despite the absence of this enzyme, the genome of T
encodes an intracellular β-glucan glucohydrolase (Tsib_0580) that displays high homology with functionally characterized intracellular β-glucan glucohydrolase GdhA from Thermotoga neapolitana
), which has been suggested as a substitute for cellobiohydrolase. The soluble products of the initial hydrolysis of cellulose by endo-β-1,4-glucanases in Thermotoga
are then hydrolyzed by intracellular GghA, which preferentially attacks the longer cellooligomers cellotriose to cellohexaose, releasing glucose from the nonreducing end and producing cellobiose (56
). Cellobiose can also be hydrolyzed by a β-glucosidase (Tsib_0334) most closely related to functionally characterized P
β-glucosidase CelA (PF0073) (55
). An additional gene product involved in cellulose degradation is Tsib_0320, which encodes a putative intracellular cellobiose phosphorylase. Tsib_0320 is related to functionally characterized intracellular cellobiose phosphorylases CbpA from T
) and CepA from T
. In Thermotoga
sp., cellobiose phosphorylase attacks cellulose-derived cellobiose, producing the activated molecule glucose-1-phosphate and glucose.
The genome encodes an extracellular laminarinase (endo-β-1,3-glucanase, Tsib_0321) highly homologous to functionally characterized laminarinase LamB (PF0076) from P
and to many bacterial laminarinases. This enzyme acts best on the β-1,3-glucose polymer laminarin, hydrolyzing it into smaller laminari-oligomers; displays some hydrolytic activity with the β-1,3-1,4 glucose polymers lichenan and barley β-glucan; and is not active with the β-1,4 glucose polymer cellulose (15
). The source of laminarin in nature is brown algae. Endo-β-1,4-glucanase (cellulase), laminarinase, and β-glucosidase, encoded by the gene cluster Tsib_0320 to Tsib_0334, could perform synergistic activity during the hydrolysis of complex carbohydrates, as demonstrated earlier for barley β-glucan and laminarin (10
The genome predicts the ability of T. sibiricus to metabolize agarose, which is confirmed by the observed growth on this substrate. Utilization of agarose is apparently enabled due to the function of an extracellular β-agarase (Tsib_0325). Agar, a cell wall constituent of many red algae, exists in nature as a mixture of unsubstituted and substituted agarose polymers and can constitute up to 70% of the algal cell wall. Agarase hydrolyzes agar and agarose, generating oligosaccharides. β-Agarase Tsib_0325 displays high similarity to the putative β-agarases of Rhodopseudomonas palustris TIE-1 and many different bacteria, including Victivallis vadensis and Pseudomonas putida. No homologs of Tsib_0325 or agarases are present in the genomes of any other archaea. A gene attached to the Tsib_0325 gene encodes a β-galactosidase (Tsib_0324) which would detach galactose residues from the oligosaccharides produced by β-agarase.
The lower-molecular-weight oligosaccharides resulting from the hydrolysis of polymers can be imported into the cell via binding protein-dependent ABC-type transport systems. The gene cluster Tsib_0320 to Tsib_0334 includes genes that encode an ABC transport system (Tsib_0329 to Tsib_0333) for β-glucosides. This system is homologous to the cellobiose/β-glucoside ABC transport system of P
The above data show that one particular region of the T
genome (bp 308742 to 328657, corresponding to genes Tsib_0320 to Tsib_0334) carries a set of genes responsible for the utilization of cellulose, laminarin, agar, and other β-linked polysaccharides. This region contains genes for the extracellular cleavage of these substrates, transport of oligosaccharides into cells, and their subsequent intracellular cleavage to monomers (see Table S2 in the supplemental material). The closest homologs of all enzymes (except β-agarase) were found in Thermotoga
species; however, this region is absent in T
. In T
, this gene island is embedded in a cluster of ribosomal protein genes with a one-to-one correlation to highly conserved corresponding genes in T
(see Fig. S3 in the supplemental material). All of the above observations suggest that the “saccharolytic gene island” was horizontally transferred to the T
genome from unknown thermophilic bacteria probably related to Thermotoga
sp. The possible archaeal origin of this gene island also cannot be excluded, since extensive lateral gene transfer from archaea was suggested for T
, in whose genome about 24% of the genes were found to bear the most similarity to archaeal genes (37
). Homologs of some of the genes in the T
saccharolytic gene island are also present in P
, but they are distributed in the genome and not located in a particular gene island.
Another supposed case of lateral transfer of saccharolytic genes in members of the order Thermococcales
is the 16-kb maltose and trehalose degradation locus found in the P
and Thermococcus litoralis
). A similar gene island is present in the T
genome (bp 340628 to 356664, corresponding to genes Tsib_0356 to Tsib_0369) with a one-to-one correlation to P
genes PF1737 to PF1751 (see Fig.S3 in the supplemental material). This gene cluster, which encodes a trehalose synthase (Tsib_0361), a glycogen-debranching enzyme (Tsib_0365), and a glycosidase (Tsib_0366), also harbors genes of two ABC transport systems for α-glucosides. The first (Tsib_0358 to Tsib_0360 and Tsib_0363) and the second (Tsib_0367 to Tsib_0369) transporters are related to the CUT1 (carbohydrate uptake) family Mal-I transporter of P
, which recognizes and transports maltose and trehalose, as well as to the Mal-II transporter of P
, which is specific for maltooligosaccharides (24
). In addition, the second transporter contains a permease component (Tsib_0367) whose halves are highly homologous to the transmembrane ABC transporter permeases PF1748 and PF1749 that are upregulated in maltose- and starch-grown P
and have been suggested to function in maltose and starch assimilation (24
). Tsib_0362 encodes TrmB, a maltose-specific transcriptional repressor of the Mal-I and Mal-II gene clusters in P
. This maltose and trehalose degradation gene island is present in the genomes of T
, and P
but is absent in closely related species (T
, and P
), further supporting the hypothesis of its lateral transmission among members of the order Thermococcales
Growth on hydrocarbons.
Attempts to grow T
on medium with phenol and toluene as substrates did not result in growth, suggesting that it is unable to utilize aromatic components of crude oil. Correspondingly, in its genome, the genes that encode the key enzymes for anaerobic degradation of aromatics—benzoyl-CoA reductase, 4-hydroxybenzoyl-CoA reductase, benzylsuccinate synthase, and 6-oxocyclohex-1-ene-1-carbonyl-CoA hydrolase (16
In contrast to aromatic compounds, sequential transfers of T
on medium with hexadecane as the substrate revealed reproducible growth, suggesting a potential of the organism to metabolize n
-alkanes—the main constituents of petroleum and its refined products. Anaerobic bacteria with the capacity for alkane degradation have been isolated relatively recently, and the enzymes responsible have not been purified and characterized yet (14
). One mechanism of activation under anoxic conditions is carboxylation of the third carbon atom in an alkane (16
), which has been proposed for the sulfate-reducing bacterium Desulfococcus oleovorans
Hxd3, which utilizes long-chain C12
alkanes. A more widespread activation mechanism is the addition of fumarate at the subterminal carbon atom, which yields a substituted succinate [(1-methylalkyl)succinate], using a glycyl radical as an initiator (34
). The gene that encodes the candidate enzyme (1-methylalkyl)succinate synthase (Mas) has been identified recently in alkane-utilizing denitrifying bacterial strain HxN1 (14
). The tentative Mas protein is presumably a heterotrimer (MasCDE) that contains a motif (in MasD) characteristic of glycyl radical-bearing sites. The large subunit MasD is encoded by the T
genome as Tsib_0631 and annotated as pyruvate:formate lyase. Close homologs of the subunits MasE and MasC are absent. MasG, the activase of MasCDE, is highly homologous to Tsib_0629, which is annotated as pyruvate-formate lyase-activating protein. It is impossible to conclude whether Tsib_0631 and Tsib_0629 encode the n
-alkane-activating enzyme Mas and its activator or pyruvate:formate lyase and its activator. In T
, similar open reading frames were described as pyruvate-formate lyase, and a role for this enzyme in pyruvate metabolism was proposed. Therefore, although T
apparently grows on hexadecane, the lack of enzymatic data on anaerobic alkane activation does not allow the derivation of insight into the mechanism of hexadecane activation from the genome.
In alkane-utilizing bacteria, fatty acid chains formed from alkanes as acyl-CoA thioester can be further degraded via conventional β-oxidation with the production of acetyl-CoA and its subsequent mineralization to CO2
). However, a β-oxidation pathway is not present in T
since homologs of the currently known genes of its key enzymes—fatty acid CoA ligase, acyl-CoA dehydrogenase, and enoyl-CoA hydratase/isomerase—are not encoded in its genome.
The degradation of propionate (e.g., in “Aromatoleum aromaticum
” EbN1) follows the methylmalonyl-CoA pathway, which is probably also involved in the degradation of odd-chained fatty acids, yielding propionyl-CoA as an intermediate (41
). The key enzymes of the methylmalonyl-CoA pathway are propionyl-CoA carboxylase (PccAB and AccBC) and methylmalonyl-CoA mutase (SbmAB). The genome of T
encodes PccAB and AccBC (Tsib_1534 to Tsib_1536), as well as SbmAB (Tsib_0812). Therefore, both key enzymes of the methylmalonyl-CoA pathway are present. However, propionyl-CoA synthase (PrpE) is not encoded, suggesting that T
is unable to activate and utilize propionate. Considering the stimulatory effect of hexadecane on growth and the absence of conventional β-oxidation, as well as propionate-activating propionyl-CoA synthase, an unknown pathway for the metabolism of this substrate could be involved in T
Genome analysis predicts the growth of T
on acetone, which was confirmed by growth experiments. In “A
” EbN1, degradation of acetone and 2-butanone involves the functions of acetone carboxylase (AcxABC) and succinyl-CoA:3-ketoacid-CoA ligase (KctAB) (22
). In T
, AcxA (Tsib_0819) and AcxB (Tsib_0820) are present. KctA and KctB are highly homologous to the N- and C-terminal halves of Tsib_0955, respectively. Degradation of acetone would result in acetyl-CoA, and degradation of 2-butanone would yield propionyl-CoA, which is further metabolized to pyruvate in the methylmalonyl-CoA pathway identified in T
Conservation of energy.
T. sibiricus conserves energy by substrate level phosphorylation, as well as by oxidative phosphorylation (anaerobic respiration) linked to proton or sulfur reduction and proton translocation by membrane-bound complex I-related complexes MBH 1, MBH 2, and MBX (see below). Metabolism of sugars in the modified EM pathway and further oxidation of pyruvate result in the formation of two ATP molecules per molecule of glucose oxidized to acetate. Metabolism of proteins yields one molecule of ATP by substrate level phosphorylation per amino acid molecule oxidized.
The main mechanism of energy conservation in T
appears to be proton translocation by three NADH:quinone oxidoreductase (complex I)-related systems: two membrane-bound proton-reducing H2
-evolving hydrogenase complexes, MBH 1 and MBH 2 (energy-converting hydrogenases, Ech) (47
), and one membrane-bound NADP+
-reducing complex (MBX) eventually linked to the reduction of S° (48
). The electron donor for these complexes is likely reduced ferredoxin, which is formed in ferredoxin-dependent reactions in the pathways of carbohydrate and protein oxidation. The function of H2
-evolving hydrogenase complexes is supported by the observed evolution of H2
during the growth of T
in the absence of S° as an electron acceptor (e.g., on peptone).
The genome of T
contains two similar 14-gene operons that encode MBH 1 and MBH 2 (see Table S2 in the supplemental material) and are related to the MBH operon of P
, which has the same gene order and encodes a six-subunit proton-translocating NiFe hydrogenase and multisubunit Na+
). It should be noted that two MBH operons are present in D
), whereas single MBH operons were found in P
. A 13-gene operon, MBX (Tsib_1905-Tsib_1917), is related to the MBX operon of P
and has the same gene order (48
In the absence of S°, T. sibiricus presumably conserves energy by the function of MBH 1 and MBH 2 with reduced ferredoxin as an electron donor and protons as an electron acceptor. Thereby, protons are reduced to H2 and a transmembrane proton gradient is established.
The growth of T
is inhibited by high H2
concentrations and stimulated by the presence of S° as an electron acceptor. In the presence of S°, T
conserves energy by the function of MBX, as proposed for P
). MBX oxidizes reduced ferredoxin, translocates protons, and reduces NADP+
. NADPH is reoxidized by NADPH:S° oxidoreductase (Tsib_0263), which transfers electrons to S° to produce H2
The proton gradient established by the functions of MBH 1, MBH 2, and MBX is used by ATP synthase for the synthesis of ATP. The genome of T
encodes a single Na+
-ATP synthase encoded by the genes atpHIKECFABD
(Tsib_1798 to Tsib_1790). Assuming that the Ech hydrogenases and MBX complex translocate protons, the presence of Na+
transporters in the gene clusters of MBH 1, MBH 2, and MBX can be explained by the requirement to convert the proton gradient into a Na+
ion gradient used for ATP synthesis by the Na+
-ATP synthase of T
. Note that relatively high salinity (optimal growth at 20 g liter−1
NaCl) is required for growth of T
In addition to S°, some archaea can respire with sulfate, thiosulfate, or nitrate. However, genome analysis suggests that these compounds cannot be employed as electron acceptors by T. sibiricus because homologs of the currently known enzymes of the dissimilatory sulfate and thiosulfate reduction pathway (sulfate adenylyltransferase, adenylylsulfate reductase, sulfite reductase, and thiosulfate reductase), as well as the dissimilatory nitrate reductase and nitrite reductase, are not encoded.
The metabolic pathways of substrate utilization and energy generation identified in the genome of T
are presented in Fig. . The genomic sequence provided a basis for the discovery of many new physiological features of this organism, which was initially described as a specialized utilizer of proteinaceous substrates. In addition to the expected protein degradation and metabolism machinery, the genome encodes numerous extracellular and intracellular enzymes for carbohydrate degradation and sugar transport systems. In agreement with the genes identified, we have found T
to be capable of growing on oligomeric and polymeric α-linked (maltose and dextran) and β-linked (cellobiose, cellulose, laminarin, and agarose) carbohydrates. The ability to grow on agarose was previously found in Desulfurococcus fermentans
) but was never shown for representatives of the order Thermococcales
or other hyperthermophilic Euryarchaeota
. In the case of T
, this capacity is supported by genome analysis, which revealed the presence of a β-agarase gene. The complete set of genes responsible for the hydrolysis of β-linked polysaccharides, their transport into the cell, and intracellular hydrolysis of oligomers to monosaccharides is located on a saccharolytic gene island, which is absent in the genomes of other members of the order Thermococcales
and was probably acquired by T
by lateral gene transfer. The second laterally acquired gene island contains genes for maltose and trehalose degradation.
FIG. 2. Overview of catabolic pathways encoded by the T. sibiricus genome. Substrates utilized are in bold, enzymes and proteins identified on the genome are in blue, and energy-rich intermediate compounds are in red. Panels: A, utilization of carbohydrates; (more ...)
Moreover, T. sibiricus probably possesses a new mechanism of n-alkane degradation, since its growth is stimulated by hexadecane and no enzymes of currently known pathways are encoded. Not reported so far for hyperthermophilic archaea is the lipolytic growth of T. sibiricus on triacylglycerides. This growth is apparently enabled by the function of an extracellular true lipase(s) encoded by the genome.
The new physiological features of T
are consistent with the proposal of the indigenous origin of the organism in the never flooded oil-bearing Jurassic horizon of the Samotlor oil reservoir, where it was found (32
) and explain its survival over geologic time, as well as its proliferation in this habitat. Indeed, in addition to amino acids present in oil reservoirs at limiting concentrations (53
), its growth could be supported by cellulose, agarose, laminarin, and triglycerides from oceanic sediments, as well as alkanes from crude oil. Considering that agarose and laminarin are components of algae, the substrate spectrum of T
and the enzymes it encodes indicate the ability to metabolize buried organic matter from the original oceanic sediment.