The genome of S. marinus F1 consists of a circular chromosome of 1.57 Mbp. There is one copy each of 5S, 16S, and 23S ribosomal RNA. About 59% of protein-coding genes begin with an AUG codon, 8% with GUG, and 33% with UUG. The low number of GUG start codons reflects the low GC content of this genome (35.7% GC). The ribosomal protein L12ae gene (Smar_1096) does not have a valid start codon, but this is likely to be an essential gene. Based on alignment with L12ae proteins from other archaea, it appears that the S. marinus gene begins with an ATC start codon. S. marinus has 12 regions of CRISPR repeats containing between 5 and 17 repeats. Twelve CRISPR-associated proteins are found in the vicinity of three of the repeats, between coordinates 323,400 and 345,500 (Smar_0308-Smar_0325), and one other CRISPR-associated protein is found at a different location not close to any repeats (Smar_1195).
The genome statistics for S. marinus and the two other sulfur-reducing crenarchaeotes are presented in Table . While the genome of H. butylicus is larger than that of S. marinus, they both have approximately the same number of genes due to the lower coding percentage of H. butylicus. T. pendens has a larger genome and a greater number of genes than the other two (discussed below). The GC content of S. marinus is much lower than the others, but this is not unusual for a hyperthermophile. It is in the same range as the GC content of the Sulfolobus genomes, while Methanocaldococcus jannaschii and Nanoarchaeum equitans have lower GC contents (31% and 32% respectively). T. pendens has a much higher percentage of genes in paralog clusters than the others, suggesting that gene duplication and divergence have been more prevalent in this genome. S. marinus has a smaller percentage of genes with signal peptides. In all three genomes the predicted exported proteins are primarily ABC transporter substrate-binding proteins and hypothetical proteins. S. marinus has approximately the same number of ABC transporters for uptake of nutrients as H. butylicus, but they both have fewer than T. pendens.
has about 270 more protein-coding genes than the other two, but only about 150 more genes with COG hits, suggesting that 120 of the additional genes in T. pendens
are hypothetical proteins. We compared COG categories [9
] between the three crenarchaeotes to determine what categories were more prevalent in T. pendens
compared to the other two (Table ). T. pendens
has a higher number of genes in many categories, suggesting that the additional genes are spread out among a number of cellular processes. The three categories with the greatest additional genes in T. pendens
are carbohydrate metabolism and transport, cell wall/membrane/envelope biogenesis, and function unknown. The greater number of carbohydrate-associated genes is mainly due to a larger number of transporters. T. pendens
has more ABC transporters than the other two and a phosphotransferase (PTS) system transporter, as well as a higher number of transporters assigned to COG2814, arabinose efflux permease, which are transporters of the major facilitator superfamily. In addition, T. pendens
has three sugar kinases of COG1070, while the other two have none. Thus T. pendens
can probably take up and utilize a greater number of carbohydrates than the other two. The greater number of cell wall-associated genes in T. pendens
is mainly due to a greater number of glycosyltransferases (COG0438) and nucleotide sugar metabolic enzymes. This suggests that T. pendens
has a greater variety of sugars attached to lipids and/or proteins on the outside of the cell.
Comparison of COG categories among the three sulfur-reducing crenarchaeotes.
The S. marinus
genome contains several protein families not found before in crenarchaeotes, and these are discussed below. S. marinus
is the first crenarchaeote found to have an arginine decarboxylase belonging to COG1166 (Smar_0204), which includes the speA
gene of E. coli
. This protein family is also found in one euryarchaeote, Methanosaeta thermophila
. Most euryarchaeota have a pyruvoyl-dependent arginine decarboxylase [10
]. T. pendens
and Cenarchaeum symbiosum
also contain this type of enzyme. No arginine decarboxylase has been identified in other crenarchaeote genomes. Phylogenetic analysis of the S. marinus
arginine decarboxylase (not shown) does not indicate a clear case of lateral gene transfer, and this enzyme was not identified during the search for laterally transferred genes (see below).
S. marinus contains a probable cell surface protein (Smar_0566) with 4 copies of the pfam03640 repeat, which has not been found in any other crenarchaeal genome. This repeat is present in two methanogens, Candidatus Methanoregula boonei and Candidatus Methanosphaerula palustris. It is also found in a wide variety of bacteria, but its function is unknown.
S. marinus is unique among crenarchaeotes in having a sodium ion-translocating decarboxylase for energy generation (Smar_1503-1504). It also has three putative operons containing subunits of multisubunit cation/proton antiporters, although these are likely to belong to large membrane-bound ion-translocating enzyme complexes rather than acting as cation antiporters (see below). S. marinus is the first crenarchaeote found to have a type I restriction-modification system (Smar_0761-0763).
has 5 putative transposable elements. Phylogenetic analysis shows that all of them belong to family IS607 (not shown). The characterized members of this IS family contain two ORFs. In S. marinus
one of the elements contains two ORFs while the other four contain only one ORF. In the S. marinus
element with two ORFs, the first (Smar_0846) is truncated relative to other members of the family, and is likely to be a pseudogene, while the second (Smar_0847) is intact. The four elements with one ORF share a high degree of similarity to each other (Smar_0083, Smar_0767, Smar_1150, Smar_1546), suggesting that they have been recently duplicated. In addition, there are 14 copies of a repeated sequence of approximately 260 nucleotides, although some of the repeats are truncated at one or both ends. These repeats are likely to be miniature inverted-repeat transposable elements (MITEs) as they are flanked by inverted repeats and have similarity to a region of DNA upstream of the group of four transposase ORFs (Figure ). MITEs have previously been identified in some archaeal genomes [11
]. Two ORFs are disrupted by MITEs, a protein with ABC transporter ATPase and acetyltransferase domains (Smar_0733) and a PIN domain protein (Smar_0327/0328). The presence of disrupted genes suggests that the MITEs have been active recently, although they do not appear to have had a major impact on the genome content.
Figure 1 Alignment of putative miniature inverted-repeat transposable elements (MITEs) from S. marinus. Start and end coordinates are given for each putative MITE. Below the MITEs are the upstream regions of four related transposases with start and end coordinates. (more ...)
Twenty-one probable laterally transferred genes were identified using the program SIGI-HMM [12
]. One gene is by itself (Smar_0375), there are three pairs of genes (Smar_0568-0569, Smar_0846-0847, and Smar_1144-1145) and there is one cluster of 17 genes (Smar_1525-1541) in which 14 of the genes are predicted to be laterally transferred. Twelve of the laterally transferred genes are predicted to have come from other Crenarchaeota, six from Euryarchaeota, and the remaining three have unknown donors. Six of the 17 genes are likely to be pseudogenes, suggesting that they were transferred but then are degrading. From these findings we conclude that lateral transfer has not played a large role in shaping S. marinus
gene content, and most if not all gene transfers have come from other archaea.
The presence of transporters for peptides and carbohydrates suggests that both types of compounds can serve as carbon and energy sources. S. marinus has four ABC transporters for carbohydrates (Smar_0088-0091, Smar_0108-0111, Smar_0299-0302, Smar_1146-1149) and two for peptides (Smar_0270-0274, Smar_0342-0346). It has a carbohydrate secondary transporter of the glycoside-pentoside-hexuronide (GPH) family (Smar_0710), and it is the first crenarchaeote found to have a peptide transporter of the oligopeptide transporter (OPT) family (Smar_1400). There are no ABC transporters for amino acids, but a probable amino acid transporter of the neurotransmitter:sodium symporter (NSS) family is present (Smar_0285). The presence of secondary transporters (GPH, OPT, and NSS), which have low affinity and high capacity, suggests that there are times when S. marinus is exposed to high levels of nutrients, and it can conserve energy by using secondary transporters instead of ATP-dependent transporters.
S. marinus has a glycolysis pathway similar to Aeropyrum pernix, with ATP-dependent glucokinase (Smar_1514) and phosphofructokinase (Smar_0007). Glycogen synthase (Smar_1393) and phosphorylase (Smar_0246) are present, suggesting that the dark granules observed in S. marinus cells are composed of glycogen. Similar to other crenarchaeotes and thermococci, S. marinus has pyruvate:ferredoxin oxidoreductase (Smar_1447-1450) and ADP-forming acetyl-CoA synthase (Smar_0449, Smar_1241-1242) for ATP synthesis from pyruvate. Three other 2-ketoacid:ferredoxin oxidoreductases are present (Smar_0291-292, Smar_0997-1000, Smar_1443-1444) that are probably involved in amino acid degradation.
S. marinus is unique in Crenarchaeota in having a sodium-translocating decarboxylase. Smar_1504 and Smar_1503 encode the beta and gamma subunits (beta and delta in methylmalonyl-CoA decarboxylase). There are two possibilities for the activity of this decarboxylase (Figure ). With Smar_1426 and Smar_1427 these genes could form a methylmalonyl-CoA decarboxylase. Smar_1426 and Smar_1504 are closely related to predicted methylmalonyl-CoA decarboxylase subunits of Pyrococcus species. This enzyme would be involved in catabolism of succinyl-CoA resulting from glutamate degradation via a 2-oxoacid:ferredoxin oxidoreductase (Figure ). However, methylmalonyl-CoA mutase and epimerase were not found in the genome. The other possible function is oxaloacetate decarboxylase with Smar_0341 as the alpha subunit. This would be involved in catabolism of aspartate (Figure ). However Smar_0341 is also related to pyruvate carboxylase B subunits of euryarchaeotes, and could interact with Smar_0140 to form this enzyme instead of or in addition to a sodium-transporting decarboxylase.
Two possible functions of the sodium ion-translocating decarboxylase of S. marinus in their metabolic contexts.
S. marinus, like the other heterotrophic crenarchaeotes H. butylicus and T. pendens, has lost almost all amino acid biosynthetic enzymes, although it has retained a few pathways for specific physiological reasons. For instance, glutamine is needed for its function as a nitrogen donor. Like the other heterotrophic crenarchaeotes, S. marinus can make pyrimidines but not purines. Enzymes for synthesis of several cofactors are present in S. marinus, in contrast to T. pendens, which lacks many cofactor synthesis pathways. S. marinus can likely synthesize riboflavin, pyridoxine, and coenzyme A, but it probably must acquire heme from the environment.
Electron transport/sulfur reduction
requires sulfur for growth and reduces it to sulfide [4
], but it lacks homologs of proteins implicated in sulfur reduction in other organisms. It has no genes similar to sulfhydrogenases [13
] and the recently discovered NADPH:sulfur oxidoreductase [15
] from P. furiosus
. It also lacks genes with similarity to the molybdoenzymes polysulfide reductase of Wolinella succinogenes
], sulfur reductase of Acidianus ambivalens
], sulfur reductase of Aquifex aeolicus
], and thiosulfate/sulfur reductase of Salmonella enterica
]. S. marinus
has a gene (Smar_1055) with 56% similarity to sulfide dehydrogenase SudA subunit from P. furiosus
], but this gene is shorter than the P. furiosus
gene by 120 amino acids and the beta subunit is not present in S. marinus
. Thus, this enzyme is unlikely to be present in S. marinus
. However S. marinus
has three putative operons similar in composition to the mbh
operons of Thermococcales (Table ). These multisubunit complexes are not found in any other sequenced crenarchaeote. The mbh
operon from P. furiosus
encodes a membrane-bound hydrogenase that oxidizes ferredoxin [21
], while the mbx
operon has a yet to be defined role in electron transfer. Its proposed function is the transfer of electrons from ferredoxin to NADPH coupled with proton translocation across the cell membrane [15
]. A similar complex present only in Pyrococcus abyssi
(PAB1395-1401) is adjacent to formate dehydrogenase subunits and has similarity to E. coli
hydrogenases 3 and 4. Thus, it is likely to be a formate hydrogen lyase.
Sulfur reduction enzymes and their presence in the three sulfur-reducing heterotrophic crenarchaeotes.
The S. marinus mbh/mbx-related complexes contain a set of proteins similar to components of multisubunit cation/proton antiporters and another set with similarity to NADH:ubiquinone oxidoreductase subunits (Table ). The S. marinus antiporter-related subunits show high similarity to each other and to the corresponding subunits of the P. abyssi putative formate hydrogen lyase. S. marinus does not have an identifiable formate dehydrogenase, so these complexes likely have a different function in S. marinus. The S. marinus and P. abyssi proteins form a distinct cluster separate from mbh and mbx complexes and from the related cation/proton antiporters (Figure ). In contrast, the NADH:ubiquinone oxidoreductase-related subunits in the S. marinus putative operons are not closely related to each other or to the corresponding P. abyssi formate hydrogen lyase proteins. These findings indicate that the antiporter-related subunits form a cassette that has been duplicated in S. marinus and combined with NADH:ubiquinone oxidoreductase-related subunits that are divergent in sequence.
Subunit composition of multisubunit membrane-bound complexes from Pyrococcus species and S. marinus.
Phylogenetic tree of proteins related to antiporter subunit mnhE/mrpE/phaE.
produces hydrogen when sulfur is limiting [4
]. Two of the multisubunit complexes are potentially involved in hydrogen production. One set of S. marinus
proteins (Smar_1060-Smar_1063) clusters strongly with E. coli
hydrogenases 3 and 4 in phylogenetic trees, and may form a membrane-bound hydrogenase. Smar_0018 and Smar_0020 have similarity (61% and 39%, respectively) to subunits of Methanosarcina mazei
ech hydrogenase subunits, and hydrogenase accessory proteins are found in their vicinity (Smar_0012-0013, Smar_0015). It is likely that at least one of these clusters is involved in hydrogen production.
The other complexes may be involved in sulfur reduction either directly or indirectly. One of the clusters (Smar_1057-1071) is close on the chromosome to a pyridine nucleotide-disulfide oxidoreductase (Smar_1055). It is possible that this cluster is involved in sulfur respiration, where Smar_1055 acts as a NAD(P)H-dependent polysulfide reductase and the other ORFs are involved in the generation of NAD(P)H through a membrane-based electron transport system that oxidizes reduced ferredoxin and translocates protons across the membrane. The system would allow energy generation from an overall sulfur-dependent oxidation of peptides and amino acids and it would be similar to the mbx
-NAD(P)H elemental sulfur oxidoreductase (NSR) system that has been described for P. furiosus
Comparison of the three sulfur-reducing crenarchaeotes
Spectral clustering was used to create protein clusters from the three anaerobic sulfur-reducing heterotrophs, and the clusters shared by all three or by pairs of the three were derived (Figure ) and [see Additional file 1
]. The three organisms share 571 core clusters, somewhat more than the conserved crenarchaeal core of 336 determined by Makarova et al. [22
]. Among the clusters conserved among the three but not found in all Crenarchaeota are the subunits of ABC transporters for sugars, peptides, and amino acids, which are required for their heterotrophic lifestyle. Also falling into this group are the ferrous iron transporter proteins FeoA and FeoB and the anaerobic form of ribonucleotide reductase, proteins which reflect their anaerobicity. S. marinus
and H. butylicus
have almost twice as many shared clusters (225) as either one has with T. pendens
(119 or 126). This is due to their closer phylogenetic relationship. S. marinus
and H. butylicus
both belong to the order Desulfurococcales while T. pendens
belongs to the order Thermoproteales.
Venn diagram showing genes shared between S. marinus, H. butylicus, and T. pendens.
The major difference in habitat between these three organisms is that S. marinus
and H. butylicus
were isolated from marine environments [1
] while T. pendens
was isolated from a terrestrial solfatara [2
]. Marine environments have relatively high concentrations of sodium and potassium compared to terrestrial springs, and this influences the complement of transporters encoded by the three genomes. For example, S. marinus
and H. butylicus
use the Trk type of potassium transporter (COG0168), which is a proton or sodium symporter, while T. pendens
uses the more energy-intensive ATP-dependent kdp-type potassium transporter (COG2060, COG2216, COG2156). Also, S. marinus
and H. butylicus
have a greater number and variety of sodium symporters than T. pendens
. They both have sodium-dependent multidrug efflux pumps of the MATE family (COG0534) and amino acid transporters of the neurotransmitter:sodium symporter family (pfam00209), while only S. marinus
has a transporter of the sodium:solute symporter family (pfam00474).
Both T. pendens and H. butylicus have formate dehydrogenases while S. marinus lacks this enzyme. Formate can be used as an electron donor with sulfur as electron acceptor to generate energy. S. marinus also lacks the FdhE protein, which is involved in formate dehydrogenase formation, while the other two have it.
There are also differences in the ability to utilize carbohydrates among the three organisms. As discussed above, T. pendens
has a greater number of carbohydrate transporters than the other two. According to the CAZy database http://www.cazy.org
], H. butylicus
has no glycosyl hydrolases, while S. marinus
has ten and T. pendens
has fifteen. Also H. butylicus
apparently does not store glycogen as it lacks glycogen synthase and phosphorylase, but the other two have these. H. butylicus
also lacks enzymes for utilization of galactose and N-acetylglucosamine. Surprisingly while S. marinus
and T. pendens
have probable glucokinases related to the characterized Aeropyrum pernix
], H. butylicus
has a protein related to the broad-specificity hexokinase from Sulfolobus tokodaii
]. This suggests that, while it may not be able to break down polysaccharides, it may be able to utilize simple sugars.
There are similarities and differences among the three genomes in the genes involved in biosynthesis. Many of the genes shared by S. marinus
and H. butylicus
but missing from T. pendens
are involved in cofactor metabolism. T. pendens
appears to be unable to make riboflavin, coenzyme A, pyridoxine, and possibly other cofactors, and it has transporters for biotin and riboflavin that are not found in the other two. Among the three organisms only H. butylicus
has a heme biosynthesis pathway. On the other hand, all three organisms are unable to make most amino acids and purines, although they do have the pyrimidine biosynthetic pathway. S. marinus
and H. butylicus
have ABC transporters of the basic membrane protein family (pfam02608) that probably transport nucleosides [26
], but T. pendens
lacks this type of transporter. In fact T. pendens
does not have any identifiable nucleoside or nucleobase transporters, so it likely has undiscovered families to transport these compounds.
There are other differences between these three organisms that do not directly reflect the habitats they live in. H. butylicus is surprisingly lacking some enzymes of central metabolism. It has no identifiable fructose-bisphosphate aldolase and no phosphoenolpyruvate synthase or pyruvate phosphate dikinase. Since fructose-bisphosphate aldolase is essential for hexose and pentose synthesis, it likely has a new version of this enzyme. H. butylicus also does not have an asparaginyl-tRNA synthetase; however, it is the only one of the three to have an Asp-tRNA(Asn)/Glu-tRNA(Gln) amidotransferase, but the A subunit of this enzyme (Hbut_0594) has a frameshift. Since this appears to be an essential enzyme for H. butylicus, the gene may still be functional.