Orthologs and alignments
The number of aligned and filtered OrthoMCL clusters containing at least four entries (i.e., genes from distinct genomes) was 3,853, their length ranging from 22 to 1650 amino acids (262.542 on average). The concatenated supermatrix thus comprised 1,011,575 columns, including 708,296 variable and 353,936 parsimony-informative characters.
Phylogenetic inference
The ML phylogeny inferred from the concatenated gene alignments is shown in (species tree) together with ML and MP bootstrap support values. The final highest log likelihood obtained was −12,342,390.79, whereas the single best MP tree (excluding uninformative sites) had a score of 1,409,187. ML and MP topologies were identical. Support was maximum (100%) for all branches under ML, and maximum for all but four branches under MP; only a single branch entirely lacked support under MP. As only one genome per genus was included in the sample, there is no taxonomic subdivision of Halobacteriaceae to compare the tree with. However, outgroup taxonomy was well recovered, the tree showing the monophyly of Methanomicrobiales, Methanosarcinales, and Methanosarcinaceae, each of which were represented with at least two genomes.
Incongruence between gene trees and species tree
After reducing the dataset to the ingroup taxa and to the OrthoMCL clusters present in at least four ingroup genomes, total PBS per OrthoMCL cluster ranged between 142 and −219 (average: 4.941, standard deviation: 19.588, median: 1, MAD: 8.896). These data are plotted against the number of parsimony-informative characters in supplementary
Figure S1. Within a total of 2,891 OrthoMCL clusters, 1,506 genes showed overall positive support and 764 showed overall negative support. Trees inferred from the five clusters least congruent with the species tree are depicted in
Figure S2. They are uniformly characterized by high bootstrap support for groupings in conflict with the species tree estimate. Total PBS values per cluster vary between the COG categories (which could be assigned to 2,213 clusters; see
Figure S1); on average, COGs related to information storage and processing display higher PBS than those associated with metabolism or cellular processes and signaling (
Table S1); but individual COG categories may differ from this general trend (
Table S1).
Core clusters
We used a spectral clustering method to generate gene clusters from the haloarchaeal genomes. There were 887 core clusters, those found in all of the haloarchaeal genomes, and these accounted for 40% to 50% of the genes in each genome (). As expected, the core clusters contain genes involved in basic cellular processes such as transcription, translation, DNA replication, DNA repair, RNA modification, protein modification, and protein secretion (
Table S2). The core clusters also include many genes involved in biosynthesis of essential metabolites – amino acids, purines and pyrimidines, lipids, and cofactors. This is somewhat unexpected as the haloarchaea are heterotrophs, but they appear to be relatively self-sufficient in being able to make most essential metabolites. Biosynthetic pathways in the haloarchaea have recently been reviewed
[42], so we will not go into more detail here. The number of genes in each genome belonging to all clusters ranges from 78% to 88% (), showing that 12% to 22% of the genes in each genome have no hits to genes in the other halophile genomes.
Signature clusters
We also identified signature gene clusters, those that are shared by all haloarchaea but are not found in any other archaea. There are 112 of these clusters (
Table S3), 89 of which contain proteins with completely unknown function. Of the clusters with a predicted function, two are protein kinases related to the
Bacillus subtilis PrkA protein. These two kinase genes are adjacent to each other on the chromosome in all haloarchaeal genomes and are always found with two other genes: one with a domain of unknown function DUF444, and one related to
B. subtilis SpoVR, the function of which is unknown. Proteins of these three families are found together on the chromosome in many other microbial genomes suggesting functional linkage.
Halophiles are known to accumulate gamma-glutamyl-cysteine
[43], and two of the signature clusters may be involved in gamma-glutamyl-cysteine metabolism. Cluster 491.1× contains proteins related to glutamate-cysteine ligase, and the
H. volcanii member of cluster 491.1× was recently shown to have glutamate-cysteine ligase activity
[44]. Cluster 1151× includes genes related to glutathione S-transferase, which inactivates toxic compounds by linking them to glutathione. In the halophiles, these may function as gamma-glutamylcysteine S-transferases.
Habitat-specific clusters
Of the ten haloarchaea with sequenced genomes, four were isolated from water and four were isolated from soil or sediment. The ones isolated from water are
H. walsbyi [45],
N. pharaonis [46],
H. marismortui [47], and
H. borinquense [48].
H. mukohataei and
H. turkmenica were isolated from saline soils
[49],
[50], while
H. volcanii and
H. utahensis were isolated from lake sediments
[51],
[52]. The water halophiles and soil/sediment halophiles do not form separate clades in the phylogenetic tree (). We looked for clusters present in all water-isolated halophiles that are not present in soil/sediment halophiles and
vice versa. There were no clusters specific to water halophiles and only three specific to soil/sediment haloarchaea (
Table S4). Proteins belonging to two of the soil/sediment-specific clusters (326.1.0× and 2168×) are often found in the vicinity of nucleotide-sugar metabolic enzymes and glycosyl transferases, suggesting they are involved in cell wall biosynthesis
[53].
Since there were few protein clusters completely specific to the water or soil/sediment halophiles, we looked for clusters present in three out of four organisms from one group and absent from the other group. There were 16 clusters present in three out of four water halophiles, of which 11 contain hypothetical proteins and four have only general functional annotations (
Table S4). The only cluster with a specific annotation is cluster 1816×, formate-tetrahydrofolate ligase.
Of the 26 clusters found in three out of four soil/sediment halophiles and not present in water halophiles, 11 are hypothetical proteins (
Table S4). Several of the clusters are involved in polysaccharide degradation. These include a glycosyl hydrolase of family GH4, an alpha-L-arabinofuranosidase of family GH51, a polysaccharide deacetylase and a trehalose utilization protein. Two additional clusters present in three out of four soil/sediment halophiles encode a monooxygenase and an acyltransferase, which are found adjacent to each other on the chromosome and flanked by two IucA/IucC family proteins. These proteins are likely to be involved in siderophore synthesis.
H. borinquense has three IucA/IucC family proteins but lacks the monooxygenase and acyltransferase, thus it is unclear if it has a complete siderophore biosynthesis pathway.
All-but-one clusters
We also looked for clusters conserved in all but one genome. These probably indicate recent gene losses in each species. The genomes fell into two groups – those that had 20 or less such clusters and those that had greater than 60 (). The three genomes that had greater than 60 clusters lost were H. salinarum, H. walsbyi, and H. utahensis. Many of the clusters lost from H. salinarum are involved in amino acid synthesis, including genes for the synthesis of glutamate, lysine, ornithine, methionine, and branched chain amino acids. To make up for this, H. salinarum does not have more amino acid transporters or secreted proteases than the other haloarchaea, but it is one of only two of the haloarchaea to have a putative peptide symporter of the OPT family (TC 2.A.67). Symporters have low affinity but high capacity, suggesting that H. salinarum may prefer to live where there is an ample supply of peptides. Of the clusters not present in H. walsbyi, many are involved in flagellum biosynthesis and chemotaxis. However, H. walsbyi has a set of gas vesicle proteins to enable motility in the absence of flagella. The clusters in all except H. utahensis include several enzymes involved in cobalamin synthesis and several enzymes of biotin utilization and propionate metabolism. H. utahensis appears to lack the enzymes for the early steps of cobalamin biosynthesis up to the incorporation of cobalt, but all of the halophiles, including H. utahensis, contain the enzymes for the later steps of cobalamin biosynthesis. Biotin and propionate metabolism are discussed further below.
Central metabolism
Some haloarchaea are known to use the semi-phosphorylated Entner-Doudoroff (ED) pathway for glucose degradation
[54],
[55], and genes encoding enzymes of this pathway have been identified in several haloarchaea
[42]. With the addition of the new genomes, we find that the semi-phosphorylated Entner-Doudoroff pathway is likely to be present in all sequenced haloarchaea except
N. pharaonis, which does not utilize carbohydrates. Aldolases belonging to two different protein families may be involved. All of the halophiles except
H. salinarum and
N. pharaonis have one or more bacterial-type 2-keto-3-deoxy-6-phosphogluconate (KDPGlc) aldolase (COG0800). Also all except
N. pharaonis have at least one potential aldolase related to the characterized
Sulfolobus aldolase (COG0329), which is active on KDPGlc and unphosphorylated 2-keto-3-deoxygluconate (KDGlc)
[56]. The enzymes of the semi-phosphorylated Entner-Doudoroff pathway are highly conserved in sequence among the haloarchaea, suggesting descent from a common ancestor.
The standard Embden-Meyerhof pathway of glycolysis appears to be incomplete in the halophiles as no 6-phosphofructokinase could be identified. This agrees with previous experimental studies and analysis
[42]. Gluconeogenesis is likely to be present in all of the halophile genomes with the possible exception of
H. utahensis. All except
H. utahensis have phosphoenolpyruvate (PEP) synthase and/or pyruvate, phosphate dikinase (COG0574). In addition,
H. lacusprofundi and
H. turkmenica have ATP-utilizing PEP carboxykinase (COG1866). In
H. utahensis the only enzyme that potentially can generate PEP for gluconeogenesis is pyruvate kinase. All except
H. utahensis have a fructose 1,6-bisphosphatase belonging to the same family as the
E. coli Fbp enzyme (COG0158). All of the halophiles including
H. utahensis have at least one gene belonging to COG0483, which includes inositol phosphatases and some archaeal fructose 1,6-bisphosphatases
[57].
H. utahensis has two genes belonging to this family, but they are weakly related to characterized fructose 1,6-bisphosphatases. These findings suggest that
H. utahensis may lack the gluconeogenesis pathway or have an unusual gluconeogenesis pathway.
Unlike the rest of the archaea, halophiles are thought to use the oxidative pentose phosphate pathway for generation of pentoses
[58]. This pathway also generates NADPH for anabolic pathways. All except
H. utahensis have a probable 6-phosphogluconate dehydrogenase (COG1023), the key enzyme of this pathway. In contrast,
H. utahensis is the only sequenced haloarchaeon to have transaldolase (Huta_0859) and transketolase (Huta_0860 and Huta_0861), the enzymes of the non-oxidative pentose phosphate pathway. For NADPH generation,
H. utahensis possesses genes encoding a NAD/NADP transhydrogenase (Huta_2005–2007). None of the other haloarchaea has the genes for this enzyme. The presence of these enzymes only in
H. utahensis suggests that they may have been acquired through lateral transfer, but phylogenetic analysis was unable to identify the donor (data not shown).
Nutrient transport
presents an overview of nutrient transport in the haloarchaea. All of the haloarchaea have at least five symporters for amino acids and at least two ABC transporters for peptides. Since all except H. salinarum can synthesize most or all amino acids, this suggests that amino acids are an important carbon and energy source even in the species that can grow on carbohydrates. All of the haloarchaea also have at least one symporter for nucleosides or nucleobases. Carbohydrate transport is variable. Only half of the halophiles have symporters for sugars, and either none or one ABC transporter for sugars is found in the non-carbohydrate-utilizing organisms. Surprisingly no transporters for sugars could be identified in the H. utahensis genome, suggesting that it uses uncharacterized families of sugar transporters.
| Table 2Nutrient transport in haloarchaea. |
There appears to be a connection between some transporters and universal stress protein A (UspA) family proteins. Most amino acid transporters of the amino acid-polyamine-organocation (APC) family are either fused to a UspA domain or adjacent on the chromosome to a UspA protein, and some are both fused and adjacent to UspA family proteins (e.g. Htur_0566). This appears to be specific for the APC family as other potential amino acid symporters of the neurotransmitter
![[ratio]](/corehtml/pmc/pmcents/x2236.gif)
sodium symporter (NSS) family and the dicarboxylate/amino acid
![[ratio]](/corehtml/pmc/pmcents/x2236.gif)
cation symporter (DAACS) family are not associated with UspA family proteins. Several transporters of the formate-nitrite transporter (FNT) family are also fused or adjacent to UspA domains (Hmuk_1674, HQ1451A, NP6264A, Hlac_2299, and Htur_2705). The FNT family proteins with associated UspA domains are closely related to each other and to a transporter from
H. marismortui that lacks a UspA domain (rrnAC0187). They are likely to be formate transporters as the
H. marismortui and
H. mukohataei proteins are adjacent to enzymes of folate metabolism. Another transporter with adjacent UspA proteins is a putative acetate transporter of the solute
![[ratio]](/corehtml/pmc/pmcents/x2236.gif)
sodium symporter (SSS) family. This transporter is found in seven of the ten halophile genomes (e.g. NP5136A), and in all cases is followed by a UspA domain protein. In six of the seven genomes with this transporter, it is close on the chromosome to two acetyl-CoA synthetase genes (e.g. NP5128A and NP5132A). The transporter has highest similarity to subfamily 7 of the SSS family, which includes acetate, propionate, and phenylacetate transporters (see the Transporter Classification Database at
www.tcdb.org). UspA family proteins are expressed during many stressful conditions, and they are known to bind ATP, but their exact molecular function is unknown
[59]. The UspA domains associated with transporters may play a regulatory role, or may be involved in maintaining transporter function during stressful conditions. A recent report shows that a UspA domain protein is involved in regulation of a transporter
[60].
Secreted proteases
Since the halophiles have numerous transporters for amino acids and peptides, we analyzed the distribution of secreted proteases within their genomes. Only secreted proteases were considered because these are likely to be involved in the utilization of proteins as a nutrient source, while intracellular and integral membrane proteases are involved in a variety of cellular processes. We included proteases that have signal peptides as well as proteins that are likely to be attached to the membrane with the protease domain outside the cell. Signal peptidases (family S26) were excluded from the analysis since they have a specific cellular function. The numbers of secreted proteases in the genomes ranged from 3 to 11. Hierarchical clustering () shows that the halophiles fall into two groups with respect to protease distribution. The main feature separating these groups appears to be the presence or absence of secreted members of protease family S8, which includes subtilisin as well as halolysins from halophilic archaea. The organisms having secreted S8 proteases do not correspond to a habitat-specific or phylogenetic group. The presence of at least three secreted proteases in each genome suggests that all of the halophiles may be capable of degradation of extracellular proteins.
Amino acid utilization
Since all of the halophiles have at least five amino acid symporters and two peptide ABC transporters, we investigated pathways of amino acid utilization to see if all of the halophiles are capable of using many amino acids. A summary is provided in . Three degradation pathways were found in all of the ten genomes. All of them had an alanine dehydrogenase similar to the enzyme characterized in
Archaeoglobus fulgidus [61]. This enzyme could potentially be involved in synthesis of alanine as well as its degradation. All had at least one asparaginase from COG0252 or COG1446. Finally all of them had a pyruvoyl-dependent arginine decarboxylase similar to the enzyme characterized in
Methanocaldococcus jannaschii [62] and agmatinase. This combination of enzymes produces putrescine and urea from arginine. Additional enzymes of arginine utilization were present in some genomes. Six of the genomes have arginase, which produces ornithine and urea. None of the genomes had ornithine decarboxylase, but all of the ones that have arginase also have ornithine cyclodeaminase which produces proline.
| Table 3Amino acid degradation pathways in haloarchaea. |
Several other amino acid degradation pathways are found in a subset of the genomes. The glycine cleavage system is found in all genomes except those of
H. walsbyi and
H. utahensis. This enzyme complex produces CO
2, NH
3, methylene-tetrahydrofolate (THF), and NADH. Methylene-THF has a variety of possible uses within the cell. Another pathway found in all but
H. walsbyi and
H. utahensis is isoleucine degradation. A 2-oxoacid dehydrogenase complex involved in isoleucine degradation was recently identified in
H. volcanii [63], and seven of the other halophiles have genes with at least 68% similarity to the
H. volcanii genes, indicating that they probably have the same function. Five of the genomes have tryptophanase, which produces pyruvate from tryptophan. A histidine degradation pathway with formiminoglutamate as an intermediate is also found in five of the genomes. Seven have proline dehydrogenase, but only
H. lacusprofundi has pyrroline-5-carboxylate dehydrogenase to complete proline conversion to glutamate. All of the genomes have threonine dehydratase, but this may be used only for biosynthesis. Five of the genomes in addition have threonine aldolase, which produces glycine and acetaldehyde. Finally, all of the genomes have glutamate dehydrogenase, which may have a biosynthetic role. Four of the genomes have glutamate mutase and methylaspartate ammonia-lyase. These are the first two enzymes of a four-step pathway that produces acetate and pyruvate from glutamate with mesaconate and citramalate as intermediates. However, these two enzymes are likely to be involved in a new pathway for acetate assimilation
[64].
Many of the pathways for amino acid degradation are found in a subset of the genomes. They could have been acquired independently by lateral gene transfer or lost in some species. The genes for amino acid degradation in the halophiles are closely related in sequence, suggesting that the common ancestor of haloarchaea was able to degrade many amino acids and that some organisms have lost these pathways.
Polysaccharide degradation
According to the distribution of glycosyl hydrolase domains and carbohydrate-binding modules, halophilic archaea can be divided into 3 groups: those that may be capable of degrading plant biomass (H. utahensis, H. turkmenica and to a lesser extent H. marismortui and H. volcanii), those harboring family 18 glycosyl hydrolases with possible chitinase activity (H. salinarum, H. mukohataei and H. borinquense) and organisms that are unlikely to degrade any externally provided polysaccharides (H. lacusprofundi, H. walsbyi and N. pharaonis) ().
H. utahensis and
H. turkmenica have the two largest sets of proteins with glycosyl hydrolase domains among halophilic archaea (43 and 44, respectively). However, their glycosyl hydrolase complements are markedly different. While
H. utahensis has five proteins of GH10 family and two proteins of GH11 family, which probably have xylanase activity,
H. turkmenica has only one GH10 protein and no GH11 members. The abundance of predicted xylanases in the
H. utahensis genome is in agreement with experimental data that showed xylan-degrading activity of this archaeon
[65]. The
H. utahensis genome contains seven GH5 family proteins and one GH9 family protein (as compared to three and zero in
H. turkmenica genome). These proteins may have endo-beta-glucanase activity, thus enabling
H. utahensis to degrade components of the plant cell wall. One of the GH5 proteins in
H. utahensis (Huta_2387) has been shown experimentally to have cellulolytic activity (T. Zhang et al., in press).
H. utahensis also has two GH94 proteins that may have cellobiose or cellodextrin phosphorylase activity. On the other hand, the
H. turkmenica genome encodes four GH32 family proteins predicted to have beta-fructosidase (levanase or invertase) activity that are absent from the
H. utahensis genome, while both genomes have several GH2 family proteins that may have beta-galactosidase activity.
Three of the genomes from haloarchaea isolated from soil or sediment encode enzymes involved in degradation of pectin. The backbone chains of pectin are made up of either homogalacturonan or rhamnogalacturonan with various side chains, and the main chains are linked together by α-1,5-arabinan chains
[66].
H. turkmenica has four family 1 polysaccharide lyases (PL) which likely have pectate lyase activity, three of which are close together on the chromosome (Htur_4783, Htur_4785, Htur_4789). Also in the vicinity of these three genes is a family 2 polysaccharide lyase related to pectate lyases (Htur_4786) and two glycosyl hydrolases, one of which may have polygalacturonase activity (Htur_4790). The other
H. turkmenica PL1 family protein (Htur_4440) is close to one of two pectin methylesterases (Htur_4438) and a rhamnogalacturonan acetylesterase (Htur_4445).
H. turkmenica also has two family 11 polysaccharide lyases (Htur_3890, Htur_3891) that are highly similar to a
B. subtilis rhamnogalacturonan lyase and a GH105 protein similar to the RhiN protein of
Dickeya (formerly
Erwinia)
chrysanthemi, which is involved in degradation of rhamnogalacturonate oligosaccharides
[67].
H. utahensis and
H. volcanii have much lower capacity for pectin degradation:
H. utahensis has two probable pectate lyases from family 1, while
H. volcanii has one pectate lyase and one pectin methylesterase. In addition to enzymes capable of degrading the main chains of pectin,
H. turkmenica,
H. utahensis, and
H. volcanii have GH43, GH51, and GH93 glycosyl hydrolases with similarity to endo- and exo-arabinases that may be capable of degrading the arabinan linking chains of pectin.
Galactose utilization
H. lacusprofundi and
H. mukohataei have been shown to grow on galactose
[49],
[68], but the genome sequences suggest that other haloarchaea can also utilize galactose. Also the genome sequences show that two different pathways for galactose metabolism may exist in haloarchaea: the Leloir pathway in
H. utahensis, and the De Ley-Doudoroff pathway in
H. lacusprofundi,
H. marismortui,
H. volcanii,
H. borinquense,
H. mukohataei, and
H. turkmenica.
H. utahensis is the only haloarchaeon with genes encoding the three enzymes of the Leloir pathway. No other archaeon possesses a gene for hexose 1-phosphate uridylyltransferase (COG1085, Huta_2170), and
H. volcanii is the only other haloarchaeon to have a probable galactokinase (HVO_1487). Six haloarchaea (listed above) have genes with high similarity (65–75%) to
E. coli galactonate dehydratase, one of the enzymes of the De Ley-Doudoroff pathway. Phylogenetic analysis shows that the genes for galactonate dehydratase in the haloarchaea cluster together (data not shown). In the De Ley-Doudoroff pathway, after 2-dehydro-3-deoxygalactonate (KDGal) is formed by galactonate dehydratase, it is phosphorylated by KDGal kinase to form 2-dehydro-3-deoxy-6-phosphogalactonate (KDPGal). KDPGal is then split by KDPGal aldolase to form pyruvate and glyceraldehyde 3-phosphate
[69]. None of the halophiles has a gene related to known KDGal kinases (COG3734). KDPGal aldolases belong to the same family as bacterial-type KDPGlc aldolases of the Entner-Doudoroff pathway (COG0800).
H. lacusprofundi has two proteins related to KDGlc kinase and two proteins related to bacterial-type KDPGlc aldolase. One kinase (Hlac_2870) and aldolase (Hlac_2860) are close on the chromosome to each other and to galactonate dehydratase (Hlac_2866), beta-galactosidase (Hlac_2868), and a probable alpha-galactosidase (Hlac_2869), suggesting that the kinase and aldolase may be involved in the utilization of galactose via the De Ley-Doudoroff pathway. Similarly,
H. volcanii has three COG0800 proteins, one of which (HVO_A0329) is close on the chromosome to a KDPGlc kinase-related protein (HVO_A0328), galactonate dehydratase (HVO_A0331) and beta-galactosidase (HVO_A0326). Some of the halophiles that have galactonate dehydratase only have one protein related to KDGlc kinase and KDPGlc aldolase. It is possible that in these organisms the proteins are bifunctional, working with both KDGlc and KDGal, similar to the proteins of the
Sulfolobus solfataricus ED pathway
[70].
Fructose utilization
Fructose can be utilized by some haloarchaea:
H. marismortui,
H. borinquense,
H. utahensis, and
H. turkmenica have been shown to grow on fructose
[48],
[50],
[52],
[71], while
H. mukohataei has been shown to grow on sucrose
[49] and thus will likely also metabolize fructose. The enzyme ketohexokinase was characterized in
Haloarcula vallismortis but the gene was not identified
[72].
H. marismortui,
H. volcanii, and
H. turkmenica each have one transporter of the phosphotransferase system (PTS), and operon evidence suggests that these are fructose transporters that produce fructose 1-phosphate. The PTS transporters from
H. volcanii and
H. turkmenica are close on the chromosome to putative fructose 1-phosphate kinases (COG1105), while in
H. volcanii the PTS proteins are also close to fructose bisphosphate aldolase. In addition to the three haloarchaea with PTS transporters,
H. utahensis and
H. mukohataei also have putative fructose 1-phosphate kinases. In
H. mukohataei the fructose 1-phosphate kinase (Hmuk_2661) is close on the chromosome to fructose bisphosphate aldolase (Hmuk_2663) and another putative sugar kinase (Hmuk_2662) which may be a ketohexokinase. Surprisingly
H. borinquense does not have a member of COG1105, despite its known ability to grow on fructose. The protein sequences of all components of the PTS system transporter and fructose 1-phosphate kinase are strongly conserved among the halophiles.
Xylose utilization
The pathway by which
H. volcanii utilizes xylose has recently been characterized
[73]. The pathway involves formation of xylonate, followed by two dehydratase steps to generate 2-oxoglutarate semialdehyde. This pathway also appears to be present in
H. turkmenica and
H. lacusprofundi, both of which were not previously known to utilize xylose.
H. marismortui is known to produce acid from xylose, and it appears to have 2-dehydro-3-deoxyxylonate dehydratase and xylose dehydrogenase, but it does not have a gene with high similarity to the
H. volcanii xylonate dehydratase.
H. utahensis is known to degrade xylan and to be able to grow on xylose
[52],
[65], and it uses a different pathway for xylose degradation. It has a xylose isomerase (COG2115, Huta_2443) and xylulokinase (TIGR01312, Huta_2446). The resulting D-xylulose 5-phosphate then feeds into the non-oxidative pentose phosphate pathway.
H. utahensis is the only one of the sequenced haloarchaea to have transaldolase and transketolase of the non-oxidative PPP, which allows it to use this pathway of xylose utilization.
H. borinquense is known to utilize xylose
[48], but it does not have identifiable genes for either of the pathways found in the other haloarchaea. Phylogenetic analysis of xylose isomerase using both neighbor joining (Clustal W) and Bayesian (MrBayes) methods show that
H. utahensis xylose isomerase branches deeply within Firmicutes with high bootstrap support (not shown), but xylulokinase did not associate closely to any group of organisms.
Glucuronate utilization
Both
H. utahensis and
H. turkmenica have putative glucuronate isomerase (COG1904) and mannonate dehydratase (COG1312), suggesting that they may utilize glucuronate by the same pathway as found in
E. coli [74]. This pathway produces KDGlc which feeds into the Entner-Doudoroff pathway.
H. utahensis also has a probable alpha-glucuronidase (Huta_0871) belonging to glycosyl hydrolase family 67, that is adjacent on the chromosome to glucuronate isomerase (Huta_0870) and mannonate dehydratase (Huta_0869).
H. lacusprofundi has a putative mannonate dehydratase but no glucuronate isomerase, therefore it is unclear whether it has the capacity to break down glucuronate. Since
H. lacusprofundi is known to grow on mannose
[68], it is possible that mannonate dehydratase is used in a pathway for mannose degradation.
L-arabinose utilization
None of the haloarchaea have been shown to grow on L-arabinose, but the genomes suggest that it may be utilized by some haloarchaea. H. utahensis, H. volcanii, and H. turkmenica all have putative alpha-L-arabinofuranosidases (COG3534). In H. utahensis the arabinofuranosidase (Huta_1152) is close on the chromosome to L-arabinose isomerase (Huta_1154) and ribulose 5-phosphate 4-epimerase (Huta_1149), suggesting that H. utahensis uses the known bacterial pathway of L-arabinose degradation. A gene similar to ribulokinase was not found in H. utahensis, but there is a gene with similarity to xylulokinases (Huta_1150) close to the arabinose degradation genes. Huta_2446 is likely to be a xylulokinase in H. utahensis (see above), and Huta_1150 may be a ribulokinase, completing the pathway. This pathway produces D-xylulose 5-phosphate which enters the non-oxidative pentose phosphate pathway. H. utahensis is the only haloarchaeon to have the non-oxidative pentose phosphate pathway, which allows it to use this pathway, and it is also the only haloarchaeon to have L-arabinose isomerase.
N-acetylglucosamine utilization
Since the presence of family 18 glycosyl hydrolases in
H. salinarum,
H. mukohataei and
H. borinquense indicates that they may possess chitinase activity and use chitin as a growth substrate, we attempted to identify enzymes for subsequent degradation of chitooligosaccharides, N-acetyl-glucosamine or glucosamine. We found that
H. mukohataei likely possesses a beta-N-acetylhexosaminidase (Hmuk_3174) that has 51% similarity to characterized enzymes from
Streptomyces thermoviolaceus [75] and
Bacillus subtilis [76]. A chitobiose deacetylase has been identified in
Thermococcus kodakaraensis belonging to COG2120, which includes other carbohydrate deacetylases
[77]. Both
H. mukohataei and
H. borinquense have genes belonging to this family, although they are distantly related to the
T. kodakaraensis enzyme. None of the organisms with family 18 glycosyl hydrolases has been tested for growth on N-acetylglucosamine or glucosamine
[48],
[49], so the presence of chitinase activity and chitinolytic pathway in haloarchaea needs further experimental elucidation.
Glycerol metabolism and transport
The haloarchaea encode genes for two different glycerol utilization pathways. All except
N. pharaonis have a glycerol kinase and glycerol 3-phosphate dehydrogenase, and the genes for both enzymes are found close together on the chromosome. Another pathway involving glycerol dehydrogenase and dihydroxyacetone kinase is present only in
H. lacusprofundi.
H. volcanii and
H. walsbyi have a dihydroxyacetone kinase without glycerol dehydrogenase, and this may be used for metabolism of dihydroxyacetone from the environment
[7].
H. salinarum encodes a glycerol dehydrogenase but no dihydroxyacetone kinase.
Only H. mukohataei has an identifiable glycerol transporter within the genome. It encodes a member of the Major Intrinsic Protein (MIP) family adjacent to glycerol kinase, providing strong evidence for a glycerol transport function. All other haloarchaea that have a glycerol kinase have an uncharacterized membrane protein adjacent (e.g. rrnAC0550), and we predict that these genes encode a new family of glycerol transporters. H. borinquense has two glycerol kinases and both have this uncharacterized membrane protein family adjacent to the kinase gene. There are also bacterial homologs of this membrane protein family, and many of them are adjacent to genes involved in glycerol or propanediol metabolism.
Propionate metabolism
H. lacusprofundi is the only halophile that has been shown to grow on propionate, but all of the haloarchaeal genomes, with the exception of H. utahensis, contain genes that may encode the methylmalonate pathway for conversion of propionate to succinyl-CoA. All except H. utahensis have methylmalonyl-CoA epimerase (TIGR03081) and methylmalonyl-CoA mutase (COG1884 and COG2185). Also they have a biotin carboxylase protein (COG4770) and a carboxyltransferase protein (pfam01039), subunits of a biotin-dependent carboxylase. All except H. utahensis also contain a biotin-protein ligase (COG0340) and a BioY family biotin transporter (pfam02632). Propionate or propionyl-CoA may be produced intracellularly from the breakdown of fatty acids, amino acids, or other compounds, or these organisms may be able to use propionate from the environment produced as a result of fermentation.
Glycine betaine metabolism and transport
Glycine betaine is a compatible solute which is likely to be present in high-salt environments. All of the haloarchaeal genomes except that of
H. mukohataei encode members of the betaine/carnitine/choline transporter (BCCT) family, which transport glycine betaine and related compounds. Most have one or two members of this family, but
H. turkmenica has seven. In addition,
H. turkmenica has an ABC transporter for compatible solutes. Four of the haloarchaeal genomes –
H. marismortui,
H. walsbyi,
H. volcanii, and
H. turkmenica – encode one or two genes with high similarity to dimethylglycine oxidase from
Arthrobacter globiformis [78]. In all four genomes these oxidase genes are close on the chromosome to BCCT family transporters. The presence of these enzymes and transporters raises the possibility that some of the halophiles may be able to utilize glycine betaine, dimethylglycine, and/or sarcosine. However, in
H. walsbyi, betaine was not found to enhance growth
[8] and
H. utahensis could not grow on betaine
[52].