Orthologs and alignments
The number of aligned and filtered OrthoMCL clusters containing at least four entries (i.e., genes from distinct genomes) was 3,853, their length ranging from 22 to 1650 amino acids (262.542 on average). The concatenated supermatrix thus comprised 1,011,575 columns, including 708,296 variable and 353,936 parsimony-informative characters.
The ML phylogeny inferred from the concatenated gene alignments is shown in (species tree) together with ML and MP bootstrap support values. The final highest log likelihood obtained was −12,342,390.79, whereas the single best MP tree (excluding uninformative sites) had a score of 1,409,187. ML and MP topologies were identical. Support was maximum (100%) for all branches under ML, and maximum for all but four branches under MP; only a single branch entirely lacked support under MP. As only one genome per genus was included in the sample, there is no taxonomic subdivision of Halobacteriaceae to compare the tree with. However, outgroup taxonomy was well recovered, the tree showing the monophyly of Methanomicrobiales, Methanosarcinales, and Methanosarcinaceae, each of which were represented with at least two genomes.
Maximum likelihood (ML) phylogenetic tree inferred from the 3,853-gene supermatrix.
Incongruence between gene trees and species tree
After reducing the dataset to the ingroup taxa and to the OrthoMCL clusters present in at least four ingroup genomes, total PBS per OrthoMCL cluster ranged between 142 and −219 (average: 4.941, standard deviation: 19.588, median: 1, MAD: 8.896). These data are plotted against the number of parsimony-informative characters in supplementary Figure S1
. Within a total of 2,891 OrthoMCL clusters, 1,506 genes showed overall positive support and 764 showed overall negative support. Trees inferred from the five clusters least congruent with the species tree are depicted in Figure S2
. They are uniformly characterized by high bootstrap support for groupings in conflict with the species tree estimate. Total PBS values per cluster vary between the COG categories (which could be assigned to 2,213 clusters; see Figure S1
); on average, COGs related to information storage and processing display higher PBS than those associated with metabolism or cellular processes and signaling (Table S1
); but individual COG categories may differ from this general trend (Table S1
We used a spectral clustering method to generate gene clusters from the haloarchaeal genomes. There were 887 core clusters, those found in all of the haloarchaeal genomes, and these accounted for 40% to 50% of the genes in each genome (). As expected, the core clusters contain genes involved in basic cellular processes such as transcription, translation, DNA replication, DNA repair, RNA modification, protein modification, and protein secretion (Table S2
). The core clusters also include many genes involved in biosynthesis of essential metabolites – amino acids, purines and pyrimidines, lipids, and cofactors. This is somewhat unexpected as the haloarchaea are heterotrophs, but they appear to be relatively self-sufficient in being able to make most essential metabolites. Biosynthetic pathways in the haloarchaea have recently been reviewed 
, so we will not go into more detail here. The number of genes in each genome belonging to all clusters ranges from 78% to 88% (), showing that 12% to 22% of the genes in each genome have no hits to genes in the other halophile genomes.
We also identified signature gene clusters, those that are shared by all haloarchaea but are not found in any other archaea. There are 112 of these clusters (Table S3
), 89 of which contain proteins with completely unknown function. Of the clusters with a predicted function, two are protein kinases related to the Bacillus subtilis
PrkA protein. These two kinase genes are adjacent to each other on the chromosome in all haloarchaeal genomes and are always found with two other genes: one with a domain of unknown function DUF444, and one related to B. subtilis
SpoVR, the function of which is unknown. Proteins of these three families are found together on the chromosome in many other microbial genomes suggesting functional linkage.
Halophiles are known to accumulate gamma-glutamyl-cysteine 
, and two of the signature clusters may be involved in gamma-glutamyl-cysteine metabolism. Cluster 491.1× contains proteins related to glutamate-cysteine ligase, and the H. volcanii
member of cluster 491.1× was recently shown to have glutamate-cysteine ligase activity 
. Cluster 1151× includes genes related to glutathione S-transferase, which inactivates toxic compounds by linking them to glutathione. In the halophiles, these may function as gamma-glutamylcysteine S-transferases.
Of the ten haloarchaea with sequenced genomes, four were isolated from water and four were isolated from soil or sediment. The ones isolated from water are H. walsbyi 
, N. pharaonis 
, H. marismortui 
, and H. borinquense 
. H. mukohataei
and H. turkmenica
were isolated from saline soils 
, while H. volcanii
and H. utahensis
were isolated from lake sediments 
. The water halophiles and soil/sediment halophiles do not form separate clades in the phylogenetic tree (). We looked for clusters present in all water-isolated halophiles that are not present in soil/sediment halophiles and vice versa
. There were no clusters specific to water halophiles and only three specific to soil/sediment haloarchaea (Table S4
). Proteins belonging to two of the soil/sediment-specific clusters (326.1.0× and 2168×) are often found in the vicinity of nucleotide-sugar metabolic enzymes and glycosyl transferases, suggesting they are involved in cell wall biosynthesis 
Since there were few protein clusters completely specific to the water or soil/sediment halophiles, we looked for clusters present in three out of four organisms from one group and absent from the other group. There were 16 clusters present in three out of four water halophiles, of which 11 contain hypothetical proteins and four have only general functional annotations (Table S4
). The only cluster with a specific annotation is cluster 1816×, formate-tetrahydrofolate ligase.
Of the 26 clusters found in three out of four soil/sediment halophiles and not present in water halophiles, 11 are hypothetical proteins (Table S4
). Several of the clusters are involved in polysaccharide degradation. These include a glycosyl hydrolase of family GH4, an alpha-L-arabinofuranosidase of family GH51, a polysaccharide deacetylase and a trehalose utilization protein. Two additional clusters present in three out of four soil/sediment halophiles encode a monooxygenase and an acyltransferase, which are found adjacent to each other on the chromosome and flanked by two IucA/IucC family proteins. These proteins are likely to be involved in siderophore synthesis. H. borinquense
has three IucA/IucC family proteins but lacks the monooxygenase and acyltransferase, thus it is unclear if it has a complete siderophore biosynthesis pathway.
We also looked for clusters conserved in all but one genome. These probably indicate recent gene losses in each species. The genomes fell into two groups – those that had 20 or less such clusters and those that had greater than 60 (). The three genomes that had greater than 60 clusters lost were H. salinarum, H. walsbyi, and H. utahensis. Many of the clusters lost from H. salinarum are involved in amino acid synthesis, including genes for the synthesis of glutamate, lysine, ornithine, methionine, and branched chain amino acids. To make up for this, H. salinarum does not have more amino acid transporters or secreted proteases than the other haloarchaea, but it is one of only two of the haloarchaea to have a putative peptide symporter of the OPT family (TC 2.A.67). Symporters have low affinity but high capacity, suggesting that H. salinarum may prefer to live where there is an ample supply of peptides. Of the clusters not present in H. walsbyi, many are involved in flagellum biosynthesis and chemotaxis. However, H. walsbyi has a set of gas vesicle proteins to enable motility in the absence of flagella. The clusters in all except H. utahensis include several enzymes involved in cobalamin synthesis and several enzymes of biotin utilization and propionate metabolism. H. utahensis appears to lack the enzymes for the early steps of cobalamin biosynthesis up to the incorporation of cobalt, but all of the halophiles, including H. utahensis, contain the enzymes for the later steps of cobalamin biosynthesis. Biotin and propionate metabolism are discussed further below.
Some haloarchaea are known to use the semi-phosphorylated Entner-Doudoroff (ED) pathway for glucose degradation 
, and genes encoding enzymes of this pathway have been identified in several haloarchaea 
. With the addition of the new genomes, we find that the semi-phosphorylated Entner-Doudoroff pathway is likely to be present in all sequenced haloarchaea except N. pharaonis
, which does not utilize carbohydrates. Aldolases belonging to two different protein families may be involved. All of the halophiles except H. salinarum
and N. pharaonis
have one or more bacterial-type 2-keto-3-deoxy-6-phosphogluconate (KDPGlc) aldolase (COG0800). Also all except N. pharaonis
have at least one potential aldolase related to the characterized Sulfolobus
aldolase (COG0329), which is active on KDPGlc and unphosphorylated 2-keto-3-deoxygluconate (KDGlc) 
. The enzymes of the semi-phosphorylated Entner-Doudoroff pathway are highly conserved in sequence among the haloarchaea, suggesting descent from a common ancestor.
The standard Embden-Meyerhof pathway of glycolysis appears to be incomplete in the halophiles as no 6-phosphofructokinase could be identified. This agrees with previous experimental studies and analysis 
. Gluconeogenesis is likely to be present in all of the halophile genomes with the possible exception of H. utahensis
. All except H. utahensis
have phosphoenolpyruvate (PEP) synthase and/or pyruvate, phosphate dikinase (COG0574). In addition, H. lacusprofundi
and H. turkmenica
have ATP-utilizing PEP carboxykinase (COG1866). In H. utahensis
the only enzyme that potentially can generate PEP for gluconeogenesis is pyruvate kinase. All except H. utahensis
have a fructose 1,6-bisphosphatase belonging to the same family as the E. coli
Fbp enzyme (COG0158). All of the halophiles including H. utahensis
have at least one gene belonging to COG0483, which includes inositol phosphatases and some archaeal fructose 1,6-bisphosphatases 
. H. utahensis
has two genes belonging to this family, but they are weakly related to characterized fructose 1,6-bisphosphatases. These findings suggest that H. utahensis
may lack the gluconeogenesis pathway or have an unusual gluconeogenesis pathway.
Unlike the rest of the archaea, halophiles are thought to use the oxidative pentose phosphate pathway for generation of pentoses 
. This pathway also generates NADPH for anabolic pathways. All except H. utahensis
have a probable 6-phosphogluconate dehydrogenase (COG1023), the key enzyme of this pathway. In contrast, H. utahensis
is the only sequenced haloarchaeon to have transaldolase (Huta_0859) and transketolase (Huta_0860 and Huta_0861), the enzymes of the non-oxidative pentose phosphate pathway. For NADPH generation, H. utahensis
possesses genes encoding a NAD/NADP transhydrogenase (Huta_2005–2007). None of the other haloarchaea has the genes for this enzyme. The presence of these enzymes only in H. utahensis
suggests that they may have been acquired through lateral transfer, but phylogenetic analysis was unable to identify the donor (data not shown).
presents an overview of nutrient transport in the haloarchaea. All of the haloarchaea have at least five symporters for amino acids and at least two ABC transporters for peptides. Since all except H. salinarum can synthesize most or all amino acids, this suggests that amino acids are an important carbon and energy source even in the species that can grow on carbohydrates. All of the haloarchaea also have at least one symporter for nucleosides or nucleobases. Carbohydrate transport is variable. Only half of the halophiles have symporters for sugars, and either none or one ABC transporter for sugars is found in the non-carbohydrate-utilizing organisms. Surprisingly no transporters for sugars could be identified in the H. utahensis genome, suggesting that it uses uncharacterized families of sugar transporters.
Nutrient transport in haloarchaea.
There appears to be a connection between some transporters and universal stress protein A (UspA) family proteins. Most amino acid transporters of the amino acid-polyamine-organocation (APC) family are either fused to a UspA domain or adjacent on the chromosome to a UspA protein, and some are both fused and adjacent to UspA family proteins (e.g. Htur_0566). This appears to be specific for the APC family as other potential amino acid symporters of the neurotransmitter
sodium symporter (NSS) family and the dicarboxylate/amino acid
cation symporter (DAACS) family are not associated with UspA family proteins. Several transporters of the formate-nitrite transporter (FNT) family are also fused or adjacent to UspA domains (Hmuk_1674, HQ1451A, NP6264A, Hlac_2299, and Htur_2705). The FNT family proteins with associated UspA domains are closely related to each other and to a transporter from H. marismortui
that lacks a UspA domain (rrnAC0187). They are likely to be formate transporters as the H. marismortui
and H. mukohataei
proteins are adjacent to enzymes of folate metabolism. Another transporter with adjacent UspA proteins is a putative acetate transporter of the solute
sodium symporter (SSS) family. This transporter is found in seven of the ten halophile genomes (e.g. NP5136A), and in all cases is followed by a UspA domain protein. In six of the seven genomes with this transporter, it is close on the chromosome to two acetyl-CoA synthetase genes (e.g. NP5128A and NP5132A). The transporter has highest similarity to subfamily 7 of the SSS family, which includes acetate, propionate, and phenylacetate transporters (see the Transporter Classification Database at www.tcdb.org
). UspA family proteins are expressed during many stressful conditions, and they are known to bind ATP, but their exact molecular function is unknown 
. The UspA domains associated with transporters may play a regulatory role, or may be involved in maintaining transporter function during stressful conditions. A recent report shows that a UspA domain protein is involved in regulation of a transporter 
Since the halophiles have numerous transporters for amino acids and peptides, we analyzed the distribution of secreted proteases within their genomes. Only secreted proteases were considered because these are likely to be involved in the utilization of proteins as a nutrient source, while intracellular and integral membrane proteases are involved in a variety of cellular processes. We included proteases that have signal peptides as well as proteins that are likely to be attached to the membrane with the protease domain outside the cell. Signal peptidases (family S26) were excluded from the analysis since they have a specific cellular function. The numbers of secreted proteases in the genomes ranged from 3 to 11. Hierarchical clustering () shows that the halophiles fall into two groups with respect to protease distribution. The main feature separating these groups appears to be the presence or absence of secreted members of protease family S8, which includes subtilisin as well as halolysins from halophilic archaea. The organisms having secreted S8 proteases do not correspond to a habitat-specific or phylogenetic group. The presence of at least three secreted proteases in each genome suggests that all of the halophiles may be capable of degradation of extracellular proteins.
Secreted protease distribution in haloarchaeal genomes.
Amino acid utilization
Since all of the halophiles have at least five amino acid symporters and two peptide ABC transporters, we investigated pathways of amino acid utilization to see if all of the halophiles are capable of using many amino acids. A summary is provided in . Three degradation pathways were found in all of the ten genomes. All of them had an alanine dehydrogenase similar to the enzyme characterized in Archaeoglobus fulgidus 
. This enzyme could potentially be involved in synthesis of alanine as well as its degradation. All had at least one asparaginase from COG0252 or COG1446. Finally all of them had a pyruvoyl-dependent arginine decarboxylase similar to the enzyme characterized in Methanocaldococcus jannaschii 
and agmatinase. This combination of enzymes produces putrescine and urea from arginine. Additional enzymes of arginine utilization were present in some genomes. Six of the genomes have arginase, which produces ornithine and urea. None of the genomes had ornithine decarboxylase, but all of the ones that have arginase also have ornithine cyclodeaminase which produces proline.
Amino acid degradation pathways in haloarchaea.
Several other amino acid degradation pathways are found in a subset of the genomes. The glycine cleavage system is found in all genomes except those of H. walsbyi
and H. utahensis
. This enzyme complex produces CO2
, methylene-tetrahydrofolate (THF), and NADH. Methylene-THF has a variety of possible uses within the cell. Another pathway found in all but H. walsbyi
and H. utahensis
is isoleucine degradation. A 2-oxoacid dehydrogenase complex involved in isoleucine degradation was recently identified in H. volcanii 
, and seven of the other halophiles have genes with at least 68% similarity to the H. volcanii
genes, indicating that they probably have the same function. Five of the genomes have tryptophanase, which produces pyruvate from tryptophan. A histidine degradation pathway with formiminoglutamate as an intermediate is also found in five of the genomes. Seven have proline dehydrogenase, but only H. lacusprofundi
has pyrroline-5-carboxylate dehydrogenase to complete proline conversion to glutamate. All of the genomes have threonine dehydratase, but this may be used only for biosynthesis. Five of the genomes in addition have threonine aldolase, which produces glycine and acetaldehyde. Finally, all of the genomes have glutamate dehydrogenase, which may have a biosynthetic role. Four of the genomes have glutamate mutase and methylaspartate ammonia-lyase. These are the first two enzymes of a four-step pathway that produces acetate and pyruvate from glutamate with mesaconate and citramalate as intermediates. However, these two enzymes are likely to be involved in a new pathway for acetate assimilation 
Many of the pathways for amino acid degradation are found in a subset of the genomes. They could have been acquired independently by lateral gene transfer or lost in some species. The genes for amino acid degradation in the halophiles are closely related in sequence, suggesting that the common ancestor of haloarchaea was able to degrade many amino acids and that some organisms have lost these pathways.
According to the distribution of glycosyl hydrolase domains and carbohydrate-binding modules, halophilic archaea can be divided into 3 groups: those that may be capable of degrading plant biomass (H. utahensis, H. turkmenica and to a lesser extent H. marismortui and H. volcanii), those harboring family 18 glycosyl hydrolases with possible chitinase activity (H. salinarum, H. mukohataei and H. borinquense) and organisms that are unlikely to degrade any externally provided polysaccharides (H. lacusprofundi, H. walsbyi and N. pharaonis) ().
Glycosyl hydrolase distribution in haloarchaeal genomes.
and H. turkmenica
have the two largest sets of proteins with glycosyl hydrolase domains among halophilic archaea (43 and 44, respectively). However, their glycosyl hydrolase complements are markedly different. While H. utahensis
has five proteins of GH10 family and two proteins of GH11 family, which probably have xylanase activity, H. turkmenica
has only one GH10 protein and no GH11 members. The abundance of predicted xylanases in the H. utahensis
genome is in agreement with experimental data that showed xylan-degrading activity of this archaeon 
. The H. utahensis
genome contains seven GH5 family proteins and one GH9 family protein (as compared to three and zero in H. turkmenica
genome). These proteins may have endo-beta-glucanase activity, thus enabling H. utahensis
to degrade components of the plant cell wall. One of the GH5 proteins in H. utahensis
(Huta_2387) has been shown experimentally to have cellulolytic activity (T. Zhang et al., in press). H. utahensis
also has two GH94 proteins that may have cellobiose or cellodextrin phosphorylase activity. On the other hand, the H. turkmenica
genome encodes four GH32 family proteins predicted to have beta-fructosidase (levanase or invertase) activity that are absent from the H. utahensis
genome, while both genomes have several GH2 family proteins that may have beta-galactosidase activity.
Three of the genomes from haloarchaea isolated from soil or sediment encode enzymes involved in degradation of pectin. The backbone chains of pectin are made up of either homogalacturonan or rhamnogalacturonan with various side chains, and the main chains are linked together by α-1,5-arabinan chains 
. H. turkmenica
has four family 1 polysaccharide lyases (PL) which likely have pectate lyase activity, three of which are close together on the chromosome (Htur_4783, Htur_4785, Htur_4789). Also in the vicinity of these three genes is a family 2 polysaccharide lyase related to pectate lyases (Htur_4786) and two glycosyl hydrolases, one of which may have polygalacturonase activity (Htur_4790). The other H. turkmenica
PL1 family protein (Htur_4440) is close to one of two pectin methylesterases (Htur_4438) and a rhamnogalacturonan acetylesterase (Htur_4445). H. turkmenica
also has two family 11 polysaccharide lyases (Htur_3890, Htur_3891) that are highly similar to a B. subtilis
rhamnogalacturonan lyase and a GH105 protein similar to the RhiN protein of Dickeya
, which is involved in degradation of rhamnogalacturonate oligosaccharides 
. H. utahensis
and H. volcanii
have much lower capacity for pectin degradation: H. utahensis
has two probable pectate lyases from family 1, while H. volcanii
has one pectate lyase and one pectin methylesterase. In addition to enzymes capable of degrading the main chains of pectin, H. turkmenica
, H. utahensis
, and H. volcanii
have GH43, GH51, and GH93 glycosyl hydrolases with similarity to endo- and exo-arabinases that may be capable of degrading the arabinan linking chains of pectin.
and H. mukohataei
have been shown to grow on galactose 
, but the genome sequences suggest that other haloarchaea can also utilize galactose. Also the genome sequences show that two different pathways for galactose metabolism may exist in haloarchaea: the Leloir pathway in H. utahensis
, and the De Ley-Doudoroff pathway in H. lacusprofundi
, H. marismortui
, H. volcanii
, H. borinquense
, H. mukohataei
, and H. turkmenica
. H. utahensis
is the only haloarchaeon with genes encoding the three enzymes of the Leloir pathway. No other archaeon possesses a gene for hexose 1-phosphate uridylyltransferase (COG1085, Huta_2170), and H. volcanii
is the only other haloarchaeon to have a probable galactokinase (HVO_1487). Six haloarchaea (listed above) have genes with high similarity (65–75%) to E. coli
galactonate dehydratase, one of the enzymes of the De Ley-Doudoroff pathway. Phylogenetic analysis shows that the genes for galactonate dehydratase in the haloarchaea cluster together (data not shown). In the De Ley-Doudoroff pathway, after 2-dehydro-3-deoxygalactonate (KDGal) is formed by galactonate dehydratase, it is phosphorylated by KDGal kinase to form 2-dehydro-3-deoxy-6-phosphogalactonate (KDPGal). KDPGal is then split by KDPGal aldolase to form pyruvate and glyceraldehyde 3-phosphate 
. None of the halophiles has a gene related to known KDGal kinases (COG3734). KDPGal aldolases belong to the same family as bacterial-type KDPGlc aldolases of the Entner-Doudoroff pathway (COG0800). H. lacusprofundi
has two proteins related to KDGlc kinase and two proteins related to bacterial-type KDPGlc aldolase. One kinase (Hlac_2870) and aldolase (Hlac_2860) are close on the chromosome to each other and to galactonate dehydratase (Hlac_2866), beta-galactosidase (Hlac_2868), and a probable alpha-galactosidase (Hlac_2869), suggesting that the kinase and aldolase may be involved in the utilization of galactose via the De Ley-Doudoroff pathway. Similarly, H. volcanii
has three COG0800 proteins, one of which (HVO_A0329) is close on the chromosome to a KDPGlc kinase-related protein (HVO_A0328), galactonate dehydratase (HVO_A0331) and beta-galactosidase (HVO_A0326). Some of the halophiles that have galactonate dehydratase only have one protein related to KDGlc kinase and KDPGlc aldolase. It is possible that in these organisms the proteins are bifunctional, working with both KDGlc and KDGal, similar to the proteins of the Sulfolobus solfataricus
ED pathway 
Fructose can be utilized by some haloarchaea: H. marismortui
, H. borinquense
, H. utahensis
, and H. turkmenica
have been shown to grow on fructose 
, while H. mukohataei
has been shown to grow on sucrose 
and thus will likely also metabolize fructose. The enzyme ketohexokinase was characterized in Haloarcula vallismortis
but the gene was not identified 
. H. marismortui
, H. volcanii
, and H. turkmenica
each have one transporter of the phosphotransferase system (PTS), and operon evidence suggests that these are fructose transporters that produce fructose 1-phosphate. The PTS transporters from H. volcanii
and H. turkmenica
are close on the chromosome to putative fructose 1-phosphate kinases (COG1105), while in H. volcanii
the PTS proteins are also close to fructose bisphosphate aldolase. In addition to the three haloarchaea with PTS transporters, H. utahensis
and H. mukohataei
also have putative fructose 1-phosphate kinases. In H. mukohataei
the fructose 1-phosphate kinase (Hmuk_2661) is close on the chromosome to fructose bisphosphate aldolase (Hmuk_2663) and another putative sugar kinase (Hmuk_2662) which may be a ketohexokinase. Surprisingly H. borinquense
does not have a member of COG1105, despite its known ability to grow on fructose. The protein sequences of all components of the PTS system transporter and fructose 1-phosphate kinase are strongly conserved among the halophiles.
The pathway by which H. volcanii
utilizes xylose has recently been characterized 
. The pathway involves formation of xylonate, followed by two dehydratase steps to generate 2-oxoglutarate semialdehyde. This pathway also appears to be present in H. turkmenica
and H. lacusprofundi
, both of which were not previously known to utilize xylose. H. marismortui
is known to produce acid from xylose, and it appears to have 2-dehydro-3-deoxyxylonate dehydratase and xylose dehydrogenase, but it does not have a gene with high similarity to the H. volcanii
xylonate dehydratase. H. utahensis
is known to degrade xylan and to be able to grow on xylose 
, and it uses a different pathway for xylose degradation. It has a xylose isomerase (COG2115, Huta_2443) and xylulokinase (TIGR01312, Huta_2446). The resulting D-xylulose 5-phosphate then feeds into the non-oxidative pentose phosphate pathway. H. utahensis
is the only one of the sequenced haloarchaea to have transaldolase and transketolase of the non-oxidative PPP, which allows it to use this pathway of xylose utilization. H. borinquense
is known to utilize xylose 
, but it does not have identifiable genes for either of the pathways found in the other haloarchaea. Phylogenetic analysis of xylose isomerase using both neighbor joining (Clustal W) and Bayesian (MrBayes) methods show that H. utahensis
xylose isomerase branches deeply within Firmicutes with high bootstrap support (not shown), but xylulokinase did not associate closely to any group of organisms.
Both H. utahensis
and H. turkmenica
have putative glucuronate isomerase (COG1904) and mannonate dehydratase (COG1312), suggesting that they may utilize glucuronate by the same pathway as found in E. coli 
. This pathway produces KDGlc which feeds into the Entner-Doudoroff pathway. H. utahensis
also has a probable alpha-glucuronidase (Huta_0871) belonging to glycosyl hydrolase family 67, that is adjacent on the chromosome to glucuronate isomerase (Huta_0870) and mannonate dehydratase (Huta_0869). H. lacusprofundi
has a putative mannonate dehydratase but no glucuronate isomerase, therefore it is unclear whether it has the capacity to break down glucuronate. Since H. lacusprofundi
is known to grow on mannose 
, it is possible that mannonate dehydratase is used in a pathway for mannose degradation.
None of the haloarchaea have been shown to grow on L-arabinose, but the genomes suggest that it may be utilized by some haloarchaea. H. utahensis, H. volcanii, and H. turkmenica all have putative alpha-L-arabinofuranosidases (COG3534). In H. utahensis the arabinofuranosidase (Huta_1152) is close on the chromosome to L-arabinose isomerase (Huta_1154) and ribulose 5-phosphate 4-epimerase (Huta_1149), suggesting that H. utahensis uses the known bacterial pathway of L-arabinose degradation. A gene similar to ribulokinase was not found in H. utahensis, but there is a gene with similarity to xylulokinases (Huta_1150) close to the arabinose degradation genes. Huta_2446 is likely to be a xylulokinase in H. utahensis (see above), and Huta_1150 may be a ribulokinase, completing the pathway. This pathway produces D-xylulose 5-phosphate which enters the non-oxidative pentose phosphate pathway. H. utahensis is the only haloarchaeon to have the non-oxidative pentose phosphate pathway, which allows it to use this pathway, and it is also the only haloarchaeon to have L-arabinose isomerase.
Since the presence of family 18 glycosyl hydrolases in H. salinarum
, H. mukohataei
and H. borinquense
indicates that they may possess chitinase activity and use chitin as a growth substrate, we attempted to identify enzymes for subsequent degradation of chitooligosaccharides, N-acetyl-glucosamine or glucosamine. We found that H. mukohataei
likely possesses a beta-N-acetylhexosaminidase (Hmuk_3174) that has 51% similarity to characterized enzymes from Streptomyces thermoviolaceus 
and Bacillus subtilis 
. A chitobiose deacetylase has been identified in Thermococcus kodakaraensis
belonging to COG2120, which includes other carbohydrate deacetylases 
. Both H. mukohataei
and H. borinquense
have genes belonging to this family, although they are distantly related to the T. kodakaraensis
enzyme. None of the organisms with family 18 glycosyl hydrolases has been tested for growth on N-acetylglucosamine or glucosamine 
, so the presence of chitinase activity and chitinolytic pathway in haloarchaea needs further experimental elucidation.
Glycerol metabolism and transport
The haloarchaea encode genes for two different glycerol utilization pathways. All except N. pharaonis
have a glycerol kinase and glycerol 3-phosphate dehydrogenase, and the genes for both enzymes are found close together on the chromosome. Another pathway involving glycerol dehydrogenase and dihydroxyacetone kinase is present only in H. lacusprofundi
. H. volcanii
and H. walsbyi
have a dihydroxyacetone kinase without glycerol dehydrogenase, and this may be used for metabolism of dihydroxyacetone from the environment 
. H. salinarum
encodes a glycerol dehydrogenase but no dihydroxyacetone kinase.
Only H. mukohataei has an identifiable glycerol transporter within the genome. It encodes a member of the Major Intrinsic Protein (MIP) family adjacent to glycerol kinase, providing strong evidence for a glycerol transport function. All other haloarchaea that have a glycerol kinase have an uncharacterized membrane protein adjacent (e.g. rrnAC0550), and we predict that these genes encode a new family of glycerol transporters. H. borinquense has two glycerol kinases and both have this uncharacterized membrane protein family adjacent to the kinase gene. There are also bacterial homologs of this membrane protein family, and many of them are adjacent to genes involved in glycerol or propanediol metabolism.
H. lacusprofundi is the only halophile that has been shown to grow on propionate, but all of the haloarchaeal genomes, with the exception of H. utahensis, contain genes that may encode the methylmalonate pathway for conversion of propionate to succinyl-CoA. All except H. utahensis have methylmalonyl-CoA epimerase (TIGR03081) and methylmalonyl-CoA mutase (COG1884 and COG2185). Also they have a biotin carboxylase protein (COG4770) and a carboxyltransferase protein (pfam01039), subunits of a biotin-dependent carboxylase. All except H. utahensis also contain a biotin-protein ligase (COG0340) and a BioY family biotin transporter (pfam02632). Propionate or propionyl-CoA may be produced intracellularly from the breakdown of fatty acids, amino acids, or other compounds, or these organisms may be able to use propionate from the environment produced as a result of fermentation.
Glycine betaine metabolism and transport
Glycine betaine is a compatible solute which is likely to be present in high-salt environments. All of the haloarchaeal genomes except that of H. mukohataei
encode members of the betaine/carnitine/choline transporter (BCCT) family, which transport glycine betaine and related compounds. Most have one or two members of this family, but H. turkmenica
has seven. In addition, H. turkmenica
has an ABC transporter for compatible solutes. Four of the haloarchaeal genomes – H. marismortui
, H. walsbyi
, H. volcanii
, and H. turkmenica
– encode one or two genes with high similarity to dimethylglycine oxidase from Arthrobacter globiformis 
. In all four genomes these oxidase genes are close on the chromosome to BCCT family transporters. The presence of these enzymes and transporters raises the possibility that some of the halophiles may be able to utilize glycine betaine, dimethylglycine, and/or sarcosine. However, in H. walsbyi
, betaine was not found to enhance growth 
and H. utahensis
could not grow on betaine