|Home | About | Journals | Submit | Contact Us | Français|
The trace element molybdenum (Mo) is utilized in many life forms, and it is a key component of several enzymes involved in nitrogen, sulfur, and carbon metabolism. With the exception of nitrogenase, Mo is bound in proteins to a pterin, thus forming the molybdenum cofactor (Moco) at the catalytic sites of molybdoenzymes. Although a number of molybdoenzymes are well characterized structurally and functionally, evolutionary analyses of Mo utilization are limited. Here, we carried out comparative genomic and phylogenetic analyses to examine the occurrence and evolution of Mo utilization in bacteria, archaea and eukaryotes at the level of (i) Mo transport and Moco utilization trait, and (ii) Mo-dependent enzymes. Our results revealed that most prokaryotes and all higher eukaryotes utilize Mo whereas many unicellular eukaryotes including parasites and most yeasts lost the ability to use this metal. In addition, eukaryotes have fewer molybdoenzyme families than prokaryotes. Dimethylsulfoxide reductase (DMSOR) and sulfite oxidase (SO) families were the most widespread molybdoenzymes in prokaryotes and eukaryotes, respectively. A distant group of the ModABC transport system, was predicted in the hyperthermophilic archaeon Pyrobaculum. ModE-type regulation of Mo uptake occurred in less than 30% of Moco-utilizing organisms. A link between Mo and selenocysteine utilization in prokaryotes was also identified wherein the selenocysteine trait was largely a subset of the Mo trait, presumably due to formate dehydrogenase, a Mo- and selenium-containing protein. Finally, analysis of environmental conditions and organisms that do or do not depend on Mo revealed that host-associated organisms and organisms with low G+C content tend to reduce their Mo utilization. Overall, our data provide new insights into Mo utilization and show its wide occurrence, yet limited use of this metal in individual organisms in all three domains of life.
The trace element molybdenum (Mo) occurs in a wide variety of metalloenzymes in both prokaryotes and eukaryotes, where it forms part of active sites of these enzymes.1–3 Except for the iron-Mo cofactor (FeMoco) in nitrogenase, Mo is complexed by pterin molecules, thereby generating the molybdenum cofactor (Moco or molybdopterin) in Mo-dependent enzymes (molybdoenzymes).5–7 Some microorganisms are able to utilize tungsten (W) which is also coordinated by molybdopterin.8 As a result, the term Moco refers to the utilization of both metals.
Moco-containing enzymes catalyze important redox reactions in the global carbon, nitrogen, and sulfur cycles.2 More than 50 Mo-enzymes, mostly of bacterial origin, have been previously identified.2, 3, 9 On the basis of sequence comparison and spectroscopic properties, these Moco-containing enzymes are divided into four families: sulfite oxidase (SO), xanthine oxidase (XO), dimethylsulfoxide reductase (DMSOR), and aldehyde:ferredoxin oxidoreductase (AOR).10, 11 Each family is further divided into different subfamilies based on the use of their specific substrates. For example, the DMSOR family also includes trimethylamine-N-oxide reductase, biotin sulfoxide reductase, nitrate reductase (dissimilatory), formate dehydrogenase and arsenite oxidase. All four of these families can be detected in prokaryotes; however, only two families (SO and XO) containing four subfamilies occur in eukaryotes. The SO family includes nitrate reductase (NR) and SO, whereas the XO family is represented by xanthine dehydrogenase (XDH) and aldehyde oxidase (AO). These enzymes are typical for essentially all Mo-utilizing eukaryotes analyzed thus far. Recently, two additional Moco-binding enzymes were reported: pyridoxal oxidase and nicotinate hydroxylase which were found exclusively in Drosophila melanogaster and Aspergillus nidulans, respectively.7
Functions of molybdoenzymes depend on additional gene products that transport molybdate anions into cells and synthesize and assemble Moco. In bacteria, high-affinity molybdate ABC transporters (ModABC, products of modABC genes) have been described that consist of ModA (molybdate-binding protein), ModB (membrane integral channel protein) and ModC (cytoplasmic ATPase).7, 12, 13 In addition, a new class of the Mo/W transport system (WtpABC) and a highly specific tungstate ABC transporter (TupABC) have been reported.14, 15 Although both transporter systems exhibited a low level of sequence similarity to ModABC transporters, they showed an anion affinity different from that of ModA. TupA specifically binds tungstate, whereas WtpA has a higher affinity for tungstate than ModA and its affinity for molybdate is similar to that of ModA.14, 15 In contrast to bacteria, eukaryotic molybdate transport is poorly understood, but recent studies in Arabidopsis thaliana suggested the occurrence of a high-affinity molybdate transport system, MOT1.16
In Escherichia coli, the modABC operon is regulated by a repressor protein, ModE, which also controls the transcription of genes coding for molybdopterin synthesis (moaABCDE), and molybdoenzymes.17–20 E. coli ModE is composed of an N-terminal DNA-binding domain (ModE_N, COG2005) and a C-terminal molybdate-binding domain.17, 18, 21 The C-terminal domain contains a tandem repeat of the molybdopterin-binding protein (Mop, COG3585; also referred to as Di-Mop domain).18 The ModABC-ModE systems are widespread in prokaryotes, but not ubiquitous.22–25 Variations of ModE-like proteins were also observed in other Moco-utilizing organisms.25, 26 On the other hand, regulation of WtpABC and TupABC transporters is unclear.
In organisms studied thus far with respect to Moco utilization (e.g., bacteria, plants, fungi and mammals), this cofactor is synthesized by a conserved multi-step biosynthetic pathway.7 The first model of Moco biosynthesis was derived from studies in E. coli.6 In this organism, the proteins required for biosynthesis and regulation of the pterin cofactor are encoded by the moa-mog operon.27, 28 The moa and moe operons are responsible for biosynthesis of the mononucleotide form of pterin cofactor, and the mob operon encodes pterin guanine dinucleotide synthase that adds GMP to the Mo-complexed pterin cofactor. Functions of other operons linked to Mo utilization are unclear. In eukaryotes, six gene products that catalyze Moco biosynthesis have been studied in plants (Cnx1–3, Cnx5–7),28 fungi29 and humans.30–32 Although these proteins are homologous to their counterparts in bacteria, not all of the eukaryotic Moco biosynthesis machinery could functionally complement the corresponding bacterial mutant strains. Different nomenclature has been used in humans and plants,30 and here we use the plant nomenclature to refer to the eukaryotic Moco synthetic genes.
In recent years, the complete genomes of many organisms from the three domains of life have become available. It is now possible to examine the occurrence and evolution of numerous biochemical pathways that an organism utilizes, including metal utilization. Several comparative and functional genomic analyses have been carried out for different trace elements.33–38 However, a comprehensive investigation of either Moco biosynthesis systems or Mo-containing enzymes has not been performed.
In this study, we used comparative genomic analyses to better understand Mo utilization in various life forms. Our data show a widespread utilization of Mo in all three domains of life and reveal that evolutionary changes in Mo utilization can be influenced by various factors. Our results also highlight complexity of regulation of the Mo/W uptake systems. Moreover, the relationship between Mo and selenium (Se) utilization in prokaryotes suggests the possibility that Se utilization is dependent on Mo. These studies reveal widespread utilization of Mo in various life forms and its limited use in individual organisms, and are important for understanding the evolution of both Mo utilization trait and molybdoenzymes.
Analysis of prokaryotic genomes revealed a wide distribution of genes encoding Moco biosynthesis pathway and Mo-containing proteins (a complete list is in Table S1). Almost all organisms were found to either possess both Moco biosynthesis proteins and known molybdoenzymes or lack them, suggesting a very good correspondence between occurrence of Moco biosynthesis trait and Moco-dependent enzymes. In total, 325 (~72.1%) bacterial organisms were found to utilize Moco. Figure 1 shows the distribution of the Moco biosynthesis trait and Moco-containing protein families in different bacterial phyla on the basis of a highly resolved phylogenetic 39 tree of life.39
Except for the phyla containing few sequenced genomes (<3, for example, Planctomycetes, Aquificae and Acidobacteria), Mo was found to be utilized by almost all bacterial phyla. All sequenced organisms in Chlorobi, Deinococcus-Thermus, Alphaproteobacteria/Rhizobiaceae, Betaproteobacteria/Bordetella, Betaproteobacteria/Burkholderiaceae, Gammaproteobacteria/Pasteurellaceae, Gammaproteobacteria/Vibrionaceae and Gammaproteobacteria/Pseudomonadaceae, as well as the majority of Cyanobacteria (92.3%), Epsilonproteobacteria (91.7%), Deltaproteobacteria (90.5%), Gammaproteobacteria/Enterobacteriales (86.4%) and many other bacterial subdivisions utilize Moco. In contrast, neither Moco biosynthesis trait components nor Moco-containing proteins were detected in Firmicutes/Mollicutes and Chlamydiae. It should be noted that we found orphan XO homologs in five completely sequenced organisms belonging to Deltaproteobacteria, Firmicutes/Clostridia, Spirochaetes and Thermotogae, which lack genes for either Mo/W transporters or known Moco biosynthesis trait components (see Table S1). This observation suggests either that there is an unknown Mo utilization pathway in these organisms (unlikely scenario) or that they use other proteins that functionally replace XO and other molybdoproteins. It is possible that the functions carried out by molybdoproteins are dispensable in these organisms. Nevertheless, the wide distribution of Moco utilization observed in the present study suggests that, in addition to several metal ions utilized by all or most organisms, e.g., iron, zinc and magnesium, Mo also shows widespread occurrence in bacteria.
An even wider Mo utilization was observed in archaea (Fig. 2). About 95% of sequenced archaeal organisms were found to utilize Moco. Thus, it appears that Mo utilization is an ancient and essential trait that is common to essentially all species in this domain of life as well as in bacteria.
In eukaryotes, the only known use of Mo is Moco. Our analysis identified 89 (62.7%) Mo-dependent organisms (Fig. 3, details are shown in Table S1). All animals, land plants, algae, stramenopiles (including diatoms and oomycetes) and certain fungi (all Pezizomycotina and some Basidiomycota) possess Moco biosynthesis genes and known molybdoenzymes. However, MOT1 molybdate transporter was found in only one-third of Mo-utilizing eukaryotes, which are land plants, green algae, pezizomycotina and stramenopiles. In contrast, all parasites (14.8%, including Alveolata/Apicomplexa, Entamoebidae, Kinetoplastida, Parabasalidea and Diplomonadida), yeasts (21.1%, including Saccharomycotina and Schizosaccharomycetes) and free-living ciliates (1.4%, Alveolata/Ciliophora) lack Mo biosynthesis proteins, molybdoenzymes and MOT1 transporters. Since Mo utilization is widespread in all three domains of life, it appears that many protozoa, especially parasites, lost the ability to utilize Mo. A unique exception was the detection of an orphan XO in a parasitic flagellated protozoan, Trichomonas vaginalis (Parabasalidea phylum). Considering that its genome sequence is not fully completed, it is possible that the Mo biosynthesis proteins could correspond to unfinished sequences. Alternatively, this organism may rely on the uptake of Moco from the host.
We analyzed both well-characterized Mo ABC transport system (ModABC) and two secondary systems: WtpABC and TupABC (W-specific) in prokaryotes. A summary of the distribution of these Mo/W transporter families is given in Table 1. In bacteria, 294 organisms that account for 90.5% of Mo-utilizing bacteria possess ModABC transporter. The occurrence of the other two systems is more restricted, especially of WtpABC which was identified in only ten organisms. The W-specific transporter TupABC was found in 85 (26.2%) Moco-utilizing organisms. In contrast, the distribution of these transporters in archaea was different. WtpABC was the most common transporter that was found in 23 (63.9%) Mo-utilizing organisms, whereas ModABC and TupABC systems showed lower occurrence (38.9% and 33.3% respectively). These data are consistent with the hypothesis that WtpABC is an archaeal Mo/W transporter, whereas ModABC and TupABC function predominantly in bacteria.14
Phylogenetic analysis was used to further examine the evolutionary relationships of Mo/W transport systems in different organisms. We used ModA (periplasmic component of the ModABC transport system), WtpA (periplasmic component of the WtpABC transport system) and TupA (periplasmic component of the TupABC transport system) to build a phylogenetic tree (Fig. 4). First, all orthologs of the three different families were used to generate a preliminary tree (see Materials and Methods). Representative sequences were then manually selected to condense the original tree without changing its topology. In addition, the periplasmic components of sulfate and Fe3+ transporters that have a low level of similarity to ModA were used as a reference. The robustness of the phylogenetic tree was evaluated with additional programs, which showed a similar topology (see Materials and Methods and Fig. S1). It should be noted that although WtpA and ModA sequences belong to the same COG (COG0725), they showed different anion affinities on the basis of previous experimental analysis.14 In the phylogenetic tree, they cluster in different branches, suggesting that they may be derived from a common ancestral gene and have since diverged from the parent copy by mutation and selection or drift.
Distant ModA-like proteins were identified in several Pyrobaculum species which are hyperthermophilic archaea. Blast-based pairwise alignment showed less than 25% similarity between these ModA-like proteins and E. coli ModA or Pyrococcus furiosus WtpA. Phylogenetic analysis also suggested they are outgroups of all known ModA proteins (Fig. 4). However, they belong to the same COG (e-value 2e-17) as ModA. We further examined the genomic context of modA-like genes and the conservation of residues involved in molybdate binding in E. coli ModA (1AMF)40 and tungstate binding in Archaeoglobus fulgidus WtpA (2ONS).41 These modA-like genes were always located in an operon containing a complete ABC transport system, including an ABC-type permease and an ATPase component. Both components were distantly homologous to ModB and ModC, respectively (similarity <25% and e-value >0.1 based on BLAST pairwise alignment). In addition, in one organism, Pyrobaculum islandicum, the modA-like gene was located next to modD gene, which is present in some modABC operons in prokaryotes and is involved in molybdate transport (its exact function is unclear).12, 13 Multiple alignment of ModA, WtpA and ModA-like sequences revealed that two or three out of five residues involved in Mo binding in E. coli ModA (Ser36, Ser63 and Tyr194)40 were conserved in these ModA-like sequences (Fig. S2 and S3). The other two residues (Ala149 and Val176), which provide only the backbone hydrogen to form hydrogen bonds with molybdate,40 were not strictly conserved, but other amino acids may similarly provide ligands to the metal ion. These data suggest that the Pyrobaculum ModA-like proteins should be considered as a distant group of the ModA family. The absence of ModA-like proteins in other sequenced organisms suggests a limited distribution of this subfamily.
We found that several completely sequenced organisms, including two archaea and 24 bacteria, which contained both Moco biosynthesis pathway and Moco-containing enzymes, did not possess any of the known transporters. Most of these organisms were distantly related, free-living organisms. This observation suggests that additional Mo/W uptake systems may exist. We examined genes in Moco biosynthesis operons in these organisms; however, no good candidates for new Mo/W-specific transport system could be found. It is possible that molybdate is transported by either sulfate transport system or nonspecific anion transporter in these organisms.
MOT1 is the only known Mo transporter in eukaryotes, which was recently identified in A. thaliana.16 In this study, we analyzed the occurrence of this transporter in sequenced eukaryotic genomes. Among 89 Mo-utilizing organisms, only 31 possess MOT1 orthologs, including Fungi/Ascomycota/Pezizomycotina, land plants (Viridiplantae/Streptophyta), green algae (Viridiplantae/Chlorophyta) and stramenopiles. The absence of MOT1 from all animals implied the presence of an unknown Mo transport system in these organisms.
In E. coli, the ModABC repressor, ModE, is positioned immediately upstream and transcribed divergently from the modABC operon (Fig. 5A). However, full-length ModE orthologs were absent from many other organisms such as the Gram-positive Bacteria and Cyanobacteria.25 In addition, various domain fusions were observed for ModE_N or Mop, indicating complexity of ModE regulation.25 Although the roles of these variants are unclear, they have been suggested to be non-functional in ModABC regulation.25 In this study, we analyzed the occurrence of full-length ModE and its variants (including separate ModE_N, Mop/Di-Mop proteins as well as their additional fusion forms) in sequenced prokaryotes. Here, only the full-length ModE orthologs were considered as true regulators of ModABC transporters. The results are shown in Table 2 (a complete distribution is shown in Table S1). Only a small portion of Moco-utilizing organisms (28.9% and 16.7% in bacteria and archaea, respectively) possessed a full-length ModE, suggesting that most prokaryotes may use additional or unspecific repressors for ModABC regulation.
In bacteria, some ModABC-containing organisms, which lack ModE, have separate ModE_N and Mop/Di-Mop proteins or orphan ModE_N proteins (Table 2, Fig. 5B and 5C). In addition, five different types of domain fusions were identified for Mop (3 types, Fig. 5D–5F) and ModE_N (2 types, Fig. 5G and 5H), mostly in bacteria. Analysis of genomic locations of both separate domains and fusion proteins revealed that, except for two ModE_N fusion proteins (including a ModE_N-COG1910 fusion protein which was suggested to regulate the transcription of formate dehydrogenase, as well as a epsilonproteobacteria-specific unknown 3-ModE_N fusion protein which might be a transcriptional activator rather than a repressor25), genes coding for these proteins are close to or even are within the modABC operon, suggesting functional relationship with ModABC transporters (Fig. 5B–5F). Orthologs of these ModE-like variants could be detected in several ModE-containing organisms (see Table S1). Currently, no conclusion can be made regarding the functions of these ModE variants. One hypothesis is that a separate ModE_N and Mop/Di-Mop proteins together may have a function similar to that of full-length ModE in regulating ModABC transporters (Fig. 5B). The function of orphan ModE_N is unclear. It has been suggested that ModE_N might be sufficient to mediate DNA binding for ModABC regulation, albeit weakly.21 In addition, the MerR-Mop fusion protein identified in Actinobacteria could be a candidate regulator for ModABC or other Mo-related genes as this protein contains both a MerR-like transcription factor domain and Mop domain. However, the fact that almost half of ModABC-containing organisms lack both ModE and its variants suggests that new regulators are present in these organisms for ModABC regulation.
Orthologous ModE or ModE_N sequences were also identified in several prokaryotes that lack ModABC transporters, especially in archaea where seven out of ten ModE_N-containing organisms lacked ModABC transporters. We noticed that in some genomes, ModE or ModE_N genes were located close to or next to genes coding for TupABC or WtpABC transporters, suggesting that the two secondary Mo/W transporter systems may be also regulated, in some organisms, by ModE-like mechanisms (Fig. 6). Further experimental verification is needed to test this possibility.
Figure 1–3 also show the occurrence of different molybdoprotein families, including Moco-containing enzymes and nitrogenase, in the three domains of life. As discussed above, there was a good correspondence between occurrence of Moco biosynthesis/Mo transport components and molybdoenzymes. In bacteria, except for the AOR family (found in 50 organisms), other Moco-containing enzymes were widespread (95.1%, 68.9% and 66.8% for DMSOR, SO and XO, respectively). The family used by most organisms, DMSOR, was largely represented by nitrate reductase (dissimilatory) and formate dehydrogenase. Many organisms possessed two or three Moco-containing protein families and several subfamilies within these families. However, the low occurrence or absence of SO and XO families in some phyla (e.g., SO in Firmicutes/Clostridia, Bacteroidetes, Chlorobi, XO in Chlorobi, Cyanobacteria, Epsilonproteobacteria and several Gammaproteobacteria clades), most of which possess the DMSOR family, suggested an independent relationship among molybdoenzymes. Only 67 organisms were found to possess nitrogenase and most of them (~97%) utilized Moco.
In archaea, members of the DMSOR family were found in all Mo-utilizing organisms. In contrast to bacteria, the AOR family was found in 69.4% of Moco-utilizing organisms, whose occurrence was much higher than that of SO and XO families (47.2% and 30.6% respectively). Nitrogenase was present only in methanogenic archaea, and it was present in all of them.
In contrast to prokaryotes, eukaryotes had only two molybdoenzyme families, SO and XO. All organisms that possessed the Mo utilization trait had the SO family and 95.5% had the XO family. All animals (Metazoa), land plants, stramenopiles and pezizomycotina had both molybdoenzymes. Interestingly, no Moco utilization trait and molybdoenzymes was detected in yeast Saccharomycotina. It was reported that Saccharomyces cerevisiae does not contain molybdoenzymes.7 However, it has been proposed that some other yeasts, such as Candida nitratophila, Pichia anomala and Pichia angusta, utilize Mo-containing assimilatory NR.42–44 In the present study, we could not detect homologs of NR in sequenced yeast genomes, including Candida albicans, Candida glabrata, Candida tropicalis and Pichia guilliermondii. The absence of both the Moco biosynthesis pathway and assimilatory NR strongly suggested the loss of Mo utilization in yeasts.
On the basis of the findings discussed above, it is possible to infer a general model of Mo utilization in the three domains of life. Considering that the common role of various Moco-binding proteins is to catalyze important redox reactions in the global carbon, nitrogen, and sulfur cycles, it is not surprising that Moco is essential for most organisms. However, some organisms or even complete clades may have evolved alternative mechanisms for such reactions due to the loss of both Moco biosynthesis pathway and Moco-containing enzymes.
Of the four large molybdoenzyme families that include more than 50 subfamilies in prokaryotes, only SO and XO (including NR, SO, XDH and AO subfamilies) span all three domains of life. If a protein family has representatives in all domains of life, it is thought that it was present in the last universal common ancestor.45 Therefore, We speculate that SO and XO families evolved in the common ancestor. The other two molybdoenzyme families, DMSOR and AOR, show a more limited occurrence and are detected only in prokaryotes.
In most phyla of prokaryotes, most organisms retained the Mo utilization trait although some organisms lost it. In order to investigate the contribution of horizontal gene transfer (HGT) to Moco utilization in these organisms, we analyzed the phylogeny of both Moco biosynthesis enzymes and Moco-binding proteins, but could not identify a single HGT event for the complete Mo utilization trait (including both Moco biosynthesis pathway and the corresponding molybdoenzymes) in distantly related organisms (data not shown). This observation is consistent with the idea that HGT is unlikely to have a significant role for acquisition of Moco utilization because genes involved in Moco biosynthesis are located in several operons, some of which are typically scattered throughout the genomes. On the other hand, a complete loss of the Moco utilization trait was observed in two distantly related phyla: Firmicutes/Mollicutes and Chlamydiae. The fact that their sister phyla (such as Bacillales and Clostridia for Mollicutes) commonly utilize Moco suggests that loss of the Moco utilization trait happened independently in the early ancestors of the two clades. All sequenced organisms in the two phyla are host-associated organisms, and it is possible that they exploit the Moco-binding proteins of the host. In several other evolutionarily distant lineages, such as Firmicutes/Lactobacillales and Alphaproteobacteria/Rickettsiales, very few organisms are able to use Moco. Phylogenetic analysis of the Mo utilization trait in these few organisms (as described above) did not support an HGT event from other species. Therefore, we inferred that Moco was used in the ancestors of Firmicutes/Lactobacillales and Alphaproteobacteria/Rickettsiales and was later independently lost. In addition, the loss of molybdoenzymes should accompany the loss of the Moco biosynthesis pathway. However, in Spirochaetes and Thermotogae, which completely lost the Moco biosynthesis pathways, XO homologs were detected. It is unclear whether these orphan XO homologs could use Mo as a cofactor.
Similar trends were observed in eukaryotes. Most phyla (including all animals) inherited the Moco utilization trait from the universal ancestor of all eukaryotes, whereas certain lineages including all parasites appeared to have lost it. An interesting case was observed in fungi. All sequenced pezizomycotina contained both the Moco biosynthesis trait and the four eukaryotic molybdoprotein subfamilies. In contrast, only a small number of yeasts possessed Mo-dependent NR, which is the only reported molybdoenzyme in these organisms. S. cerevisiae, Schizosaccharomyces pombe and all other sequenced yeasts lost the ability to use this trace element. Considering the difficulty of acquisition of the whole Mo utilization trait from distant species in eukaryotes, we suggest that the common ancestor of yeasts (including Saccharomycotina and Schizosaccharomycetes) utilized Mo as cofactor, at least for NR. However, this trait was later lost. The fact that Mo-containing NR is absent from sister species of Mo-utilizing yeasts (e.g., it is present in Candida nitratophila but absent from Candida albicans and Candida glabrata) suggests a recent loss event. NR catalyzes the reduction of nitrate to nitrite which is present only in autotrophic organisms such as plants, algae and fungi.2, 3 The absence of Mo-dependent NR from most yeast species suggests either that Mo-dependent reduction of nitrate to nitrite is unnecessary for these organisms or that alternative Mo-independent mechanisms have evolved.
Mo and W are found in the mononuclear form in the active sites of diverse enzymes in all three domains of life.46–48 The active sites of these enzymes include the metal ion coordinated to pyranopterin molecules and to a variable number of other ligands, such as oxygen, sulfur and selenium.49, 50 In addition, these proteins may have other redox cofactors, such as iron–sulfur centers, flavins and hemes, which are involved in intramolecular and intermolecular electron transfer processes.49 Much effort has been made on identifying and characterizing Moco biosynthesis components and Mo-dependent enzymes in various organisms. In contrast, occurrence and evolution of the overall Mo utilization trait remained unclear. In this study, we analyzed phylogenetic profiles and regulation of Mo uptake systems, Moco biosynthesis genes and Mo-containing proteins to better understand evolution and current use of Mo in nature. Our data reveal patterns and properties of Mo utilization among organisms with sequenced genomes and provide new insights into understanding the dynamic evolution of the Mo utilization trait in prokaryotes and eukaryotes.
The widespread distribution of the Mo utilization trait in prokaryotes suggests that this trace element could be used by essentially all prokaryotic phyla. In contrast, the absence of the Mo utilization trait from several evolutionarily distant phyla (e. g., Firmicutes/Mollicutes and Chlamydiae) implied a loss of this trait from these clades. There was a very good correspondence between the occurrence of the Mo biosynthesis pathway and the presence of known Moco-containing protein families. However, a few exceptions wherein some organisms lacked either Moco-containing proteins or Moco biosynthesis components suggest the presence of additional Moco-dependent protein families or alternative Mo utilization pathways in these organisms.
Besides the classic ModABC transport system, a distant ModABC-like group was predicted in Pyrobaculum. Although the ModA-like proteins appeared to be an outgroup of all three known Mo/W transporters, they belong to the same COG as E. coli ModA. The presence of modB-like and modC-like genes (as well as the modD gene) in the same operon implied that they form a distant group of ModABC transporters and are involved in Mo/W uptake. Orthologs of this group could be found only in Pyrobaculum species but not in other sister species in the same archaeal phylum. It is possible that these ModABC-like transporters evolved from an ancestral ModABC system and diverged rapidly in Pyrobaculum. On the other hand, MOT1, which is the only known Mo transporter in eukaryotes, was detected only in one-third of Mo-utilizing organisms, suggesting that most eukaryotes (including all animals) use additional unknown transport system(s) for Mo uptake.
We investigated ModE-related ModABC regulation in prokaryotes. Surprisingly, less than 30% of Mo-utilizing organisms possessed full-length ModE regulators. Over 70% bacteria and 80% archaea appeared not to use E. coli-type ModE for ModABC regulation. Orphan Mop or Di-Mop proteins are not specific for ModE-related regulation because they occur also in other proteins with distinct functions (e.g., the Mop domain is present in the C-terminus of ModC, and Di-Mop domain is present in ModG which is implicated in intracellular Mo homeostasis). Although some species contain either both ModE_N and Mop/Di-mop proteins (which suggests a function similar to that of ModE) or orphan ModE_N (which may mediate weak ModABC regulation), almost half of ModABC-containing organisms lacked ModE-type ModABC regulation. This finding suggests the presence of novel or unspecific pathways for molybdate uptake in these organisms. In addition, the occurrence of different fusion proteins composed of ModE_N and Mop domains suggests the presence of more complex regulatory networks for Mo uptake, and Moco biosynthesis and utilization. Analysis of the gene neighborhoods of ModE_N/ModE and TupABC/WtpABC transporters implied that the two secondary Mo/W transporters may be also regulated by ModE-type system in some organisms.
Analysis of Mo-containing proteins provided a straightforward approach to analyze the distribution and evolution of molybdoproteomes in various organisms. AOR was the first enzyme that was structurally characterized as a protein containing a Moco-type cofactor, and it has been proposed to be the primary enzyme responsible for the interconversion of aldehydes and carboxylates in archaea.51 However, it is the rarest known bacterial Moco-containing protein, suggesting that AOR-dependent oxidation of aldehydes is not needed for most bacterial species. The other three molybdoenzyme families are distributed much more widely, especially the DMSOR family, which is found in almost all Mo-utilizing bacteria and all Mo-utilizing archaea. Enzymes of the DMSOR family catalyze a variety of reactions that involve oxygen atom transfer to or from an available electron pair of a substrate or cleavage of a C-H bond.2, 10, 52–55 NR (dissimilatory) and formate dehydrogenase are the two major members of the DMSOR family. The formate dehydrogenase alpha subunit (FdhA) is also a selenocysteine (Sec)-containing protein that may be responsible for maintaining the Sec-decoding trait in prokaryotes.56 We compared the distribution of Mo- and Sec-utilizing organisms in both prokaryotes and eukaryotes, and found that Sec-utilizing organisms were essentially a subset of Moco-dependent organisms in prokaryotes (Fig. 7, Table S1). These data suggest that the Sec trait is dependent on the Mo utilization trait in prokaryotes because of the function of formate dehydrogenase, which is a widespread Mo-enzyme and the main user of Se in prokaryotes. In addition, occurrence of the only non-Moco-containing protein, nitrogenase, was limited in both bacteria and archaea. This enzyme is used by several organisms to fix atmospheric nitrogen gas (N2). The fact that it was found in all methanogenic archaea implied that the function of this protein is essential for these organisms.
We attempted to generate a general evolutionary model of Mo utilization in the three domains of life. The Moco biosynthesis pathway and at least two molybdoenzyme families (SO and XO) were likely present in the last universal common ancestor. The Moco utilization trait is evolutionarily conserved in most prokaryotic and eukaryotic species due to the important redox reactions catalyzed by molybdoenzymes in carbon, nitrogen and sulfur metabolism. In addition, an independent loss of the Moco utilization trait (instead of an HGT from other species) and perhaps an appearance of alternative Mo-independent pathways has a role in the evolution of Mo utilization.
We hypothesized that, since both the Moco biosynthesis trait and molybdoenzymes were present (or both were absent) in organisms, and these patterns were observed in various bacterial phyla, certain common factors (e.g., habitat) may have affected the acquisition/loss of Mo utilization. To examine this possibility, we analyzed a role of environmental conditions (e.g., habitat, oxygen requirement, optimal temperature and optimal pH) and other factors (e.g., genome size, G+C content) in Mo utilization in sequenced prokaryotes. Previously, a similar strategy was used to analyze the evolution of Se in bacteria.56 Fig. 8 shows the distribution of organisms that possess or lack Moco utilization with respect to several such factors.
We found that the majority of bacteria that do not utilize Moco were host-associated (i.e., parasites or symbionts, Fig. 8A), implying that host-associated lifestyle often leads to the loss of Mo utilization, perhaps due to limited space and resources or availability of Mo pathways of the host. This is consistent with the observation in Firmicutes/Mollicutes and Chlamydiae, all of which are host-associated and could not utilize Moco. This idea is also supported by analysis of Mo utilization in Alphaproteobacteria/Rickettsiales. In this phylum, only one out of 19 organisms utilized Mo (Candidatus Pelagibacter ubique, a marine bacterium living in ocean surface water). However, it is also the only non-host-associated organism in this clade.
Our data suggested a complete loss of the Mo utilization trait in all host-related organisms in this phylum instead of an HGT into Candidatus Pelagibacter ubique. In addition, in many phyla, genomes of Moco-utilizing organisms had a significantly higher G+C content, suggesting that the increase in G+C content correlates with increased Mo utilization (Fig. 8B and 8C). Organisms with low G+C content (i.e., GC <40%) that lack the Moco utilization trait were were found in a variety of phyla, indicating that such correlation is significant. The reason why low G+C content organisms in different clades lost the Moco utilization trait is not clear. Other factors, such as oxygen requirement, gram strain, optimal temperature and pH, did not appear to have a role in Mo utilization. In archaea, only two organisms, Methanosphaera stadtmanae (the only sequenced parasite in archaea) and Nanoarchaeum equitans (an ancient hyperthermophilic and anaerobic obligate symbiont with a small genome that has lost the ability to use most trace elements such as nickel, cobalt, copper and selenium), lacked Mo utilization and both genomes had a very low G+C content (27.6% and 31.6%). These data provided additional support for our observation in bacteria. Thus, host-associated lifestyle as well as reduced G+C content seem to correlate with the loss of Mo utilization.
We also examined the distribution, based on the factors discussed above, of different molybdoenzyme families, and similar trends were found. Moreover, additional features were observed for different molybdoenzymes (Fig. 9). For example, organisms possessing AOR proteins favor an anaerobic environment, whereas organisms containing SO, XO or DMSOR proteins favor aerobic conditions. Organisms containing nitrogenase favor both anaerobic and relatively warm conditions (no psychrophilic organism possessed nitrogenase). These data illustrate that, although being dependent on the same processes, such as Mo availability and Moco synthesis, different Mo enzymes are subject to independent and dynamic evolutionary processes.
Similar investigation of molybdoenzymes in eukaryotes provided the information on Mo utilization in this domain of life. As in prokaryotes, distribution of eukaryotic Mo-containing proteins essentially matched the Moco utilization trait. However, only SO (including NR and SO) and XO (including XDH and AO) families could be detected, suggesting a much smaller molybdoproteome in eukaryotes than in prokaryotes. The functional roles of these four subfamilies have been investigated in different organisms.2, 3, 7 Besides NR, which is a key enzyme of nitrate assimilation and does not occur in animals, the other three enzymes are present in a variety of clades including unicellular organisms and animals. SO catalyzes the oxidation of sulfite to sulfate (the final step in the degradation of sulfur-containing amino acids).7 XDH is a key enzyme in purine degradation that oxidizes both hypoxanthine to xanthine and xanthine to uric acid, whereas AO catalyzes the oxidation of a variety of aromatic and nonaromatic heterocycles and aldehydes and converts them to the respective carboxylic acids.7 All parasites lost the ability to synthesize Moco, which is consistent with what we found in prokaryotes, suggesting that Mo utilization may have been present in the eukaryotic progenitor and became unnecessary for parasites because of the reduced availability of Mo or dependence on the corresponding metabolic pathways of the host. Both Mo-dependent and Mo-independent organisms were found among fungi. The recent loss of the Mo biosynthesis pathway and Mo-dependent NR in most yeasts, including S. cerevisiae, suggested that Mo-dependent nitrate assimilation may be unnecessary or have been replaced by other pathways in these organisms. It is known that nitrate assimilation is one of two major biological processes by which inorganic nitrogen is converted to ammonia and hence to organic nitrogen.58 Although S. cerevisiae lacks both Moco biosynthesis trait and NR, it contains a number of genes which convert glutamine to glutamate, providing a major source of organic nitrogen.59 In addition, glutathione (GSH) stored in the yeast vacuole can serve as an alternative nitrogen source during nitrogen starvation.60 It is unclear whether the ancestor of yeasts possessed other Mo-binding enzymes. However, alternative Mo-independent pathways for sulfur and carbon metabolism may have evolved in yeasts. Both Mo-dependent and Mo-independent fungi are free-living organisms and in this case we could not identify a common environmental factor related to Mo utilization. Thus, additional unidentified factors may have affected Mo utilization in fungi. A future challenge would be to discover these factors as well as additional features influencing Mo utilization in the three domains of life.
In conclusion, we report a comprehensive comparative genomics analysis of Mo utilization in prokaryotes and eukaryotes by examining occurrence of proteins involved in Moco biosynthesis, Mo transport and Mo utilization (molybdoenzymes). Our data reveal a complex and dynamic evolutionary process of Mo utilization. Most bacteria and archaea utilize Mo, with the exception of parasites and organisms with low genomic G+C content. A distant group of ModABC transport system was identified in Pyrobaculum species. Regulation of Mo uptake must be more complex than previously thought, as ModE-type ModABC regulatory systems occurred in only a limited number of Moco-utilizing organisms. In contrast to the wide use of Mo in prokaryotes, the utilization of this element in eukaryotes is more restricted, both with regard to the number of organisms that depend on Mo and the number of molybdoprotein families that occur in them. Again, host-associated conditions appear to lead to the loss of Mo utilization.
Sequenced genomes of archaea, bacteria and eukaryotes were retrieved from the NCBI website (http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi). Only one strain was used for each species (e.g., E. coli K12 was used as a representative of E. coli). A total of 451 bacterial, 38 archaeal and 142 eukaryotic organisms were analyzed (as of Feb. 2007).
We used several well-characterized proteins that are Mo/W transporters or known to be involved in the Moco biosynthesis pathways as our seed sequences to search for homologs in sequence databases. In prokaryotes, products of moa (moaA-moaE), mod (modABC and modE) and moe (moeA and moeB) operons from E. coli, WtpABC from P. furiosus and TupABC from E. acidaminophilum were used to identify a set of primary homologous sequences using TBLASTN with an e-value <1. Iterative TBLASTN searches were then performed within each phylum, using different homologous sequences from the primary set as queries, to identify more distant homologs. In parallel, three cycles of PSI-BLAST with default parameters were used for the identification of distant homologs. Orthologous proteins were defined as bidirectional best hits.61 When necessary, orthologs were also confirmed by genomic location analysis or building phylogenetic trees for the corresponding protein families. Occurrence of the Moco trait was verified by the requirement for presence of most of these genes. Members of known Moco protein families as well as nitrogenase were identified using a similar approach.
In eukaryotes, we used MOT1 (a recently identified Mo-specific transporter in plants), and Cnx1–3 and Cnx5–7 from A. thaliana as seed sequences to detect molybdate transporter and Moco utilization in sequenced genomes. Considering the uncertainty of the Moco biosynthesis pathway in unicellular eukaryotes and the incompleteness of some genome sequences, the presence of the Moco utilization trait was verified in these organisms by the following criteria: at least two orthologs of proteins involved in Moco biosynthesis and at least one known Mo-containing protein detected in the same organism.
To investigate the distribution of organisms that utilize Mo in different phyla, we adopted a phylogenetic tree developed by Ciccarelli et al.,39 which is based on concatenation of 31 orthologs occurring in 191 species with sequenced genomes. Phylogenetic trees of each Mo/W transporter systems were reconstructed by standard approaches. Sequences were aligned with CLUSTALW62 using default parameters. Ambiguous alignments in highly variable (gap-rich) regions were excluded. The resulting multiple alignments were then checked for conservation of functional residues and edited manually. In addition, MUSCLE63 alignment tool was used to evaluate the CLUSTALW results. Phylogenetic analyses were performed using PHYLIP programs.64 Pairwise distance matrices were calculated by PROTDIST to estimate the expected amino acid replacements per position. Neighbor-joining (NJ) trees were obtained with NEIGHBOR and the most parsimonious trees were determined with PROTPARS. Robustness of these trees was then evaluated by maximum likelihood (ML) analysis with PHYML65 and Bayesian estimation of phylogeny with MrBayes.66
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.