|Home | About | Journals | Submit | Contact Us | Français|
Folate coenzymes function as one-carbon group carriers in intracellular metabolic pathways. Folate-dependent reactions are compartmentalized within the cell and are catalyzed by two distinct groups of enzymes, cytosolic and mitochondrial. Some folate enzymes are present in both compartments and are likely the products of gene duplications. A well-characterized cytosolic folate enzyme, FDH (10-formyltetrahydrofolate dehydrogenase, ALDH1L1), contains a domain with significant sequence similarity to aldehyde dehydrogenases. This domain enables FDH to catalyze the NADP+-dependent conversion of short-chain aldehydes to corresponding acids in vitro. The aldehyde dehydrogenase-like reaction is the final step in the overall FDH mechanism, by which a tetrahydrofolate-bound formyl group is oxidized to CO2 in an NADP+-dependent fashion. We have recently cloned and characterized another folate enzyme containing an ALDH domain, a mitochondrial FDH. Here the biological roles of the two enzymes, a comparison of the respective genes, and some potential evolutionary implications are discussed. The phylogenic analysis suggests that the vertebrate ALDH1L2 gene arose from a duplication event of the ALDH1L1 gene prior to the emergence of osseous fish >500 millions years ago.
In the cell, folate coenzymes participate in numerous reactions of one-carbon transfer (Fig. 1, reviewed in [1–3]), including de novo nucleotide biosynthesis, conversions of several amino acids and the incorporation of formate-derived carbon into folate pool. Another group of biochemical reactions of folate involve interconversions of different forms of the coenzyme, in which one-carbon groups remain folate-bound but alter their oxidation state. Additional folate reactions, which do not involve the conversion of one-carbon groups, are: (i) the reduction of folic acid to dihydrofolate and then to tetrahydrofolate, and (ii) the addition of glutamic acid residues to folate monoglutamate to form folate polyglutamates. These reactions are required to produce the active form of the coenzyme (via dihydrofolate reductase) and to retain folate within the cell (via folylpolyglutamate synthetase). Finally, a reaction catalyzed by 10-formyltetrahydrofolate dehydrogenase (ALDH1L1, FDH) irreversibly removes carbon groups from the folate pool in the form of CO2 . This pathway is distinct from other folate-dependent reactions in terms of the utilization of a one-carbon group: instead of being used in a biosynthetic pathway it is diverted toward energy production coupled with the final step of carbon oxidation, from the level of formate to the level of CO2.
Folate metabolism is highly compartmentalized in the cell with the major pathways being localized to either the cytoplasm or mitochondria . Mitochondrial folate metabolism is generally viewed as a supplier of one-carbon groups for cytosolic folate-dependent biosynthetic reactions [5, 6]. In addition, recent studies have indicated a nuclear compartmentation for some folate-dependent reactions as well [2, 3, 7, 8]. The nucleus-related aspects of folate metabolism, however, are less studied than the cytosolic and mitochondrial folate pathways. The cytosolic and mitochondrial compartmentation of folate metabolism occurs at two levels. First, there are two different folate pools, which are not easily interchangeable since folate cannot freely traverse mitochondrial membrane . Transport of folate into mitochondria is carried out by a specific transporter, which is homologous to several inner mitochondrial wall transporters . Second, there are separate sets of folate enzymes residing in the cytosol or mitochondria that define the specificity of folate pathways in each compartment [2, 3]. Some folate enzymes and corresponding reactions are unique to a single compartment, either cytosolic or mitochondrial. Several enzymes, however, are present in both compartments, and FDH belongs to this group (Table 1).
FDH converts 10-formyl-THF to THF and CO2 in a NADP+-dependent dehydrogenase reaction (Fig. 2 inset). The cytosolic form of this enzyme, ALDH1L1, has been known for a long time and is a well-characterized protein [4, 15]. ALDH1L1 appears to be a natural fusion of three unrelated genes that determines a complex domain structure of the protein. The functional domains, which compose the protein, are an aldehyde dehydrogenase (carboxyl-terminal), a folate-binding/hydrolase (amino-terminal), and an acyl carrier protein-like intermediate domain [34–36]. The conversion of 10-formyl-THF to THF and CO2 includes three steps, two catalytic and one transfer step. In the first step, the formyl group is removed from the folate molecule in a hydrolase reaction; in the second step, this group, covalently attached to a 4′-phosphopantetheine moiety of the intermediate domain, is transferred to the carboxyl-terminal domain where it undergoes oxidation through an ALDH-like mechanism as the third step . The presence of the ALDH domain enables the enzyme to perform the aldehyde dehydrogenase catalysis as well . It is not clear, however, whether this reaction has an independent physiological significance and what would be a substrate for the enzyme in such a reaction in vivo.
We have recently identified and characterized a homolog of ALDH1L1, ALDH1L2, which is the product of a separate gene. ALDH1L2 is a mitochondrial enzyme with high sequence similarity to ALDH1L1 . An additional sequence at its amino-terminus is unique to ALDH1L2 and is a functional mitochondrial leader sequence which is absent in ALDH1L1. In the present study, we have evaluated the presence of both ALDH1L1 and ALDH1L2 genes in the genomes of several species and examined the organization and appearance of these genes during vertebrate and invertebrate evolution.
BLAST (Basic Local Alignment Search Tool) studies were undertaken using web tools from the National Center for Biotechnology Information (NCBI) (http://blast.ncbi.nlm.nih.gov/Blast.cgi) . Protein BLAST analyses used ALDH1L1 amino acid sequences. Non-redundant protein sequence databases for several vertebrate and invertebrate genomes were examined using the BLASTP algorithm, including human (Homo sapiens), chimpanzee (Pan troglodytes), orangutan (Pongo abelii), rhesus monkey (Macaca mulata), cow (Bos Taurus), horse (Equus caballus), mouse (Mus musculus), rat (Rattus norvegicus), opossum (Monodelphis domestica), platypus (Ornithorhynchus anatinus), chicken (Gallus gallus), frog (Xenopus tropicalis), zebrafish (Danio rerio), tetraodon fish (Tetraodon nigroviridis), fruit fly (Drosophila melanogaster) and nematode (Caenorhabditid elegans). This procedure produced multiple BLAST ‘hits’ for each of the protein databases which were individually examined and retained in FASTA format, and a record was kept of the sequences for predicted mRNAs and encoded ALDH1L-like proteins. These records were derived from annotated genomic sequences using the gene prediction method: GNOMON and predicted sequences with high similarity scores. Predicted ALDH1L-like protein sequences were obtained in each case and subjected to analyses of predicted protein and gene structures. BLAT analyses were subsequently undertaken for each of the predicted ALDH1L amino acid sequences using the UC Santa Cruz genome browser [http://genome.ucsc.edu/cgi-bin/hgBlat]  with the default settings to obtain the predicted locations for each of the ALDH1L genes, including predicted exon boundary locations and gene sizes. BLAT analyses were similarly undertaken for other human ALDH genes (Table 2).
MitoProt web tools were used to predict the N-terminal protein region that can support a mitochondrial targeting sequence and the cleavage site for each of the predicted vertebrate ALDH1L2 sequences (http://ihg2.helmholtz-muenchen.de/ihg/mitoprot.html) .
Alignments of vertebrate ALDH-like protein sequences for ALDH1L (eg. residues 417-902 for human ALDH1L1 and residues 428-923 for human ALDH1L2) and human ALDH1A1, ALDH1A2, ALDH1A3, ALDH1B1, ALDH2, ALDH3A1, ALDH3A2 and ALDH3B1 sequences (see Table 2 for sources) were assembled using BioEdit v.5.0.1 with the default settings . Alignment of ambiguous regions, including the amino and carboxyl termini, were excluded prior to phylogenetic analysis yielding alignments of 396 residues for comparisons of vertebrate ALDH1L and human ALDH sequences with the fruit fly (Drosophila melanogaster) and nematode (Caenorhabditis elegans) ALDH1L1 sequences (Table 2). Evolutionary distances were calculated using the Kimura option  in TREECON . Phylogenetic trees were constructed from evolutionary distances using the neighbor-joining method . Tree topology was reexamined by the boot-strap method (100 bootstraps were applied) of resampling and only values that were highly significant (≥90) are shown .
Table 2 summarizes the predicted locations for vertebrate and invertebrate ALDH1L-like genes based upon BLAT interrogations of several genomes using the reported sequences for human/mouse  and rat [15, 46] and the predicted sequences for other vertebrate genes and the UC Santa Cruz genome browser . Predicted primate ALDH1L1 and ALDH1L2 genes were predominantly transcribed on the negative strand, with the exception of the orangutan (Pongo abelii) ALDH1L1 gene, which was transcribed on the positive strand. Vertebrate ALDH1L1 genes examined contain between 21 and 25 exons and ALDH1L2 genes contain between 22 and 25 exons (Table 2). Within the same species, the number of exons between the two genes can be equal or not. While this variability could be attributed to incomplete annotation of some genomes at present, in most genomes the first exon encodes the 5′-non-translatable mRNA region in ALDH1L1 genes and for the N-terminal mitochondrial targeting sequence in ALDH1L2 genes. The invertebrate genomes examined (fruit fly and nematode) exhibited only a single ALDH1L-like sequence which lacked the N-terminal mitochondrial targeting sequence in each case. Fewer exons were observed for the invertebrate ALDH1L1 genes examined, with the fruit fly (Drosophila melanogaster) and nematode (Caenorhabditid elegans) genes exhibiting 2 and 7 exons respectively. It is apparent however that the fused nature of ALDH1L1 and ALDH1L2 genes and proteins, previously reported for mammalian enzymes , have been retained for all of the invertebrate and other vertebrate genes and enzymes examined.
A phylogenetic tree (Figure 2) was calculated by the progressive alignment of 24 vertebrate ALDH1L1 and ALDH1L2 amino acid sequences with human ALDH1A1, ALDH1A2, ALDH1A3, ALDH1B1, ALDH2, ALDH3A1, ALDH3A2 and ALDH3B1 sequences with the fruit fly (Drosophila melanogaster) and nematode (Caenorhabditis elegans) ALDH1L1 sequences (Table 2). The phylogram showed clustering of the ALDH sequences into groups which were consistent with their evolutionary relatedness as well as groups for vertebrate ALDL1L1 and ALDH1L2 sequences, which were distinct from the human ALDH1-, ALDH2- and ALDH3-like sequences. The ALDH1L1 and ALDH1L2 groups were significantly different from each other (with bootstrap values of 99–100/100) supporting a hypothesis that these are distinct but related family groups. It is apparent from this study of vertebrate ALDH1L genes and proteins that this is an ancient protein for which a proposed common gene ancestor has predated the appearance of osseous fish > 500 million years ago . In addition, the ALDL1L1 gene, which encodes the cytoplasmic form of this enzyme, may have served as the ancestral gene, given that both fruit fly and nematode genomes exhibited only a single ALDH1L1-like gene. Genetic distances for human, cow, mouse and rat ALDH1L1 and ALDH1L2 sequences calculated from the corresponding zebrafish sequences were 0.846±0.006 and 0.858±0.008, respectively, which suggests that these sequences are diverging at similar rates during vertebrate evolution.
FDH, a multidomain enzyme, is the product of a fusion of three unrelated genes [4, 15, 36]. The two catalytic modules of the enzyme, the amino-terminal hydrolase and the carboxyl-terminal aldehyde dehydrogenase, retain their respective catalytic activities when expressed as individual proteins [46, 48]. The third module, an acyl carrier-like domain, couples the two catalytic domains together that produces a new catalytic activity, the 10-formyltetrahydrofolate dehydrogenase . Several molecular mechanisms, including exon shuffling and gene duplication/fusion, must underlie the origin of such new chimeric genes from more simple ancient ones [49–51]. In fact, the complex domain organization seen in the FDH molecule is a common phenomenon in nature [52–54]. The presence of different functionalities within one protein molecule can be beneficial for several reasons. In the case of multifunctional enzymes, the combination of catalytic activities from the same metabolic pathway allows for substrate channeling, a process protecting unstable short-living intermediates, preventing the loss of a substrate due to diffusion and eliminating side-reactions [53, 55]. The expected effect of substrate channeling would be more efficient catalysis. Such multifunctional enzymes encoded by a single gene are found in several biochemical pathways including lipid metabolism , de novo purine and pyrimydine biosynthesis [20, 57], and folate metabolism [2, 58]. In some other proteins resulting from gene fusion or exon shuffling, the combination of domains could create a new function, and FDH is an outstanding example of this phenomenon.
The presence of ALDH1L1 gene obviously provided a selective advantage for higher organisms since it has been retained throughout their evolution. The importance of the new reaction for the cell, as well as its precise evolutionary advantage, is not completely understood. More ancient organisms (e.g. bacteria, plants and fungi) do not have a corresponding enzyme. The loss of Aldh1l1 in mice, while affecting the distribution of reduced folate pools, is not lethal and did not produce a distinct phenotype when animals are kept on a folate-rich diet . These mice, however, demonstrated decreased reproductive efficiency . We suggest that FDH controls the overall flux of one-carbon groups through the folate pool. In agreement with this hypothesis, it has been shown that the enzyme regulates the major folate-dependent biosynthetic processes, de novo purine pathways and regeneration of methionine from homocysteine [61–63]. The key role of FDH is associated with the fact that it functions as a catabolic enzyme with regard to the one-carbon group conversion: the enzyme removes these groups, in the form of CO2, from the folate pool thus counteracting biosynthetic processes. In this sense, FDH could serve to limit excessive proliferation, which is an unwanted process for most tissues in an adult organism. In support of this possibility, it has been observed that FDH is strongly and ubiquitously down-regulated in cancers .
It has also been suggested that FDH is a crucial component of the methanol detoxification pathway . Methanol toxicity is primarily caused by its metabolite, formic acid, which is responsible for the metabolic acidosis and ocular toxicity observed in methanol-intoxicated humans . Thus, on a more general note, the enzyme should be considered as a component of the formate degradation pathway. This pathway converts formate to neutral CO2, through 10-formyltetrahydrofolate as an intermediate. The two steps of this pathway are catalyzed by cytosolic C1-synthase and ALDH1L1, correspondingly. In the cell, formate is directly produced in pathways involving the degradation of 3-methyl-branched fatty acids and the shortening of 2-hydroxy long chain fatty acids . In addition, methanol is produced during fermentation from the hydrolysis of fruit pectin and thus is present in juices and alcoholic beverages . Interestingly, the artificial sweetener aspartame also generates a small amount of methanol . Importantly, it has been demonstrated that the ALDH1L1 pathway is more prominent for the clearance of lower, physiological doses of formate . Bacteria, yeast and plants possess an enzyme, formate dehydrogenase (EC 220.127.116.11), which directly oxidizes formate to CO2 , and this enzyme is not found in higher animals. Thus, it can be speculated that FDH-catalyzed reaction evolved as a compensatory pathway to clear formate.
The mitochondrial FDH (mtFDH, ALDH1L2) is structurally very similar to the cytosolic enzyme . While ALDH1L1 was obviously the natural product of the gene fusion, mtFDH most likely was the result of a duplication of the ALDH1L1 gene. This point of view is supported by the fact that mtFDH is seen later on the evolutionary tree than cytosolic FDH and that the two ALDH1L genes have higher similarity to each other than to the potential ALDH ancestors. For instance, for the ALDH domain, the similarity between the two proteins is about 79% while the closest member of ALDH family, retinaldehyde dehydrogenase, is only about 50% similar to either of the FDH isoforms. Of note, gene duplication is not uncommon for folate enzymes with at least two other examples known, MTHFD and SHMT [2, 3, 70]. Interestingly, another genetic mechanism that can create mitochondrial and cytosolic isoforms is through alternative splicing. In this mechanism, the exon encoding for a mitochondrial leader sequence is spliced out which produces a protein localized to the cytosol. The two isoforms of FPGS, mitochondrial and cytosolic, are the result of the alternative splicing of a single gene . Moreover, this mechanism is also possible for the gene encoding mitochondrial SHMT . Such a mechanism, however, has not been seen in the ALDH gene family.
In the case of the ALDH1L genes, it appears that an opposing mechanism was responsible for the creation of the mitochondrial enzyme: instead of losing a mitochondrial leader sequence, its acquisition took place. This has apparently occurred without significant changes in the ALDH1L1 gene organization. Thus, both genes, ALDH1L1 and ALDH1L2, have a similar number of exons in all species examined, with some species having identical number of exons in both genes and others demonstrating the difference of only one or two exons (Table 2). The first exon of ALDH1L1 is non-translatable, but it encodes for the mitochondrial leader sequence in the case of ALDH1L2. Evidently, alterations within this exon allowed for the acquisition of a mitochondrial leader sequence.
It is not clear whether the presence of two ALDH1L genes, and thus two FDH isoforms, provides a selective advantage for a species or whether it was a random act of a gene duplication. C. elegans and insects have only the ALDH1L1 gene. Our recent analysis of the zebrafish genome revealed two FDH-like genes, for which protein products were predicted to reside in mitochondria. In addition, the combination of these proteins corresponded to a full-length FDH . However, these two genes have been deleted from the most recent annotation of the zebrafish genome, and a new annotation of the ALDH1L2 gene encoding for full-length mitochondrial FDH has been included in the database. Thus, the duplication of ALDH1L1 gene took place prior to appearance of osseous fish. The present study identified only a single FDH gene in birds, ALDH1L2, which may be due to the incomplete annotation of ALDH1L genes and proteins in these species. Alternatively, the two ALDH1L genes may be redundant in bird species and similar to mitochondrial SHMT in mice , avian ALDH1L2 could produce a cytosolic enzyme through an alternative splicing mechanism. Whether alternative splicing of ALDH1L2 is possible or not is unclear at present, but a preliminary analysis of this gene did not indicate such a splice variant.
BLAST and BLAT analyses of several vertebrate genome databases were undertaken using amino acid sequences reported for human ALDH1L1 (cytosolic) and ALDH1L2 (mitochondrial) enzymes for interrogation of vertebrate genomes. Evidence is presented for ALDH1L1 and ALDH1L2 genes in all vertebrate genomes examined, with the exception of opossum, platypus and chicken genomes, for which only ALDH1L2 sequences were observed. This may be due to an incomplete annotation of ALDH1L1 sequences or to an alternative mechanism (such as differential splicing of ALDH1L2) in generating a cytosolic form of ALDH1L in these species. Predicted amino acid sequences for vertebrate ALDH1L-like subunits showed a high degree of similarity with the corresponding human enzymes. Phylogenetic analyses supported a hypothesis concerning the molecular evolution of vertebrate ALDH1L-like genes: vertebrate ALDH1L1 and ALDH1L2 genes were generated within a common ancestral genome (for vertebrates) by a duplication of the gene encoding cytosolic ALDH1L1, prior to the appearance of osseous fish, more than 500 million years ago .
This study was supported by the National Institutes of Health grant DK054388 (SAK) and a Ruth L. Kirschstein National Research Service Award for Individual Predoctoral MD/PhD Fellows F30DK083215 (KCS). KCS was supported by USPHS NIH grant R13-AA019612 to present this work at the 15th International Meeting on Enzymology and Molecular Biology of Carbonyl Metabolism in Lexington, KY USA.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.