MTHFD1 is a logical candidate gene for investigation in relation to disease risk; thus, we have undertaken the primary study of the promoter controlling its expression, and report the existence of a functional promoter SNP, rs1076991 C>T, which affects gene expression
in vitro. Our results demonstrate that the
MTHFD1 gene is regulated by a TATA-less, Inr-less promoter that directs transcriptional initiation at multiple start sites within a 126bp initiation window in the upstream region. Three major TSSs were identified at positions 68, 72 and 100bp upstream of the translation start site. An alternate upstream exon is not present and TSS usage is similar between different individuals and tissue types investigated (lymphocyte and placenta). This type of TATA-less promoter with multiple start sites resembles that found in other folate-related genes, including thymidylate synthase (
Dong et al., 2000) and reduced folate carrier (
Gong et al., 1999), and is also similar to the promoter controlling rat
Mthfd1 expression (
Howard et al., 2003). A 1.38kb CpG island spans the promoter region, and is associated with the presence of multiple GC boxes, suggesting that Sp1 is an important TF for gene regulation (
Brandeis et al., 1994). It has been proposed that these Sp1 sites may serve to maintain the hypomethylated state of such promoters, indicating that the gene will be constitutively expressed (
Pugh & Tjian, 1990).
In silico prediction and previously reported empirical evidence from large-scale promoter studies suggest the E2F family of TFs will also play an important role in
MTHFD1 gene expression (
Cam et al., 2004;
Weinmann et al., 2002). Therefore, it is likely that
MTHFD1 will be cell cycle-regulated and, although predicted to be constitutively expressed, an up-regulation of its expression during S phase would be expected given its role in DNA synthesis, followed by a down-regulation in the G
0 and G
1 phases.
In silico analysis also indicates that c-Myc-Max binding to a conserved E-box motif regulates
MTHFD1 expression. Cross-species conservation signifies the importance of this promoter feature, since nucleotide sequences that are actively conserved during evolution are likely to be of biological importance. Evidence of functionality is also supported by identification of cMyc-Max binding to the
MTHFD1 upstream region in another large-scale promoter binding study (
Mao et al., 2003). The c-Myc TF can play a role in recruiting factors necessary for the initiation of transcription to the core promoter (
Hermann et al., 2001;
McEwan et al., 1996) and could be responsible for this process in the absence of a TATA box/Inr in the
MTHFD1 promoter. Co-ordinated binding of these TFs, as well as other predicted ones such as NRF-1, is likely responsible for the high levels of activated transcription measured from the
MTHFD1 promoter, especially within the first 0.47kb of the upstream region, which supports the highest level of transcriptional activity. Activated transcription was not induced by a promoter construct of 0.11kb, demonstrating the absence of essential regulatory elements and indicating that the minimal promoter region for activated transcription of this gene is between 0.11kb and 0.26kb upstream. The drop in activity observed in the 1.94kb construct may be due to a yet to be identified repressor element. Our analysis was confined to the 2kb region upstream of the translational start site of MTHFD1, thus, we cannot rule out the role of additional regulatory elements further upstream of this.
Promoter function and normal gene expression can be significantly affected by polymorphisms in important regulatory regions (
Hoogendoorn et al., 2003). SNP rs1076991 C>T is located within the window of initiation and was shown to have a significant impact on promoter function
in vitro. Transcriptional activity of the 0.59kb ‘T’ promoter construct was shown to be about 1.6 fold less than that of the 0.59kb wildtype ‘C’ construct. If this effect is translated
in vivo, a decrease in
MTHFD1 gene expression could result in lower levels of the MTHFD1 enzyme available for purine and thymidylate synthesis; a situation that could be detrimental under certain conditions, especially during times of increased demand on
de novo DNA synthesis, such as embryogenesis. However, results obtained from
in vitro reporter gene studies should be interpreted with caution, since gene regulation and expression in the natural genomic environment
in vivo is undoubtedly more complex than seen in the cell line model. Bioinformatic analysis did not reveal an alteration of any consensus TF binding site that would explain the observed difference in activity between the two genotypes. The loss of a DNA methylation site is also possible, but is unlikely to explain the reduced expression of MTHFD1
in vitro. The most likely explanation for the functional impact of SNP rs1076991 on MTHFD1 gene expression is through the loss or gain of binding to a non-consensus binding site, the identification of which would require further investigation.
Polymorphisms that exert a functional effect, such as SNP rs1076991, are those most likely to be involved in common disease. The link between disruptions to folate metabolism and NTD risk is well established and, more specifically, variation in the MTHFD1 folate enzyme has previously been associated with NTD risk in the Irish population (
Brody et al., 2002;
Parle-McDermott et al., 2006). Therefore, SNP rs1076991 was investigated as a candidate polymorphism for NTD risk in the Irish population in a large association study using both case/control and family triad-based analysis methods. SNP rs1076991 was not associated with NTD risk/protection in this study, nor did it have an effect on RCF or homocysteine levels analysed in a separate control group. However, SNP-SNP interaction analysis with MTHFD1 SNP rs2236225 (R653Q) revealed a highly significant association with NTD risk in both case (genotype and allele frequencies) and maternal groups (allele frequencies only). These two SNPs are not in LD with each other and, therefore, the identified interaction cannot be attributed to simple co-segregation. Therefore, it seems that the SNP rs1076991, while not an independent risk factor for NTDs, in some way contributes to the risk associated with SNP rs2236225 and homozygosity for these two SNPs confers a greater risk than either one in isolation. SNP-SNP interaction is particularly relevant in the aetiology of common complex diseases, where it is likely that the mechanism of action of one variant may be influenced by the presence or absence of another. Further investigations of SNP-SNP interaction in relation to abruptio placentae, mid-trimester miscarriage risk, and would be of particular interest, since the rs2236225 AA genotype is a known risk factor (
Parle-McDermott et al., 2005(a);
Parle-McDermott et al., 2005(b)) for both these conditions and it is possible that the presence of SNP rs1076991 confers an even greater risk. However, we do acknowledge the limitations of our data set and confirmation of this interaction in a second NTD cohort would be desirable.
We are the first to investigate the promoter of MTHFD1 and to identify a novel functional SNP rs1076991 that may have disease relevance. The results reported here marks a step in the direction toward understanding the underlying molecular pathways and disruptions involved in folate-related disease development and progression. This is necessary to achieve the fundamental goal of elucidating the aetiology of these complex diseases and, eventually, optimising individual folate status to prevent or overcome them.