|Home | About | Journals | Submit | Contact Us | Français|
Folate status is an important predictor of colorectal cancer risk. Common genetic variants in genes involved in regulating cellular folate levels might also predict risk, but there are limited data on this issue. We conducted a family-based case-control association study of variants in four genes involved in folate uptake and distribution: FOLR1, FPGS, GGH, and SLC19A1, using 1,750 population-based and 245 clinic-based cases of pathologically-confirmed colorectal cancer and their unaffected relatives participating in the Colon Cancer Family Registries. Standardized questionnaires, administered to all participants, collected information on risk factors and diet. Standard molecular techniques were used to determine microsatellite instability (MSI) status on cases. tagSNPs (n=29) were selected based on coverage as assessed by pairwise r2. We found no evidence that tagSNPs in these genes were associated with risk of colorectal cancer. For the SLC19A1- rs1051266 (G80A, Arg27His) missense polymorphism, the A/A genotype was not associated with risk of colorectal cancer using population-based (OR=1.00; 95% CI=0.81–1.23) or clinic-based (OR=0.75; 95% CI=0.44–1.29) families compared to the G/A and G/G genotypes. We found no evidence that the association between any tagSNP and CRC risk was modified by multivitamin use, folic acid use and dietary folate intake and total folate intake. The odds ratios were similar, irrespective of MSI status, tumor subsite and family history of colorectal cancer. In conclusion, we found no significant evidence that genetic variants in FOLR1, GGH, FPGS and SLC19A1 are associated with the risk of colorectal cancer.
Folate functions as a major carrier of one-carbon groups, needed for methylation reactions and nucleotide synthesis [1,2]. A large body of evidence has shown that low folate intake is associated with increased risk of colorectal adenomas and cancer in populations that are not folate-replete , while more recent evidence suggests that the use of folic acid supplements may not be beneficial in the prevention of colorectal adenomas and prostate cancer [4–11]. These findings raise the important question of whether there are subgroups of people that are differentially susceptible to the effects of folic acid. Genetic factors may offer critical insight in this distinction.
The folate-associated one-carbon metabolic (FOCM) pathway has been extensively studied . Folic acid (pteroylmonoglutamate) and dietary folates (after hydrolysis from polyglutamated to monoglutamated forms by GCPII/FOLH1 (glutamate carboxypeptidase II or prostate-specific membrane antigen)  are taken up in the jejenum of the small intestine. Folylmonoglutamates in the bloodstream, predominantly 5′methyl tetrahydrofolate (5′MTHF), are taken up into cells by FOLR1/FBP (folate receptor-alpha or folate-binding protein) and RFC1/SLC19A1 (reduced folate carrier (protein); solute carrier family 19 (folate transporter) member 1 (gene)) depending on cell type [14,15]. Importantly, RFC1 has a higher affinity for reduced folates (e.g., 5′MTHF) than for folic acid ; while FOLR1 has a higher affinity for folic acid compared to 5′MTHF . After entering cells, folates are polyglutamated by folylpolyglutamate synthase (FPGS), which facilitates their retention inside the cell, reduces their Km and increases affinity for specific enzymes . Before being released from cells into the bloodstream, folylpolyglutamates must again be hydrolysed to monoglutamates, a reaction facilitated primarily by γ-glutamyl hydrolase (GGH) . Genetic variation in these key enzymes may affect intra- and extra-cellular folate levels and thereby modulate risk of colorectal cancer, but have been largely understudied.
In this study, we conducted a comprehensive analysis of the role of common genetic variation in FOLR1, SLC19A1, FPGS, and GGH using a family-based case-control study of CRC conducted by the Colon Cancer Family Registry (Colon CFR). In addition, we evaluated heterogeneity of risk estimates by folic acid supplement use, dietary folate, family history of CRC, tumor subsite and microsatellite instability.
The Colon CFR is an international collaborative consortium initiated in 1997 with the goal of creating a resource for the study of the genetic epidemiology of colorectal cancer . Participants were recruited from six registries based at the University of Hawaii (Honolulu, HI), the Fred Hutchinson Cancer Research Center (FHCRC, Seattle, WA), Mayo Clinic (Rochester, MN), the University of Southern California Consortium (Los Angeles, CA), Cancer Care Ontario (Toronto, Canada), and the University of Melbourne (Victoria, Australia), which recruited families from both Australia and New Zealand. Cases were ascertained through population-based registries and cancer family clinics as described in detail elsewhere . Some registries recruited all incident cases of CRC while others over-sampled cases with a family history of CRC and/or those diagnosed at younger ages .
We used a case-unaffected sibling control design and data from both population-based and clinic-based families. Cases were affected probands and relatives who had been diagnosed with pathologically-confirmed CRC. All cases were interviewed within 5 years of diagnosis (75% of cases were interviewed within 2 years of diagnosis). Controls were full biologic siblings of cases who had not been diagnosed with CRC. We excluded monozygous twin pairs. Matching cases to sib controls accounts for any potential confounding by unknown admixture across families .
In addition, we also genotyped a random set of unrelated population-based controls (n=447) from one of the Colon CFR sites (FHCRC) in order to estimate minor allele frequencies.
We obtained informed consent from all participants. The study was approved by the Institutional Review Board at each of the registries.
A core questionnaire, administered to all participants at the time of recruitment, collected information on personal and family medical histories of polyps, colorectal and other cancers, and other risk factors, including: medication use, reproductive history, physical activity, demographics, alcohol intake, tobacco use, and dietary patterns (including multivitamin and folic acid use). Weekly alcohol intake was calculated as the sum of intakes from beer, wine and liquor. In addition, a detailed food frequency questionnaire (FFQ) was administered to all participants at baseline for three of the Colon CFR sites: USC consortium, Ontario and Hawaii (N cases= 585, N controls = 837 completed the questionnaire) . The FFQ included questions on both dietary and supplemental intake of folate and other B-vitamins. Folate intake from food and supplements was evaluated per 1000 KCAL/day. The FFQ specifically asked about typical food intake 2 years prior to diagnosis for cases or 2 years prior to participation for controls. Because all of the participants who completed a FFQ did so after 1998, dietary folate was calculated using a food composition table that accounted for fortification guidelines of 140 μg of folic acid per 100 g of fortified cereal products. Results did not differ materially when we calculated dietary folate using a food composition table that did not account for food fortification. Blood samples were collected from all participants; tumor blocks and pathology reports were obtained for the majority of cases.
Microsatellite instability(MSI) was evaluated using a panel of 10 markers (BAT25, BAT26,BAT40, MYCL, D5S346, D17S250, ACTC, D18S55, D10S197, and BAT34C4) using standard techniques . Results were required for atleast four of the 10 markers to determine MSI status; findings did not vary substantively with numbers of typed markers. Tumors were deemed MSI-high (MSI-H) if instability was observed at 30% of at least 4 markers, MSI-low (MSI-L) if>0 and <30% of markers were instable, and MSS if all markers were stable.
Tumors were classified by location in the colon using International Classification of Diseases for Oncology, third edition (ICD-O-3) codes . Tumors located in the cecum, ascending colon, hepatic flexure, transverse colon, and splenic flexure (ICD-O-3 codes C180, C182, C183, C184, and C185) were classified as proximal colon. Tumors located in the descending colon and sigmoid colon (ICD-O-3 codes C186 and C187) were classified as distal colon. Rectal tumors included those of the rectosigmoid junction and rectum (ICD codes C199 and C209).
In this analysis, we report results regarding variants in four genes: FOLR1, GGH, FPGS and SLC19A1, although the entire project involved many additional genes. tagSNPs were selected using Haploview Tagger for the CEU population using the following criteria: minor allele frequency (MAF) >5%, pairwise r2>0.95, and distance from closest SNP greater than 60 base pairs on the Illumina GoldenGate 1536 SNP array. The linkage disequilibrium (LD) blocks were determined using data from HapMap data release #16c.1, June 2005, on NCBI B34 assembly, dbSNP b124. For each gene, we extended the covered 5′- and 3′-UTR regions to include the 5′- and 3′-most SNP within the LD block (approximately 10kb upstream and 5kb downstream). Where an LD block did not extend beyond either the first or last exon, the gene boundaries were defined as 5kb upstream and 10kb downstream of the gene. If the first or last exon or both were included in an LD block that extended up- or down-stream of the exon, respectively, than the boundaries of the gene were extended based on the LD block structure and included at least 5kb upstream and 10kb downstream. In regions of no- or low-LD, SNPs with MAF>5% at a density of approximately 1 per kb were selected from HapMap or dbSNP. Non-synonymous SNPs and expert-curated SNPs, regardless of MAF, were included. SNPs were excluded from our statistical analysis based on the following criteria: GenTrain Score <0.4; GenCall (GC) Score <0.25; Heterozygote (AB) T Deviation >0.1239; Call Frequency <0.95; Replicate Errors >2; Parent-Parent-Child Errors; Mendelian Errors > 2; or discordance with HapMap >3. These are quality metrics that indicate the reliability of the genotypes called.
We performed additional genotyping using Sequenom’s iPLEX Gold for tagSNPs that were not successfully genotyped on the Illumina platform and for additional SNPs selected to ensure adequate coverage based on updated HapMap data (v.21). These additional SNPs were selected using Haploview Snagger (r2>0.95, MAF> 0.05) . Polymerase chain reaction (PCR) and extension primers for these SNPs were designed using the MassARRAY Assay Design 3.0 software (Sequenom, Inc) and are available upon request. PCR amplification and single base extension reactions were performed according to the manufacturer’s instructions. Extension product sizes were determined by mass spectrometry using Sequenom’s Compact MALDI-TOF mass spectrometer. The resulting mass-spectra were converted to genotype data using SpectroTYPER-RT software.
Genotype data from 30 CEPH trios (Coriell Cell Repository, Camden, NJ) were used to confirm reliability and reproducibility of the genotyping. Intraplate and interplate replicates at a rate of 5% were included on all plates. As a quality control measure, the frequency of discordant genotypes was estimated: 1 of398 (0.25%) blinded replicates were discordant; these 2 samples were excluded.
We genotyped 807 SNPs in 33 genes on the Illumina platform; 58 SNPs failed the criteria above and 4 were monomorphic. We selected an additional 43 SNPs in these genes to be genotyped using Sequenom. Of these, 2 SNPs failed the genotyping criteria and one SNP was monomorphic. There were 29 tagSNPs in the four genes included in this analysis. Two SNPs in FOLR1 (rs762622 and rs7938669) were excluded because they were monomorphic and one SNP (rs1893008) was excluded because of a low call rate. In FPGS, we excluded one SNP (rs10987746) because of a low call rate. In GGH, we excluded a total of two SNPs: rs15073 because it was monomorphic and rs1800909 because of a low call rate. Two additional SNPs: FOLR1-rs649060 (MAF=0.000176) and FPGS-rs10760502 (MAF=0.00017) were excluded from the analysis because of small cell counts.
Minor allele frequencies were estimated using genotype data collected on the unrelated population-based controls (n=447). We estimated pairwise LD between SNPs within a gene by the square of the correlation coefficient between markers (r2) using the genetics package in R.
Multivariable conditional logistic regression with sibship as the matching factor was used to assess the associations between variants and risk of CRC. Each sibship had at least one case and at least one control. Because we were not certain which, if any, of these tagSNPs were the causal variants, we used a robust variance estimator to prevent biased estimates as a result of testing associations in the presence of linkage . Population- and clinic-based data were analyzed separately. We tested the associations between tagSNPs and the risk of CRC using a log additive model, except where evidence from the literature suggested that an alternative mode of inheritance was more appropriate (i.e., for SLC19A1-rs105266 we used a recessive model since homozygous carriers of the variant allele have been shown to have a significantly lower red cell folate levels compared those who were heterozygous or homozygous for the wild-type allele ). Multivariable models were adjusted for age and sex. Race and center were accounted for in the matched analysis. We performed additional analyses adjusting for alcohol consumption, folic acid and multivitamin use; inclusion of these and other variables, did not appreciably modify the risk estimates (i.e., more than 10%) and the more parsimonious model are presented. We present p-values obtained from a likelihood ratio test on each corresponding regression coefficient, as well as adjusted p-values using the approach presented by Conneely and Boehnke . We corrected for correlated tests within a gene and determining system-level significance according to previously reported methods . Briefly, we adjusted for the multiple correlated tests from the SNPs within each gene region by modeling the test statistics as an asymptotically distributed multivariate normal with a co-variance structure estimated from the observed SNP correlation. Significance across all genes tested is determined using a Bonferroni correction for four gene regions (alpha=0.05/4 = 0.0125). This approach provides evidence of association for each individual SNP, preserves the nominal α-level within each gene via reported adjusted p-values, and allows for determination of noteworthiness across all SNPs tested using a Bonferroni adjusted level of significance.
We estimated stratum-specific ORs among population-based families to evaluate heterogeneity by: MSI (MSS and MSI-L vs. MSI-H); tumor subsite (right colon vs. left colon vs. rectum); family history of colorectal cancer in a first-degree relative (at least one relative vs. none); multivitamin use (yes vs. no); and dietary and total intake of folate (dietary folate equivalency, dichotomized at the median). Furthermore, because food fortification guidelines vary by country (Australia and New Zealand did not fortify grain products with folic acid at the time of recruitment) we assessed potential heterogeneity in the estimates of risk by study center. Lastly, we considered whether inclusion of cases recruited more than 2 years after diagnosis resulted in different estimates by comparing OR estimates for SNPs separately, using cases diagnosed before and after 2 years following diagnosis. No substantial differences in OR estimates were observed. We included interaction terms in the regression models and used a likelihood ratio test to assess evidence of heterogeneity. No substantial differences in OR estimates were observed. All statistical analyses were conducted in R (version 2.6.2).
We studied a total of 1,750 population-based and 245 clinic-based discordant sibships. The vast majority of sibships had one case and at least one unaffected sibling control (N=1,919, 96.2%), whereas the remaining sibships had two or more cases. Table 1 shows the distribution of selected characteristics for population-based and clinic-based families. Table 2 lists the tagSNPs investigated in each of the four genes. Except for two rare tagSNPs in FOLR1, rs11235464 and rs2071010, and two in GGH, rs11545078 and rs17194931, all SNPs had a MAF of at least 10%. There was strong linkage disequilibrium (r2>0.80) between selected SNPs in FPGS (rs1544105 and rs4451422; rs14451422 and rs1330684), GGH (rs11545078 and rs17194931; rs13270305 and rs3758149), and SLC19A1 (rs2236484 and rs12482346; rs12482346 and rs7499; rs2297291 and rs1051266).
In log-additive models, we found no evidence that any of the common variants in FOLR1, FPGS, GGH or SLC19A1 were statistically significantly associated with risk of colorectal cancer using either population-based or clinic-based families (Table 2). Using co-dominant models, we observed no statistically significant associations after adjustment for multiple testing (data not shown). For the SLC19A1- rs1051266 missense polymorphism, the A/A genotype showed no association with colorectal cancer compared to the G/A and G/G genotypes using either population-based (OR=1.00; 95% CI=0.81–1.23) or clinic-based (OR=0.75; 95% CI=0.44–1.29) families.
We investigated potential heterogeneity in risk estimates by MSI-status (Table 3) and tumor subsite (data not shown). We found no evidence of effect modification. For the SLC19A1- rs1051266 missense polymorphism, we found no association with risk of cancer in the right colon (OR=0.92; 95% CI=0.64–1.32), left colon (OR=0.90; 95% CI=0.60–1.35), or rectum (OR=1.18; 95% CI=0.83–1.67, p-value for interaction=0.71). When analyses were stratified by MSI, the A/A genotype relative to the G/A and G/G genotypes was associated with a statistically non-significant inverse association for the risk of MSS and MSI-L tumors (OR=0.89; 95% CI=0.67–1.18) and a statistically non-significant increased risk of MSI-H tumors (OR=1.80; 95% CI=0.76–4.30, p-value for heterogeneity=0.19).
We found no evidence that the association between SNPs and risk of colorectal cancer was modified by family history of colorectal cancer (data not shown), multivitamin use (Table 4), supplementary folic acid use (data not shown) and dietary intake of folate (Table 4), or total (food and supplemental) intake of folate (data not shown). After adjustment for multiple testing, no statistically significant SNP-dietary interactions were noted except for FOLR1- rs3016432 and FPGS-rs1330684 by multivitamin use.
Using this large family-based case-control study of colorectal cancer, genetic variation in genes involved in folate cellular uptake and distribution, FOLR1, FPGS, GGH and SLC19A1, were unassociated with colorectal cancer risk. We observed no evidence that associations, should they exist, were modified by multivitamin use, folic acid use, or dietary/total intake of folate. Furthermore, we found no evidence of heterogeneity in the SNP risk estimates by family history of colorectal cancer, tumor subsite, or MSI status.
The solute carrier family 19 (folate transporter) member 1 (SLC19A1, RFC1) transports folate compounds into cells and plays a role in maintaining intracellular concentrations of folate. Mean expression levels of the RFC1 protein have been shown to be higher in tumor tissues compared with normal colonic mucosa  and higher expression of folate receptors has been associated with the resistance to folate antagonist drugs . The SLC19A1-G80A polymorphism has been associated with alterations in folate and homocysteine (Hcy) metabolism in healthy individuals ; that study further suggested that the variant SLC19A1-80A allele was associated with higher levels of serum folate . Other studies have reported no association between this polymorphism and plasma folate or homocysteine [32–34]. To date, studies have suggested no association for this polymorphism with cancers of the breast  and colon ; other studies have suggested a potentially increased risk of bladder (borderline)  and esophageal cancers  for the AA versus G/A or G/G genotypes. The homozygote variant genotype has also been reported as not associated with CIMP+ or CIMP− colon cancers .
Folate-binding proteins also transports folates in cells. These exist in three isoforms (FRα, FRβ and FRγ) that are differentially expressed in various tissues. The FRα isoform, known as FOLR1, is the most widely studied and is over-expressed in colon tumors . Expression of FOLR1 has been shown to be an important prognostic marker in some studies [29,40,41]. Ma and colleagues reported that FOLR1 and SLC19A1 gene inactivation in mice increased sensitivity to colon carcinogenesis . FOLR1 may confer a growth advantage to the tumor by modulating folate uptake or generating regulatory signals . This gene is highly polymorphic and a selected set of variants may be associated with homocysteine and folate levels , although further study is needed. To our knowledge, no study has reported on the potential role of common genetic variants in this gene and cancer risk. We found no evidence to support the hypothesis that polymorphisms in this gene are associated with risk of colorectal cancer. FOLR1 has very high affinity for folic acid [17,45], but we found no evidence that the association between any FOLR1 polymorphism and risk differed between individuals taking folate-containing supplements and those that did not.
Folates derived from dietary sources exist mainly as polyglutamated forms. Gamma-glutamyl hydrolase (GGH) removes glutamate residues from folylpolyglutamates, thereby permitting movement into or out of cells . When the expression of GGH increases, more rapid hydrolysis of cellular folylpolyglutamates results in the depletion of intracellular folates. Studies have suggested that selected polymorphisms in GGH, (−401C>T, rs3758149 and −124T>G, rs11545076) may increase promoter activity when introduced into both hepatocellular liver carcinoma (HepG2) and breast cancer (MCF-7) cell lines . A recent study suggested that the G allele of the GGH-124T>G polymorphism was associated with a stepwise increase in DNA uracil content, but not plasma total homocysteine levels . We found no association between either polymorphism and risk of colorectal cancer.
FPGS catalyzes an essential polyglutamation step in FOCM, the addition of multiple glutamates to compounds with the basic pteroylglutamate structure such as tetrahydrofolate and many other folate analogues . Polyglutamation of endogenous reduced folates allows for retention and accumulation of these essential cofactors within the cell. Low expression of FPGS in normal-appearing mucosa in the colorectum in individuals with colorectal cancer has been associated with poor survival . In a study that resequenced the FPGS gene in four ethnic populations, five SNPs were shown to alter an amino acid and two of these non-synonymous SNPs, -R424C and -S457F, affected protein expression, in vitro substrate enzyme kinetics, and efficacy of anti-folate therapy . Few studies have been conducted investigating the role of FPGS polymorphisms in cancer risk [49,50]. We found no evidence that any of the selected tagSNPs was associated with colorectal cancer risk. We did not include some known non-synonymous variants in FPGS, and therefore further study may be warranted.
Polymorphisms in genes involved in the provision of methyl groups may be more important for the development of MSI-H colorectal cancers than for those with the MSI-L or MSS phenotype. The majority of sporadic MSI-H colorectal tumors show hypermethylation of the MLH1 gene promoter and CpG island methylator phenotype (CIMP) ; therefore, because folates play a key role in methylation, genetic variants that influence folate levels may contribute to the risk of MSI-H tumors. This hypothesis has received little attention. One publication by Curtin et al. found little evidence that a selected set of functional variants in folate genes were associated with CIMP+ or CIMP− cancers except for MTHFR-C677T . We found limited evidence that variation in genes involved in the uptake and distribution of folates differential influence colorectal cancer risk by MSI-status.
This study has several strengths and limitations. The case-unaffected sibling design controls for any potential confounding by ethnicity and is more powerful for detecting gene-environment interactions than studies with population-based unrelated controls. However, a limitation to this design is that it may have lower power or detecting main effects . We used a validated semi-quantitative food frequency questionnaire, but these are subject to measurement error, which may introduce substantial biases, generally conservative . Detailed data using a food frequency questionnaire were collected for only a subset of the participants in this study, so we had limited power to detect heterogeneity by dietary and total folate intake. Strengths include the large sample size, comprehensive evaluation of genes, and the availability of systematically collected data on lifestyle and tumor characteristics.
In summary, we found no evidence that 29 common genetic variants in FOLR1, GGH, FPGS and SLC19A1 are associated with risk of colorectal cancer. Nonetheless given the limited data on these genes in cancer risk, further confirmation by other studies is needed.
This work was supported by the National Cancer Institute, National Institutes of Health under RFA # CA-95-011 and through cooperative agreements with the Australasian Colorectal Cancer Family Registry (U01 CA097735), the USC Familial Colorectal Neoplasia Collaborative Group (U01 CA074799), the Mayo Clinic Cooperative Family Registry for Colon Cancer Studies (U01 CA074800), the Ontario Registry for Studies of Familial Colorectal Cancer (U01 CA074783), the Seattle Colorectal Cancer Family Registry (U01 CA074794), and the University of Hawaii Colorectal Cancer Family Registry (U01 CA074806) as well as NCI T32 CA009142 (JNP), NCI R01 CA112237 (RWH), NCI PO1 CA41108 (MEM), CA23074 (MEM) and CA95060 (MEM). P.T.C. and J.C.F. were supported in part by National Cancer Institute of Canada post-PhD Fellowships (#18735 and #17602).
We thank the following individuals for their support in data collection and management: Margreet Luchtenborg, Maj Earle, Barbara Saltzman, Kathy Kennedy, Darin Taverna, Chris Edlund, Matt Westlake, Paul Mosquin, Darshana Daftary, Michelle Cotterchio, Douglas Snazel, Allyson Templeton, Terry Teitsch, Helen Chen and Maggie Angelakos. We thank all the individuals who participated in the Colon CFR.
Disclosures: Paul Limburg is a consultant for Genomic Health, Inc.