This large-scale multi-center study examined germline DNA markers and their contributions to somatic events, especially susceptibility to DNA methylation in CRC. In three independent samples, three polymorphisms, rs1800734, rs749072, and rs13098279 were associated with MLH1
-promoter-methylation status resulting in loss of MLH1 protein and microsatellite instability. Although these three markers are not associated with an increase in the risk of CRC overall, they do play a role in colorectal tumorigenesis in the subset of CRCs that display genome-wide microsatellite instability. Among cases in each individual sample population and in an analysis of all three combined, statistically significant associations were observed between each of these three polymorphisms and MLH1
promoter methylation, MLH1 IHC deficiency, and MSI-H tumor status. In multiple logistic regression models, each SNP was associated with tumor MSI-H status; however, once MLH1 IHC deficiency or MLH1
promoter methylation, or both, were included in the model, the SNP association was no longer statistically significant. The observation that the SNP term was not significant in the model with MLH1 IHC and MLH1
promoter methylation indicates that the addition of the SNP does not significantly improve model fit over and above what MLH1 IHC and MLH1
methylation contribute to the model. Hence, MSI status is conditionally independent of the SNP, or in other words, the effect of the SNP on MSI status is contained in the effects of MLH1 IHC and MLH1
methylation on MSI. These results support the hypothesis that the observed associations between these polymorphisms and MSI-H status occur through MLH1
methylation and subsequent gene silencing. Furthermore, when both IHC and methylation status were included in the model, MLH1 IHC status and MLH1
promoter methylation were both strongly associated with MSI-H status indicating that these two events, while highly correlated, are not completely dependent on each other even after exclusion of all known germline MMR gene mutation carriers. A similar observation was reported previously where MLH1
promoter methylation accounted for 80% of MLH1 IHC-deficient-MSI-H CRCs after excluding all MLH1
germline mutation carriers 
. Other mechanisms must, then, be responsible for the remaining 20% of MLH1 IHC-deficient-MSI-H CRCs. These may include somatic gene mutations, epimutations, loss of heterozygosity at an MMR gene locus, or maybe even unidentified microRNA silencing of a MMR gene.
In addition to colon cancer, the MLH1
-93G>A polymorphism (rs1800734) also is associated with other cancers including: ovarian 
, endometrial 
, and secondary tumors arising from Hodgkin lymphoma 
. More specifically, the MLH1
-93G>A polymorphism was shown to be associated with MLH1
promoter methylation in endometrial cancers 
. Hodgkin lymphoma patients who carried the variant -93A allele were at higher risk of developing secondary tumors following methylating chemotherapy 
. In the colon, this polymorphism has been shown to increase the risk of hyperplastic polyps and adenomas in smokers 
as well as MSI-H CRCs, alone, or in combination with lifestyle factors 
. Furthermore, the MLH1
-93G>A polymorphism is associated with CIMP-positive CRCs (which include MLH1
promoter methylation) 
and with the loss of MLH1
gene expression 
, both of which are consistent with the hypothesis proposed and tested here.
One possible explanation of our previous finding that the MLH1
-93G>A promoter polymorphism was associated with increased risk of MSI-H CRCs is that the association is caused by another functional MLH1
polymorphism in strong linkage disequilibrium (LD) with the MLH1
-93G>A SNP 
. In this study, we identified two polymorphisms, rs749072 and rs13098279, that are in strong LD with the MLH1
-93G>A SNP. However, neither of these two polymorphisms are located in MLH1
: rs749072 is located in intron 26 of LRRFIP2
(leucine-rich repeat in Flightless interaction protein 2), 18 nucleotides from a splice acceptor site (IVS26-18T>C); rs13098279 is an intergenic polymorphism located between the LRRFIP2
(golgi autoantigen, golgin subfamily a, 4). LRRFIP2 binds Dishevelled and serves as an activator of the Wnt signalling pathway, which is deregulated in ~85% of CRCs 
. LRRFIP2 splice variants were identified in colon and prostate cancers 
. The spliced exons contain several potential phosphorylation sites that might influence protein function 
. The roles of the identified splice variants in tumorigenesis, as well as potential effects of rs749072 on splicing machinery, are still unclear.
We identified two additional polymorphisms, rs931913 and rs4624519, associated with an overall increased risk of CRC in the Ontario sample. We did not attempt to replicate the findings for rs931913 and rs4624519 in Newfoundland or Seattle.
Our study has several limitations, including the unavailability of some clinical data from our study subjects. Clinical and pathologic characteristics were not available for several reasons (e.g., tumor material not available for MSI, IHC, or methylation testing, technical difficulties, or death of the patient before tissue samples could be obtained). However, because the general clinical and pathologic characteristics of CRC in our whole population were similar to those of cases with no missing data, our study was not limited by this potential source of bias. One exception was the methylation analysis of Seattle samples, which were mostly completed on MSI-H cases. However, the results obtained from the Seattle samples are very similar to those from Ontario and Newfoundland.
Our study also has numerous strengths. The large sample size gave us high power and precision. In order to observe statistically significant associations of the same order of magnitude that we report here in a genome-wide association study design, we would require between 23,000 and 61,000 cases and controls. A major strength of our study is the use of three independent population-based registries, Ontario, Newfoundland, and Seattle. Replication of our main findings in two additional independent samples provides strong evidence that our findings reflect real associations and are unlikely to have occurred by chance.
The important finding of this study is the identification of a genetic basis for DNA methylation susceptibility; it indicates that genetic variants may play an indirect role in increasing the risk of MSI-H colorectal cancer. Perhaps they alter the binding sites of transcription factors and DNA-binding proteins that protect the DNA molecule from methylation. Inability of these protective proteins to bind DNA would expose DNA to methylating machinery. Conversely, these polymorphisms may create binding sites for co-repressors, methylated DNA-binding proteins, or other proteins involved in epigenetic silencing that modify DNA and silence gene expression. Another possible mechanism involves the production of antisense RNA; it was shown recently that increased production of antisense RNA resulted in epigenetic silencing of p15 tumor suppressor gene 
. The polymorphisms in this study may increase the production of antisense RNAs that result in epigenetic silencing of the corresponding sense-strand genes.
The fact that polymorphisms in genes other than MLH1
are associated with DNA methylation may indicate that the MLH1
promoter methylation observed in MSI-H colorectal cancers is not localized just to the MLH1
locus, but extends beyond the gene. Indeed, Hitchins et al
. observed that, in MSI-H colorectal cancers, methylation is not limited to the MLH1
promoter region, but affects genes in a region as large as 2.4 Mega base-pairs 
. We may have identified, in a much smaller region, genetic markers of the predisposition to such epigenetic alterations and, because a mismatch repair gene, MLH1
, is involved, microsatellite instability invariably develops. However, we cannot yet exclude the possibility that these markers tag some other unknown variant(s) that are the true cause of DNA susceptibility to methylation.
The major agent used for the medical treatment of patients with advanced CRC, 5-Fluorouracil (5-FU), is recognized by the MMR system 
. 5-FU selectively kills cells with intact MMR, while MMR-deficient cells are resistant 
. Patients with stage II and III sporadic MSI CRC do not show a survival benefit following 5-FU therapy when compared with MSS CRC patients in retrospective and prospective studies 
. Indeed, 5-FU-based adjuvant chemotherapy might decrease overall and disease-free survival among MSI CRC patients 
. Similarly, stage III Lynch Syndrome patients do not show a 5-year survival benefit with 5-FU treatment over untreated patients 
. CRC is a heterogeneous disease and the three polymorphisms used in this study may serve as predictive markers in at-risk individuals for early identification of MSI and selection of optimal therapies.
In summary, we built on our previous finding, an association of the MLH1
-93G>A polymorphism with MSI-H colorectal cancers 
. We identified a novel mechanism in which common missense alterations may contribute to complex disease. The three polymorphisms reported in this study serve as germline markers/predisposition alleles for a somatic event that will result in gene silencing and consequently, a specific subtype of colorectal cancer. Additional characterization of these the genes and polymorphisms noted here may lead to new insights and new mechanisms by which alleles contribute to cancer incidence and progression.
List of all SNPs genotyped in Ontario, Newfoundland and Seattle samples.
(0.06 MB XLS)
Analyses of all SNPs with CRC, tumor MSI status, MLH1 IHC.
(0.15 MB XLS)
Information on all statistical models used.
(0.15 MB XLS)
Contains supplementary Table S1: Sequences of primers and probes F
forward primer; R
reverse primer; FAM
wild type allele probe; VIC
variant allele probe; MGBNFQ
minor groove binder non-florescent quencher, FM
methylated forward primer, RM
methylated reverse primer, FU - unmethylated forward primer, RU
unmethylated reverse primer, BHQ-1
black hole quencher-1. *Published previously (23).
(0.03 MB DOC)
D-Prime map of all SNPs genotyped in Ontario samples.
(0.11 MB JPG)
R-squared map of all SNPs genotyped in Ontario samples.
(0.14 MB JPG)