|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: MV MIP. Performed the experiments: MV NV LMM. Analyzed the data: YW. Contributed reagents/materials/analysis tools: NIP. Wrote the paper: MV YW MIP.
The DNA mismatch repair (MMR) enzymes repair errors in DNA that occur during normal DNA metabolism or are induced by certain cancer-contributing exposures. We assessed the association between 10 single-nucleotide polymorphisms (SNPs) in 5 MMR genes and oesophageal cancer risk in South Africans. Prior to genotyping, SNPs were selected from the HapMap database, based on their significantly different genotypic distributions between European ancestry populations and four HapMap populations of African origin. In the Mixed Ancestry group, the MSH3 rs26279 G/G versus A/A or A/G genotype was positively associated with cancer (OR=2.71; 95% CI: 1.34–5.50). Similar associations were observed for PMS1 rs5742938 (GG versus AA or AG: OR=1.73; 95% CI: 1.07–2.79) and MLH3 rs28756991 (AA or GA versus GG: OR=2.07; 95% IC: 1.04–4.12). In Black individuals, however, no association between MMR polymorhisms and cancer risk was observed in individual SNP analysis. The interactions between MMR genes were evaluated using the model-based multifactor-dimensionality reduction approach, which showed a significant genetic interaction between SNPs in MSH2, MSH3 and PMS1 genes in Black and Mixed Ancestry subjects, respectively. The data also implies that pathogenesis of common polymorphisms in MMR genes is influenced by exposure to tobacco smoke. In conclusion, our findings suggest that common polymorphisms in MMR genes and/or their combined effects might be involved in the aetiology of oesophageal cancer.
According to the GLOBOCAN 2008 database (http://globocan.iarc.fr) oesophageal cancer is the 8th most common cancer worldwide and the sixth most common cause of cancer death in the world, with more than 95 percent of the cases and deaths occuring in developing countries. The highest incidence rates were observed the Black population in Southern Africa and Eastern Asia, with 16.3 and 14.2 cases per 100.000 population, respectively, in contrast to Central America, Western and Central Africa where 1.4, 1.2 and 1.1 cases per 100.000 were reported, respectively. The latest report from the South African National Cancer Registry confirms this high incidence rates among Black Ancestry males and females with 8.0 and 4.5 cases per 100.000, respectively, as well as among Mixed Ancestry males and females with 10.4 and 4.4 cases per 100.000 population, respectively . The main histological types of oesophageal cancer - squamous cell carcinoma (OSCC) and adenocarcinoma (OAC) - are observed in more than 95% of all oesophageal cancer cases, with OSCC being the most predominant type in Africa and China , .
Numerous alterations in certain key genes are linked with altered risks for developing oesophageal cancer , . These genes are mainly involved in DNA maintenance and repair, alcohol, folate and carcinogen metabolism, cell cycle regulation and apoptosis. However, only a few putative genes have been consistently shown to correlate with disease susceptibility, including ALDH2, CYP1A1 and MTHFR (reviewed in Hiyama et al. ). This suggests that there are most likely additional background genetic factors and interactions that contribute to oesophageal pathogenesis . Interestingly, from over 100 genetic association studies conducted to date, only Liu et al.  study has focused on highly polymorphic genes involved in the DNA mismatch repair pathway and their role in oesophageal pathogenesis. Several genome-wide association studies (GWAS) have also been conducted in different populations, and study in European population reported associations in the DNA repair gene HEL308 , , . Moreover, MMR genes and their polymorphisms were reported to contribute to the risk of developing lung or head and neck cancer; both types of cancer share similar aetiology to oesophageal cancer , , , , .
Genetic or epigenetic alterations in MMR genes can completely or partially impair MMR efficiency and thus confer an increase in the accumulation of replication errors (RER) in important cancer-regulating genes, eventually leading to carcinogenesis . Loss of mismatch repair activity is manifested in a microsatellite instability (MSI) phenotype. Studies investigating widespread microsatellite alterations in oesophageal cancer have detected low-level MSI (MSI-L; where at least one microsatellite locus is altered) in 16–67% of adenocarcinomas, whereas 2–60% of squamous cell carcinoma tumors were MSI-L positive, with the highest MSI frequencies observed in high-incidence populations, indicating that MMR might be involved in the pathogenesis of the oesophagus , , , , , , , , , , , , , .
These reports prompted us to investigate common variants in MMR genes and their role in susceptibility to oesophageal squamous cell carcinoma in a high-risk population. We performed a case-control study in two distinct ethnic groups of South Africans, where we examined potential associations between 10 polymorphisms in 5 MMR genes (MLH1, MLH3, PMS1, MSH2 and MSH3) and oesophageal cancer. Moreover, SNP-SNP interactions, as well as SNP-environment interactions were investigated to further examine involvement of MMR system in OC.
Characteristics of the two groups, Black and Mixed Ancestry are provided in Table 1. In Black Africans, cases and controls were similar in terms of age (P=0.109) and family history of cancer (P=0.920). In the Mixed Ancestry group, cases compared with controls were more likely to be males, smoke and drink alcohol (P<0.0001 for all) and there was no significant difference in the age distribution (P=0.472) between cancer cases and cancer-free controls (Table 1). A combination of smoking and drinking habits increased the risk for oesophageal cancer 5.46-fold in Black and 19.06-fold in Mixed Ancestry populations (P<0.0001 for each population; Table 2).
Genotype and minor allele frequencies for SNPs are shown in Table S1. For polymorphism rs26279 (MSH3), the minor G allele occurred with a frequency of 32% in cases vs. 38% in controls in the Mixed Ancestry group (P=0.044). The G allele of polymorphism rs5742938 (PMS1) had a frequency of 48% in cases and 57% in controls (P=0.007) in the Mixed Ancestry group. The minor A allele of rs28756991 (MLH3) polymorphism occurred in 4% Mixed Ancestry controls vs. 9% in Mixed Ancestry cases (P=0.001). No difference was observed among Black cases and controls for analysed polymorphisms. Allelic distributions in Black controls were in good agreement with those in the LWK or YRI HapMap populations. All polymorphisms were found to be in Hardy-Weinberg equilibrium (P>0.05) when examining Black and Mixed Ancestry controls, separately. For ten SNPs under study, more than 99% of samples were successfully genotyped. We used logistic regression analysis to examine potential associations between polymorphisms in MMR genes and oesophageal pathogenesis before and after adjusting for age, gender, place of birth, lifestyle habits and familial history of cancer. Adjusted odds ratios are represented in Table 3. Dominant and recessive models for minor alleles were considered for each SNP.
In the Mixed-ancestry group, the MSH3 rs26279G/G versus A/A or A/G genotype was positively associated with cancer (adjusted OR=2.71; P=5.71×10−3). Similar associations were observed for PMS1 rs5742938 (GG versus AA or AG: ajdusted OR=1.73; P=0.027) and MLH3 rs28756991 (AA or GA versus GG: adjusted OR=2.07; P=0.038). We found that all three associations remained significant after correcting the P-values for multiple testing, using the Benjamini-Hochberg method (rs26279: Pcorrected=0.027; rs5742938: Pcorrected=0.036; and rs28756991: Pcorrected=0.027; [see Materials and Methods for details]). In Black South Africans, we observed a marginal association (P=0.086) for MSH3 rs1428030 polymophism (GG or AG versus AA) with a 1.36-fold increase in cancer risk after adjusting for other confounders. There was also evidence implying a reduced cancer risk, with marginal significance, for MLH3 rs2875991 ‘A’ allele under recessive genetic model (adjusted OR: 0.14; P=0.078). However, after correcting for multiple tests, significance for both associations was lost. In addition, single-SNP associations were also investigated only among squamous cell carcinoma cases, excluding adenocarcinomas, however anaysis did not provide additional or more significant results (Table S2).
Haplotype analysis was performed to further evaluate the role of MMR genes in cancer aetiology. As shown in Table 4, three SNPs in MSH3 (rs1805355, rs1428030 and rs26279) and two SNPs in PMS1 (rs572938 and rs13404927) were used to generate haplotypes. The frequency of Ars1805355−Grs1428030−Grs26279 haplotype of MSH3 was found to be significantly higher in black controls (6.8%) than in black cancer cases (3.6%). However, the observed inverse association was only marginally significant after correcting for multiple tests (P1000=0.049). In the Mixed Ancestry group, Grs5742938−Grs13404927 haplotype of PMS1 was associated with 1.61−(95% CI: 1.22−2.13) increase in OC risk, compared to the reference A rs5742938−G rs13404927 haplotype and remained significant after correcting for multiple tests (P1000=0.011). Observed PMS1 haplotype effect is entirely due to the association of the PMS1 Grs5742938 allele observed in the single SNP analysis, and no increase in significance is achieved by inclusion of the variant rs13404927. There was no association between the three-marker haplotype of MSH2 (rs17217772, rs3771280 and rs10188090) and OC risk in either of the two ethnic groups (data not shown).
Possible cumulative effects of the SNPs were evaluated with MB-MDR approach (see Material and Methods), as it is well known that MMR enzymes function as heterodimers. Two, three and four-order interaction models were considered and the results are shown in Table 5. Data revealed best genetic interaction for SNPs in MSH2 gene (rs3771280), MSH3 gene (rs1428030) and PMS1 gene (rs13404927 and rs5742938), which was strongly associated with increased risk of oesophageal cancer in Black subjects. The frequency of the four-locus genotype CCrs3771280/AGrs1428030/GGrs13404927/GG5742938 was significantly higher in cases (18.6%) compared to controls (9.3%). In the Mixed Ancestry group, three significant multigene interactions were predicted. A three-order interaction MSH2 (rs3771280) * PMS1 (rs13404927) * MSH3 (rs26279) and a four-order interaction, which included the rs13320360 polymorphism in MLH1 gene, the rs10188090 polymorphism in MSH2 gene and the rs13404927 and rs5742938 polymorphisms in PMS1 gene, were the most significant and were hence regarded as the best models. The multi-locus genotype CTrs3771280−GGrs13404927−AGrs26279 was strongly associated with reduced risk for cancer (P=0.0028), whereas the genotype TT/AA/GG/GG from interaction MLH1 (rs13320360) *MSH2 (rs10188090) *PMS1 (rs13404927) *PMS1 (rs5742938) was more than 2-fold higher in cancer patients than in healthy individuals (Table 5). All three aforementioned interactions, remained significant after 1000 random permutations test.
To further investigate the role of MMR polymorphisms in relation to environmental factors, individuals were stratified for tobacco smoking habits. Three polymorphisms, that showed association with oesophageal cancer risk in a single-SNP analysis (see Table 3) were investigated in the stratified analysis based on smoking. In the Mixed Ancestry group, polymorphisms MSH3 rs26279 and MLH3 rs28756991 remained associated with the disease in smokers (Prs26279=0.004, and Prs28756991=0.011) in contrast to non-smokers, where no significant associations were observed (Table 6). In addition, three most significant gene-gene interactions were investigate after stratifying both populations for tobacco smoke exposure. Association of the four-locus genotype CCrs3771280/AGrs1428030/GGrs13404927/GG5742938, identified in Black subjects, was associated with OC in tobacco smokers (P=0.007), whereas the significance of the association of the TTrs13320360/AArs10188090/GGrs13404927/GGrs5742938 genotype with OC in Mixed Ancestry subjects was lost (P=0.054). The interaction MSH2 (rs3771280) * PMS1 (rs13404927) * MSH3 (rs26279) was only significant in smokers (P=0.004) in the Mixed Ancestry group (Table 6).
To assess functional nature of OC-associated SNPs, that were identified in this study, MSH3 and PMS1 mRNA levels were examined in normal oesophageal biopsies from 47 OSCC patients in correlation with rs26279 and rs5742938 genotypes, respectively. No significant effects of the rs26279 and rs5742938 genotypes on MSH3 and PMS1 expression levels, respectively, were observed (Prs26279=0.340 and Prs5742938=0.954) (Fig. 1). In addition, functional and structural effects of amino acid substitutions Ala1045Thr (rs26279) in MSH3 and Arg797His (rs28756991) in MLH3 were predicted using bioinformatic algorithms SIFT, PolyPhen, and Align-GVGD. Bioinformatic tools predict whether amino acid change will have neutral or damaging impact of the protein, based on multiple alignment information and biophysical characteristics of amino acids. Evolutionary sequence conservations were prepared from 26 MSH3 and 19 MLH3 protein sequences from different species and served as an input for all algorithms. In silico algorithm Align-GVGD predicted neutral effect for variant Arg797His (rs28756991), whereas SIFT and PolyPhen predicted it to have damaging impact on the proteins. All three computational approaches were consistent in predicting neutral functional nature of amino acid change Ala1045Thr (Table S3).
Carcinogenesis is a multistep process involving genetic and environmental risk factors. Common polymorphisms in many genes, including those involved in DNA repair, have been shown to predispose individuals to the disease. Genetic alterations in microsatellite regions, a hallmark of a defective DNA mismatch repair system, have been reported in oesophageal cancers. Despite this, common polymorphisms in MMR genes have rarely been studied in relation to OSCC susceptibility.
To estimate OSCC risk conferred by common polymorphisms in MMR genes, we analysed 10 SNPs within 5 genes of the MMR pathway in high incidence populations in South Africa. In light of the common disease/common variant (CD/CV) hypothesis, SNPs were selected on the basis of their different genotypic distributions between African and non-African populations. Recent studies also indicate that interplay between multiple polymorphisms plays a key role in carcinogenesis; we therefore analysed SNP-SNP as well as SNP-environment interactions in association with OC.
In this study, we identified three common polymorphisms that were associated with OSCC in Mixed Ancestry individuals. Firstly, the GG-genotype of polymorphism MSH3 rs26279 was positively associated with the disease. Polymorphism MSH3 rs26279 has been examined before, however with partially conflicting results across studies. Conde et al.  observed no association with breast cancer risk at the individual-SNP level in Caucasian females. However, gene-gene interaction analysis in the same study showed that multi-locus genotype AA/TC in MSH3 rs26279 *MSH6 rs1042821 interaction was associated with a decreased risk for tumorigenesis, suggesting that rs26279 changes affinity of MSH3 protein to heterodimerize with MSH6. Furthermore, Liu et al.  found no association between rs26279 and oesophageal adenocarcinoma in the Caucasian population, whereas a study by Hirata et al.  reports that the GG or AG genotypes of MSH3 rs26279 polymorphism might be a risk factor for sporadic prostate cancer. Furthermore, down-regulation of MSH3 was found to induce a MSI-L phenotype in sporadic colorectal cancer (reviewed by Boland and Goel, 2010 ). It is possible that a similar mechanism is responsible for frequently observed MSI-L phenotype in OSCC cases . In our effort to assess functional nature of identified SNP we performed expression analysis of MSH3 in biopsy samples from patients and did not detect any correlation between rs26279 genotypes and expression levels of MSH3 gene. Moreover, in silico algorithms predicted neutral effect on the proteins’ function and structure.
Secondly, we observed that a homozygous genotype for the G-allele in the PMS1 rs5742938 polymorphism was associated with cancer in Mixed-ancestry South Africans. This finding was further confirmed by haplotype analysis in Mixed-ancestry subjects, where haplotype Grs5742938−Grs13404927 of PMS1 increased the risk for cancer. The intronic change c.-21+639G>A (rs5742938) has not been identified in association with any type of cancer before; however, it was predicted by an UTRScan computational algorithm as functionally non-significant . According to our biopsy expression analysis, genotypes of rs5742938 do not affect the PMS1 mRNA expression levels. In the current literature the role of MLH1-PMS1 complexes in mismatch repair remains enigmatic.
Lastly, having one or two copies of an A-allele in MLH3 rs28756991 polymorphism was associated with increased risk for developing OSCC. This finding was also supported by in silico SIFT and PolyPhen algorithms, which predicted that aminoacid change Arg797His (rs28756991) has potentially damaging impact on the structure and function of MLH3 protein. This is the first study reporting on MLH3 rs28756991 polymorphism (Arg797His) and its relation to cancer risk; however, other functional polymorphisms in the MLH3 gene have previously been shown to confer cancer susceptibility. Michiels et al.  have shown that SNP MLH3 rs175080 (Leu844Pro) was associated with an increased risk for lung cancer in European Caucasians. A similar finding was reported by Conde et al.  where interaction between MLH3 rs175080 and MSH4 rs5745325 was associated with increased risk for breast cancer in a Portuguese population. Taken together, these reports support our finding that MLH3 could indeed be involved in the development of various types of sporadic cancers, including OSCC. MLH3 is the third protein that binds to MLH1, a key player in MMR apparatus, hence inefficient assembly of MLH1–MLH3 complex could lead to low penetrating oncogenic events.
In addition, significance for all three associations in the Mixed-ancestry group persisted after correction for multiple tests and the powers to detect the observed effect sizes were 90.47% (MSH3 rs26279), 84.88% (PMS1 rs5742938) and 86.71% (MLH3 rs28756991).
We failed to confirm similar associations in Black South Africans at the single SNP level. The reason for lack of significant associations in these individuals probably lies in different linkage disequilibrium (LD) patterns among the two ethnic groups, rather than different aetiologies of OSCC. African populations are most probably the oldest populations in the world, since Africa is believed to be the continent of origin for modern humans. In older populations, the sizes of LD blocks are generally smaller, due to more recombination events , . Hence, we can speculate that the marker alleles, which were identified in the Mixed Ancestry group (this is a young population arising from admixture of non-Africans with indigenous African populations in the 17th century), are in LD with the disease-causing alleles of MSH3, PMS1 and MLH3 genes, whereas in Black Ancestry subjects (i.e. representatives of an old population in Africa) the investigated alleles might not be in LD with the disease alleles. However, association studies in other ethnic groups are needed in support of this notion. Furthermore, populations of African origin present an opportunity to identify the true disease alleles, which are responsible for the disease phenotype, by fine-mapping of the genetic regions that are identified as disease-associated in non-African populations .
To further investigate the involvement of the MMR mechanism in OSCC development, we explored possible gene-gene interactions by MB-MDR, a dimension reduction method proposed by Calle et al. . In both groups, interaction analysis yielded several statistically significant interactions. In Black individuals, a potential four-order interaction MSH2 (rs3771280) * MSH3 (rs1428030) * PMS1 (rs13404927) *PMS1 (rs5742938) demonstrated strong association with increased risk for malignancy, whereas interaction MSH2 (rs3771280) * PMS1 (rs13404927) * MSH3 (rs26279) significantly decreased the risk for cancer in Mixed Ancestry subjects (Table 5). Interestingly, results are consistent between the two groups, since most significantly disease-associated interactions were found between SNPs which are situated in MSH2, MSH3 and PMS1 genes. Based on our data and the knowledge that MMR activity is achieved by protein heterodimers, one could argue that SNPs, affecting functionality of the ternary complex between heterodimers MSH2–MSH3 and MLH1-PMS1, are an important event in OSCC development. We also believe that there could be many genetic interactions that affect assembly and functionality of other MMR-heterodimers and therefore these need to be identified. MLH3 (rs28756991) *MSH2 (rs17217772) interaction was found to reduce the risk for OSCC in Mixed Ancestry population (Table 5). Moreover, similar gene-gene interactions of MMR genes have also been reported by Conde et al.  in association with breast cancer susceptibility.
Our data also implies that pathogenesis of common polymorphisms in MMR genes are influenced by environmental exposures, especially tobacco smoking. Similar findings have been reported before in other cancer types, that share common aetiology to oesophageal cancer. Hirao et al.  have shown association between lung cancer and loss of heterozygosity (LOH) at MLH1 locus, with higher prevalence of LOH in smoking patients. Several studies also suggest that the MLH1 rs1800734 polymorphism and tobacco smoke exposure have a role in tumorigenesis of lung cancer , . In our study, polymorphisms rs26279 (MSH3) and rs2875661 (MLH3) appear to be involved in smoking-related cancer, as they were only pathogenic among smoking individuals in contrast to non-smoking, where no significant association was observed (Table 6). The powers to detect the observed effect sizes in tobacco-smoking cases of Mixed Ancestry group remained sufficient for MSH3 rs26279 (97.53%) and MLH3 rs28756991 (85.05%), whereas borderline association of PMS1 rs5742938 with cancer was underpowered at 64.06%. We are aware that only as few as 16 non-smoking Mixed Ancestry cases were present in the study, which considerably reduced the power to detect the observed effect sizes in non-smokers. Therefore, to further support the results obtained from Mixed Ancestry group, stratification analysis was performed on the Black Ancestry group, where more smoking and non-smoking individuals were enrolled. The four-order interaction identified by MB-MDR was strongly associated with cancer in smoking Black Ancestry individuals, in contrast to non-smoking individuals of the same ethnic group, where no association was observed (Table 6). This trend suggests that defective MMR proteins - their activity may be compromised by polymorphisms in MMR genes - might be inefficient in repairing increased amounts of smoking-induced DNA adducts and/or signalling for apoptosis in such DNA error events. Our results support the data obtained by Dodd et al. , where it was reported that genes involved in metabolism of nitrosamines and DNA repair processes, including MMR, are dysregulated in nasopharyngeal carcinoma (NPC). These authors proposed an interplay between exogenous exposure to sources of nitrosamines (such as dietary, tobacco smoke and other), and the ability to efficiently metabolize nitrosamines or repair DNA damage induced by reactive byproducts of nitrosamine metabolism in the aetiology of NPC. Nitrosamine 4- (methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is a potent carcinogen contained in the cigarette smoke and was shown to induce cellular DNA damage , , . A study by Hou et al.  reported that Bcl2 enhances the frequency of NNK-induced mutations by down-regulating MMR efficiency via disruption of the MSH2-MSH6 complex. Despite this, observed gene-environment interactions associated with OSCC still warrant confirmation in a larger independent study.
In addition, we confirmed from additive (in Black group) to synergistic (in Mixed Ancestry group) risk effect of tobacco smoke and alcohol combination on carcinogenesis as previously reported by several other studies , , .
In conclusion, this study provides evidence that common polymorphisms in MMR genes, are indeed involved in the aetiology of OSCC. Cumulative effects of MMR-polymorphisms were further shown to strongly contribute to cancer development in both ethnic groups. In fact, our results imply that combined effects of common polymorphisms in MMR genes might alter susceptibility to OSCC by modulating the effect of exposure to first-hand tobacco smoke.
The study population has been described elsewhere , , . Briefly, a total of 1239 individuals were recruited from Black and Mixed Ancestry population of South Africa. Black individuals (n=689) were Xhosa-speaking South Africans (Xhosa-speakers originated from the Bantu-speakers in Southern Africa), mostly born in regions of the Eastern and Western Cape. Subjects resulting from marriages between different ethnic groups, including Western Europeans, the indigenous Khoisan, Bantu-speaking Africans, Indonesians and Malaysians were considered to be of Mixed Ancestry and were from Western Cape (n=471). The study consisted of 550 diagnosed and histologically confirmed oesophageal squamous cell carcinoma or adenocarcinoma cases, who were recruited between 2000 and 2010 from Groote Schuur Hospital, Cape Town, Western Cape, South Africa. Cases were either from Black or Mixed Ancestry ethnic group. There was no restriction on recruitment criteria for age and gender of cases. Controls (n=610) were healthy individuals without a previous history of cancer and were recruited from the same population groups and geographical area as the cases. Each participating subject was interviewed to collect information on demographic characteristics (age, gender, ethnicity), tobacco smoking and alcohol consumption and family history of cancer. Subjects with current or former smoking habits were classified as smokers. Alcohol consumers were defined as individuals who consumed more than 40 grams of alcohol per day. Family history of cancer was considered positive for individuals with at least one first-degree relative or two second-degree relatives having cancer. DNA was extracted from frozen blood samples using standard protocols. This study was approved by the University of Cape Town/Groote Schuur Hospital Human Ethics Research Commitee. Written informed consent was obtained from all participants recruited into the study.
Prior to genotyping, SNPs were selected from the HapMap database (Phase II+III release #28, August 10) based on their significantly different genotypic distributions between HapMap population of European ancestry (CEU; Utah residents with Northern and Western European ancestry from the CEPH collection) and 4 HapMap populations of African origin (ASW: African ancestry in Southwest USA; LWK: Luhya in Webuye, Kenya; MKK: Maasai in Kinyawa, Kenya; and/or YRI: Yoruba in Ibadan, Nigeria). We analysed 978 SNPs from 7 MMR genes and found 27 candidate polymorphisms in 5 MMR genes with significantly different genotypic distributions between African and non-African HapMap-populations. From those SNPs, ten were selected based on minor allele frequency (MAF >0.05) and their possible functional properties (e.g. nonsynonymous SNPs). In MSH2 three SNPs were selected (rs17217772, Asn127Ser, c.380A>G; rs10188090, c.2635-765G>A; and rs3771280, c.1510+118T>C), three in MSH3 (rs26279, Ala1045Thr, c.3133G>A; rs1428030, c.1341-12568A>G; and rs1805355, Pro231Pro, c.693G>A), two in PMS1 (rs5742938, c. −21+639G>A; and rs13404927, c.699+3331G>A), one in MLH1 (rs13320360, c.546-191T>C), and one in MLH3 (rs28756991, Arg797His, c.2390G>A). No polymorphisms were selected in MSH6 and PMS2 genes, since genotypic distributions of polymorphisms were not significantly different between populations of African and non-African origin.
All SNPs were analysed by allele-specific quantitative PCR assay ,  using Roche LightCyler® 480II instrument. Two allele-specific primers, each specific for one of the two variants of the analysed SNP, and a common primer for each SNP were designed with WASP software . To ensure better specificity, allele-specific primers contained an additional mismatch at penultimate position (second to last at 3’-end). Genotyping for each sample was performed in two parallel 3µL PCR reactions, one for each of the two alleles. Reactions contained 200 nM of one allele-specific primer, 200 nM of common primer, 5 ng of genomic DNA, and 1.5 µL KAPA™ SYBR® FAST qPCR Master Mix (2×) (Kapa Biosystems). Amplification conditions were as follows: initial denaturation for 3 min at 95°C; followed by 45 cycles of 5 sec at 95°C, 25 sec at 55–60°C (depending on the SNP), and 5 sec at 72°C; finally, melting curve analysis was performed. 10–20% of samples were re-genotyped to ascertain the reproducibility of the assay. Complete concordance between experiments was obtained. Primer sequences are available upon request.
Differences in demographic variables, lifestyle habits and genotypic frequencies between cases and control subjects were evaluated by using the Pearson’s Chi-Square (Χ2) test. Genotype data in control subjects from each ethnic group was checked for Hardy-Weinberg equlibrium using Fisher’s exact test. All genotypic analyses were performed assuming dominant and recessive models for the variant allele (i.e. minor allele in the control group) of each SNP. Crude odds ratios (ORs) and odds ratios adjusted for potential confounders (AORs), 95% confidence intervals (CIs) and P-values were obtained from logistic regression analysis using the SPSS (version 19) statistical package. For polymorphisms, the common homozygote genotype in the control subjects was set as the reference group. All reported values are two-sided, with P-value <0.05 considered as significant. Unadjusted significant P-values were corrected for multiple tests under the number of hypotheses tested (twenty per 10 SNPs in each ethnic group), using the Benjamini-Hochberg (BH) method . Power of the study was calculated post-hoc using QUANTO (v1.2) .Haplotypes were constructed from our population genotype data (including missing genotypes) using PHASE (v2.1) software , . Phasing of case and control haplotypes was performed separately. Samples with ≥90% certainty of phase estimates were considered in the analysis. In order to obtain reliable results, PHASE algorithm was applied 100 times for each haplotype using the -x option as instructed in the manual. The odds ratios (ORs) and their 95% confidence intervals (CIs) were estimated by Χ2 test. Significance for overall haplotype distribution between controls and cases was obtained with 1000 random permutations (P1000), where controls and cases were phased together.
Gene-gene interactions were explored using the model-based multifactor dimensionality reduction approach (MB-MDR) by applying a ‘mbmdr’ R-package to our whole dataset, including missing genotypes. General procedures of the three-step method and ‘mbmdr’ guidelines are fully described elsewhere , . Briefly, in the first step of the algorithm, an association test between each multi-locus genotype and the phenotype is performed using logistic regression, where individuals with multi-locus genotype of interest are compared against the rest of the individuals (the latter are considered as the reference group in the analysis). Genotypes are then assigned into three categories: high-risk, low-risk and no-risk, accordingly. The second step of the algorithm explores association of pooled genotypes in low-risk and high-risk categories respectively, with the phenotype, using logistic regression analysis. Again, the rest of individuals are considered as the reference group. Significance of results is explored through Wald statistics in the third step. In this study, multi-order interaction with the most significant association between a specific multi-locus genotype and the phenotype, was considered the best model and was further adjusted for multiple testing by 1000 permutations approach (P1000). Stratification analysis for tobacco smoking and alcohol consumtion was performed using the SPSS package.
Freshly frozen normal and tumour oesophageal biopsies were obtained from 47 patients (Mixed Ancestry group) with histologically confirmed OSCC. Total RNA was extracted from homogenates of the tissue samples using RNeasy Mini Kit (Qiagen) according to manufacturer’s protocol. cDNA was prepared from 1 µg of total RNA using ImProm-II™ Reverse Transcription System (Promega) and was subsequently used as a template in quantitative PCR (qPCR). QPCR assays were performed with SYBR® FAST qPCR kit (KapaBiosystems) in 10 µL volume reactions containg 1 µL of cDNA and gene-specific primers for genes GAPDH, MSH3 and PMS1, respectively. The following primer-pairs were used: GAPDH-fw (5′- GCC TGC TTC ACC ACC TTC) and GAPDH-rv (5′- GGC TCT CCA GAA CAT CAT CC); MSH3-fw (5′- GGC TCC TAT GTT CCT GCA GAA G) and MSH3-rv (5′- CCC TCT TCC TAG TTC ATC CAA GAT); PMS1-fw (5′- CCG TTA AGC ACA CCC AGT CAG) and PMS1-rv (5'- CAC AGG TTC AAT ATT CTC TCC CAC). All amplifications were performed as follows: initial denaturation at 95°C for 3 minutes, followed by 45 cycles at 95°C for 30 seconds, and 60°C for 30 seconds and 72°C for 10 seconds. Analysed genes in all 47 samples were amplified in triplicate using the Light Cycler 480II apparatus (Roche). MSH3 and PMS1 mRNA levels in each sample were normalized to GAPDH expression in the same sample using the efficiency corrected comparative Ct model:
?PCR efficiencies (E) were determined using LinRegPCR software . Differences in expression levels between groups were evaluated with nonparametric Kurskal-Wallis test. Reported P-values were two-tailed.
Predicting the putative effects of nonsynonymous SNPs on protein function was performed using SIFT (Sorting Intolerant from Tolerant) , , PolyPhen (Polymorphism Phenotype) , and Align-GVGD ,  algorithms. From multiple protein sequence alignment, these bioinformatic tools provide prediction scores, indicating the probability that a SNP is tolerant or deleterious. SIFT predicts the functional importance of amino acid change based on sequence homology and physical properties of amino acids. Likewise, Align-GVGD combines the biophysical characteristics of amino acids and protein multiple sequence alignments, whereas PolyPhen predics the possible impact of an amino acid substitution using sequence conservation, phylogenetic and structural information characterizing the substitution. For all algorithms 26 MSH3 and 19 MLH3 protein sequences, were used as input sequences (Table S3). SIFT scores were designated as tolerant (0.201–1.00), borderline (0.101–0.20), potentially intolerant (0.051–0.10), or intolerant (0.00–0.05) , . PolyPhen scores were classified as probably benign (0.000–0.999), borderline (1.000–1.249), potentially damaging (1.250–1.499), possibly damaging (1.500–1.999), or damaging (≥2.000) . For Align-GVGD predictions, variants were classified according to AGVGD graded classifiers used with the software (http://agvgd.iarc.fr/agvgd_input.php).
Genotype distributions at each SNP in controls and oesophageal cancer cases in two ethnic groups of South African population.
Individual SNP effects on OSCC risk in two ethnic groups of South African population.
Computational analyses results.
We would like to thank Antionette Olivier and Zenaria Abbas for assisting with the sample collection and processing, as well as patients and healthy controls for their participation in this study.
Competing Interests: The authors have declared that no competing interests exist.
Funding: This work was supported by the South African Research Chairs Initiative of the Department of Science and Technology and the National Research Foundation, the International Centre for Genetic Engineering and Biotechnology (ICGEB), the South African MRC and the University of Cape Town. MV is a recipient of an ICGEB postdoctoral fellowship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.