The regional genetic and lifestyle heterogeneity among populations from different parts of India have been noted by many investigators 
. This poses serious impediment to the genetic association study in Indian populations. We thus, targeted the middle and low-income group of semi-urban population with an age range of 22 to 80 years from the state of West Bengal in this study. We also ensured similar tobacco habits of the case and control individuals who participated in the study. The ongoing Million Death Study (MDS) in India finds an increase in age-specific cancer risk due to tobacco habit in the population from West Bengal 
. Another study also reported association of oral habit and DNA damage with OSCC and leukoplakia in these populations 
The most promising associated SNP from this study is rs12515548 of MSH3
. This SNP was found to be significantly associated in three out of four analysis sets tested in the discovery phase (case-control, cancer-control and cancer-leukoplakia) and also remained significant in the replication phase. No association was found with this SNP in GWAS of upper aerodigestive tract cancers and this is the first report of association of this SNP with OSCC. However, several studies showed association of other SNPs in MSH3 and MSH6 genes in different cancers 
. It may be noted that, although we have observed relatively strong P-values in the association tests for the given sample size, the power of the study is 0.81 and there was no population stratification. However, further replication is essential in same and other populations. The rs12515548 is an intronic SNP located near 21th exon of the MSH3
with a change from G to A (http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=12515548
). Two functional attributes may be associated with this SNP, (a) functionality prediction using F-SNP 
revealed that it loses the capacity to bind GATA family of transcription factors upon change from G to A (confidence score of binding prediction for different GATA transcription factors ranges from 88.4 to 98.4) and (b) the miRBase analysis showed an increased affinity of hsa-miR-374a-3p to the risk allele (A) of the SNP (score 6.9, evalue 1.0 for allele A; score 60, evalue 5.6 for allele G). Direct experimental validations are needed to understand its exact functional role, if any. The results from Indian Genome Variation Consortium 
and admixture mapping of Indian population identified the caste populations of the eastern India as Indo-European population which show relatedness to the CEU population of the HapMap 
. We thus, build a LD map of MSH3
using imputed data from HapMap CEU population and found an 81 Kb LD block with rs12515548 which includes exon 21 (data not shown). It would be interesting to examine whether or not such LD block exist in this populations and, if so, whether rs12515548 is linked with any other functional SNP of the MSH3
. The intronic SNP rs207943 of XRCC5
, which also showed significant association with OSCC development, is present within a putative binding site of the transcription factor Skn-1 of C. elegans
. It binds only with non-risk G allele of the SNP (F-SNP prediction score 0.5, binding score 87.1). The human homolog of Skn-1, Nrf 1/2/3 is an important transcription factor involved in oxidative stress resistance 
. The Nrf2 deficient mice have attenuated expressions of many detoxifying and antioxidant enzymes and are highly susceptible to carcinogen induced toxicity and carcinogenesis 
. Thus, the inability of Skn-1 binding with the risk allele C of this SNP and OSCC progression needs to be investigated further.
The study also probed genetic risk factors associated with the development of leukoplakia and its conversion to OSCC. We found different SNPs to be associated exclusively with the development of leukoplakia from normal individuals and progression of leukoplakia to cancer. For example, rs7003908 of PRKDC
was reported to be associated with prostate and urinary bladder cancer in north-Indian populations and glioblastoma in United States 
. Identification of a specific risk SNP associated with cancer-leukoplakia comparison would be valuable as a prognostic biomarker for the detection of cases where leukoplakia would have the potential of conversion to oral cancer. However, replication of the association in another cohort of leukoplakia patients is required to validate these results.
The tobacco exposure is a known environmental factor associated with oral cancer and leukoplakia. Thus, we performed association test without its adjustment and stratifying the subjects based on their tobacco exposure levels. The observation that a few polymorphic variants of DNA repair and damage response genes exhibited association to a different tobacco exposed groups suggests that DNA damage signals are differentially processed by different polymorphic variants of these genes. Similar observation has also been made in previous studies with p53
gene polymorphisms 
. It may be noted that these SNPs might be useful for development of tobacco-associated predictive marker for oral cancer and leukoplakia. The MDR analysis revealed age in OSCC and chewing in leukoplakia are the two important covariates which interacts synergistically with the most potent risk SNPs of the respective diseases (rs12515548 and rs207943 for OSCC and rs12360870 for leukoplakia). The study revealed synergy between SNPs and redundancy between lifestyle factors albeit without any additive effect. This particular phenomena was also observed with the SNPs from DNA repair genes in other caner types 
. Thus, it may be suggested that the overall repair capacity contributed by different repair machineries and independent effects of various lifestyle factors are the ultimate determinant of oral cancer and leukoplakia predisposition in an individual.
The present study suggests that MSH3, XRCC5, MRE11A
to be the four most important genes that would modify the risk of predisposition to oral cancer and leukoplakia in these eastern Indian populations. Polymorphic variants of these genes were found to be significantly associated with breast, pancreatic, colorectal and ovarian cancers 
. However, to the best of our knowledge, none of the variants identified in this study were previously reported to be associated with any other cancer, except rs7003908. MSH3 upon phosphorylation by ATM/ATR initiates DNA mismatch repair with MSH2 and directs downstream MMR events, including strand discrimination, excision, and re-synthesis with MLH1 and PMS1 
. XRCC5 with XRCC6 forms a dimer and increases the affinity of PRKDC, the catalytic subunit of DNA-PK
[DNA-dependent serine/threonine protein kinase] 
. It plays several crucial roles like, recognition and recruitment of other components to DSB and phosphorylation of several transcription factors including p53 
. Several other phosphorylating substrates of PRKDC have also crucial role in cancer, like, c-Myc, PARP, c-JUN 
. MRE11A, one of the partners of MRE11A-RAD50-NBN complex involved in DSB repair, have also role in telomerase integrity and meiosis. The functional implications of either the associated intronic SNPs or their linked functional SNPs in these genes are needed to be investigated in future.