Nucleotide excision repair (NER) is a complex pathway integral to repair of exogenous damage to DNA from a variety of sources. Small variations in this pathway that may have an impact on DNA repair capacity, and, over time, could heighten risk for malignancy. The effect of interactions of these variations among the many genes involved in NER is largely unknown.
This study represents an analysis of common polymorphisms in the complete NER pathway and associated genes with risk for pancreatic cancer. Due to the explosion of high-throughput technology in genetic analysis, large scale analyses are now possible for genetic epidemiology studies. Increasingly common among these are genome wide association studies (GWAS), which are agnostic, without the need for choosing candidate genes or pathways. These can be costly, and often are only done on a small subset of the sample in a staged approach, so only one question can be addressed (usually overall adjusted risk using an additive model) in the second stage. An alternative is the candidate pathway approach, which are based on prior suspicion of association, and this follow a classic hypothesis-testing approach. In these studies, tag-SNPs are chosen in every known gene in the pathway in an attempt to include most common sequence variation in the identified genes, through the assumption of linkage disequilibrium. Variations may have a direct effect on gene function, but more likely are linked to potential causal variants. This approach is limited by our knowledge of the genes involved in pathways and their interactions, and will miss less common variation (such as deleterious mutations).
In order to screen for overall gene effects, we performed gene-level associations using a principal components analysis with each SNP of a gene included in the analysis, adjusted for important covariates.
Our study has implicated MMS19L
(on 10q24.1), a human homolog of MMS19
, a gene first noted in Saccharomyces cerevisiae
to be involved in NER and RNA transcription, with separate domains required for each process.(31
has not been well characterized in humans, but is believed to play a similar role in human NER, with several regions highly conserved; and alternate splicing preserved across species.(33
) The protein binds to the GTF2H complex via ERCC2 and ERCC3, though its exact function is unclear.(34
) Analysis of MMS19L
variants with cancer risk has only been reported in one study of lung cancer, with no alteration of outcome for one non-synonymous SNP.(35
In addition to the gene level analyses by principal components analysis, we also performed individual SNP, subgroup, haplotype, and interaction analyses within the pathway. As noted above, three SNPs in MMS19L appeared to associate with altered risk for pancreatic cancer. The association appeared to be strongest among women, ever smokers, former smokers quitting > 15 years prior, and those with lower BMI. However, confidence intervals for these subgroups overlap with others, so these distinctions are considered exploratory.
In order to avoid missing possible associations of SNPs in genes not detected by the principal component approach, individual analyses were performed for all SNPs in the pathway. Because many of these will be associated simply by chance, replication will be required to confirm our findings.
In the pathway interaction analyses undertaken using recursive partitioning (RPART), no significant associations were found, though we cannot rule out interactions. Pathway analysis is limited by many factors, including unknown biological function of variants, lack of ability to separate chance findings from true differences, and lack of consensus among the research community how to best assess interactions. A potential limitation of RPART is that due to binary splitting, subgroups are created with rapidly diminishing numbers of cases and controls. Thus, it may not detect more complex associations due to a lack of power in the smaller groups. However, an advantage of RPART is that it is agnostic, and does not simply constitute a compilation of positive findings, many of which could be false positives.
Perhaps more important than our findings with MMS19L, there does not appear to be a large effect of NER variation on pancreatic cancer risk overall. The low number of positive associations, when many are likely due to chance, suggests that perhaps this pathway is less important in pancreatic adenocarcinoma carcinogenesis. Replication of our findings, both positive and negative, in other study populations will be key to defining the role for polymorphisms this pathway in pancreatic cancer risk.
Limitations of this study include genotyping failure of 5% of our samples, which could affect power and results, but is unlikely to introduce a systematic bias. As this is a clinic based case-control study, the choice of controls is always problematic, since no control group perfectly matches the patient population. Indeed, patients seen at a referral center are likely younger, healthier, and earlier stage than in the general population, and they must survive long enough to be seen. We attempted to minimize this with recruitment at the time of initial clinic appointment. In addition, using healthy patients seen in the General Internal Medicine Clinic as controls draws from a similar referral population at our institution, and the odds ratios seen for subjects from local and nonlocal locations of primary residence are consistent, at least for the MMS19L
SNPs (). We also did not correct for multiple comparisons in our analyses, as we view these findings as exploratory and not conclusive. Methods such as the Bonferroni method can be overly conservative in genetic analyses due to linkage disequilibrium.(36
) The field has not yet reached a consensus on the correct adjustments needed, if any, aside from future replication, which we believe would represent the most important method of confirming our findings as not occurring by chance.
Further studies to confirm the associations and identify the functional genetic variants in MMS19L responsible for the association are needed before these findings would be able to be included in risk modeling for pancreatic cancer.