|Home | About | Journals | Submit | Contact Us | Français|
Polymorphisms in double-strand DNA repair gene XRCC2 may play an important role in colorectal cancer (CRC) etiology, specifically in disease subtypes. Associations of XRCC2 variants and CRC were investigated by tumor site and tumor instability status in a four-center collaboration including three U.K. case-control studies (Sheffield, Leeds, Dundee) and a U.S. case-control study of cases from high-risk Utah pedigrees (total: 1,252 cases, 1,422 controls). The 14 variants studied were tagging-SNPs selected from HapMap/NIEHS data, supplemented with SNPs identified from sequencing of 125 cases chosen to represent multiple CRC groups (familial, metastatic disease, and tumor subsite). Monte Carlo significance testing using Genie software provided valid meta analyses of the total resource that includes family-based data. Similar to reports of CRC and other cancer sites, the rs3218536 R188H allele was not associated with increased risk. However, we observed a novel, highly significant association of a common SNP, rs3218499G>C, with increased risk of rectal tumors (OR 2.1, 95%CI 1.3-3.3; pchisq. =0.0006) versus controls, with the largest risk found for female rectal cases (OR 3.1, 95%CI 1.6-6.1; pchisq. =0.0006). This difference was significantly different to that for proximal and distal colon cancers (pchisq. =0.02). Our investigation supports a role for XRCC2 in CRC tumorigenesis, conferring susceptibility to rectal tumors.
The X-ray repair complementing defective repair in Chinese hamster cells 2 (XRCC2) gene, located on 7q36.1, is an essential part of the homologous recombination repair (HRR) pathway and a functional candidate for involvement in tumor progression (1-3). DNA double-strand breaks (DSBs) trigger a response pathway via activation of ATM and the MRN complex (comprising MRE11A, NBS1, and RAD50) that is thought to initiate the DNA repair process (4). ATM phosphorylates CHEK2 (leading to cell cycle arrest) and the breast cancer proteins BRCA1 and BRCA2, and also activates TP53 (4-6). The DSB can then be repaired by two alternative pathways; homologous recombination repair (HRR) and the more error-prone non-homologous end joining (4, 7). DNA DSBs induce an S-phase co-localization to nuclear foci of BRCA1 and BRCA2 with RAD51, which is central to homologous recombination (8). Other members of the RAD51-related protein family, XRCC2 and XRCC3, are also essential for HRR (9, 10), and are required for correct chromosome segregation and the apoptotic response to DSBs (1, 11). Accurate repair of DSBs arising during DNA replication or from DNA-damaging agents is necessary to maintain genomic stability. Failure of these processes to repair DSBs can lead to mutations, apoptosis, tumor predisposition, and carcinogenesis (2, 4, 12); inherited deficiencies in a number of genes involved in the DSB pathway can confer an increased risk of cancer and may be predictive of later mortality (1, 12, 13).
Although a defective mismatch repair (MMR) is known to cause hereditary nonpolyposis colorectal cancer (HNPCC) or Lynch syndrome, much less is known about the DSB pathway in colorectal cancer (CRC) etiology and the role of DSB-related genes has only recently begun to be investigated (14). Common variants within XRCC2, particularly a coding SNP in exon 3 (R188H, dbSNP ID rs3218536), have been identified as potential cancer susceptibility loci in recent studies although association results are mixed. The XRCC2 R188H polymorphism has been proposed to be a genetic modifier for smoking-related pancreatic cancer (15), was associated with an increased risk of pharyngeal cancer (16), and the rs2040639 SNP was reported to contribute to oral cancer risk (17). In contrast, R188H and three SNPs in the 5′ promoter region have been associated with reduced bladder cancer risk (18). A large multiethnic study of epithelial ovarian cancer showed an association between R188H and reduced risk (19), although validation studies could not provide confirmation (20, 21). Studies have implicated XRCC2 R188H in breast cancer (22-25); however, the Breast Cancer Association Consortium (26) and other subsequent studies found no association between R188H and breast cancer risk (27, 28), or evidence of a modest protective association (29, 30). Only a very limited number of studies of XRCC2 specific to colorectal lesions have been conducted to date. In a large nested case-control study of colorectal adenomas, no association with R188H was found (31). In a hospital-based study and a large CRC candidate gene SNP scan, Moreno, et al. and Webb, et al. observed no association of R188H and CRC, respectively (14, 32). Based on their meta analysis of these two studies, Vineis, et al. reported a nominally significant, modest increased risk of CRC with R188H (OR=1.16, 95%CI 1.01-1.34; p=0.034), which they characterize as weakly credible in a comprehensive analysis of associations reported between several variants in DNA repair genes across cancer sites (33). To our knowledge, no study has examined XRCC2 variants, other than R188H, in relation to CRC.
Colorectal tumors fall into two main groups, those exhibiting chromosomal instability (CIN) and those exhibiting microsatellite instability (MIN). The latter group is deficient in DNA MMR; this can be caused by inherited mutations in genes encoding MMR proteins MLH1 and MSH2, as occurs in HNPCC, or by loss of expression of these proteins (usually MLH1) in sporadic tumors. There are marked differences between MMR-proficient and MMR-deficient tumors in their etiology and progression. MMR-proficient tumors show CIN, not MIN, tend to be distally located in the colon, and carry mutations in genes such as K-ras and TP53 (34, 35). MMR-deficient tumors exhibit MIN, not CIN, tend to be proximally located, carry mutations in genes such as TGFBR2, BAX, MSH3 and MSH6, and have a better prognosis (36-38). There are also differences in epidemiological risk factors between CIN and MIN tumor types (39). Defects in DSB repair genes might, therefore, be predicted to confer a CIN phenotype (40) leading to the hypothesis that genetic differences exist between these two etiological pathways. Thus, in addition to evaluating overall risk of CRC, we examined MMR-proficient and MMR-deficient cancers separately, since the genetic risk factors may differ between the two types.
Our investigation is the first to genotype a comprehensive set of tagging-SNPs (tSNPs) in a meta-genetic association study of CRC in a large, combined resource that included three U.K. case-control cohorts (Sheffield, Leeds, and Dundee) and U.S.-Utah cases from high-risk pedigrees and matched controls.
In Sheffield, CRC cases were identified from subjects resident in Sheffield, U.K. and undergoing surgery for a primary colorectal tumor at the Royal Hallamshire or Northern General Hospitals, Sheffield between March 2001 and June 2005. Control subjects, age- and sex-matched to cases, were identified from Sheffield General Practice registers and recruited between October 2001 and December 2005. In Leeds and Dundee, incident CRC cases were identified between 1997 and 2000 from examination of pathology records at the Leeds and Dundee Teaching Hospitals NHS Trust, and age- and sex-matched controls were identified from the records of general practitioners of cases as described previously (41-43). In addition to 1:1 matched controls, an additional 198 Dundee controls with XRCC2 genotypes were available for analysis. In Utah, CRC cases were selected from 252 high-risk cancer pedigrees; one case per pedigree from 161 pedigrees (161 independent CRC cases), and two or more cases from 91 pedigrees (294 related CRC cases). A high-risk pedigree was defined as one containing a statistical excess of individuals with cancer, as assessed using the Utah Population DataBase (UPDB). The UPDB is a genealogical resource that is record-linked to the Utah Cancer Registry (UCR) and Utah vital records; it includes a subset of approximately 2.3 million individuals with extensive pedigree information from which high-risk families are identified. Utah controls which represent a convenience sample not specifically ascertained for this study, were selected to be cancer-free, and were matched by sex- and 5-year birth cohort to the prevalent cases. As age of Utah controls represents their age at ascertainment for prior studies, age at diagnosis for cases and age at selection for controls do not necessarily correspond; however, cases and controls were well-matched for age based on birth cohort (see footnote 2 of Table 1). Study subjects in all centers were of North European descent. The total resource included 1,252 cases and 1,422 controls that were genotyped for SNP variants in XRCC2. Proximal colon site was defined as tumors of the cecum through transverse colon. Distal colon was defined as tumors of the splenic flexure, descending, and sigmoid colon. Rectal cancer was defined as tumors of the rectosigmoid junction and rectum.
Usually, small, neutral discovery panels (a set of individuals unselected for disease with dense genotyping or sequence data) are used to select tSNPs to study. Recently it has been demonstrated that diseased discovery panels can be superior to neutral panels for selecting tSNPs that are more powerful to detect rarer genetic variants in common, complex disease (44). We therefore sequenced XRCC2 in a large, disease-based discovery panel, and incorporated these results in addition to using publicly available sequence and map data to determine a more comprehensive set of tSNPs to study. Publicly available SNP data included that derived from sequence data for >90% of all nucleotides across XRCC2 in 24 Caucasian samples available from the NIEHS SNPs Program1, and map data of 60 CEU samples available from HapMap2. We supplemented this with SNPs identified from the sequencing of exons and ~500 bp of the promoter region in 125 Caucasian CRC cases chosen to represent multiple groups (familial, sporadic, and metastatic disease) and tumor site (proximal colon, distal colon, and rectum) in a collection of U.K. and U.S. samples. Using a principal components method (45) and no restriction on MAF, we selected 14 tSNPs accounting for >93% of the intragenic variation in XRCC2. The average pairwise r2 between selected tSNPs and the unselected SNPs they were chosen to represent was 0.88. We identified a total of four supplemental variants from our sequencing of the disease-based panel; two that were not represented in the neutral NIEHS/HapMap data, as well as two rare, novel variants (Table 2). These 4 SNPs, plus the set of 14 tSNPs, were selected from a total of 93 possible XRCC2 SNPs. Of these, 12 tSNPs and the two novel variants were successfully genotyped in the combined 4-study resource of case and control subjects in the U.K. and U.S.
Genotyping was carried out at the Sheffield, U.K. center in 384-well plates using the Applied Biosystems SNPlex™ system which allows multiplex analysis of up to 48 SNPs3. At least 5% of samples were duplicated in the plates to assess the reproducibility of the genotype calls. For each SNP, duplicate concordance, call rate and test for compliance with Hardy-Weinberg equilibrium (HWE) in controls separately for each study site are shown in Supplementary Table S1. We required a duplicate concordance of at least 95%, a call rate of at least 90%, and HWE in controls (p>0.05) for a SNP to pass quality control. Two tSNPs (rs2106776 and rs3218455) failed quality control and two of the 4 supplemental SNPs failed primer design and were dropped from further analysis. However, tSNPs rs3218374 and rs3218536 adequately represented the omitted tSNPs (r2=0.7 and r2=1.0, respectively). The remaining 14 SNPs (12 tSNPs and 2 novel variants) were taken forward to analysis.
Tumor samples in the U.K. studies, Sheffield, Leeds and Dundee, were assessed for mismatch repair capacity as measured by IHC of the MLH1 and MSH2 proteins using antibodies raised against MLH1 (G168-15, BD Biosciences) and MSH2 (Ab-2, Oncogene) as previously described (38, 46). MMR deficiency was defined as loss of MLH1 or loss of MSH2; conversely, MMR proficiency was the expression of MLH1 and MSH2. An assessment of MMR capacity was available for 468 cases out of 797 cases in the U.K. data with XRCC2 genotypes.
All analyses were conducted using Genie 2.6.2, a freely available software package4. Genie provides valid genetic association, HWE, and homogeneity testing in cases and controls that include related individuals using Monte Carlo significance testing (47). Specifically, Genie allows for valid meta association testing, where constituent studies can include a mixture of family-based and independent individuals. In such situations, using standard statistical software to perform methods such as logistic regression, are invalid. The meta association capabilities and validity of Genie are described elsewhere in detail (47) and have been applied previously in candidate gene meta-association studies (48, 49). We performed meta chi-square tests for trend, odds ratios (OR), and empirical 95% confidence intervals (CI) using Cochran-Mantel Haenszel (CMH) techniques for each SNP. We repeated our CMH analyses also controlling for sex, early or late age at diagnosis, and family history in addition to study center. These did not differ substantively and are therefore not shown.
The primary statistical test employed throughout is a trend test, together with heterozygote and homozygote odds ratios to indicate effect size; however, a dominant model was used for SNPs with insufficient homozygote counts to maintain statistical validity (MAF<0.05). If the ORs indicated a recessive model, then this was also analyzed because a recessive model is not well-represented by a trend test (50). Stratified analyses by sex, age at diagnosis, family history, and tumor site were performed. A cutpoint of 60 years (~25th percentile of the distribution of diagnosis age in the cases) was used to determine early or late onset. As controls in Utah were age-matched by five-year birth cohort to cases, age was stratified by younger or older birth cohort to approximate a cutpoint of age 60. Cochran’s Q test was conducted to assess homogeneity of effect size across studies. Statistical heterogeneity was considered present if p<0.05. All p-values were empirically derived based on 10,000 simulations in the Genie null distribution as described (47, 51). The haplotype-mining hapConstructor module of Genie was used to comprehensively analyze multi-locus.XRCC2 haplotypes and combined genotypes (52).
A description of the four study populations in the U.K. and U.S. centers and the combined resource is shown in Table 1. Cases from Utah had a higher proportion of first-degree relatives with CRC than cases in the U.K. cohorts (panova <0.0001), and a higher proportion of early-onset cases (age 59 and younger; panova=0.002), as would be expected for CRC cases selected from high-risk cancer pedigrees. Utah also had a lower proportion of rectal cancer (panova<0.001). This is also as expected because the CRC high-risk pedigrees were ascertained primarily for excess of colon cancers, and the relative incidence of rectal cancer to colon cancer is higher in the U.K. than in the U.S. (53). In Table 2, we describe the XRCC2 tSNPs selected. All SNPs were in Hardy-Weinberg equilibrium; pairwise LD between the SNPs studied is shown in Supplementary Table S2.
Meta genetic associations of each tSNP with CRC are shown in Table 3. No results exhibited significant statistical heterogeneity across studies. Only one SNP indicated significant association with CRC. Individuals who were homozygous for the rs3218499 C risk allele had a 60% increased risk of CRC compared to the GG genotype (ORmeta 1.6, 95%CI 1.1-2.2). There was no increased risk for individuals who were GC heterozygotes, thus the C allele appeared to have a recessive mode of inheritance (pchisq.=0.009). Risk estimates were somewhat higher in the three U.K. studies for rs3218499 (Supplementary Table S3), although there was no statistically significant evidence for heterogeneity across the four studies (phom=0.90). No significant haplotype associations were found.
We evaluated whether the observed association for rs3218499 G>C differed by tumor site, sex, age at onset, and family history. In Table 4, the meta association results for rs3218499, stratified by each characteristic, are shown. For tumor site, we compared proximal colon, distal colon, and rectal cases to controls. We observed that the increased risk for cancer differed substantially by tumor site. In cases with rectal tumors, the association was highly significant compared to controls (CC vs. GC/GG, recessive: ORmeta 2.1, 95%CI 1.3-3.2; pchisq.=0.0006). Furthermore, the increased risk for rectal cancer was significantly higher than for proximal and distal colon cancer (p chisq.=0.02). Nominally significant results were also observed for other characteristics; however none of the other subgroups were statistically different in case-case comparisons. When sex and CRC site were considered together, the risk was highest for female rectal cases compared to controls (CC vs. GC/GG: ORmeta 3.1, 95%CI 1.6-6.1; p=0.0006), and was significant in a case-case comparison to female colon cases (pchisq.=0.02, data not shown). However, the risk conferred by the rs3218499 C allele in female rectal cases was not statistically significantly different compared to the risk in the male rectal cases (pchisq.=0.21, data not shown).
We inspected the association between rs3218499 and rectal cancer risk specifically in the four study sites. There was no statistically significant evidence for heterogeneity across the four studies for rs3218499 (phom=0.42) and all risk estimates were in the same direction, however, higher and more significant risk estimates were found in the three U.K. studies (CC vs. GC/GG): ORU.K.-Sheffield=1.8 (0.8-3.9); ORU.K.-Leeds=3.7 (1.1-10.4); OR U.K.-Dundee=2.0 (0.7-5.4) and OR US-Utah=1.4 (0.7-3.0). In female rectal cancer cases, statistical homogeneity was maintained (phom=0.80) and associations were more consistent across studies: ORU.K.-Sheffield=2.4 (0.7-8.1); ORU.K.-Leeds=5.8 (0.9-31.8); OR U.K.-Dundee=2.6 (0.6-11.4) and OR US-Utah=3.1 (1.0-9.4), data not shown.
Rectal cancers in this study were less likely to exhibit MIN (3% of rectal tumors assessed for MMR capacity were deficient) than proximal colon (30%) or distal colon (6%) cancers, as observed elsewhere (35, 54), and differences in genetic and epidemiological risk factors between colon and rectal tumor subsites have been suggested to exist (55-59). An exploratory analysis of XRCC2 tSNP associations by MMR-deficient and MMR-proficient tumor status, in a subset of cases in the three U.K. studies, was thus performed; no difference in risk by MMR status was observed for the rs3218499 SNP. However, one rarer XRCC2 intronic SNP, rs3218402 A>G (MAF=0.03), was found to be nominally associated with MMR deficient CRC, in both a case-control (pchisq.=0.01) and case-case comparison for a dominant model (MMR deficient vs. proficient, pchisq.=0.04, data not shown). Carriage of the G allele conferred an increased risk of MMR-deficient CRC in comparison to controls (AG/GG vs. AA: ORmeta 2.9, 95%CI 1.1-7.4; data not shown). No increased risk of CRC was observed for MMR proficient tumors (AG/GG vs. AA: ORmeta 1.2, 95%CI 0.7-2.0; data not shown). Haplotype analyses based on MMR status suggested a haplotype of G-G across rs3218402 and rs3218385 was nominally associated with MMR-deficient tumors when compared to a referent wildtype haplotype of A-T (pchisq.=0.002). As several comparisons were made and only a subset of tumors in the overall resource had MMR status available, these results should be considered preliminary.
Our study represents the first comprehensive genetic characterization of the role of XRCC2 in CRC in a meta analysis of three U.K. case-control studies, and a U.S. family-based study. We used both publicly available data and results from sequencing a panel of CRC samples, well-characterized for tumor type or genetically loaded from high-risk pedigrees, to select XRCC2 tSNPs for further study. Valid analyses were made possible via Genie, which is designed to analyze both related and independent individuals. A major strength of our investigation was the sample size of the combined resource, which allowed increased power to examine associations including sub-phenotypes of interest such as CRC cancer subsite.
We observed no association between the putatively functional rs3218536 R188H SNP and CRC. Our most significant finding was a common XRCC2 intron 2 variant (rs3218499 G>C, MAF=0.23) that appeared to be strongly associated with increased risk of rectal tumors (ORmeta=2.1, p=0.0006), and female rectal cancer in particular (ORmeta=3.1, p=0.0006). This is a finding that has not been previously reported to our knowledge, and no known functional studies of this SNP (or a SNP in high LD with rs3218499) exist5. Genome-wide association studies on cancer including CRC have not reported any highly statistically significant findings for common DNA repair gene variants, including polymorphisms in XRCC2 (33); however, these studies have not focused on associations in rectal cancers specifically.
The magnitude of risk estimates for XRCC2 rs3218499 and rectal cancer were notably stronger in the U.K. cohorts. There was no evidence of statistical heterogeneity across studies; however, the Q test can be insensitive when a small number of studies are included. It is possible that the rectal tumors and XRCC2 are interacting with environmental factors that may negatively impact DNA DSB repair, e.g. cigarette smoking. Environmental differences, particularly smoking and alcohol consumption, are similar in the three U.K. sites and differ with the US-Utah site which is comprised predominantly of members of the Church of Jesus Christ of Latter-day Saints (LDS or Mormon), many of whom abstain from alcohol and tobacco use, and may be responsible for the differential in risks observed across the studies. However, environmental differences could not be assessed directly as the relevant data were not available for this study. Potential heterogeneity in phenotype origin is another plausible explanation for differences observed between the U.K. and U.S.-Utah studies. Utah cases from high-risk pedigrees could be influenced by yet undiscovered high-risk alleles and therefore less influenced by low risk XRCC2 alleles; although it is pertinent to note that CRC cases in the pedigrees were screened for HNPCC variants and Amsterdam-type criteria, and none were found to be responsible for the clustering. It is of note that female rectal cases were at highest risk (although not statistically significantly higher than male rectal cases (pchisq. =0.21), and that risk estimates were more consistent across study sites for this subset of disease. This observation may argue instead for an etiology that involves interactions with hormone factors. It has been suggested that exogenous estrogens may reduce risk of sporadic CRC (54, 55), although associations with specific types of hormones have been inconsistent and it is unclear whether some tumor types differ in risk (39, 56). Hence, the potential role of environmental factors as well as endogenous and exogenous hormones should be assessed in future studies of XRCC2 and CRC.
An exploratory investigation of CRC MMR status and XRCC2 identified SNP rs3218402 and additionally, a haplotype across this SNP and rs3218385, were nominally associated with MMR-deficient tumors. The latter SNP was identified in our disease-panel sequencing and would not have been examined had the study relied solely on publicly available data, suggesting the potential importance of supplementing tSNP selection with sequence data from disease-based samples. As a number of tests were conducted in a small subset of cases (37%) in the overall study resource, these results may be due to chance or selection bias and should be considered preliminary pending confirmation in other studies. Another limitation of our investigation of MMR status is our assessment of MMR capacity by immunohistochemistry (IHC) of MLH1 and MSH2 proteins. IHC can be a valid tool to identify patients at risk for HNPCC or Lynch syndrome and patients with sporadic microsatellite unstable CRC (57). However, it has been suggested in recent studies that adding PMS2 and MSH6 to immunohistochemical detection of MMR protein in screening CRC tumors has greater sensitivity (comparable to MSI testing), than IHC detection of MLH1 and MSH2 alone (58). Thus, it is possible that tumors evaluated as MMR-deficient in this investigation may be mischaracterized.
It has been shown that disease-based panels for tSNP selection can improve detection of rarer variants (MAF 0.01-0.05) in subsequent association studies (44). Better characterization of such variants is due to their increased frequency and LD structure that may vary in disease panels relative to neutral resources. Consistent with this, we found there were loci identified in sequencing of our disease panel of 125 individuals that were not evident in the publicly available panels. However, our meta investigation which included a large collection of more than 2,500 subjects was still underpowered to detect associations of very rare variants (MAF <0.005), pointing to the need for continued large collaborations in studies of common, complex disease. It should also be noted that the increased power gained to detect association by including familial cases is accompanied by an over-estimate of the effect size as measured by the odds ratio for the general population (59). Tests of the null hypothesis (effect size or independence) remain valid with the combined populations; the Utah site contains predominantly familial cases and as such, while our significance values are valid, our meta OR estimates may be inflated. In our hypothesis-based investigation, we analyzed multiple SNPs and performed stratified analyses, including tumor site, gender, and MMR-proficient or deficient subgroups. As a number of comparisons were made the possibility of observing a chance finding exists, and p-values that achieve nominal significance should be interpreted with caution. Therefore, it is important that these association findings are replicated in other investigations for confirmation.
In summary, we present evidence that a common variant in XRCC2 is associated with increased risk of CRC, an association that is particularly strong with regard to rectal cancer in women. Preliminary findings suggest XRCC2 may also play a role in MMR-deficient CRC.
The authors are grateful to Study Coordinators, Laboratory Specialists, and Computer Specialist Jathine Wong. The authors declare that there are no competing interests.
The genotyping and data analysis was supported by the National Institutes of Health grants [CA123550 and CA98364 to N.J.C.]; research was supported by the Utah Cancer Registry, which is funded by contract N01-PC-35141 from the National Cancer Institute’s SEER program with additional support from the Utah State Department of Health and the University of Utah; partial support for datasets within the Utah Population Database was provided by the University of Utah Huntsman Cancer Institute; Recruitment, data collection and genotyping in Sheffield was supported by Yorkshire Cancer Research grants [to A.C. and Professor Mark Meuth]; data collection in Leeds was supported by Cancer Research U.K. Programme Award [C588/A4994 to D.T.B.]; and data collection in Dundee and Leeds was supported by the U.K. Food Standards Agency award T01022.
5The rs3218499 tSNP represents SNPs rs3218384, rs3218408, rs3218410, rs3218417, rs3218425, rs3218461, and rs3218560; average pairwise r2=0.92.
CONFLICT OF INTEREST STATEMENT
The authors declare that they have no competing financial interests.