In this study, we utilized a haplotype-tagging approach to examine the role of variants in genes involved in DNA repair and genomic maintenance in risk of childhood ALL. We found significant haplotype associations with total ALL for ERCC2, RAD51, APEX1, and BRCA2; and with specific ALL subtypes for NBN, XRCC4, and CDKN2A. In addition, we observed significant gene-environment interactions between XRCC4 variants and exposure to diagnostic X-rays modulating the risk of structural abnormality-positive childhood ALL. Our results provide strong support for a role of the DNA repair and cell cycle control pathways in risk of childhood ALL.
encodes the major apurinic/apyrimidinic (AP) endonuclease in human cells. AP sites occur frequently in DNA as a result of spontaneous hydrolysis, damaging agents, or glycosylases that remove specific abnormal bases; therefore, APEX1
plays an important role in base excision repair. Gene expression analyses of APEX1
have shown high levels of expression in many different types of tumors, including osteosarcoma, ovarian, and digestive cancers ([31
] Endunuclease, [32
]). The two SNPs involved in the observed APEX1
risk haplotype were borderline significantly associated with total childhood ALL after adjustment for the false discovery rate (pFDR
= 0.06 for both) [33
]. Of these, rs3120073, whose variant allele was associated with decreased risk ALL among both ethnicities combined, is located 6.5 kb upstream of APEX1
, in intron 6 of OSGEP
, a probable endopeptidase. Analysis of HapMap CEU population data shows that this SNP is not in strong linkage disequilibrium (r2
> 0.80) with any nearby SNPs (<10 kbp), though HapMap genotyping data for this variant was available for less than 50% of the CEU population. The other involved SNP, rs11160711, whose variant allele was associated with an increased risk of total childhood ALL among non-Hispanics, is 10kb upstream of APEX1
. Neither SNP has known function. Additional studies are warranted to replicate these findings and identify functional variants that may be in strong linkage disequilibrium with these SNPs.
gene product is involved in nucleotide excision repair. Defects in this gene can cause rare genetic syndromes, including the cancer-prone syndrome xeroderma pigmentosum complementation group D, which necessitates protection from ultraviolet light [34
]. We found a significant haplotype association for this gene with total childhood ALL risk. Separately, the BRCA2
gene product, a tumor suppressor, interacts with the RAD51
gene product, a recombinase, to effect homologous repair of double-strand breaks [35
]. We observed significant haplotype associations for BRCA2
, albeit in different ethnic groups. These results must be confirmed, but support a role for DSB repair pathways in risk of childhood ALL.
The X-ray repair cross-complementing protein encoded by XRCC4
is involved in nonhomologous end-joining repair of double-strand DNA breaks and the completion of V(D)J recombination events [36
]. In addition to the significant haplotype associations with this gene for ALL with any structural changes, we observed a significant interaction between the XRCC4
risk haplotype and exposure to postnatal X-rays on risk of the same disease subtype. These effects appear to be driven largely by intronic SNP rs1193695, which showed nominally significant main effects and interactions with postnatal X-ray exposure for both total ALL as well as ALL subtypes defined by any structural changes and any numerical changes. Our results provide compelling support for a role of the nonhomologous end-joining repair pathway in risk of ALL, particularly ALL with structural abnormalities, both alone and in conjunction with exposure to ionizing radiation.
Other genes showed significant subtype-specific haplotype associations as well. Haplotypes of NBN
were associated with t(12;21) positive childhood ALL. NBN
is part of the MRE11
DSB repair complex, involved in homologous DSB repair. Mutations in this gene can lead to Nijmegen breakage syndrome, a chromosomal instability syndrome that predisposes to cancer, among other diseases [38
]. These results for NBN
did not extend more broadly to ALL with any structural changes.
In contrast, haplotypes of CDKN2A
were associated with both hyperdiploid ALL and ALL with any numerical ploidy changes. CDKN2A
is a cell cycle control gene recognized as a tumor suppressor for its role in stabilizing p53. A study of secondary hits from a genome-wide association study found CDKN2A
SNP rs3731217, which was not included for genotyping in our study, to be associated with childhood ALL, specifically B-cell precursor disease [39
]. Our results support these previous findings for a role of CDKN2A
in childhood ALL risk, and suggest that further studies should examine effects by ploidy.
Few previous candidate gene studies have investigated the main effects of DNA repair pathway gene variants in the etiology of childhood ALL, and those that have were typically of smaller sample size (less than 200 cases) and focused on analyses of individual SNPs with putative function (i.e., inducing amino acid changes or located in potential regulatory regions). One study reported significant association of a promoter variant of CDKN2A
with childhood pre-B-cell ALL [40
], while another reported an association of a functional ERCC1
variant with total childhood ALL ([41
] 2006). Other studies have found significant main effects of functional SNPs in XRCC1
], and a significant haplotype combining three of these XRCC1
functional variants of XRCC1
has also been observed [42
]. In our study, we did not observe a significant haplotype association for XRCC1
though we did observe significant effects for CDKN2A
. Inconsistencies between our study and previous candidate gene studies may be attributable to a number of factors, including the differences in approach (haplotype vs. individual SNP), population and/or allele frequency differences and sample size differences that impact power to detect effects, as well as chance.
To our knowledge, this study is the first to examine the joint effects of genes in the DNA repair and cell cycle control pathways with ionizing radiation on risk of childhood ALL using a haplotype approach. We focused on interactions involving haplotypes with significant main effects, as the biological basis is unclear for interactions involving significant subgroup variation in exposure–disease associations with no main effects when subgroups are combined [44
]. We also limited our investigation of potential interactions to those haplotypes with significant main effects in both Hispanics and non-Hispanics combined, as the sizes of the individual ethnic groups were considered too small to permit adequately powered examinations of interactions. A previous study focusing on single SNPs using a case-only interaction study design reported a suggestive interaction between an APEX1
SNP and postnatal X-ray exposure [45
]. Per our a priori analysis plan, we did not examine APEX1
haplotype interactions since this gene showed significant main effects in only non-Hispanics. In this regard, we recognize that our total sample size (343 cases, 406 controls with both genetic and exposure data) may be insufficient to observe modest interaction effects with adequate statistical power. In addition, although the role of post-natal diagnostic X-rays in childhood ALL risk is not entirely clear [16
], we previously found exposure to postnatal X-rays to be significantly associated with childhood ALL (OR = 1.85, 95%CI: 1.22–2.79 for 3+ vs. 0–2 X-rays) in a larger sample size (n
= 711 cases), of which the current study population is a subset [17
]. However, these measures are derived from maternal reports of the child’s exposures and exposure periods and are potentially subject to reporting errors. Further studies with improved measures of ionizing radiation exposure, perhaps via review of medical records, as well as data on other potential DNA damaging exposures and larger sample sizes, are needed to confirm the associations observed.
One of the strengths of our study is the inclusion of U.S. Hispanics, an understudied population whose childhood leukemia incidence rates are the highest reported in California [46
]. We selected SNPs in a manner to maximize capture of genetic variation in Hispanics, and examined Hispanics separately from non-Hispanics where there was significant heterogeneity in between-group effects of individual SNPs. Although this approach may have limited our ability to detect associations in the population as a whole, we believe it was necessary given that genetic susceptibility and/or patterns of linkage disequilibrium may be different in Hispanics versus non-Hispanics due to the Hispanic population’s relatively recent genetic admixture [23
]. Results that differ between Hispanics and non-Hispanics may be due to differences in allele frequency and/or haplotype structure, or may reflect underlying differences in exposures that modulate the effects of genes. Regardless, if the results are not spurious, they represent potential risk loci, and we present them in either or both ethnic groups for replication and further followup. A final point is that the limited size of certain racial/ethnic sub-populations among non-Hispanics precluded further stratification of this group; therefore, heterogeneity among non-Hispanics might have obscured results.
In gauging these results, consideration must be given to several factors. First, despite this study’s relatively large sample size compared to those of most previous candidate gene studies, the presence of genetic heterogeneity due to the ethnic and racial diversity of the California population may have influenced our ability to detect associations. However, in this study population, we found no evidence of strong confounding due to estimated genetic ancestry, minimizing concerns about the impact of population stratification on the results. In addition, as noted above, our findings for CDKN2A
confirm those from analyses of secondary hits from a recent genome-wide association study [39
]. However, associations identified in our study for other genes, including APEX1
, and others, were not observed in the primary or secondary hits of the large genome-wide studies conducted to date [39
]. This may be due to stringent multiple testing adjustment (at the p
≤ 1 × 10−7
level) to account for the large number of individual variants investigated in the genome-wide studies. In contrast to the agnostic approach to discovery used in genome-wide studies, our study focused on relatively few genes representing key elements of the DNA repair and cell cycle control pathways. Other DNA repair genes not included in our study may also be associated with disease; we are unable to comment on these. We concede that results of our study may be due to chance, and therefore must be replicated. However, the haplotype-tagging approach we adopted maximizes capture of total variation within each candidate gene and the haplotype analysis increases statistical power to detect associations over analyses of individual variants. Finally, differences in genetic risk factors by cytogenetic subtypes may be obscured in studies that do not stratify analyses by these subtypes.
In summary, our results indicate that elements of the DNA repair and cell cycle pathways are likely to be associated with childhood ALL, and that some of these elements may interact with ionizing radiation exposures to modulate risk. The associations and interactions identified should be considered targets for further analysis in studies with larger sample sizes, high quality environmental exposure data, and finer coverage of SNPs in the identified associated regions.