Using the published Texas lung cancer GWAS discovery dataset, we first analyzed 1,806 SNPs of 125 DNA repair genes, among which 32 SNPs of 17 genes were found to have an allele effect on cancer risk with a P value of <0.01, although no genome-wide significant association was identified. We then assessed the associations between 20 SNPs of XRCC4 (the top-hit gene in the list of 17 genes) and lung cancer risk. We found that, of 20 SNPs, six (i.e. rs10040363, rs4591730, rs1017794, rs1011981, rs1478486, and rs9293329) were associated with risk of lung cancer with a P value of < 1×10−2, and the most significant SNP was rs10040363 (P value for allelic test = 4.89 ×10−4) with an imputed functional rs2075685 SNP (the imputed P value for allelic test = 1.3 ×10−3). The minor alleles of the six top-hit observed SNPs appeared to be protective against lung cancer risk, which is consistent with the data from the luciferase reported assay that further demonstrated that the rs2075685G>T change in the XRCC4 promoter increased XRCC4 expression.
In our replication study of three independent top-hit SNPs and one imputed functional SNP, we found that the rs10040363 G allele was associated with decreased risk of lung cancer with a borderline statistical significance, whereas all the three SNP, i.e.rs1478486, rs9293329 and rs2075685, were not. It is noted, however, that both rs1478486 A and rs2075685 T alleles exhibited reduced lung cancer risk. In the combined analysis of both GWAS discovery and replication datasets, the strength for an association was increased for rs10040363 (Pdominant = 5.0×10−4 and P for trend = 5×10−4) and rs1478486 (Pdominant = 6.0×10−3 and P for trend = 3.5×10−3), and the trends of the risk were consistently in the same direction in both discovery and the replication datasets. In the meta-analysis, however, we did not find evidence of an association between overall lung cancer risk and any of the four XRCC4 SNPs. This underscores the importance of replication of any findings of an effect of a low-penetrance locus on cancer risk, particularly from a GWA study, in different study populations.
is a limiting factor in the NHEJ [37
], which is required for both normal development and suppression of tumors. It has been recognized that mouse embryonic cells with disruption of XRCC4
show reduced proliferation, radiation hypersensitivity, chromosomal instability, and severely impaired V(D)J recombination [38
]. A deficiency in XRCC4
results not only in an increased sensitivity of cells to X-ray but also may give rise to immunodeficiency in animals [39
]. Although our experimental data showed that the rs2075685G>T change in the XRCC4
promoter region increased XRCC4
expression, the association with risk for this functional SNP did not achieve statistical significance in our replication dataset (despite its association with a similarly decreased risk). Since rs2075685 is located in the promoter region in XRCC4
and the G>T change in this SNP increases XRCC4
expression, if rs1478486, one of the observed top-hit SNPs in high LD with rs2075685, contributes to the risk of lung cancer, rs2075685 could be the causal SNP linked to rs1478486. It is also likely that rs10040363 in XRCC4
, though intronic, could also be linked to other untyped causal SNPs. It is plausible, however, that disease-associated variants with modest effects may be distributed proportionately between coding and noncoding sequences of the genome [40
]. Indeed, several studies have found that ‘functional’ intronic variants are associated with disease occurrence [41
]. For example, an intronic SNP in a RUNX1 binding site of SLC22A4
, which affects the transcriptional efficiency of SLC22A4
, is strongly associated with rheumatoid arthritis [42
]. The results from our replication dataset might represent an association of mild to modest effect, but such a weak association was not supported by the meta-analysis.
So far, at least two small studies have reported that rs10040363 and rs2075685 are involved in the susceptibility to lung cancer [43
]. A French study in 151 cases and 172 controls found that variant alleles of rs10040363 and rs2075685 were associated with decreased risk of lung cancer. This direction of association is the same as seen in our data. However, a candidate gene study from Taiwan of 164 lung cancer patients and 649 healthy controls found that rs2075685 was not associated with lung cancer risk, and this discrepancy could be due to ethnic differences, genotyping platform and study sizes. Again, these discrepancies further underscore the importance of replication to rule out a chance finding, particularly from under-powered studies.
The limitations of the present study include: 1) our analysis was limited to individuals of non-Hispanic whites, the controls were frequency-matched to the cases by age (within 5 year categories), sex, smoking status, and all the subjects in the discovery and replication datasets were ever smokers; 2) the sample size in the replication phase was smaller than the original discovery dataset; and 3) the subgroup meta-analysis was not conducted because only genotyping data was available from other four GWA studies.