In this study, we used an integrative approach to analyze both single variants and haplotypes of genes in the NER pathway, including MDR analysis to account for the complex gene-gene and gene-smoking interactions, and principal components analysis for thorough exploration of correlations among variants that are not linkage-phase dependent. For Latinos, in the MDR analyses, smoking was a strong predictor of lung cancer, as expected, but three SNPs (ERCC2 rs171140, ERCC5 rs17655, and LIG1 rs20581) also increased the case-control prediction accuracy, suggesting that additional effect modification by genetic factors may also be important. Since MDR deals with statistical prediction, whether the results of MDR have any biological significance would need to be confirmed by laboratory studies.
Another strength of this study was the ability to control for ancestry differences among cases and controls within each ethnic group using ancestry informative markers. As previously described, cases of this study were ascertained from a population registry, while controls came from a variety of sources including random digit dialing, Health Care Financing Administration (Medicare) rolls, and community sources such as churches, senior centers, etc 47
. This may explain why the percentage of Amerindian ancestry was higher among Latino controls than cases; controls were more likely to have Central American heritage while cases were more likely to be third or higher generation US ancestry and Mexican ancestry. Controlling for this difference in ancestry (population stratification) by inclusion of genetic ancestry in the logistic models as determined by an extensive panel of ancestry informative markers, increases confidence that observed differences among cases and controls for NER pathway genes is not due to ancestral differences. For Latinos, the adjustment for genetic ancestry moved the association toward the null for most SNPs or haplotypes, suggesting the existence of some population stratification, but the confounding of the gene-disease association by population stratification did not appear extensive. For African Americans, the results were almost identical with and without adjusting for genetic ancestry, suggesting that population stratification was minimal. One must be aware that since population stratification is dependent on different allele frequencies and disease risks among different ethnic groups, the minimal impact of population stratification observed in this study can not be generalized to other studies with different SNPs and different admixed populations.
Comparisons of our results for each gene in relation to previously reported literature are discussed in detail below.
In the current study, the Asp312Asn (rs1799793) was not significantly associated with lung cancer risk among either Latinos or African Americans. In contrast, the Gln/Gln genotype of Lys751Gln (rs13181) was associated with increased lung cancer risk among Latinos but not among African Americans. The only other study of ERCC2
and lung cancer among African Americans also reported a null association between Lys751Gln Gln/Gln genotype and lung cancer (OR=1.03; 95% CI: 0.40-2.65) and did not report on other ERCC2
. These variants have been assessed in twenty studies of Asians and Caucasians with mixed results 5, 6, 8-11, 13, 17-21, 23, 24, 26, 29, 36-38, 41
. A recent meta-analysis of ERCC2
genes in 11 populations found that the Asp312Asn polymorphism was not associated with risk of lung cancer 68
; and that the Lys751Gln Gln/Gln genotype yielded a pooled OR of 1.30 (95% CI: 1.13-1.49) with data from 15 study populations. This association was confined to Caucasians (OR=2.25; 95% CI: 0.97-5.23) and was not apparent in Asian populations (OR=1.02; 95% CI: 0.20-5.27) 68
. However, the null result could be due to a low frequency of Gln/Gln among Asians (≤ 2% for 3 of the 4 Asian studies included in the meta-analysis) 68
. More recent studies also showed no association of lung cancer risk with Asp312Asn polymorphisms in either Asians 13, 24, 37
or Caucasians 9
, while one 36
of five 9, 13, 19, 24, 36
recent studies showed an significant increased risk of lung cancer associated with Lys/Gln genotype of Lys751Gln. The functional impact of the ERCC2
polymorphisms is yet to be clarified. A recent study showed that the variants of Arg156Arg, Asp312Asn, and Lys751Gln polymorphisms were all associated with a decreased mRNA expression 69
; however, another study showed that the variants of Asp312Asn, and Lys751Gln and the double variants of (Asp312Asn/Lys751Gln) had no impact on nucleotide excision repair capacity or the basal transcription of ERCC2 70
Ethnic differences in associations of lung cancer risk with ERCC2
variants suggest that either those polymorphisms may only be important for certain ethnicities or the presence or absence of associations could result from different linkage patterns between the SNPs genotyped and the causal SNPs. There is a high variability in the allele frequencies and the linkage disequilibrium patterns of ERCC2
polymorphisms among Europeans, Africans, and Asians 50
. Thus, it is important to examine the association between ERCC2
haplotypes and the risk of lung cancer, as haplotype analysis may point to the important region(s) of the gene that warrant further examination. Furthermore, the lung cancer risk may not be attributed to individual SNPs, but more to haplotypes which may reflect the joint effect of multiple SNPs.
For Latinos, both the haplotype and principal components analyses of ERCC2 suggested that block 2 and block 3 may be important regions associated with the risk of lung cancer for Latinos. The strongest association was for block 3, which spans the 5′ upstream region of the ERCC2. Given the association observed in Latinos, further examination and sequencing of the 5′ upstream region of ERCC2 may be warranted, since it may contain important regulatory sequences and polymorphisms influencing the expression of ERCC2.
Among Latinos, interaction analyses showed that the association between lung cancer risk and ERCC2
haplotypes was confined to non-smokers. Similar findings have been reported by three other studies in other ethnic groups 9, 11, 38
. A possible explanation is that the extensive damage due to the high dose of carcinogens among heavy smokers overwhelms the DNA repair capacity of ERCC2
, and the “protective” advantages of certain genotypes or haplotypes are attenuated or obliterated under such conditions.
Too few studies have examined variants in ERCC5
with lung cancer risk for consistent results to have emerged. Among African Americans in this study, those with the His/His genotype of Asp1104His had statistically significant higher lung cancer risk. Although similar results were reported by the only other study among African Americans, results were not statistically significant because of the small number of study subjects (71 cases and 71 controls) 7
. Significantly higher lung cancer risk among His1104 carriers has also been observed among Caucasians, Mexican Americans, Asian Americans 7
and Koreans 14
. In contrast, among Latinos, we observed a non-statistically significant lower risk of lung cancer for those with His/His genotype. A lower risk of lung squamous cell carcinoma for His carriers was also suggested in a study among Japanese subjects 24
. However, a study among Chinese found no association of His1104 genotype or two ERCC5
haplotype blocks with lung cancer risk 26
. In contrast, a study among Caucasians reported increased lung cancer risk with the rare haplotype (CCCGA) formed by rs732321, rs4150360, rs3759500, rs3818356, and rs4771436 19
. Since we only typed one SNP for ERCC5
, we were not able to perform haplotype analyses.
Among African Americans, our analysis suggested a possible interaction of ERCC5
variants with lung cancer risk with those with His/His genotype and ever smoked having the highest risk of lung cancer. Two studies reported a similar interaction between Asp1104His and smoking on the development of lung cancer 7, 14
The functional impact of Asp1104His polymorphism is currently unknown though the resulting amino acid substitution may potentially affect the structural integrity of the protein. Future laboratory assessment is necessary to determine the functional impact of this polymorphism.
Among Latinos, none of the five LIG1
SNPs included in this study were significantly associated with lung cancer risk although the numbers of A allele of rs20579 showed a borderline significant trend with increasing risk (p=0.07). For African Americans, rs20579 A allele was significantly associated with a decreased lung cancer risk while the rs439132 G allele was significantly associated with increased risk. A study among Eastern and Central Europeans showed that subjects who are heterozygous for rs20579 had an increased risk of young-onset lung cancer compared to those with homozygous wildtype genotype 15
. In addition, the same study reported that the variant G allele of rs3730931 was associated with an increased risk of early-onset lung cancer, which was not observed by our study. Neither our study nor the study by Michiels et al. 19
found any association of rs20581 (Asp802Asp) and rs156641 variants and lung cancer risk.
Among Latinos, neither our haplotype nor principal components analyses revealed any association between LIG1
variants and lung cancer risk. For African Americans, our haplotype and principal components analyses suggested that variations in rs3730931 and rs20579 or regions linked to those two SNPs may be associated with lung cancer risk. Similarly, the only other study of lung cancer risk and LIG1
haplotypes reported a statistically significant association 19
, though different choices of SNPs and a study population with a different ethnic background make it difficult to compare the results their haplotype analysis to ours.
Among Latinos, RAD23B
Ala249Val variants were not significantly associated with lung cancer risk. We did not assess the Ala249Val polymorphism among African Americans since the minor allele frequency was low (4%). A study among Chinese reported an elevated lung cancer risk associated with having either Ala/Val or Val/Val genotypes 26
. Another study also observed a higher frequency of the Val allele among lung cancer cases compared to controls (0.18 vs. 0.15) although not statistically significant 19
Similar to eight previous studies, we did not observe a statistically significant associations of XPC
Lys939Gln variants and lung cancer risk 3, 12, 15, 16, 19, 24, 26, 30
A major limitation of this study is the relatively small sample size which may have limited the statistical power to detect a weak SNP-disease association and increased the probability of spurious significant results. The small sample size in the current study may not have sufficient power to detect gene-environment interactions; therefore, the results of the gene-smoking analysis should be viewed as exploratory. In addition, SNP coverage is sparse in the genes examined by this study so the negative findings may not necessarily preclude their importance in the development of lung cancer. Further studies should incorporate greater coverage of variation in NER pathway genes. Nevertheless, this is one of the few studies examining the association between NER SNPs and lung cancer among Latinos and African Americans.
In conclusion, among Latinos, the current study showed that ERCC2 may be associated with risk of lung cancer especially among non-smokers, and that smoking together with ERCC2, ERCC5, and LIG1 may have a joint influence on the development of lung cancer. For African Americans, we found that ERCC5 and LIG1 were independently associated with lung cancer risk. Thus, our study and others have suggested that different elements of the pathway may be important in the different ethnic groups resulting either from different linkage patterns, genetic backgrounds, and/or exposure histories. These results need to be confirmed by future large-scale studies among Latinos and African Americans.