|Home | About | Journals | Submit | Contact Us | Français|
The formation of bulky DNA adducts caused by diol epoxide derivatives of polycyclic aromatic hydrocarbons has been associated with tobacco-induced cancers, and inefficient repair of such adducts by the nucleotide excision repair (NER) system has been linked to increased risk of tobacco-induced lung and head and neck (H&N) cancers. The human excision repair cross-complementation group 1 (ERCC1) protein is essential for a functional NER system and genetic variation in ERCC1 may contribute to impaired DNA repair capacity and increased lung and H&N cancer risk.
In order to comprehensively capture common genetic variation in the ERCC1 gene, Caucasian data from the International HapMap project was used to assess linkage disequilibrium and choose four tagSNPs (rs1319052, rs3212955, rs3212948, and rs735482) in the ERCC1 gene to genotype 452 lung cancer cases, 175 H&N cancer cases, and 790 healthy controls. Haplotypes were estimated using expectation maximization (EM) algorithm, and haplotype association with cancer was investigated using Haplo.stats software adjusting for known covariates.
The genotype and haplotype frequencies matched previous estimates from Caucasians. There was no significant difference in the prevalence of rs1319052, rs3212955, rs3212948, and rs735482 when comparing lung or H&N cancer cases with controls (p-values > 0.05). Similarly, there was no association between ERCC1 haplotypes and lung or H&N cancer susceptibility in this Caucasian population (p-values > 0.05). No associations were found when stratifying lung cancer cases by histology, sex, smoking status, or smoking intensity.
This study suggests that ERCC1 polymorphisms and haplotypes do not play a role in lung and H&N cancer susceptibility in Caucasians.
The formation of bulky DNA adducts caused by various compounds such as the diol epoxide derivatives of polycyclic aromatic hydrocarbons (PAHs) are likely causes of tobacco-induced cancers . These DNA adducts are mutagenic and are repaired by the nucleotide excision repair (NER) system. A sub-optimal DNA repair capacity has been shown to be a risk factor for tobacco-induced cancers [2, 3]. Excision repair cross-complementation group 1 protein (ERCC1) forms a heterodimer with XPF that cleaves the DNA 5′ to the lesion  and provides a link to the NER machinery through its interaction with the XPA protein  as well as binding to the DNA . XPF–ERCC1 has other DNA repair functions; it is involved in homologous recombination  as well as the repair of interstrand crosslinks [8, 9]. Since ERCC1 is essential for a functional NER system , alterations in the expression or activity of this protein may affect NER-related DNA repair capacity.
Polymorphisms in DNA repair genes have been associated with increases in cancer susceptibility. Numerous case-control studies have investigated ERCC1 polymorphisms and their potential association with several types of cancer [10–17]. The 3’UTR C8092A and synonomous exon 4 T19007C polymorphisms have been studied most extensively since there are not any non-synonymous SNPs in the coding region. The 3’UTR C8092A polymorphism is thought to be associated with alterations in transcript stability [11, 17] while the T19007C polymorphism has been associated with altered mRNA levels . Low expression of ERCC1 mRNA has been associated with increased lung and H&N cancer susceptibility [15, 18], while high expression has been shown to improve DNA repair capacity . The evidence that ERCC1 mRNA levels have an effect on NER capacity has been further demonstrated by studies linking high ERCC1 expression in tumor tissue with resistance to DNA cross-linking agents such as cisplatin [20–24].
There is conflicting evidence for the association of ERCC1 SNPs with lung cancer risk [11, 12, 25, 26], and a meta-analysis of the C8092A and T19007C polymorphisms found no association . There are fewer studies involving ERCC1 polymorphisms and H&N cancer risk, but two studies suggest that the C8092A polymorphism increases risk [28, 29]. Because of its potential as an important genetic biomarker for cancer risk, other approaches such as haplotyping are necessary to examine the role of ERCC1 genetics on tobacco-related cancer risk. Haplotype analysis allows for the comprehensive analysis of genetic variations across an entire gene as it captures common genetic variation within a given haplotype block without requiring prior knowledge of SNP functionality. Despite the overall lack of association with known functional SNPs, Ma et al. found a protective effect of the TCCCATT haplotype and lung cancer risk in a large Chinese population . To our knowledge, there are currently no association studies involving ERCC1 haplotypes and H&N cancer risk and no studies have been performed examining ERCC1 haplotypes and cancer risk in a Caucasian population. In the present study, we investigated four tagSNPs that represent the common genetic variation in the ERCC1 promoter and coding region to test for potential associations with lung and/or H&N cancer risk in Caucasians.
The case-control study was conducted at the H. Lee Moffitt Cancer Center (Tampa, FL) from 2000 to 2003. Caucasian lung and H&N cancer cases (n = 452 and n=175, respectively) were newly-diagnosed subjects with histologically confirmed lung or H&N cancer with no past history of other tobacco-related cancers. Caucasian controls (n = 790) were selected from community residents attending the Lifetime Cancer Screening facility of the Moffitt Cancer Center. Control subjects were randomly selected from thousands of community residents who underwent prostate-specific antigen testing, skin examinations, endoscopy, or mammography. Spiral computed tomography for lung cancer was not done at the clinic. The Lifetime Cancer Screening facility conducts community outreach and education programs throughout the Tampa Bay area, including lecture series, screening events, health fairs, literacy programs, and community-based partnerships. A list of control IDs was matched against the hospital patient database to identify any subject who might have developed cancer. All control subjects with a new cancer diagnosis were excluded from this study. Ninety-nine percent of the hospital cancer patients and ninety-seven percent of the clinic control patients who were asked to participate in the study consented. All study subjects signed a consent form approved by the institutional review board. A trained interviewer administered a structured questionnaire that obtained lifestyle and smoking history information including levels of education, occupation, year of smoking onset, current smoking status, number of cigarettes smoked per day, and years since quitting (for former smokers). The questionnaire was an abbreviated from of a previously-validated questionnaire used by investigators at the American Health Foundation in a large hospital-based, case-control study involving over 10,000 cases and 30,000 controls . Individuals were categorized as current, former, or never smokers for the discrete variable ‘current smoking status’ according to the following method of classification: current smokers had smoked at least one or more cigarettes per day for the past year, former smokers had quit smoking one or more years prior to the interview, and never smokers had smoked fewer than 100 cigarettes in their lifetime. The medical charts of the case subjects were reviewed to obtain diagnostic and pathology records.
CEPH genotypes representing individuals with European ancestry were downloaded from the International HapMap Project  and linkage disequilibrium (LD) was determined using Haploview software . LD was estimated between all pairs of SNPs using the D’ statistic. Haplotype block structure was determined using the Solid Spine of LD option of Haploview with the block extended if pairwise D’ between SNPs was greater than 0.80.
According to HapMap data, the coding region and promoter of the ERCC1 gene lie in separate haplotype blocks (Figure 1). HapMap data was used to identify four haplotypes in the coding region of the ERCC1 gene, which can be distinguished by three tagSNPs (rs3212948, rs3212955, rs735482), and two haplotypes in 10 kb of the promoter region, which can be distinguished by one tagSNP (rs1319052). All four tagSNPs had >15% minor allelic frequency (MAF). TagSNPs were genotyped using pre-designed Taqman 5’-exonuclease genotyping assays, according to manufacturer’s instructions and SDS 2.2.2 software was used for automated determination of genotypes (Applied Biosystems, Foster City, CA). Briefly, allele-specific probes for rs1319052, rs3212955, rs3212948, and rs735482 were labeled with the fluorescent dyes VIC and FAM. During extension, the 5’-exonuclease activity of the Taq polymerase separates the fluorophore from the non-fluorescent quencher. A post-amplification allelic discrimination run on the ABI 7900HT was used to determine genotype based on the relative amount of fluorescence of VIC and FAM. PCR reactions were carried out in a total reaction volume of 5 µl in 384-well plates using the ABI 7900 HT Sequence Detection System, and the thermal cycling conditions were 50°C for 2 min, 95 °C for 10 min, and then 40 cycles at 95 °C for 15 s and at 60 °C for 90s. Individuals involved in genotyping were blind to patient status.
Hardy-Weinberg equilibrium was assessed by χ2 tests. Odds ratios (OR) and 95% confidence intervals (CI) of association between cancer and SNP genotype were estimated using logistic regression. Potential confounding of the association between genotype and cancer risk by known risk factors was explored using Spearman rank correlation analyses and multivariate logistic regression models, including stepwise regression models. Single SNP associations were analyzed using SPSS 15.0 (SPSS Inc., Chicago IL).
Haplotypes were estimated using the expectation maximization (EM) algorithm and haplotype association with cancer was investigated using Haplo.stats software , adjusting for known covariates (age, gender, and pack-years). In addition to being analyzed in its entirety, the lung cancer data set was stratified by histology, gender, and smoking status to search for associations. Stratified analyses were not performed in the H&N cancer data set due to limited power. Significance of associations was calculated by a 1,000-permutation algorithm. Rare haplotypes (<2%) were excluded from the analysis.
The power calculation to identify a genotypic effect in the case control studies was performed with the software PS version 3.0.14  and was determined for a range of genotypic frequencies (0.1, 0.2 and 0.4) and odds ratios (1.6, 1.9, and 2.2) using a case-control design. The power calculations were based on two-sided tests with a 0.05 false positive level. With the given sample sizes, there was sufficient power to detect an increased risk in the lung and H&N cancer studies over nearly the entire range of frequencies and effect sizes. Specifically, to detect an OR ≥1.9 in the H&N study, there was 92% power for a SNP with a MAF of 0.2 and 97% power for a SNP with a MAF of 0.4. The power was even greater in the lung cancer study due to larger sample sizes. The SNPs observed in this study had high MAFs in the controls suggesting there was sufficient power to observe a significant effect if the SNP contributed an increased risk with an odds ratio of at least 1.6.
The basic demographic profile of the lung cancer case control subjects is shown in Table 1. Fifty-four percent were men and 46% were women. The mean ages of cases and controls were 64.2 ± 9.8 and 58.4 ± 10.4 y, respectively (p < 0.01). A significantly (p < 0.01) higher percentage of lung cancer cases than controls ever smoked cigarettes (92% vs. 62%). The mean pack-years of smoking for lung cancer cases (56 pack-years) was significantly (p < 0.01) higher than the mean pack-years observed for controls (24 pack-years). A significantly (p < 0.01) lower percentage of lung cancer cases (21%) were college educated as compared to controls (39%), and the mean BMI of 26.9 ± 5.1 in lung cancer cases was similar to that observed in controls (27.2 ± 4.9). Among the lung cancer case group, the most frequent tumor histology was adenocarcinoma (38%), followed by squamous cell carcinoma (24%).
The demographic profile of the H&N cancer cases (Table 2) shows that a significantly higher percentage of men (p<0.01) were found in the cases (74%) versus controls (54%). The mean age of 58.5 ± 11.4 y for H&N cancer cases was similar to that observed for controls. A significantly (p<0.01) higher percentage of H&N cancer cases than controls ever smoked cigarettes (75% versus 62%) and the mean pack-years for H&N cancer cases (48 pack-years) was significantly higher (p < 0.01) than that observed for controls (24 pack-years). A significantly (p<0.01) lower percentage of H&N cancer cases (22%) were college educated and the mean BMI of 26.7 ± 5.5 in H&N cancer cases was similar to that observed in controls.
All SNPs were found to be consistent with Hardy-Weinberg equilibrium. Of 376 samples repeated for quality control purposes, two samples (~0.5%) yielded contradictory results. The MAF of the tagSNP genotypes in controls were 0.40 (rs1319052), 0.39 (rs3212948), 0.25 (rs3212955), and 0.15 (rs735482), which were similar to that observed in the Caucasian data from HapMap.
No association was found between individual SNPs and lung or H&N cancer risk (Table 3 and Table 4). The genotype frequencies, OR, and 95% CI are shown, and indicate that none of the SNPs approached significance. Odds ratios were calculated by logistic regression adjusting for the covariates age, gender, and pack-years. When comparing the H&N and lung cancer cases, it is apparent that no trend exists for any of the SNPs. The genotypes that appear to be modestly protective in one set of cases are associated with risk in the other set of cases, and vice versa. For example, in the lung cancer cases the OR for the rs1319052 G/A genotype is 0.82 (95% CI = 0.61–1.11); however, in the H&N cases the OR for this genotype is 1.29 (95% CI = 0.88–1.89). We used three of the tagSNPs to distinguish four haplotypes in the ERCC1 coding region and one tagSNP to distinguish two haplotypes in the promoter region of the ERCC1 gene. In this study, no association was found between any of the ERCC1 haplotypes and lung or H&N cancer risk, adjusting for age, gender, and pack-years (Table 5 and Table 6). Lung cancer cases were stratified by histology, gender, smoking status, and smoking intensity in an attempt to identify any specific associations that might be present, but none were found. In addition, non-small cell lung cancer and small cell lung cancer cases were analyzed separately because the etiology of these diseases may be different regarding ERCC1 status, but no associations were found. Based on the high linkage disequilibrium between all four SNPs in our population, an alternate haplotype combination including all four SNPs was analyzed, and no association with lung or H&N cancer risk was observed (data not shown).
Understanding the genetics of lung cancer etiology will enable individuals with high-risk genotypes to be targeted for screening and intervention. Although smoking cessation is currently the only intervention that has been shown to be effective in reducing risk of lung cancer in humans, several chemoprevention agents have shown promise in murine models [36–38]. While it remains to be seen whether chemoprevention of lung cancer can be achieved in humans, genetic screening can help physicians assess an individual’s risk more accurately and target them for early detection and lung cancer screening .
The results of this study suggest that ERCC1 polymorphisms do not play a role in lung or H&N cancer susceptibility in Caucasians. Previous studies investigating the association of ERCC1 polymorphisms with tobacco-related cancers have shown mixed results. For example, Zienolddiny et al found a protective effect for the T19007C (C/C) genotype on lung cancer risk in a Norwegian population . However, studies of a large American population, a Danish population, and a Chinese population reported no association with lung cancer risk for this SNP [12, 25, 26]. Sugimura et al found a significant gene-smoking interaction between the C8092A polymorphism and oral cancer risk in a Japanese population . However, no significant association was observed between the C8092A (C/C) genotype and risk of squamous cell carcinoma of the head and neck in a small Caucasian population, though a modestly significant increase in risk (p=0.04) was seen in individuals that also had the ERCC2 G23591A (A/A) genotype .
More recently, Ma et al. found a protective effect for the C allele of rs3212948 in a Chinese population, in which the C allele is the minor allele (MAF=22%) . In the current study, this association was not observed for the rs3212948 C allele and lung or H&N cancer risk. Instead, a trend towards increased risk was noted for the C allele in both cancer populations. Since it is the major allele in Caucasians, sample size was not a likely contributing factor to this contradictory finding. Given the broad genomic differences between these races, it is not uncommon to find differences in association between them. In addition, rs3212948 is an intronic SNP and is not expected to be the functional SNP. Instead, the association is expected to be due to high LD with the functional SNP, which was not identified in the Chinese study . It is possible that Caucasians lack the functional SNP, though the protective effect was only seen in heterozygotes, which is a difficult phenomenon to explain . In the current study, there was no effect of the heterozygous (C/G) genotype for rs3212948 on lung or H&N cancer risk in Caucasians. The rs1319052, rs3212955, and rs735482 SNPs also were not associated with lung or H&N cancer risk in our population.
The ERCC1 gene is located in the chromosomal region 19q13.3, and haplotypes of chromosome 19q13.2–3 have been associated with cancer risk in previous studies [25, 40–43]. With respect to tobacco-related cancers, a haplotype of polymorphisms encompassing the ASE-1, RAI, and ERCC1 genes was associated with lung cancer risk in a large Danish cohort study [25, 42]. The association was strongest in middle-aged women that smoked heavily, and the authors suggested that a functional polymorphism in the RAI gene was most likely to be responsible for the observed association. Yin et al found no association between the silent ERCC1 Asn118Asn SNP and lung cancer , but later discovered two at-risk haplotypes and three protective haplotypes in chromosomal region 19q13.2–3 in a small Chinese population, reflecting the more comprehensive nature of the haplotype approach . Recent studies suggest that ERCC1 haplotypes may also have functional significance. An in vitro study by Zhao et al. showed that a specific ERCC1 haplotype is associated with higher median levels of BPDE-DNA adducts in cultured primary lymphocytes from healthy Caucasians . In a study by Woelfelschneider et al., an ERCC1 haplotype was associated with lower mRNA levels in lymphocytes from Caucasian prostate cancer patients . Taken together, these findings demonstrate the need to further examine ERCC1 haplotypes for association with tobacco-induced cancers.
We found that ERCC1 haplotypes do not modulate lung cancer risk in our Caucasian population. Previously, Ma et al. identified an ERCC1 haplotype that was protective against lung cancer risk in a Chinese population , and another recent study also found an association between ERCC1 haplotypes and lung cancer risk in a Chinese population . In the latter study, haplotypes containing both −433C and 262G were found to be associated with higher risk for lung cancer, and the risk was higher in individuals that smoked more. In luciferase reporter assays, haplotypes containing the 262G allele displayed significantly lower transcriptional activity as well as lower protein binding measured by EMSA . The 262G allele is much more frequent in Caucasians (92.5%) as compared to Chinese (55.6%). If the 262G>T is the functional SNP, then the 262T allele may indeed exert a protective effect against lung cancer in Caucasians as well as Chinese. However, since the 262T allele has a frequency of only 7.5% in Caucasians, compared to 44.4% in Chinese, this could not be adequately assessed in the Caucasian population described in the present study given the sample size. Moreover, variability in other genes in the NER pathway may help explain the observed racial difference. A comprehensive genetic approach that implements haplotype analyses of multiple genes in the NER pathway may help resolve the differences between Caucasian and Chinese populations.
Tobacco carcinogens are known to induce numerous cancers within the H&N. We present for the first time that ERCC1 haplotypes are not associated with H&N cancer susceptibility in Caucasians. Low expression of ERCC1 has been associated with greater risk of squamous cell carcinoma of the head and neck . For this reason, we included a SNP in the promoter region of ERCC1 in our genetic analysis of this locus. Because the promoter of the ERCC1 gene lies in a different haplotype block than the coding region, this SNP was necessary to capture any alleles that contribute to expression variability. Even with this SNP included, no association was found between any of the ERCC1 haplotypes and H&N cancer risk.
This study was adequately powered to detect moderate associations, but sample size was a limiting factor for rare alleles that may be weakly associated with cancer risk. A larger sample size, especially for the H&N cases, would permit detection of such events. Another potential limitation of the study design is that the catchment area for cases was from H. Lee Moffitt Cancer Center oncology clinics, whereas recruited controls were community-based from an affiliated cancer screening clinic. Additionally, patients were enrolled at a single cancer center, and these findings were not validated at an independent center. A particular strength of this study includes a high participation rate for cases (99%) and controls (97%), which limits selection bias . In conclusion, we comprehensively evaluated common genetic variation in the ERCC1 gene and found no association with lung or H&N cancer risk in Caucasians. Mice with homozygous ERCC1 mutations exhibit severe aneuploidy within weeks after birth and die before weaning . Given the lack of non-synonymous SNPs in the ERCC1 coding region, it is possible that the function of this gene is vital and that even minor functional alterations are incompatible with life.
These studies were supported by Public Health Service grants K99-CA 131477 (Gallagher), P01-CA 68384 (Lazarus), R01-DE13158 (Lazarus), K07-CA 104231 (Muscat), from the National Institutes of Health and PA-DOH 4100038714 (Lazarus and Muscat) from the Pennsylvania Department of Health. This project was also funded, in part, under a grant with the Pennsylvania Department of Health using Tobacco Settlement Funds (Spratt). We thank the Functional Genomics Core Facility at the Penn State University College of Medicine for real-time PCR services and Diane McCloskey for editorial assistance.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Conflict of Interest Statement
The authors of this work have no conflicts of interest to disclose.