Search tips
Search criteria 


Logo of carcinLink to Publisher's site
Carcinogenesis. 2009 April; 30(4): 626–635.
Published online 2009 January 27. doi:  10.1093/carcin/bgp033
PMCID: PMC2664455

Tobacco and estrogen metabolic polymorphisms and risk of non-small cell lung cancer in women


To explore the potential role for estrogen in lung cancer susceptibility, candidate single-nucleotide polymorphism (SNPs) in tobacco and estrogen metabolism genes were evaluated. Population-based cases (n = 504) included women aged 18–74, diagnosed with NSCLC in metropolitan Detroit between November 2001 and October 2005. Population-based controls (n = 527) were identified through random digit dialing and matched on race and age. Eleven SNPs in 10 different genes were examined in relation to risk: CYP1A1 Msp1, CYP1A1 Ile462Val, CYP1B1 Leu432Val, CYP17, CYP19A1, XRCC1 Gln399Arg, COMT Val158Met, NQO1 Pro187Ser, GSTM1, GSTT1 and GSTP1 Ile105Val. Lung cancer risk associated with individual SNPs was seen for GSTP1 [A allele; odds ratio (OR) = 1.85; 95% confidence interval (CI), 1.04–3.27] and XRCC1 (A/A genotype; OR = 1.68; 95% CI, 1.01–2.79) in white women and CYP1B1 (G allele; OR = 11.1; 95% CI, 1.18–104) in black women smokers. White women smokers carrying two risk genotypes at the following loci were at increased risk of lung cancer compared with individuals not carrying risk alleles at these loci: CYP17 and GSTM1, COMT and GSTM1, CYP17 and GSTT1, XRCC1 and GSTP1, CYP1B1 and XRCC1 and COMT and XRCC1. The most parsimonious model of lung cancer risk in white smoking women included age, family history of lung cancer, history of chronic lung disease, pack-years, body mass index, XRCC1 A/A genotype, GSTM1 null and COMT A/G or G/G genotype. These findings support the need for continued study of estrogen in relation to lung cancer risk. Polymorphisms in the tobacco metabolism, estrogen metabolism and DNA repair pathways will be useful in developing more predictive models of individual risk.


Lung cancer is the second most commonly diagnosed cancer in men and women and is the leading cause of cancer-related deaths for both groups in the USA (1). It has been suggested that women may be at increased risk of lung cancer compared with their male counterparts, and several sources of evidence, though inconsistent, suggest that estrogens may play a role in lung cancer. Variation in risk factors and tumor characteristics between men and women has been reported in studies showing that women are more likely to have adenocarcinomas of the lung (1), a higher risk in never smokers (2), higher levels of polycyclic aromatic hydrocarbon–DNA adducts at any given level of smoking (3), higher levels of expression of the gene encoding cytochrome P450 (CYP) 1A1) (3,4), more frequent G:C to T:A transversions in p53 (5) and more frequent epidermal growth factor receptor mutations (6) than men. Evidence supporting an estrogen component to risk based on epidemiologic studies has been inconsistent, with reports of increased risk of adenocarcinomas with ever/never use of estrogen replacement therapy (7,8), as well as reduced risk with ever/never use of oral contraceptives and hormone replacement therapy (HRT) (9,10) or increased risk associated with long duration of HRT use (11). In our own work, we found an inverse relationship between HRT use and non-small cell lung cancer (NSCLC) risk in post-menopausal women (12). This association was strongest in women with tumors expressing estrogen receptors (ERs). In addition to exogenous exposure to hormones, some findings report that lung cancer risk may be associated with endogenous hormones. A prospective cohort study of Canadian women found increased risk of lung cancer [Hazard Ratio = 1.42; 95% confidence intervals (CI), 1.06–1.88] among women with five or more births compared with nulliparous women (11). Additionally, among parous women, those who had their first birth after age 30 were at reduced risk compared with those who had their first birth before age 23, after adjustment for various other risk factors, including smoking (11). A population-based cohort study of never-smoking Japanese women reported that the interaction between later age at menopause and earlier age at menarche was associated with increased risk of lung cancer (Relative Risk = 2.76; 95% CI, 1.32–5.77) after adjusting for age, region and passive smoking exposure, although age at menopause and menarche were not individually associated with risk (7). Thus, the roles that endogenous and exogenous hormones play in lung cancer development have not been elucidated through epidemiologic studies.

Laboratory-based evidence supporting a role for estrogens in lung cancer development and progression has also been published. Several studies suggest frequent expression of ER in lung tissue, lung tumors and lung cancer cell lines (1318). Further epidemiologic evidence suggests that ER may be associated with NSCLC prognosis (1922). Estrogen bound to ERs affects cell growth, and multiple pathways of ER action in the lung are currently being studied (13,14,18).

Numerous studies have taken a candidate gene approach to understanding susceptibility to lung cancer. The focus of this work has been on single-nucleotide polymorphisms (SNPs) in genes coding for enzymes involved in tobacco carcinogen metabolism (e.g. CYP1A1 and NQO1) and genes involved in DNA repair (e.g. XRCC1) (23,24). A number of the same genes involved in tobacco carcinogen metabolism pathways are also active in the estrogen metabolism pathway, including CYP1A1, CYP1B1 and the glutathione S-transferases (GSTs) (24). In addition, catechol-O-methyltransferase (COMT) enzymes appear to play a role in both estrogen metabolism and possibly nicotine dependence by breaking down catecholamines in the brain (2527). CYP17 and CYP19A1 are both key enzymes in biosynthesis of estradiol, but do not have an established role in tobacco carcinogen metabolism (2830). To further explore the role of estrogen in lung cancer development, we evaluated the role of the following SNPs in these tobacco and estrogen metabolism genes: CYP1A1 Msp1, CYP1A1 Ile462Val, CYP1B1 Leu432Val, CYP17, CYP19A1, XRCC1 Gln399Arg, COMT Val158Met, NQO1 Pro187Ser, GSTM1, GSTT1 and GSTP1 Ile105Val along with tobacco use and estrogen exposures in determining risk of NSCLC in women in a large population-based study. In addition, we further examine the role of these SNPs as part of a risk prediction model, including both traditional and genetic risk factors.

Materials and methods

Study subjects

Subjects were enrolled through the population-based Metropolitan Detroit Cancer Surveillance System, a participant in the National Cancer Institute's Surveillance, Epidemiology and End Results program. This study has been described in detail elsewhere (12). Women aged 18–74, diagnosed with primary NSCLC in Wayne, Macomb and Oakland counties between 1 November 2001 and 31 October 2005 were eligible to participate. Ascertainment was originally limited to those with adenocarcinoma histology, but was broadened after 1 November 2004 to include all NSCLC histologic types since many histological diagnoses at the time of rapid case ascertainment were not more specific. The majority of cases (71.6%) were adenocarcinoma histology, 8.9% were squamous cell carcinoma, 3.0% large cell carcinoma and 16.5% were NSCLC, unspecified, reflecting this sampling method.

Of the 1092 women identified as eligible for the study, 582 women completed an in-person interview and provided a DNA sample (53.3%), 133 were too ill (12.2%), 273 refused (25.0%) and a working phone number could not be found for another 97 (8.9%). Seven women participated in the interview portion of the study but did not provide a DNA sample (0.6%). Women self-reporting race other than black or white (n = 16) were excluded from these analyses. Participants reporting a previous history of breast cancer (64 cases and 27 controls) were excluded from this analysis due to the associations between reproductive factors, estrogen-related SNPs and breast cancer risk thus ensuring that associations detected were not driven by differences in breast cancer risk factors between cases and controls. In total, interview data and DNA samples were available for 504 women with lung cancer.

Population-based controls were identified through random digit dialing. Exclusion criteria included a previous diagnosis of a primary lung cancer; however, due to the low prevalence of this condition, no controls were excluded for this reason. Control women were frequency matched to cases on race and 5 years age group. Of the households willing to complete the brief eligibility screening questionnaire, 565 (68.5%) participated. Eleven controls reporting race other than black or white and 27 with a previous breast cancer diagnosis were excluded from this analysis, leaving 527 controls with interview data and DNA samples for analysis.

Data and biospecimen collection

All local institutional and review boards approved this study. Informed consent was obtained from each subject prior to study participation in an in-person interview. Data collected included demographic information, self-reported race, smoking history, health history, reproductive history and environmental tobacco smoke (ETS) exposure. Health history included self-report of physician diagnoses of asthma, emphysema, allergies, pneumonia, bronchitis, chronic obstructive pulmonary disease, tuberculosis and cancer. Emphysema, chronic obstructive pulmonary disease and chronic bronchitis were combined to create a broad ‘chronic obstructive lung disease’ variable. Diagnoses reported within 1 year of lung cancer diagnosis (for cases) or interview (for the controls) were excluded. Non-smokers were individuals who reported smoking <100 cigarettes in their lifetime. Ex-smokers were women who reported quitting smoking ≥2 years prior to diagnosis (cases) or interview (controls). Both ex-smokers and current smokers were further categorized based on number of pack-years of smoking, based on the median amount of smoking in the control population. Light smokers were individuals with ≤25 pack-years of smoking, and heavy smokers included women with ≥26 pack-years of use. Dates and dosages for oral contraceptive use and HRT were collected. Estrogen only, estrogen combined with progesterone, progesterone only and unknown HRT formulations were included as HRT use in analyses. Family history of lung cancer was coded as yes or no based on detailed first-degree family history information.


DNA extraction.

DNA was isolated from whole blood with a Gentra Autopure system (Qiagen, Valencia, CA), cells from inside the mouth (Gentra Puregene kit) or paraffin-embedded tissue (QIAamp DNA Mini Kit, Qiagen) following the manufacturers’ protocols. When multiple biospecimens were obtained from a study participant, DNA extracted from blood was used preferentially, followed by DNA extracted from mouthwash samples and DNA extracted from normal tissue within paraffin blocks. In cases, DNA was obtained from blood (n = 428, 84.9%), buccal cells (n = 72, 14.3%) and normal tissue blocks (n = 4, 0.8%). Controls provided blood (n = 484, 91.8%) or buccal cells (n = 43, 7.2%) for genotyping.


DNA isolated from buccal cells or paraffin-embedded tissue was preamplified using a nested polymerase chain reaction (PCR) strategy. Preamplification (outer amplification) was carried out in a 25 μl reaction containing 2.5 mmol/l MgCl2, 0.5 μmol/l of the gene-specific outer forward and reverse primers, 1.25 U AmpliTaq Gold polymerase and 200 μmol/l of deoxyadenosine triphosphate, deoxycytidine triphosphate, deoxyguanosine triphosphate and deoxythymidine triphosphate. The outer amplification mixture was denatured at 95°C for 10 min and amplification was achieved by 15 cycles of 94°C for 30 s, a primer-specific annealing temperature for 30 s and 72°C for 1 min, followed by a final extension at 72°C for 10 min. The outer amplification was performed on a Mastercycler® Gradient thermocycler (Eppendorf, Westbury, NY). PCR failure was responsible for the following missing genotype data: eight cases, eight controls (CYP1A1 MspI); two cases, four controls (CYP1A1 Ile462Val); five cases, one control (CYP1B1); two cases (CYP17); four cases, one control (CYP19A1); four cases, one control (NQO1); two cases, three controls (GSTM1); one case (GSTT1); four cases, one control (GSTP1); seven cases and four controls (COMT) and two cases (XRCC1). Additionally, we did not attempt to genotype five cases and 24 controls at CYP19A1 due to time and financial constraints at the end of the study period. Overall, genotyping success rates were >98% for all genotypes.

Restriction fragment length polymorphism.

Genotypes for the CYP1A1 MspI (rs4646903) polymorphism were ascertained using restriction fragment length polymorphism protocols as described by Drakoulis et al. (31). Briefly, products of the enzyme digests were run on an 8% polyacrylamide gel and visualized by ethidium bromide staining. Sequencing verified a random sample of 5% of the subjects with 100% concordance. Laboratory techniques and primers are described in greater detail in Wenzlaff et al. (32).

TaqMan assays.

Genotypes for CYP1A1 Ile462Val (rs1048943), CYP17 (rs743572), CYP1B1 Leu432Val (rs1056836), XRCC1 Gln399Arg (rs25487) and COMT Val158Met (rs4680) polymorphisms were ascertained through the creation of custom TaqMan assays (TaqMan, Applied Biosystems, Foster City, CA) (see Table II footnotes for specific primers). The CYP19A1 tetranucleotide repeat polymorphism (TTTA) inner reaction contained 1× True Allele PCR Mix (Applied Biosystems), 333 nM CYP19 IF (5′-NED-TGAATGTGCCTTTTTTGAAATCATA-3′) and CYP19 R primer and either 1 μl of the preamplified outer reaction or 20 ng genomic DNA. The NED-labeled inner PCR products were mixed with ROX-labeled 500HD size standards (Applied Biosystems) and electrophoresed on a AB3100 (Applied Biosystems). CEPH 1347-02 DNA (Applied Biosystems) was used as a standard DNA and amplified and electrophoresed on each 96-well plate. NQO1 Pro187Ser polymorphism (rs1800566) genotyping was performed using a 5′-nuclease assay (TaqMan, Applied Biosystems) for all subjects and has been described previously (33). Likewise, the Ile105Val polymorphisms of GSTP1 (rs1695) and null mutations of GSTM1 and GSTT1 were also determined by using the TaqMan 5′-nuclease assay and have been described previously (34). For all markers, 5% of the products were randomly sequenced and 10% of genotypes were carried out in duplicate, with 100% concordance between the genotype results.

Table II.
Genotype distribution in women with lung cancer and controls, by race

Statistical analysis

Genotype frequencies for each polymorphism were calculated for cases and controls, stratifying by race. Hardy–Weinberg equilibrium tests of genotype distribution in controls, by race, were conducted for all genes except GSTM1, GSTT1 and CYP19A1 because of the nature of the mutations. Comparisons of dichotomous risk factors were made between cases and controls using Pearson's chi-square tests; comparisons of means were conducted using Student's t-tests. Heterozygotes were combined with homozygotes to assess risk of lung cancer associated with carrying certain genotypes. Specifically, individuals with at least one C allele at CYP1A1 Msp1 (TC or CC) were compared with those with the TT genotype. Other combinations were as follows: CYP1A1 Ile462Val individuals carrying at least one G allele (Ile/Val or Val/Val) were compared with those with the AA (Ile/Ile) genotype; at CYP1B1 Ile432Val, those with at least one G allele (Ile/Val or Val/Val) were compared with those with the CC (Ile/Ile) genotype; at CYP17, those with at least one G allele were compared with the AA genotype; at NQO1, those with at least one T allele (Pro/Ser or Ser/Ser) were compared with the CC (Pro/Pro) genotype; at GSTP1, those with at least one A allele (Ile/Ile or Ile/Val) were compared with the GG (Val/Val) genotype; at COMT, those with at least one A allele (Met/Met or Val/Met) were compared with the GG (Val/Val) genotype and at XRCC1, those with the AA genotype (Gln/Gln) were compared with those with any G allele (Gln/Arg or Arg/Arg). For the repeat polymorphism in CYP19A1, TTTA10 was established as the cut point based on the frequency distribution in the control population. The effect of each variant on lung cancer risk was tested in unconditional logistic regression models, adjusting for age at diagnosis (cases)/interview (controls), self-reported race (white/black), pack-years of smoking, current body mass index (BMI), family history of lung cancer and history of a chronic obstructive lung disease. Odds ratios (OR) and 95% CIs for each genotype were calculated from coefficients in the adjusted models. Analyses were repeated after stratification by age (<60/60+), race, smoking exposure history (never/ever and light/heavy use among smokers) and HRT use (ever/never).

Selecting the group that represented the majority of our lung cancer cases (white ever smokers), two gene combinations were analyzed in unconditional logistic regression models to assess risk of lung cancer associated with carrying no risk genotypes, one risk genotype or two risk genotypes. It was determined a priori that we would analyze single SNPs that were found to be statistically significant in the white population. Data are presented for combinations of the GSTs and XRCC1 with the other SNPs analyzed in this study. In addition, using data from this subgroup of white ever-smoking women, we performed a model building procedure using forward selection which included both environmental risk factors (age, current BMI, pack-years, obstructive lung disease and family history) and the candidate genotypes. To assess which environmental risk factors were important and whether the addition of genotype information into this model would improve the fit, we used the likelihood ratio test to calculate the statistical significance of nested models as new terms were added. Differences in the 2 ln (likelihood of full model/likelihood of null model) between nested models (notated as ‘G’) were compared with the χ2 distribution, with the degrees of freedom equal to the difference in the degrees of freedom between the two models. In addition to the main effects of these genes, we considered two- and three-way gene–gene interactions. We also considered the interaction of each gene with environmental risk factors: age (<60/60+ years of age), BMI tertiles (≤24.9/25.0–29.9/30+), personal smoking history (0–25/26+ pack-years), personal history of an obstructive lung disease (yes/no) and family history of lung cancer (yes/no). When evidence of gene–gene or gene–environment interaction was found, we examined cases and controls separately using a log-linear model. Adjusted ORs for variables included in the most parsimonious model are presented. The significance level was set at α = 0.05 and all tests were two sided. All statistical analyses were done using SAS version 9.1 (SAS Institute, Cary, NC).


Table I presents the characteristics of case and control women, overall and stratified by race. Among white women, cases and controls were frequency matched on age (at diagnosis or interview), with the mean for both groups of ~60 years of age (P = 0.19), with the majority of the subjects reporting to be post-menopausal (95.1% of cases and 85.5% of controls, P < 0.0001, data not shown). White controls had higher BMIs than case women (P < 0.0001) and had attained higher levels of education (P < 0.0001). White control women were less likely to be current smokers (P < 0.0001). Among those who did smoke (either current or ex-smokers), white control women reported significantly fewer pack-years of smoking (P < 0.0001). Exposure to ETS as a child was similar among all white women (P = 0.11), but among those who were exposed (approximately three quarters of women in both groups), the mean number of years was similar for both groups (P = 0.07). As adults, both household exposure to ETS and work exposure to ETS was more common among the white case women (both P < 0.0001), and the number of years exposed at home and work was also significantly higher (both P < 0.0001). White cases were also more likely than controls to report a personal history of chronic obstructive lung disease (P < 0.0001) and to report a family history of lung cancer in a first-degree relative (P < 0.0001). There were no differences in reported ever/never use of HRT between the white case and control women (P = 0.25).

Table I.
Characteristics of NSCLC cases and controls

The mean age reported among black cases and controls was similar (P = 0.94), with 88.7% of cases and 81.8% of controls reporting to be post-menopausal (P = 0.14, data not shown). Black controls had higher BMIs than case women (P < 0.0001). Educational attainment was similar for both groups (P = 0.48). Black control women were less likely to be current smokers (P < 0.0001), and those that were ever smokers reported significantly fewer pack-years of smoking compared with black case women (P < 0.0001). Exposure to ETS as a child and as an adult at home was similar among all black women (P = 0.18 and P = 0.94, respectively). Work exposure to ETS was more common among the black case women (P = 0.05), but the number of years exposed did not differ significantly (P = 0.47). Black cases were not more likely than controls to report a personal history of chronic obstructive lung disease (P = 0.38), but were more likely to report a family history of lung cancer in a first-degree relative (P < 0.01). There were no differences in reported ever/never use of HRT between the black case and control women (P = 0.24).

Table II reports the genotype frequency by case or control status, stratified by race. The following genotype distributions in controls differed between white and black women: CYP1A1 Msp1 (P = 0.03), CYP1B1 (P < 0.0001), CYP19A1 (P < 0.0001), GSTM1 (P < 0.0001), GSTP1 (P < 0.005), COMT (P < 0.0001) and XRCC1 (P < 0.0001). We examined Hardy–Weinberg equilibrium in the white controls and black controls separately, and no evidence of deviation from Hardy–Weinberg equilibrium in either group was found. The only significant difference between cases and controls was seen in black women for the CYP17 genotypes (P = 0.01). All other genotype frequencies were similar between cases and controls in unadjusted analyses.

Table III presents lung cancer risk associated with each genotype for all women combined, then stratified by race and by smoking status and race and HRT use, after adjusting for BMI, age, family history of lung cancer, pack-years of smoking (among smokers) and obstructive lung disease (in the smokers groups only). Among all women, those with GSTM1 null genotypes were more likely to have lung cancer than those with a GSTM1 present genotype (OR = 1.40; 95% CI, 1.02–1.91). Women carrying at least one A allele at GSTP1 were also at greater risk of having lung cancer compared with those with the GG genotype (OR = 1.59; 95% CI, 1.02–2.47); a similar association was seen for white women only (OR = 1.85; 95% CI, 1.04–3.27). Women carrying the A/A genotype at XRCC1 were 1.6-fold more likely to have lung cancer compared with women with the at least one G allele (OR = 1.62; 95% CI, 1.01–2.60), and this association was also seen for white women only (OR = 1.68; 95% CI, 1.01–2.79), particularly in ever-smoking white women (OR = 2.11; 95% CI, 1.13–3.94). Among black ever-smoking women, those carrying the C/G or G/G genotype were 11.09-fold more likely to have lung cancer compared with those carrying the C/C genotype at CYP1B1 (OR = 11.09; 95% CI, 1.18–104.25), although it should be noted that these results are based on six individuals with the C/C genotype. There were no other associations between single genotypes and lung cancer risk among the entire group, by race or by HRT use.

Table III.
Risk of NSCLC associated with polymorphisms in candidate genes, by cigarette smoking, hormone use and race

Table IV presents lung cancer risk associated with single genotypes and two gene combinations (the GSTs and XRCC1) for white ever-smoking women, with ORs and 95% CIs adjusted for age, BMI, pack-years of smoking, family history of lung cancer and history of obstructive lung disease. GSTM1 and GSTT1 were included because, although not statistically significant at the 0.05 level in our population, they are among the most commonly analyzed gene–gene combinations in the literature and have both been identified singularly in larger pooled and meta-analyses as having modest associations with lung cancer risk (35,36). When examining two genes and coding risk as 0 risk genotypes, 1 risk genotype or 2 risk genotypes, the combination of carrying GSTM1 null genotype and the CYP17 risk genotype (A/G and G/G) was associated with an increased risk of lung cancer compared with individuals who carried 0 risk genotypes at these loci (OR = 1.83; 95% CI, 1.02–3.30). Increased risk was also seen for individuals carrying one or two risk genotypes at GSTM1 and COMT compared with those with no risk genotypes at these loci (OR = 2.28; 95% CI, 1.07–4.84 and OR = 2.57; 95% CI, 1.18–5.59, respectively). Risk was also increased for individuals carrying either risk genotype for GSTM1 or XRCC1 (OR = 1.73; 95% CI, 1.12–2.65) compared with those with no risk genotypes at these loci. Individuals with both GSTM1 and XRCC1 risk genotypes had non-statistically significant increased risk compared with those with no risk genotypes (OR = 2.12; 95% CI, 0.92–4.89). Increased risk of lung cancer was also seen among individuals who carry both the GSTT1 and CYP17 risk genotypes compared with those who do not (OR = 2.06; 95% CI, 1.02–4.19). Increased risk was also seen for those carrying either the GSTT1 or XRCC1 risk genotypes (OR = 1.76; 95% CI, 1.11–2.77) and for individuals carrying GSTT1 and GSTP1 risk genotypes compared with those with no risk alleles at these loci (OR = 2.56; 95% CI, 1.02–6.42). Individuals with GSTP1 and XRCC1 risk genotypes were >3-fold more likely to have lung cancer compared with those with no risk genotypes at these loci (OR = 3.57; 95% CI, 1.38–9.21). Individuals with both risk genotypes for XRCC1 and CYP1B1 were also >3-fold more likely to have lung cancer compared with those with neither risk genotype (OR = 3.39; 95% CI, 1.35–8.53), and those with XRCC1 and COMT risk genotypes were >2-fold more likely to have lung cancer compared with those without risk genotypes at these loci (OR = 2.54; 95% CI, 1.10–5.83).

Table IV.
Risk of lung cancer and GST combinations in white ever-smoking women

Estimates for each variable in the best fitting unconditional regression model are presented in Table V for white ever-smoking women. The most parsimonious model consisted of traditional risk factors for lung cancer, including age at diagnosis (P = 0.34), family history of lung cancer in a first-degree relative (P = 0.0011), personal history of a chronic obstructive lung disease (P = 0.008) and BMI (with a higher BMI showing a protective effect) (P = 0.0003). Among the traditional risk factors, pack-years of cigarette smoking was the most significant predictor of lung cancer risk (P < 0.0001). Three of the SNPs examined in this study were also selected in the most parsimonious model, namely GSTM1 (P = 0.11), COMT A/G or G/G genotype (P = 0.59), and XRCC1 A/A was the strongest genetic predictor of risk that was examined in this study (P = 0.02). The addition of XRCC1 provided a better fit than the risk factor model alone (G = 5.91), and the addition of GSTM1 further improved the fit (G = 5.47). Only the addition of COMT significantly improved the fit of the model once XRCC1 and GSTM1 were included with the traditional risk factors (G = 4.11), resulting in the most parsimonious model. It should be noted that after the inclusion of the traditional risk factors associated with lung cancer, adding any combination of two SNPs was found to result in a significantly improved model fit, even when the main effects of each single SNP by itself were not significant. No statistically significant interactions were noted between the SNPs, although there was a significant interaction between GSTM1 (null/present) and BMI (underweight or normal/overweight/obese) (P = 0.007). After stratification by case or control status, this association was not seen among case women (P = 0.26), only control women (P = 0.02) and may represent a chance finding. No other interactions between the traditional risk factors for lung cancer risk and the genetic risk factors were identified.

Table V.
Estimates of main effects for environmental and genetic risk factors associated with lung cancer risk, from the most parsimonious model, white ever-smoking women


The associations between the SNPs involved in tobacco carcinogen metabolism and lung cancer have been examined in numerous studies with variable results, as we have recently reviewed (37). There is evidence from studies of human lung tissue suggesting that the pathways involved in carcinogen metabolism and sex steroid synthesis interact with GSTT1, CYP1B1 and NQO1 correlated with plasma estradiol levels or ER expression (24). NQO1 expression was also correlated with hormonal factors in human lung tumor tissue. Further work by this group suggests that ER-alpha regulates CYP1A1 protein expression when exposed to cigarette smoke extract in normal human bronchial epithelial cells and increases both basal expression of CYP1B1 at the messenger RNA level and protein level (38). The modest ORs and 95% CIs estimated for the association between lung cancer and polymorphisms in GSTM1, GSTT1, GSTP1, NQO1, CYP1A1 and XRCC1, both individually and combined, in our study population of women with NSCLC are similar to the ranges reported in the literature. The genes involved in sex steroid synthesis (CYP19A1, COMT, CYP17 and CYP1B1) have been less well studied in relation to lung cancer risk and are described in more detail in the following paragraphs. Our goal was to examine SNPs in a set of genes involved in one or both of these pathways to investigate whether they were related to development of NSCLC in women. In addition, the SNPs examined here were also considered in the risk model incorporating both traditional and genetic risk factors for lung cancer.

Genes involved in sex steroid synthesis

The CYP19A1 gene codes for the aromatase enzyme, which converts testosterone to estradiol, and has a common tetranucleotide simple tandem repeat polymorphism (TTTA) in intron 4 (39). Longer repeats (which various studies define as between 7 and 10 repeats) have been associated with higher levels of circulating estrogen levels in older men (40) and women (41). Studies of hormone-related cancers report conflicting findings with regard to this polymorphism and cancer risk, with inconsistent findings reported for breast (42,43), prostate (44,45) and endometrial cancer (46) studies. It has been suggested that the associations identified may be due to linkage disequilibrium across this locus. We did not identify any association with carrying TTTA10+ repeats and lung cancer risk, either alone or in combination with other SNPs.

COMT is involved in methylation of catechol estrogens and reduced activity of this enzyme may increase the accumulation of catecholestrogens, leading to oxidative DNA damage (47,48). In the COMT gene, a G→A transition at codon 158 results in the substitution of methionine for valine (47) and carriage of the Val/Met or Met/Met genotypes is thought to result in lower enzyme activity (48). An analysis of 105 SNPs in 31 genes was the first to report risk associated with COMT (Val158Met) and NSCLC in 365 Norwegian lung cancer cases and 413 smokers without lung cancer (49). Zienolddiny et al. (49) found that individuals who carried the Val/Met or Met/Met genotypes were at 1.69-fold increased risk of NSCLC compared with those with the Val/Val genotype, after adjusting for age, sex and pack-years of smoking (95% CI, 1.16–2.47). Our findings in the total study population (OR = 1.10) and in the white ever-smoking women (OR = 1.22) were similar in magnitude but did not reach statistical significance after adjustment for BMI, age, family history of lung cancer, pack-years of smoking and obstructive lung disease. When combined with carrying the null mutation in GSTM1, white women smokers with either one or both risk genotypes at GSTM1 or COMT were at statistically significantly increased risk of lung cancer compared with individuals who carried GSTM1 non-null and COMT GG genotypes (OR = 2.57; 95% CI, 1.18–5.59) after adjustment for BMI, age, family history of lung cancer, pack-years of smoking and obstructive lung disease. In combination with the XRCC1 risk genotype, carrying the COMT A allele also resulted in 2.5-fold increased risk. These findings suggest that a combination of accumulated catechol estrogens and associated oxidative damage with impaired detoxification and/or DNA repair may lead to increased lung cancer risk.

CYP17 is another potential candidate gene in hormone-associated cancers based on its role in steroid hormone biosynthesis. A single polymorphism (T→C, A1→A2) in the 5′ untranslated region had been associated with higher levels of estradiol among carriers of the A2 allele (50), but more recent work suggests that circulating estrogens and androgens do not significantly differ by CYP17 genotype at this locus (51). This polymorphism has not been shown to be associated with increased risk of prostate or endometrial cancers, either alone or in combination with other steroid hormone-related genes (44,52). An analysis of major single-gene effects of CYP17 variants found no associations with breast or prostate cancer, although the combination of two SNPs in CYP17 was mildly associated with high-grade prostate cancers (53). This SNP has not been previously studied in people with lung cancer. In our population of white women smokers, the CYP17 A1/A2 or A2/A2 genotypes did not increase risk of lung cancer until combined with the null mutations of GSTM1 or GSTT1 compared with women with no risk genotypes at these loci. These findings provide additional support to the idea that increased estrogen biosynthesis in combination with reduced detoxification of metabolites might contribute to lung cancer risk in women.

CYP1B1, in addition to activating procarcinogens to reactive metabolites (i.e. in the metabolism of tobacco smoke), is also thought to mediate the hydroxylation of 17β-estradiol. The Val allele at Leu432Val involves a change from an aliphatic amino acid to a smaller aliphatic amino acid, enabling higher catalytic efficiency for the 4-hydroxylation of estradiol and may play a dual role in carcinogenesis depending on the substrate (54). We recently reported increased risk of lung cancer in non-smokers associated with this single SNP and also in combination with other phase II enzymes (32). We did replicate these findings in our current population, but only for black women smokers, and we did find a statistically significant increased risk of lung cancer among white smoking women who carried both risk genotypes for CYP1B1 and XRCC1 (OR = 3.39; 95% CI, 1.35–8.53). We are unable to disentangle the metabolic pathway (estrogen or tobacco metabolism) through which these genotypes are affecting lung cancer risk, but the findings suggest that the inability to repair DNA damage caused by the 4-hydroxestradiol may contribute to lung cancer risk.

Risk model of NSCLC in white ever-smoking women

We also presented a systematic model building procedure that incorporated traditional non-modifiable (i.e. age and family history of lung cancer), potentially modifiable (i.e. personal history of chronic obstructive lung disease, pack-years of smoking and BMI) and genetic risk factors. Given the differences in the allele frequencies reported here between black and white women and the potentially different pathways by which never smokers and smokers develop lung cancer, our model building was limited to ever-smoking white women. This represented the largest subgroup in our study population. In addition, recent work by Etzel et al. (55) presents a risk prediction model in African-Americans and includes some of the participants from our study in the validation of their model. For these reasons, the model presented was limited to white ever-smoking women. The traditional risk factors, namely age, family history of lung cancer, personal history of lung disease, BMI and pack-years of smoking were more strongly associated with lung cancer risk than any of the SNPs tested; however, genotypes at XRCC1, GSTM1 and COMT all contributed to the most parsimonious risk model. The model presented here is limited by both sample size and the number of SNPs examined, and should not be considered a predictive risk model. Instead, our findings highlight the importance of examining the associations identified between SNPs in genes in candidate pathways versus single genes and suggest that future lung cancer modeling efforts will benefit from the inclusion of genotyping results.

The need for an individual risk prediction model for lung cancer has been well described (56). Past work in this area has focused on biomarkers found in sputum samples from individuals suspected to be at increased risk due to personal smoking histories and airflow obstruction (57) or asbestos exposures (58). A more recently proposed model included smoking pack-year histories, past diagnoses of pneumonia or another cancer, family history of early/later onset lung cancer and occupational exposure to asbestos. It was estimated that approximately two-thirds of lung cancer cases in a 5 year period could be predicted through use of this model (59). The most detailed model to date provided separate risk estimates for never, former and current smokers and considered 10 different exposures, family history of smoking-related and other cancers and information about smoking initiation, cessation and pack-years (60). The discriminatory abilities of all of these models are similar and modest in predictive ability, but have the advantage of using questionnaire and/or clinical data that are relatively easy to obtain compared with genetic information. The addition of information from pathway-based candidate genes represents the next step in the development of better risk prediction models for lung cancer.

There are limitations to the findings presented here. First, the majority of the subjects in our study had lung adenocarcinomas, so our findings may not be representative of all NSCLCs. In addition, while our population included a substantial number of black women, we were still underpowered to examine gene–environment interactions in these women. Since many of the SNPs studied varied in frequency significantly by self-reported race, we chose to present our data stratified by race or limited to white women smokers (our largest subgroup) only. Thus, our ability to adequately assess risk in other subgroups, such as non-smokers, was limited. The ORs and CIs for single SNP and gene–gene combinations presented here were not adjusted for multiple comparisons and need to be validated in other populations. Lastly, only a limited number of candidate SNPs were genotyped and included in this analysis, which is not fully representative of the complexity in the tobacco or estrogen metabolism pathways.

The strengths of this study include the detailed, in-person data collected from each subject, including information on past medical and reproductive histories, over-the-counter and prescribed medication use, occupational exposures, BMI, family history of lung cancer and smoking history. These data are from a large, population-based sample of women and are probably representative of the women in their birth cohorts. The ability to integrate risk factors identified via questionnaires, clinical exam and through the use of data derived from genotyping is an additional strength of this study. Data from this study can be analyzed with other comparable data to better elucidate the gene–gene and gene–environment interactions which influence individual lung cancer risk and can provide further validation of existing models in under-studied populations (i.e. women, blacks and non-smokers). The continued study of the role of estrogen and estrogen metabolism as a contributor to lung cancer risk is also warranted.


National Institutes of Health (R01-CA87895, N01-PC35145, P30CA22453).


The authors thank Lynda Forbes, Yvonne Bush, Kelly Montgomery, Pat Campagna, and the staff of the Metropolitan Detroit Cancer Surveillance System for data collection and management.

Conflict of Interest Statement: None declared.



body mass index
confidence interval
cytochrome P450
estrogen receptor
environmental tobacco smoke
glutathione S-transferase
hormone replacement therapy
non-small cell lung cancer
odds ratio
polymerase chain reaction
single-nucleotide polymorphism


1. Ries LAG, et al., editors. SEER Cancer Statistics Review, 1975–2005. Bethesda, MD: National Cancer Institute; 2008.
2. Zang EA, et al. Differences in lung cancer risk between men and women: examination of the evidence. J. Natl Cancer Inst. 1996;88:183–192. [PubMed]
3. Ryberg D, et al. Different susceptibility to smoking-induced DNA damage among male and female lung cancer patients. Cancer Res. 1994;54:5801–5803. [PubMed]
4. Mollerup S, et al. Sex differences in lung CYP1A1 expression and DNA adduct levels among lung cancer patients. Cancer Res. 1999;59:3317–3320. [PubMed]
5. Kure EH, et al. p53 mutations in lung tumours: relationship to gender and lung DNA adduct levels. Carcinogenesis. 1996;17:2201–2205. [PubMed]
6. Tam IY, et al. Distinct epidermal growth factor receptor and KRAS mutation patterns in non-small cell lung cancer patients with different tobacco exposure and clinicopathologic features. Clin. Cancer Res. 2006;12:1647–1653. [PubMed]
7. Liu Y, et al. Reproductive factors, hormone use and the risk of lung cancer among middle-aged never-smoking Japanese women: a large-scale population-based cohort study. Int. J. Cancer. 2005;117:662–666. [PubMed]
8. Taioli E, et al. Re: endocrine factors and adenocarcinoma of the lung in women. J. Natl Cancer Inst. 1994;86:869–870. [PubMed]
9. Kreuzer M, et al. Hormonal factors and risk of lung cancer among women? Int. J. Epidemiol. 2003;32:263–271. [PubMed]
10. Schabath MB, et al. Hormone replacement therapy and lung cancer risk: a case-control analysis. Clin. Cancer Res. 2004;10:113–123. [PubMed]
11. Kabat GC, et al. Reproductive and hormonal factors and risk of lung cancer in women: a prospective cohort study. Int. J. Cancer. 2007;120:2214–2220. [PubMed]
12. Schwartz AG, et al. Reproductive factors, hormone use, estrogen receptor expression and risk of non small-cell lung cancer in women. J. Clin. Oncol. 2007;25:5785–5792. [PubMed]
13. Beattie CW, et al. Steroid receptors in human lung cancer. Cancer Res. 1985;45:4206–4214. [PubMed]
14. Cagle PT, et al. Estrogen and progesterone receptors in bronchogenic carcinoma. Cancer Res. 1990;50:6632–6635. [PubMed]
15. Chaudhuri PK, et al. Steroid receptors in human lung cancer cytosols. Cancer Lett. 1982;16:327–332. [PubMed]
16. Kaiser U, et al. Steroid-hormone receptors in cell lines and tumor biopsies of human lung cancer. Int. J. Cancer. 1996;67:357–364. [PubMed]
17. Mollerup S, et al. Expression of estrogen receptors alpha and beta in human lung tissue and cell lines. Lung Cancer. 2002;37:153–159. [PubMed]
18. Stabile LP, et al. Human non-small cell lung tumors and cells derived from normal lung express both estrogen receptor alpha and beta and show biological responses to estrogen. Cancer Res. 2002;62:2141–2150. [PubMed]
19. Kawai H, et al. Estrogen receptor alpha and beta are prognostic factors in non-small cell lung cancer. Clin. Cancer Res. 2005;11:5084–5089. [PubMed]
20. Schwartz AG, et al. Nuclear estrogen receptor beta in lung cancer: expression and survival differences by sex. Clin. Cancer Res. 2005;11:7280–7287. [PubMed]
21. Skov BG, et al. Oestrogen receptor beta over expression in males with non-small cell lung cancer is associated with better survival. Lung Cancer. 2008;59:88–94. [PubMed]
22. Wu CT, et al. The significance of estrogen receptor beta in 301 surgically treated non-small cell lung cancers. J. Thorac. Cardiovasc. Surg. 2005;130:979–986. [PubMed]
23. Hung RJ, et al. Genetic polymorphisms in the base excision repair pathway and cancer risk: a HuGE review. Am. J. Epidemiol. 2005;162:925–942. [PubMed]
24. Spivack SD, et al. Phase I and II carcinogen metabolism gene expression in human lung tissue and tumors. Clin. Cancer Res. 2003;9:6002–6011. [PubMed]
25. Ball P, et al. Catecholoestrogens (2-and 4-hydroxyoestrogens): chemistry, biogenesis, metabolism, occurrence and physiological significance. Acta Endocrinol. Suppl. (Copenh.) 1980;232:1–127. [PubMed]
26. Colilla S, et al. Association of catechol-O-methyltransferase with smoking cessation in two independent studies of women. Pharmacogenet. Genomics. 2005;15:393–398. [PMC free article] [PubMed]
27. Martucci CP, et al. P450 enzymes of estrogen metabolism. Pharmacol. Ther. 1993;57:237–257. [PubMed]
28. Feigelson HS, et al. Cytochrome P450c17alpha gene (CYP17) polymorphism is associated with serum estrogen and progesterone concentrations. Cancer Res. 1998;58:585–587. [PubMed]
29. Picado-Leonard J, et al. Cloning and sequence of the human gene for P450c17 (steroid 17 alpha-hydroxylase/17,20 lyase): similarity with the gene for P450c21. DNA. 1987;6:439–448. [PubMed]
30. Means GD, et al. Structural analysis of the gene encoding human aromatase cytochrome P-450, the enzyme responsible for estrogen biosynthesis. J. Biol. Chem. 1989;264:19385–19391. [PubMed]
31. Drakoulis N, et al. Polymorphisms in the human CYP1A1 gene as susceptibility factors for lung cancer: exon-7 mutation (4889 A to G), and a T to C mutation in the 3′-flanking region. Clin. Investig. 1994;72:240–248. [PubMed]
32. Wenzlaff AS, et al. CYP1A1 and CYP1B1 polymorphisms and risk of lung cancer among never smokers: a population-based study. Carcinogenesis. 2005;26:2207–2212. [PubMed]
33. Bock CH, et al. NQO1 T allele associated with decreased risk of later age at diagnosis lung cancer among never smokers: results from a population-based study. Carcinogenesis. 2005;26:381–386. [PubMed]
34. Wenzlaff AS, et al. GSTM1, GSTT1 and GSTP1 polymorphisms, environmental tobacco smoke exposure and risk of lung cancer among never smokers: a population-based study. Carcinogenesis. 2005;26:395–401. [PubMed]
35. Benhamou S, et al. Meta- and pooled analyses of the effects of glutathione S-transferase M1 polymorphisms and smoking on lung cancer risk. Carcinogenesis. 2002;23:1343–1350. [PubMed]
36. Raimondi S, et al. Meta- and pooled analysis of GSTT1 and lung cancer: a HuGE-GSEC review. Am. J. Epidemiol. 2006;164:1027–1042. [PubMed]
37. Schwartz AG, et al. The molecular epidemiology of lung cancer. Carcinogenesis. 2007;28:507–518. [PubMed]
38. Han W, et al. Estrogen receptor alpha increases basal and cigarette smoke extract-induced expression of CYP1A1 and CYP1B1, but not GSTP1, in normal human bronchial epithelial cells. Mol. Carcinog. 2005;44:202–211. [PMC free article] [PubMed]
39. Polymeropoulos MH, et al. Tetranucleotide repeat polymorphism at the human aromatase cytochrome P-450 gene (CYP19) Nucleic Acids Res. 1991;19:195. [PMC free article] [PubMed]
40. Gennari L, et al. A polymorphic CYP19 TTTA repeat influences aromatase activity and estrogen levels in elderly men: effects on bone metabolism. J. Clin. Endocrinol. Metab. 2004;89:2803–2810. [PubMed]
41. Dick IM, et al. Association of an aromatase TTTA repeat polymorphism with circulating estrogen, bone structure, and biochemistry in older women. Am. J. Physiol. Endocrinol. Metab. 2005;288:E989–E995. [PubMed]
42. Baxter SW, et al. Polymorphic variation in CYP19 and the risk of breast cancer. Carcinogenesis. 2001;22:347–349. [PubMed]
43. Healey CS, et al. Polymorphisms in the human aromatase cytochrome P450 gene (CYP19) and breast cancer risk. Carcinogenesis. 2000;21:189–193. [PubMed]
44. Cussenot O, et al. Combination of polymorphisms from genes related to estrogen metabolism and risk of prostate cancers: the hidden face of estrogens. J. Clin. Oncol. 2007;25:3596–3602. [PubMed]
45. Li L, et al. No association between a tetranucleotide repeat polymorphism of CYP19 and prostate cancer. Cancer Epidemiol. Biomarkers Prev. 2004;13:2280–2281. [PubMed]
46. Paynter RA, et al. CYP19 (aromatase) haplotypes and endometrial cancer risk. Int. J. Cancer. 2005;116:267–274. [PubMed]
47. Lachman HM, et al. Human catechol-O-methyltransferase pharmacogenetics: description of a functional polymorphism and its potential application to neuropsychiatric disorders. Pharmacogenetics. 1996;6:243–250. [PubMed]
48. Palmatier MA, et al. Global variation in the frequencies of functionally different catechol-O-methyltransferase alleles. Biol. Psychiatry. 1999;46:557–567. [PubMed]
49. Zienolddiny S, et al. A comprehensive analysis of phase I and phase II metabolism gene polymorphisms and risk of non-small cell lung cancer in smokers. Carcinogenesis. 2008;29:1164–1169. [PubMed]
50. Carey AH, et al. Polycystic ovaries and premature male pattern baldness are associated with one allele of the steroid metabolism gene CYP17. Hum. Mol. Genet. 1994;3:1873–1876. [PubMed]
51. Olson SH, et al. Variants in estrogen biosynthesis genes, sex steroid hormone levels, and endometrial cancer: a HuGE review. Am. J. Epidemiol. 2007;165:235–245. [PubMed]
52. Gaudet MM, et al. Genetic variation in CYP17 and endometrial cancer risk. Hum. Genet. 2008;123:155–162. [PubMed]
53. Setiawan VW, et al. CYP17 genetic variation and risk of breast and prostate cancer from the National Cancer Institute Breast and Prostate Cancer Cohort Consortium (BPC3) Cancer Epidemiol. Biomarkers Prev. 2007;16:2237–2246. [PubMed]
54. Hanna IH, et al. Cytochrome P450 1B1 (CYP1B1) pharmacogenetics: association of polymorphisms with functional differences in estrogen hydroxylation activity. Cancer Res. 2000;60:3440–3444. [PubMed]
55. Etzel CJ, et al. Development and validation of a lung cancer risk prediction model for African-Americans. Cancer Prev. Res. 2008;1:255–265. [PMC free article] [PubMed]
56. Cassidy A, et al. Lung cancer risk prediction: a tool for early detection. Int. J. Cancer. 2007;120:1–6. [PubMed]
57. Prindiville SA, et al. Sputum cytological atypia as a predictor of incident lung cancer in a cohort of heavy smokers with airflow obstruction. Cancer Epidemiol. Biomarkers Prev. 2003;12:987–993. [PubMed]
58. Bach PB, et al. Variations in lung cancer risk among smokers. J. Natl Cancer Inst. 2003;95:470–478. [PubMed]
59. Cassidy A, et al. The LLP risk model: an individual risk prediction model for lung cancer. Br. J. Cancer. 2008;98:270–276. [PMC free article] [PubMed]
60. Spitz MR, et al. A risk model for prediction of lung cancer. J. Natl Cancer Inst. 2007;99:715–726. [PubMed]

Articles from Carcinogenesis are provided here courtesy of Oxford University Press