|Home | About | Journals | Submit | Contact Us | Français|
A previously published case-control study nested in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial found a significant relationship of serum levels of total NNAL (4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol and its glucuronides) to prospective lung cancer risk. The present paper examines this relationship in the context of single-nucleotide polymorphisms (SNPs) in genes important in the metabolism of tobacco smoke carcinogens. DNA was extracted from the subjects' lymphocytes and analyzed for SNPs in 11 locations on four genes related to tobacco carcinogen metabolism. Logistic regressions on case-control status were used to estimate main effects of SNPs and biomarkers and their interactions adjusting for potential confounders. Of the 11 SNPs, only one, in CYP1B1, significantly interacted with total NNAL affecting risk for lung cancer. At low NNAL levels, the variant appeared protective. However, for those with the minor variant, the risk for lung cancer increased with increasing NNAL five times as rapidly compared to those without it, so that at high NNAL levels, this SNP's protection disappears. Analyzing only adenocarcinomas, the effect of the variant was even stronger, with the risk of cancer increasing six times as fast. A common polymorphism of CYP1B1 may play a role in the risk of NNK, a powerful lung carcinogen, in the development of lung cancer in smokers.
Among the multiple carcinogens in cigarette smoke, tobacco-specific nitrosamines such as 4-(methylnitrosamino)-1-(3-pyridyl)-1-buta none (NNK) and polycyclic aromatic hydrocarbons (PAH) are widely regarded as important causes of lung cancer, which kills an average of 3000 people per day in the world [1-3]. NNK and PAH require metabolic activation to exert their carcinogenic effects through the formation of DNA adducts which can cause mutations in critical growth control genes, leading ultimately to genomic instability and lung cancer . There are competing detoxification reactions which lead to harmless excretion of NNK and PAH metabolites. Multiple cytochrome P450 enzymes and Phase II enzymes are involved in the metabolic activation and detoxification of NNK and PAH [5-7]. Single nucleotide polymorphisms (SNPs) in these enzymes could affect the balance of metabolic activation and detoxification in a given smoker, thus altering lung cancer risk upon exposure to NNK and PAH in cigarette smoke.
Previously, we reported the first investigation of the relationship between lung cancer and biomarkers of NNK and PAH exposure, using a nested case control design embedded in the National Cancer Institute-sponsored Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial . We found that serum levels of 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol CYP1B1, smoking and lung cancer risk and its glucuronides (total NNAL), an established biomarker of NNK exposure , were significantly related to lung cancer risk in smokers. In the same study, we also examined the relationship to lung cancer risk of r-1,t-2,3,c-4-tetrahydroxy-1,2,3,4-tetrahydrophenanthrene (PheT), a metabolite of the PAH phenanthrene [9,10], but found no significant effect.
In the study reported here, we have examined the joint effects of SNPs in several enzymes involved in carcinogen metabolism and the biomarkers total NNAL and PheT as risk factors for lung cancer in the PLCO study. We report an unexpectedly strong effect of a CYP1B1 polymorphism interacting with total NNAL to affect lung cancer risk.
The PLCO is an NCI-funded multi-center, randomized, prospective trial of screening for cancers of the prostate, lung, colorectum and ovaries that began in 1993 and is projected to end in 2011 . The screening in the trial includes 77,468 men and women, of whom approximately 25,000 are current or former smokers. In addition to annually screening participants and carefully abstracting cancers from medical records, the PLCO has prospectively collected extensive information from study participants, including smoking history, family history of cancer, and demographic information collected at randomization; and it maintains a bio-repository of blood samples drawn over six annual screening visits starting in 1993. The PLCO trial made available its prospectively collected blood samples from the first screening visit and its extensive baseline and clinical data, thus providing for the direct calculation of lung cancer risks in the groups with different baseline levels of biomarkers. For this study, we selected those who were current smokers at the time of the blood draw. In addition, in the PLCO screening cohort nearly all cases of lung cancer had been screened at least once and so the variability in diagnostic lead times and the potential confounding that variability can produce in unscreened or partially screened cohorts was substantially reduced. At the time our study was initiated, over 800 lung cancer cases had been diagnosed in the screening arm of the PLCO. We randomly selected cases and controls from subjects who reported currently smoking at least 10 cigarettes per day on the baseline questionnaire filled out at the time of randomization.
The PLCO was approved by the institutional review boards of each participating institution, and all subjects signed consents permitting the research represented here.
We used a nested case-control approach wherein the source cohort consisted of PLCO participants who at randomization filled out a baseline questionnaire indicating they were free of cancer and currently smoking at least 10 cigarettes per day, and who contributed adequate blood samples to the biorepository. Cases were those smokers subsequently diagnosed with lung cancer and controls were those smokers with no diagnosis of lung cancer before the cut-off date (August 17, 2007). From this cohort, we randomly selected 100 incident lung cancer cases and 100 controls and obtained their demographic and other baseline data from the PLCO database, as well as serum samples adequate to measure total cotinine, total NNAL, and PheT. The intent was to determine whether or not biomarker levels in lung cancer subjects differ from those in non-lung cancer subjects. We hypothesized that higher levels of tobacco carcinogens and their specific metabolites among long-term current smokers predispose them to higher risks of developing lung cancer. We did not match on any characteristics, choosing to control for age, sex, family history, and smoking exposure by post-adjustment; this avoided over-matching and allowed us to examine the risks associated with these factors in comparison to those for the biomarkers . Since all subjects were current smokers at baseline, no adjustment for time since quitting was necessary.
The original protocol called for four non-synonymous SNPs on three genes (CYP1B1, GSTP1, and EPHX1) to be analyzed (Table 1, asterisked SNPs). These SNPs were selected because they a) could be related to the metabolism of tobacco smoke carcinogens , b) had allele frequencies >45% reported in the literature, which afforded at least 80% power to detect an odds ratio (OR) of 1.5, and c) were on a custom BOAC SNP chip panel for the Affymetrix/Gene Chip Targeted Genotyping Platform developed at the University of Minnesota by two of the authors . After the study was approved by the PLCO, these four SNPs were augmented with seven additional common non-synonymous SNPs (Table 1, non-asterisked SNPs) based on the literature  but not appearing on the BOAC chip; four of those SNPs are on CYP1B1 and EPHX1, and two are on another gene considered potentially important in carcinogen metabolism (UGT1A6); the seventh is an intergenic SNP. We also genotyped coding-nonsynonymous SNPs in three other genes involved in tobacco smoke carcinogen metabolism (CYP1A1, CYP2A13, CYP2A6); however, genotyping of those SNPs resulted in all homozygous major calls, and therefore were not included in the study.
For the first screening visit in the PLCO study, participants were asked to provide blood samples adequate for 10 ml of serum, 4 ml of plasma, 2 ml of red blood cells, and 2 ml of “buffy coat” (lymphocytes). These samples were stored at the central biorepository facility in Frederick, Maryland.
The methods for assaying total NNAL and PheT in blood samples have been previously published . Total cotinine (free cotinine plus cotinine N-glucuronide) concentration in serum was quantified by gas chromatography-mass spectrometry. The method was similar to that used previously to analyze urinary cotinine .
For some of the subjects, DNA was already isolated and provided by the PLCO. For the remainder, lymphocytes were requested and DNA extracted using Qiagen FlexiGene DNA kit (250) from buffy coat preparations provided by the PLCO.
A directed, custom SNP chip design was developed at the University of Minnesota, and contains functionally relevant polymorphisms playing a role in normal and abnormal cellular functions, inflammation and immunity, and drug responses. The design, quality controls, and platform have been described . While the full SNP panel consists of 3,404 SNPs in ~1,000 genes, in this study we focused initially only on 4 SNPs related to carcinogen metabolism. Further analysis of the total SNP pool will be reported elsewhere. Genotyping was performed using the Affymetrix® Gene-Chip® Scanner 3000 Targeted Genotyping System (GCS 3000 TG System), which utilizes molecular inversion probes  to simultaneously identify many SNPs.
In order to add coverage of relevant metabolic genes and SNPs, we selected an additional 7 SNPs from genes involved in NNK and phenanthrene metabolism (SNPs with frequencies in the controls too low to allow analysis are excluded). The genotyping was performed at the Genotyping Facility, part of the BioMedical Genomics Center, at the University of Minnesota, using the Sequenom platform. Among all assays, 14% of the samples were repeated, with an average repeatability of 99.8%t concordance in SNP calls.
Lymphoblastoid cell lines obtained from the Coriell Cell Repositories were established by Epstein-Barr Virus transformation of peripheral blood mononuclear cells using phytohemagluttinin as a mitogen. Six cell lines were selected based on their genotypes of CYP1B1N453S in order to investigate the variation's impact on the metabolism of NNK and NNAL, if any. Two cell lines of each genotype (for a total of six cell lines) were analyzed. Due to the missing information on the kinetics and the involvement of CYP1B1 in NNK/NNAL metabolism, two different NNK concentrations (low=0.092 μM and high=100.1 μM) and two different incubation times (2 hrs and 6 hrs) were selected. Cells were incubated in a sodium bicarbonate assay buffer (pH 7.4). The NNK metabolic activity was determined by radioflow HPLC. The HPLC column used was a Phenomenex Gemini C18 column (5 μm, 250 × 4.60 mm) eluted with a gradient from 100% A [20 mM sodium phosphate (pH 7) containing 1mM sodium EDTA] to 70% A over 30 min, and then to 50% A in 10 min; B was 100% acetonitrile. The eluantflow rate was 0.5 ml/min, scintillant [Monoflow, National Scientific, Rockwood, TN]. The [5-3H] NNK was purchased from Moravek, Brea, CA (specific activity of 21.7 Ci/mmol). The standard metabolites were detected by UV absorbance at 254 nm. Cell metabolism was stopped with 300 μL of 100% acetonitrile, and samples were reconstituted with deionized water before injection onto the HPLC column.
The sample size of 100 cases and 100 controls was determined for the original case-control study of biomarkers in order to provide 80% power to detect an OR of 1.5 for a 1 standard-deviation difference in serum biomarker level, with a 2-sided 5% type I error rate. This sample size yields the same power and type I error rate for an OR of 1.5 associated with a SNP that occurs in at least 45% of the population; for lower population frequencies, the power is smaller than 80%.
Standard descriptive statistics, such as means, standard deviations, maxima, minima, frequencies, and cumulative frequencies, were computed on all continuous and discrete variables. In addition, ORs and t-tests and the associated p-values were computed to compare the distributions between cases and controls. The logistic regression from the original analysis, which used a hypothesized causal diagram to select potential confounders to adjust the biomarker effects  and also to adjust for associated covariates to improve power, was modified for this study to include each SNP and its interaction with each biomarker. The covariates in the original regression included the categorical variables sex and family history of lung cancer, and the continuous variables of age at randomization, cotinine, total NNAL, PheT, and years of cigarette smoking. Untransformed biomarker measurements were used in the regression, as the distributions were reasonably symmetric. We augmented this regression by including each SNP and its interaction with total NNAL. Separate regressions with each SNP and its interaction with PheT were performed, but none are presented here as none of them were statistically significant. Odds ratios with 95% Wald confidence limits were estimated and intervals excluding 1 were considered statistically significant. Joint significance of both main effect and interaction parameters were tested using the likelihood ratio test. Graphic displays of SNP/NNAL interaction effects were generated both from smoothed averages of case/control status indicators across values of log total NNAL by genotype subgroups as well as from the estimated coefficients in the regression. Further exploratory regressions were done using the same dataset on histological subtypes of the lung cancers. All computations were done in SAS v. 9.1 (SAS Institute, Inc., Cary, NC, USA) for Windows XP OS (Microsoft, Inc., Redmond, WA, USA).
Table 2 gives the frequency of responses by case/control status and the associated ORs and p-values for each categorical variable, including age, sex, race/ethnicity, education, marital status, occupation, family history of lung cancer, the usual number of cigarettes smoked per day and the frequency of the genotypes for each of the eleven SNPs; Table 3 gives the mean and standard deviation by case/control status for continuous variables, including duration of smoking and measured serum levels of total cotinine and total NNAL and PheT, and the associated p-value for the difference between cases and controls. Only age (p = 0.0039), years of smoking (p < 0.0001), and serum level of total NNAL (p = 0.0084) were statistically significantly associated with lung cancer risk. Total cotinine and PheT differed in the direction expected between cases and controls, although not to a statistically significant degree (Table 3). None of the selected SNPs showed significant association with case/control status.
When adjusted for PheT, total cotinine and potential confounders, the interaction of CYP1N543S and total NNAL in determining lung cancer risk was statistically significant in the logistic regression (OR = 1.020, 95% confidence interval = 1.002, 1.038) (Table 4a); thus the SNP modifies the previously observed effect of NNAL on risk of lung cancer. At lower levels of NNAL the SNP appeared to be protective. The protective effect diminishes as the total NNAL level approaches the mean and the effect disappears altogether by the time the total NNAL level reaches its upper range. The test of the total effect of CYP1B1N453S on risk, including the main effect and the interaction effect, was marginally significant , p = 0.0527).
In prior analyses without the SNP data , we estimated that each standard deviation (40 fmol/ml) increase in total NNAL is associated with an approximate 57% increase in lung cancer risk (95% CI: 8%, 128%). The regression that includes the CYP1B1N453S interaction indicates that subjects with at least one minor variant allele exhibit a different effect of NNAL on risk from that among those with both major alleles. For those with the minor allele, each standard deviation increase in NNAL increases lung cancer risk by 170%, more than three times the originally estimated effect. For those without a minor allele, the estimated increase in risk associated with a standard deviation increase in NNAL is 22%, about an eighth the increase in the minor allele group. Because the frequency of the minor allele is about 1/3, the overall rate of increase averages to the 57% shown in the previous paper .
Figure 1 presents two ways, one non-parametric and one parametric, of visualizing the interaction effect of total NNAL and genotype on the risk of lung cancer. The graph in Figure 1A estimates the trend for the unadjusted relationship between log total NNAL concentration and case/control status for subjects with at least one minor allele at CYP1B1N453S (black points and line) and for those without (red points and line). These trends are estimated non-parametrically by a weighted moving average between the cases, plotted on the vertical axis at y = 1, and the controls, plotted at y = 0. This graph is consistent with the logistic regression results and clearly shows a lower risk at the lower levels of NNAL and approximately the same risk at average and higher levels. Notably, CYP1N453S minor allele shows a cluster of cases at high NNAL levels (black dots). Figure 1B plots the risk of cancer, as estimated by the logistic regression, against the log total NNAL concentration among subjects with at least one minor allele at CYP1B1N453S (black points and line) and for those without (red points and line). As can be seen these graphs agree qualitatively, indicating that the findings are unlikely to be the result of model misspecification.
To further investigate the effect of the CYP1B1N453S, the cancers were grouped into adenocarcinomas (N = 59) and non-adenocarcinomas (N = 41) and analyzed separately. For the adenocarcinoma group, the main effect of CYP1B1N453S and its interaction effect with total NNAL (Table 4b) were jointly statistically significant (, p = 0.0107) and both effects of higher magnitude than the effects including all lung cancers (OR = 0.415, 95% confidence interval = 0.157, 1.098; and OR = 1.031, 95% confidence interval = 1.006, 1.056, respectively). For non-adenocarcinomas, the main and interaction effects were not statistically significant.
In the logistic regressions for the other 10 SNP (not shown) neither the main nor the interaction effects were significant.
In order to understand the relation of the CYP1B1N453S SNP and total NNAL levels themselves, we calculated means, standard deviations, and ranges for total NNAL levels by case/control and CYP1B1N453S status (Table 5). Note that for homozygous major alleles, the total serum NNAL level for cases is only 2.5 fmol/ml higher than for controls, but those cases carrying at least one allele had a total serum NNAL level 42.5 fmol higher than the controls. This result is highly statistically significant, and consistent with the findings of the logistic regression. This reflects the greater impact of NNAL on lung cancer risk among those with the minor allele. Notably, when this analysis is narrowed to the adencarcinoma cases, the impact of carrying a minor allele becomes even stronger, and the total serum NNAL level is 52.4 fmol/ml higher for cases than for controls.
CYP1B1 has never been shown to have an enzymatic activity in the metabolism of NNAL and until now its involvement in this pathway had never been tested. Coriell lymphoblastoid cell lines have CYP1B1 activity (http://hapmap.ncbi.nlm.nih.gov/) and the naturally occurring variants of CYP1B1N453S in selected cell lines provided a testable in vitro model for potential differences in NNK metabolism. Six Coriell lines, two of each of the three different genotypes in CYP1B1N453S, were tested for differences in the metabolism of NNK. While metabolism of NNK to NNAL was observed in all six lines, no significant differences were detected in the conversion, and no other metabolites were observed (information available on request from the corresponding authors). This suggests the interaction of CYP1B1N453S variants with levels of NNAL is not through a direct involvement of this CYP activity on NNK/NNAL metabolism.
Most lung cancers derive from cigarette tobacco smoke, which accounts for as much as 90% of all lung cancer cases in the US [18,19]. NNK is a powerful lung carcinogen associated with tobacco smoke, and total serum NNAL is a biomarker of its exposure that has been shown to be significantly associated with lung cancer risk . In the present study, we found that the CYP1B1N453S has an interaction effect on the relation between total NNAL and lung cancer risk in addition to a main effect on risk. It is notable that the SNP would not have been identified had we looked first for a main effect alone. The main effect is statistically significant only in the presence of the interaction term. Further strengthening the result is the fact that the effect increases when the analysis is limited to adenocarcinomas, the histological subtype of lung cancer caused by NNK in laboratory animals, and the most common type of lung cancer in the U.S. This would not likely have been observed if the initial observation was a chance occurrence.
The involvement of CYP1B1 has not been previously implicated in NNK or NNAL metabolism. We considered the possibility that CYP1B1 has a direct involvement in the metabolism of NNK, but found no evidence for this. However, CYP1B1 could have an influence on the NNAL pathway by affecting transcription of CYP1A1, whose role in the metabolic activation of total NNAL has been previously described, even though its catalytic efficiency is not very great [5,20].
Transcription of both P450 family members CYP1A1 and CYP1B1 is induced upon activation of the aryl hydrocarbon receptor (AhR) pathway . AhR is a cytosolic transcription factor that is normally inactive and bound to several co-chaperones. Following exposure to endogenous and exogenous chemicals, AhR acts as a ligand-activated receptor and transcription factor, activating the transcription of xenobiotic-metabolizing enzymes such as CYP1A1 and CYP1B1 as well as other genes [22-24]. There could be a signaling loop mechanism in which CYP1B1 can also act as a ligand and mediate the AhR signaling pathway, either in an activating or a suppressingfashion.
If the SNP implicated in this study, CYP1B1N453S, has a functional significance on the protein levels of CYP1B1 such that it down-regulates or abrogates them, then this would be expected to enhance AhR activation. Significantly, one study did show that inhibition of CYP1B1 is linked to enhanced AhR activation . Consequently, enhanced AhR activation leads to an enhanced transcription of CYP1A1. In fact, a recent study showed that CYP1B1N453S has a functional impact on the protein such that the protein displays lower intracellular levels and is degraded more rapidly than all other CYP1B1 variants tested in the study . It is not clear what structural alterations are responsible for the increased rate of CYP1B1 degradation caused by the codon change Asn453SSer. This residue is located in the large meander region between the K- and L-helix and probably highly accessible to proteases. This so-called meander region is situated in the COOH terminal half of CYP1B1, important in the heme-binding and proper folding of the molecule. Moreover, it is interesting to note that the regions in which the putative non-synonymous SNPs reside in CYP1B1 are not highly conserved in mammals with the exception of the SNP at codon 453 . Two different groups reported a 2-fold reduction in the cellular level of the protein containing this polymorphism, and a significantly reduced enzyme half-life [27,28]. It is therefore well established that this variation has a functional consequence on the protein cellular levels, its folding and stability. Due to CYP1B1's involvement in the metabolism of carcinogens, and the SNPs residence in a conserved region of the gene, it is not surprising that this SNP is emerging as an important player in carcinogenesis. Therefore, CYP1B1N453S connection to CYP1A1 and its consequent indirect involvement in the NNK/NNAL metabolism is a possible explanation for our findings (Figure 2). AhR mediated induction of CYP1 enzymes can lead to many cancer-related processes including genotoxicity, mutation and tumor initiation .
This indirect impact of CYP1B1 on NNK metabolism through CYP1A1 could involve other pathways that we are not aware of due to the complexity of tobacco smoke carcinogenesis. A relationship between CYP1 inducibility and cancer has been previously shown . A group of researchers demonstrated an association between CYP1 inducibility and bronchogenic carcinoma . Furthermore, in the context of hepatoma cells or in vitro studies, CYP1A1 is a primary determinant of the metabolism of benzo [a]pyrene, a PAH likely involved in tobacco-induced lung cancer . Thus, CYP1A1's link to lung cancer has been proposed in many previous studies, although the possible relationship of our observations to CYP1A1 inducibility remains speculative.
As presented in this paper, we found an even stronger effect of CYP1B1N453S in a smaller adenocarcinoma group. A study by Chang et al.  found expression of AhR and CYP1B1 to be associated regardless of smoking status and AhR overexpression to up-regulate the expression of CYP1B1 in the early stage of lung adenocarcinoma. This finding may strengthen the results of our study.
Therefore, the effect of the CYP1B1N453S we observed might be predicted–lower levels of CYB1B1 protein results in increased activation of AhR, which in turn increases CYP1A1 activity (Figure 2). Based on our analysis of HapMap variants we do not believe CYP1B1 to be directly involved in the metabolism of NNAL, although further functional studies on CYP1B1's involvement in NNAL and NNK metabolism are needed.
Phenanthrene and other PAHs are substrates for CYP1B1 and CYP1A1[32,34]. We did not observe an association between PheT levels and lung cancer, nor was there any interaction with CYP1B1 polymorphisms. This somewhat unexpected result may be due to the relatively small size of our study, and to the fact that phenanthrene, in contrast to NNK, is not tobacco-specific. Thus, substantial amounts of serum PheT are due to phenanthrene exposure from diet or general environment.
The study is limited by its small size, which required a focus on just a few SNPs, rather than on a broad array of polymorphisms. We chose a subset of the eleven most likely candidates for study, and found evidence that one of those SNPs may segregate the population by the risk conferred by NNAL exposure as well as by the underlying risk itself.
The evidence of a strong interaction between total serum NNAL and the CYP1B1N453S SNP from this study was unexpected and, as yet, is not fully explained. If confirmed by appropriate additional molecular and epidemiologic studies, this outcome constitutes an important step in understanding how exposure to cigarette smoke leads to inter-individual variation in risk of lung cancer.
This work was supported Grant Support: U.S. National Cancer Institute, NIH, Department of Health and Human Services (contract number N01-CN-25513); NIH (grant DA-13333).