There is an important heritable component to lung cancer 
and understanding how genetic variation alters smoking-induced gene expression could provide genetic biomarkers for diagnosis and reveal genetic susceptibility alleles. Numerous studies of human airway 
mouse lung 
or in vitro
cell culture 
have reported on gene expression signatures related to smoking. Spira et al
identified gene expression profiles in cytologically normal large-airway epithelial cells that can serve as a diagnostic biomarker for lung cancer 
. The present work identifies molecular and genetic features of the NRF2-regulated pathway that are central to this airway gene expression response. Using pathway analysis tools, we identified differences in NRF2-mediated transcription profiles from bronchial airway epithelial cells obtained from nonsmokers, cigarette smokers with suspicion of lung cancer, and those with a subsequent diagnosis of lung cancer. We also revealed a potential role for MAFG (a NRF2-binding partner) in modulating smoking-induced gene expression.
NRF2 is activated by oxidative stress and translocates to the nucleus where it heterodimerizes with small MAF proteins to form a transactivation complex that binds to specific DNA regions termed antioxidant response elements (ARE) 
and up-regulates antioxidant and phase II detoxification enzymes. We examined the expression of NRF2 and its interacting partners (e.g. MAFG, NRF1, NRF3, and BACH1) and made a novel observation that MAFG expression was strongly correlated with expression of downstream NRF2 target genes. At present, there are 42 NRF2 target genes discovered in various human tissues, and we found 22 of them correlated with MAFG gene expression level in human airway epithelial cells. In addition, NRF1, a negative and competitive regulatory factor, was anti-correlated with downstream antioxidant gene expression. The possibility that MAFG expression might limit downstream antioxidant gene expression was explored further. Silencing MAFG with siRNA in A549 cells attenuated the expression of known ARE genes () and this was consistent with published experiments in MafG knockout mice 
. A similar pattern for MAFG expression relative to other antioxidant genes was found when we carried out a retrospective analysis of expression data from a related, previously-published, larger-scale study 
. We pursued a possible genetic cause for reduced MAFG expression among the SC group by re-sequencing MAFG in the subjects in this study and uncovered more than 30 novel SNPs in the 16.5-kb MAFG region. A SNP at chr17:77482956 in the MAFG promoter region was associated with lower MAFG mRNA levels, while another in the 3′ UTR displayed a marginal association with expression and lung cancer status (). Thus some of the variability in gene expression among groups may be due to genetic variation, but it is also likely that other regulatory mechanisms, as well as the timing and duration of cigarette smoking in these patients have affected MAFG levels.
To explore how genetic variation in other NRF2 pathway genes might contribute to smoking-induced lung cancer disease susceptibility, we tested the association of many genotypes with both expression and/or group phenotype. We observed possible cis
-acting effects for putative regulatory SNPs on genes affected by cigarette smoking and these could impact susceptibility of the airway to smoking-related diseases through various mechanisms, including metabolism of carcinogens. For example, members of the aldo-keto reductase (AKR) superfamily, AKR1C1, AKR1C2, and AKR1C3, catalyze the conversion of aldehydes and ketones to their corresponding alcohols by utilizing NADH and/or NADPH as cofactors. Polymorphisms of AKR1C3 have been implicated in susceptibility to various types of cancer, including lung cancer 
. Microsomal epoxide hydrolase 1 (EPHX1) plays an important role in both the activation and detoxification of tobacco-derived carcinogens. Polymorphisms at exons 3 and 4 of the EPHX1 gene have been associated with variation in EPHX1 activity and a low-activity genotype of EPHX1 gene was associated with decreased risk of lung cancer among whites 
. While the associations we found in this hypothesis-generating study were modest and need to be confirmed by follow-up in larger studies, we suggest that this approach may prove useful for identifying functional SNPs that contributing to a phenotype via an impact on gene expression.
Disease-association studies, both candidate gene-based and genome-wide association studies (GWAS), have identified genetic variants that associate with both monogenic and complex diseases like lung cancer. Recently several genomic regions that may affect nicotine metabolism or dependency in lung cancer patients were identified by GWAS with very high statistical significance 
. Presumably individuals with these genetic traits may use more tobacco and receive higher doses of the carcinogenic compounds in cigarette smoke. However, while the variants identified in the lung cancer GWAS studies, and in many other GWAS, point to potentially important loci, the functional relationship between the SNP and a molecular genetic mechanism to explain the biological phenotype is not apparent. Thus, understanding the molecular genetic basis of human phenotypic variation still remains a major challenge for genetics.
The approach we have used integrates information about gene expression in target tissue, variation in transcription factor binding sites and genotype frequency among cancer status groups in order to identify biologically-plausible functional polymorphisms. It could be generally useful for identifying SNPs that contribute to disease risk through their impact on gene expression. While the present study is limited by statistical power, applying this method to larger studies of exposed-tissue samples from clinically characterized patients may reveal useful expression-based and/or genetic biomarkers and provide a basis for prevention efforts.