We have previously defined the impact of tobacco smoking on nasal epithelium gene expression using Affymetrix Exon 1.0 ST arrays. In this paper, we compared the performance of the Affymetrix GeneChip Human Gene 1.0 ST array with the Human Exon 1.0 ST array for detecting nasal smoking-related gene expression changes. RNA collected from the nasal epithelium of five current smokers and five never smokers was hybridized to both arrays. While the intersample correlation within each array platform was relatively higher in the Gene array than that in the Exon array, the majority of the genes most changed by smoking were tightly correlated between platforms. Although neither array dataset was powered to detect differentially expressed genes (DEGs) at a false discovery rate (FDR) <0.05, we identified more DEGs than expected by chance using the Gene ST array. These findings suggest that while both platforms show a high degree of correlation for detecting smoking-induced differential gene expression changes, the Gene ST array may be a more cost-effective platform in a clinical setting for gene-level genomewide expression profiling and an effective tool for exploring the host response to cigarette smoking and other inhaled toxins.
The fluid-filled lung exists in relative hypoxia in utero (∼25 mm Hg), but at birth fills with ambient air where the partial pressure of oxygen is ∼150 mm Hg. The impact of this change was studied in mouse lung with microarrays to analyze gene expression one day before, and 2, 6, 12 and 24 hours after birth into room air or 10% O2. The expression levels of >150 genes, representing transcriptional regulation, structure, apoptosis and antioxidants were altered 2 hrs after birth in room air but blunted or absent with birth in 10% O2. Kruppel-like factor 4 (Klf4), a regulator of cell growth arrest and differentiation, was the most significantly altered lung gene at birth. Its protein product was expressed in fibroblasts and airway epithelial cells. Klf4 mRNA was induced in lung fibroblasts exposed to hyperoxia and constitutive expression of Klf4 mRNA in Klf4-null fibroblasts induced mRNAs for p21cip1/Waf1, smooth muscle actin, type 1 collagen, fibronectin and tenascin C. In Klf4 perinatal null lung, p21cip1/Waf1mRNA expression was deficient prior to birth and associated with ongoing cell proliferation after birth; connective tissue gene expression was deficient around birth and smooth muscle actin protein expression was absent from myofibroblasts at tips of developing alveoli; p53, p21cip1/Waf1 and caspase-3 protein expression were widespread at birth suggesting excess apoptosis compared to normal lung. We propose that the changing oxygen environment at birth acts as a physiologic signal to induce lung Klf4 mRNA expression, which then regulates proliferation and apoptosis in fibroblasts and airway epithelial cells, and connective tissue gene expression and myofibroblast differentiation at the tips of developing alveoli.
Delivery of the transcription factors Oct4, Klf4, Sox2 and c-Myc via integrating viral vectors has been widely employed to generate induced pluripotent stem cell (iPSC) lines from both normal and disease-specific somatic tissues, providing an invaluable resource for medical research and drug development. Residual reprogramming transgene expression from integrated viruses nevertheless alters the biological properties of iPSCs and has been associated with a reduced developmental competence both in vivo and in vitro. We performed transcriptional profiling of mouse iPSC lines before and after excision of a polycistronic lentiviral reprogramming vector to systematically define the overall impact of persistent transgene expression on the molecular features of iPSCs. We demonstrate that residual expression of the Yamanaka factors prevents iPSCs from acquiring the transcriptional program exhibited by embryonic stem cells (ESCs) and that the expression profiles of iPSCs generated with and without c-Myc are indistinguishable. After vector excision, we find 36% of iPSC clones show normal methylation of the Gtl2 region, an imprinted locus that marks ESC-equivalent iPSC lines. Furthermore, we show that the reprogramming factor Klf4 binds to the promoter region of Gtl2. Regardless of Gtl2 methylation status, we find similar endodermal and hepatocyte differentiation potential comparing syngeneic Gtl2ON vs Gtl2OFF iPSC clones. Our findings provide new insights into the reprogramming process and emphasize the importance of generating iPSCs free of any residual transgene expression.
Lung carcinogenesis is a complex, stepwise process that involves the acquisition of genetic mutations and epigenetic changes that alter cellular processes, such as proliferation, differentiation, invasion, and metastasis. Here, we review some of the latest concepts in the pathogenesis of lung cancer and highlight the roles of inflammation, the “field of cancerization,” and lung cancer stem cells in the initiation of the disease. Furthermore, we review how high throughput genomics, transcriptomics, epigenomics, and proteomics are advancing the study of lung carcinogenesis. Finally, we reflect on the potential of current in vitro and in vivo models of lung carcinogenesis to advance the field and on the areas of investigation where major breakthroughs will lead to the identification of novel chemoprevention strategies and therapies for lung cancer.
Field of cancerization; inflammation; stem cells; genomics; epigenomics; proteomics
The “field of injury” hypothesis proposes that exposure to an inhaled insult such as cigarette smoke elicits a common molecular response throughout the respiratory tract. This response can therefore be quantified in any airway tissue, including readily accessible epithelial cells in the bronchus, nose, and mouth. High-throughput technologies, such as whole-genome gene expression microarrays, can be employed to catalog the physiological consequences of such exposures in the airway epithelium. Pulmonary diseases such as chronic obstructive pulmonary disease, lung cancer, and asthma are also thought to be associated with a field of injury, and in patients with these diseases, airway epithelial cells can be a useful surrogate for diseased tissue that is often difficult to obtain. Global measurement of mRNA and microRNA expression in these cells can provide useful information about the molecular pathogenesis of such diseases and may be useful for diagnosis and for predicting prognosis and response to therapy. In this review, our aim is to summarize the history and state of the art of such “transcriptomic” studies in the human airway epithelium, especially in smoking and smoking-related lung diseases, and to highlight future directions for this field.
epithelium; lung neoplasms; chronic obstructive pulmonary disease; asthma; tobacco
The acute phase response is an evolutionarily conserved reaction in which physiological stress triggers the liver to remodel the blood proteome. Although thought to be involved in immune defense, the net biological effect of the acute phase response remains unknown. As the acute phase response is stimulated by diverse cytokines that activate either NF-κB or STAT3, we hypothesized that it could be eliminated by hepatocyte-specific interruption of both transcription factors. Here, we report that the elimination in mice of both NF-κB p65 (RelA) and STAT3, but neither alone, abrogated all acute phase responses measured. The failure to respond was consistent across multiple different infectious, inflammatory, and noxious stimuli, including pneumococcal pneumonia. When the effects of infection were analyzed in detail, pneumococcal pneumonia was found to alter the expression of over a thousand transcripts in the liver. This outcome was inhibited by the combined loss of RelA and STAT3. Moreover, this interruption of the acute phase response increased mortality and exacerbated bacterial dissemination during pneumonia, possibly as a result of acute humoral enhancement of macrophage opsonophagocytosis, which was impaired in the mutant mice. Thus, we conclude that RelA and STAT3 are essential for stress-induced transcriptional remodeling in the liver and the subsequent activation of the acute phase response, whose functional role includes compartmentalization of local infection.
The homeodomain transcription factor Nkx2-1 is essential for normal lung development and homeostasis. In lung tumors, it is considered a lineage survival oncogene and prognostic factor depending on its expression levels. The target genes directly bound by Nkx2-1, that could be the primary effectors of its functions in the different cellular contexts where it is expressed, are mostly unknown. In embryonic day 11.5 (E11.5) mouse lung, epithelial cells expressing Nkx2-1 are predominantly expanding, and in E19.5 prenatal lungs, Nkx2-1-expressing cells are predominantly differentiating in preparation for birth. To evaluate Nkx2-1 regulated networks in these two cell contexts, we analyzed genome-wide binding of Nkx2-1 to DNA regulatory regions by chromatin immunoprecipitation followed by tiling array analysis, and intersected these data to expression data sets. We further determined expression patterns of Nkx2-1 developmental target genes in human lung tumors and correlated their expression levels to that of endogenous NKX2-1. In these studies we uncovered differential Nkx2-1 regulated networks in early and late lung development, and a direct function of Nkx2-1 in regulation of the cell cycle by controlling the expression of proliferation-related genes. New targets, validated in Nkx2-1 shRNA transduced cell lines, include E2f3, Cyclin B1, Cyclin B2, and c-Met. Expression levels of Nkx2-1 direct target genes identified in mouse development significantly correlate or anti-correlate to the levels of endogenous NKX2-1 in a dosage-dependent manner in multiple human lung tumor expression data sets, supporting alternative roles for Nkx2-1 as a transcriptional activator or repressor, and direct regulator of cell cycle progression in development and tumors.
Identifying similarities between patterns of differential gene expression provides an opportunity to identify similarities between the experimental and biological conditions that give rise to these gene expression alterations. The growing volume of gene expression data in open data repositories such as the NCBI Gene Expression Omnibus (GEO) presents an opportunity to identify these gene expression similarities on a large scale across a diverse collection of datasets. We have developed a fast, pattern-based computational approach, named openSESAME (Search of Expression Signatures Across Many Experiments), that identifies datasets enriched in samples that display coordinate differential expression of a query signature. Importantly, openSESAME performs this search without prior knowledge of the phenotypic or experimental groups in the datasets being searched. This allows openSESAME to identify perturbations of gene expression that are due to phenotypic attributes that may not have been described in the sample annotation included in the repository.
To demonstrate the utility of openSESAME, we used gene expression signatures of two biological perturbations to query a set of 75,164 human expression profiles that were generated using Affymetrix microarrays and deposited in GEO. The first query, using a signature of estradiol treatment, identified experiments in which estrogen signaling was perturbed and also identified differences in estrogen signaling between estrogen receptor-positive and -negative breast cancers. The second query, which used a signature of silencing of the transcription factor p63 (a key regulator of epidermal differentiation), identified datasets related to stratified squamous epithelia or epidermal diseases such as melanoma.
openSESAME is a tool for leveraging the growing body of publicly available microarray data to discover relationships between different biological states based on common patterns of differential gene expression. These relationships may serve to generate hypotheses about the causes and consequences of specific patterns of observed differential gene expression. To encourage others to explore the utility of this approach, we have made a website for performing openSESAME queries freely available at http://opensesame.bu.edu.
Smoking is the most important known risk factor for the development of lung cancer. Tobacco exposure results in chronic inflammation, tissue injury and repair. A recent hypothesis argues for a stem/progenitor cell involved in airway epithelial repair that may be a tumor-initiating cell in lung cancer, and which may be associated with recurrence and metastasis. We used immunostaining, quantitative real-time PCR, Western blots and lung cancer tissue microarrays to identify subpopulations of airway epithelial stem/progenitor cells under steady state conditions, normal repair, aberrant repair with premalignant lesions and lung cancer and their correlation with injury and prognosis. We identified a population of keratin 14 (K14)-expressing progenitor epithelial cells that was involved in repair after injury. Dysregulated repair resulted in persistence of K14+ cells in the airway epithelium in premalignant lesions. The presence of K14+ cells in non-small cell lung cancer (NSCLC) samples predicted poorer outcomes. This was especially true in smokers where the presence of K14+ cells in NSCLC was predictive of metastasis. The presence of K14+ progenitor airway epithelial cells in NSCLC predicted a poor prognosis and this predictive value was strongest in smokers, where it also correlated with metastasis. This suggests that reparative K14+ progenitor cells may be tumor-initiating cells in this subgroup of smokers with NSCLC.
Lung carcinogenesis; dysregulated repair; injury
Using valproic acid as an example, the authors demonstrate that drug response signatures derived from genome-wide expression data can identify individuals likely to respond to a drug, and propose that this method could select optimal populations for clinical trials of new therapies.
Drug response signatures that accurately reflect the cellular response to a drug can be generated from Connectivity Map and publically available gene expression data.Predictions from the drug response signature for valproic acid correlate with sensitivity to valproic acid in breast cancer cell lines and patient tumors grown in three-dimensional culture and mouse xenografts.The MATCH algorithm provides an efficient approach for using genome-wide gene expression data to identify a target population for a drug prior to clinical trials.MATCH can predict drug sensitivity in tumors without knowledge of mechanism of action.
Unlike traditional chemotherapy, targeted cancer therapies are expected to work in only a subset of people with a particular cancer. However, biomarkers of response are not always known before clinical trial initiation. We present MATCH (Merging genomic and pharmacologic Analyses for Therapy CHoice), an algorithm for using genome-wide gene expression data to identify and validate a genomic biomarker of sensitivity (see Figure 1). Our proof-of-principle example is valproic acid (VPA), but we also show that an estrogen blocking drug currently used for breast cancer and a B-RAF inhibitor in trials for melanoma give predictions that correspond to their clinical uses.
We use genome-wide gene expression data from treated and untreated samples from the Connectivity Map to generate a VPA response signature. We validate that the VPA signature can identify treated and untreated cells in an independent data set of normal cells and in independent samples from the Connectivity Map. The AUC for the ROC curve is 0.86. We then apply the VPA signature to publically available data sets from a panel of cancer cell lines and from primary tumor and normal tissue samples. These data suggest that there is a subset of women with breast cancer who will be sensitive to VPA. Finally, we validate that our predictions correlate with sensitivity to VPA in breast cancer cell lines grown in two-dimensional culture, primary breast tumor samples grown in three-dimensional culture, and in vivo mouse breast cancer xenografts. Together, these studies show that MATCH can identify cancer patients most likely to respond to a specific drug treatment.
Identifying the best drug for each cancer patient requires an efficient individualized strategy. We present MATCH (Merging genomic and pharmacologic Analyses for Therapy CHoice), an approach using public genomic resources and drug testing of fresh tumor samples to link drugs to patients. Valproic acid (VPA) is highlighted as a proof-of-principle. In order to predict specific tumor types with high probability of drug sensitivity, we create drug response signatures using publically available gene expression data and assess sensitivity in a data set of >40 cancer types. Next, we evaluate drug sensitivity in matched tumor and normal tissue and exclude cancer types that are no more sensitive than normal tissue. From these analyses, breast tumors are predicted to be sensitive to VPA. A meta-analysis across breast cancer data sets shows that aggressive subtypes are most likely to be sensitive to VPA, but all subtypes have sensitive tumors. MATCH predictions correlate significantly with growth inhibition in cancer cell lines and three-dimensional cultures of fresh tumor samples. MATCH accurately predicts reduction in tumor growth rate following VPA treatment in patient tumor xenografts. MATCH uses genomic analysis with in vitro testing of patient tumors to select optimal drug regimens before clinical trial initiation.
biomarkers; cancer; pharmacogenomics
The directed differentiation of iPS and ES cells into definitive endoderm (DE) would allow the derivation of otherwise inaccessible progenitors for endodermal tissues. However, a global comparison of the relative equivalency of DE derived from iPS and ES populations has not been performed. Recent reports of molecular differences between iPS and ES cells have raised uncertainty as to whether iPS cells could generate autologous endodermal lineages in vitro. Here, we show that both mouse iPS and parental ES cells exhibited highly similar in vitro capacity to undergo directed differentiation into DE progenitors. With few exceptions, both cell types displayed similar surges in gene expression of specific master transcriptional regulators and global transcriptomes that define the developmental milestones of DE differentiation. Microarray analysis showed considerable overlap between the genetic programs of DE derived from ES/iPS cells in vitro and authentic DE from mouse embryos in vivo. Intriguingly, iPS cells exhibited aberrant silencing of imprinted genes known to participate in endoderm differentiation, yet retained a robust ability to differentiate into DE. Our results show that, despite some molecular differences, iPS cells can be efficiently differentiated into DE precursors, reinforcing their potential for development of cell-based therapies for diseased endoderm-derived tissues.
Although cigarette smoking is the major cause of chronic obstructive pulmonary disease (COPD), only a subset of smokers develops this disease. There is significant clinical, radiographic, and pathologic heterogeneity within smokers who develop COPD that likely reflects multiple molecular mechanisms of disease. It is possible that variations in the individual response to cigarette smoking form the basis for the distinct clinical and molecular phenotypes and variable natural history associated with COPD. Using the biologic premise of a molecular field of airway injury created by cigarette smoking, this response to tobacco exposure can be measured by molecular profiling of the airway epithelium. Noninvasive study of this field effect by profiling airway gene expression in patients with COPD holds important implications for our understanding of disease heterogeneity, early disease detection, and identification of novel disease-modifying therapies.
airway gene expression; chronic obstructive pulmonary disease; bioinformatics
Chronic obstructive pulmonary disease (COPD) fulfills criteria for a complex genetic disease in which environmental factors interact with multiple polymorphic genes to influence susceptibility. Finding the genes that influence susceptibility can be approached in hypothesis testing or unbiased study designs. In candidate gene association studies, genetic variation in, and/or levels of, expression of genes known or suspected to be involved in the pathogenesis of COPD are compared in affected and unaffected individuals. Although this approach is useful it is limited by our present knowledge of disease pathophysiology. Genomewide studies of gene expression and of genetic variation are now possible and are not constrained by our limited knowledge. Although both of these unbiased approaches are in their infancy, they have already provided exciting new avenues for future investigation and potentially now approaches to risk prediction and therapy.
chronic obstructive pulmonary disease; genetics; genomics
Prior microarray studies of smokers at high risk for lung cancer have demonstrated that heterogeneity in bronchial airway epithelial cell gene expression response to smoking can serve as an early diagnostic biomarker for lung cancer. As a first step in applying functional genomic analysis to population studies, we have examined the relationship between gene expression variation and genetic variation in a central molecular pathway (NRF2-mediated antioxidant response) associated with smoking exposure and lung cancer. We assessed global gene expression in histologically normal airway epithelial cells obtained at bronchoscopy from smokers who developed lung cancer (SC, n = 20), smokers without lung cancer (SNC, n = 24), and never smokers (NS, n = 8). Functional enrichment analysis showed that the NRF2-mediated, antioxidant response element (ARE)-regulated genes, were significantly lower in SC, when compared with expression levels in SNC. Importantly, we found that the expression of MAFG (a binding partner of NRF2) was correlated with the expression of ARE genes, suggesting MAFG levels may limit target gene induction. Bioinformatically we identified single nucleotide polymorphisms (SNPs) in putative ARE genes and to test the impact of genetic variation, we genotyped these putative regulatory SNPs and other tag SNPs in selected NRF2 pathway genes. Sequencing MAFG locus, we identified 30 novel SNPs and two were associated with either gene expression or lung cancer status among smokers. This work demonstrates an analysis approach that integrates bioinformatics pathway and transcription factor binding site analysis with genotype, gene expression and disease status to identify SNPs that may be associated with individual differences in gene expression and/or cancer status in smokers. These polymorphisms might ultimately contribute to lung cancer risk via their effect on the airway gene expression response to tobacco-smoke exposure.
Lung cancer is the leading cause of cancer death in the US and the world. The high mortality rate results, in part, from the lack of effective tools for early detection and the inability to identify subsets of patients who would benefit from adjuvant chemotherapy or targeted therapies. The development of high-throughput genome-wide technologies for measuring gene expression, such as microarrays, have the potential to impact the mortality rate of lung cancer patients by improving diagnosis, prognosis, and treatment. This review will highlight recent studies using high-throughput gene expression technologies that have led to clinically relevant insights into lung cancer. The hope is that diagnostic and prognostic biomarkers that have been developed as part of this work will soon be ready for wide-spread clinical application and will have a dramatic impact on the evaluation of patients with suspect lung cancer, leading to effective personalized treatment regimens.
While the role cigarette smoke plays in chronic obstructive pulmonary disease (COPD) is undisputed, the molecular mechanisms by which inhaled smoke contributes to disease pathogenesis remains unclear. One of the major barriers to effective approaches to diagnose and manage COPD is the remarkable heterogeneity displayed by patients with the disease. Whole-genome gene-expression studies of airway and lung tissue from patients with COPD provide an opportunity to gain insights into disease pathogenesis, allowing for both a molecular understanding of the pathogenic processes that contribute to this heterogeneity, and the ability to target therapies to these processes. This review focuses on synthesizing and integrating the limited numbers of high-throughput gene expression studies that have been conducted on lung tissue and airway samples from smokers with COPD. Comparing several lung tissue studies using computational approaches, we find that the results suggest fundamental similarities and identify common biological processes underlying COPD, despite each study having identified largely nonoverlapping lists of differentially expressed genes. Given these similarities, we argue that additional lung tissue and airway gene-expression studies are warranted, and present a roadmap for how such studies could lead to clinically relevant tools that would impact COPD management.
gene expression; microarray analysis; biomarkers; emphysema
The concept of field cancerization was first introduced over six decades ago in the setting of oral cancer. Later, field cancerization involving histologic and molecular changes of neoplasms and adjacent tissue began to be characterized in smokers with or without lung cancer. Investigators also described a diffuse, non-neoplastic field of molecular injury throughout the respiratory tract that is attributable to cigarette smoking and susceptibility to smoking-induced lung disease. The potential molecular origins of field cancerization and the field of injury following cigarette smoke exposure in lung and airway epithelia are critical to understanding the impact of the field of injury on clinical diagnostics and therapeutics for smoking-induced lung disease.
field of injury; field cancerization; lung cancer; tobacco smoke; molecular diagnosis and prognosis
Although prior studies have demonstrated a smoking-induced field of molecular injury throughout the lung and airway, the impact of smoking on the airway epithelial proteome and its relationship to smoking-related changes in the airway transcriptome are unclear.
Airway epithelial cells were obtained from never (n = 5) and current (n = 5) smokers by brushing the mainstem bronchus. Proteins were separated by one dimensional polyacrylamide gel electrophoresis (1D-PAGE). After in-gel digestion, tryptic peptides were processed via liquid chromatography/ tandem mass spectrometry (LC-MS/MS) and proteins identified. RNA from the same samples was hybridized to HG-U133A microarrays. Protein detection was compared to RNA expression in the current study and a previously published airway dataset. The functional properties of many of the 197 proteins detected in a majority of never smokers were similar to those observed in the never smoker airway transcriptome. LC-MS/MS identified 23 proteins that differed between never and current smokers. Western blotting confirmed the smoking-related changes of PLUNC, P4HB1, and uteroglobin protein levels. Many of the proteins differentially detected between never and current smokers were also altered at the level of gene expression in this cohort and the prior airway transcriptome study. There was a strong association between protein detection and expression of its corresponding transcript within the same sample, with 86% of the proteins detected by LC-MS/MS having a detectable corresponding probeset by microarray in the same sample. Forty-one proteins identified by LC-MS/MS lacked detectable expression of a corresponding transcript and were detected in ≤5% of airway samples from a previously published dataset.
1D-PAGE coupled with LC-MS/MS effectively profiled the airway epithelium proteome and identified proteins expressed at different levels as a result of cigarette smoke exposure. While there was a strong correlation between protein and transcript detection within the same sample, we also identified proteins whose corresponding transcripts were not detected by microarray. This noninvasive approach to proteomic profiling of airway epithelium may provide additional insights into the field of injury induced by tobacco exposure.
To identify genes expressed during initiation of lung organogenesis, we generated transcriptional profiles of the prospective lung region of the mouse foregut (mid-foregut) microdissected from embryos at three developmental stages between embryonic day 8.5 (E8.5) and E9.5. This period spans from lung specification of foregut cells to the emergence of the primary lung buds. We identified a number of known and novel genes that are temporally regulated as the lung bud forms. Genes that regulate transcription, including DNA binding factors, co-factors, and chromatin remodeling genes, are the main functional groups that change during lung bud formation. Members of key developmental transcription and growth factor families, not previously described to participate in lung organogenesis, are expressed in the mid-foregut during lung bud induction. These studies also show early expression in the mid-foregut of genes that participate in later stages of lung development. This characterization of the mid-foregut transcriptome provides new insights into molecular events leading to lung organogenesis.
Lung; development; organogenesis; foregut; endoderm; embryo; mouse; microarray; RNA amplification; gene expression; real time PCR; laser capture microdissection; transcription factors; chromatin remodeling; Fox; Notch
Cigarette smoking is a leading cause of preventable death and a significant cause of lung cancer and chronic obstructive pulmonary disease. Prior studies have demonstrated that smoking creates a field of molecular injury throughout the airway epithelium exposed to cigarette smoke. We have previously characterized gene expression in the bronchial epithelium of never smokers and identified the gene expression changes that occur in the mainstem bronchus in response to smoking. In this study, we explored relationships in whole-genome gene expression between extrathorcic (buccal and nasal) and intrathoracic (bronchial) epithelium in healthy current and never smokers.
Using genes that have been previously defined as being expressed in the bronchial airway of never smokers (the "normal airway transcriptome"), we found that bronchial and nasal epithelium from non-smokers were most similar in gene expression when compared to other epithelial and nonepithelial tissues, with several antioxidant, detoxification, and structural genes being highly expressed in both the bronchus and nose. Principle component analysis of previously defined smoking-induced genes from the bronchus suggested that smoking had a similar effect on gene expression in nasal epithelium. Gene set enrichment analysis demonstrated that this set of genes was also highly enriched among the genes most altered by smoking in both nasal and buccal epithelial samples. The expression of several detoxification genes was commonly altered by smoking in all three respiratory epithelial tissues, suggesting a common airway-wide response to tobacco exposure.
Our findings support a relationship between gene expression in extra- and intrathoracic airway epithelial cells and extend the concept of a smoking-induced field of injury to epithelial cells that line the mouth and nose. This relationship could potentially be utilized to develop a non-invasive biomarker for tobacco exposure as well as a non-invasive screening or diagnostic tool providing information about individual susceptibility to smoking-induced lung diseases.
Oligonucleotide microarray analysis revealed 175 genes that are differentially expressed in large airway epithelial cells of people who currently smoke compared with those who never smoked, with 28 classified as irreversible, 6 as slowly reversible, and 139 as rapidly reversible.
Tobacco use remains the leading preventable cause of death in the US. The risk of dying from smoking-related diseases remains elevated for former smokers years after quitting. The identification of irreversible effects of tobacco smoke on airway gene expression may provide insights into the causes of this elevated risk.
Using oligonucleotide microarrays, we measured gene expression in large airway epithelial cells obtained via bronchoscopy from never, current, and former smokers (n = 104). Linear models identified 175 genes differentially expressed between current and never smokers, and classified these as irreversible (n = 28), slowly reversible (n = 6), or rapidly reversible (n = 139) based on their expression in former smokers. A greater percentage of irreversible and slowly reversible genes were down-regulated by smoking, suggesting possible mechanisms for persistent changes, such as allelic loss at 16q13. Similarities with airway epithelium gene expression changes caused by other environmental exposures suggest that common mechanisms are involved in the response to tobacco smoke. Finally, using irreversible genes, we built a biomarker of ever exposure to tobacco smoke capable of classifying an independent set of former and current smokers with 81% and 100% accuracy, respectively.
We have categorized smoking-related changes in airway gene expression by their degree of reversibility upon smoking cessation. Our findings provide insights into the mechanisms leading to reversible and persistent effects of tobacco smoke that may explain former smokers increased risk for developing tobacco-induced lung disease and provide novel targets for chemoprophylaxis. Airway gene expression may also serve as a sensitive biomarker to identify individuals with past exposure to tobacco smoke.