Purpose: The EGFR tyrosine kinase inhibitors (TKIs) demonstrate efficacy in NSCLC patients whose tumors harbor activating EGFR mutations. However, patients who initially respond to EGFR TKI treatment invariably develop resistance to the drugs. Known mechanisms account for approximately 70% of native and acquired EGFR TKI resistance. In the current study we investigated a novel mechanism of NSCLC resistance to erlotinib. Experimental Design: The mechanisms of acquired erlotinib resistance were evaluated by microarray analysis in thirteen NSCLC cell lines and in vivo in mice. Correlations between plasma neutrophil gelatinase associated lipocalin (NGAL) levels, erlotinib response and the EGFR mutational status were assessed in advanced stage NSCLC patients treated with erlotinib. Results: In 5 of 13 NSCLC cell lines NGAL was significantly upregulated. NGAL knockdown in erlotinib-resistant cells increased erlotinib sensitivity in vitro and in vivo. NGAL overexpression in erlotinib-sensitive cells augmented apoptosis resistance. This was mediated by NGAL-dependent modulation of the pro-apoptotic protein Bim levels. Evaluation of the plasma NGAL levels in NSCLC patients that received erlotinib revealed that patients with lower baseline NGAL demonstrated a better erlotinib response. Compared to patients with wild type EGFR, patients with activating EGFR mutations had lower plasma NGAL at baseline and weeks 4 and 8. Conclusions: Our studies uncover a novel mechanism of NGAL-mediated modulation of Bim levels in NSCLC that might contribute to TKI resistance in lung cancer patients. These findings provide the rationale for the further investigations of the utility of NGAL as a potential therapeutic target or diagnostic biomarker.
Lung cancer; effectors of apoptosis; survival factors; EGFR; erlotinib resistance
The “field of cancerization” refers to histologically normal-appearing tissue adjacent to neoplastic tissue that displays molecular abnormalities, some of which are the same as those of the tumor. Improving our understanding of these molecular events is likely to increase our understanding of carcinogenesis. Here, Kadara et al. attempt to characterize the molecular events associated temporally and spatially within the field of cancerization of early stage non-small cell lung cancer (NSCLC) patients following definitive surgery. They followed patients with bronchoscopies annually after tumor resection and extracted RNA from the serial brushings from different endobronchial sites. They then performed microarray analysis to identify gene expression differences over time and in different sites in the airway. Candidate genes were found that may have biological relevance to the field of cancerization. For example, expression of phosphorylated AKT and ERK1/2 was found to increase in the airway epithelium with time. Despite a number of limitations in the study design, this investigation demonstrates the utility of identifying molecular changes in histologically normal airway epithelium in lung cancer. In addition to increasing our understanding of lung cancer biology, studying the field of cancerization has the potential to identify biomarkers from samples obtained in a minimally invasive manner.
Lung cancer is the leading cause of cancer death worldwide in part due to our inability to identify which smokers are at highest risk and the lack of effective tools to detect the disease at its earliest and potentially curable stage. Recent results from the National Lung Screening Trial have shown that annual screening of high-risk smokers with low-dose helical computed tomography of the chest can reduce lung cancer mortality. However, molecular biomarkers are needed to identify which current and former smokers would benefit most from annual computed tomography scan screening in order to reduce the costs and morbidity associated with this procedure. Additionally, there is an urgent clinical need to develop biomarkers that can distinguish benign from malignant lesions found on computed tomography of the chest given its very high false positive rate. This review highlights recent genetic, transcriptomic and epigenomic biomarkers that are emerging as tools for the early detection of lung cancer both in the diagnostic and screening setting.
Biomarker; Diagnostics; Early detection; Epigenetics; Genetics; Lung cancer; Screening; Transcriptomics
Cigarette smoke creates a molecular field of injury in epithelial cells that line the respiratory tract. We hypothesized that transcriptome sequencing (RNA-Seq) will enhance our understanding of the field of molecular injury in response to tobacco smoke exposure and lung cancer pathogenesis by identifying gene expression differences not interrogated or accurately measured by microarrays. We sequenced the high-molecular-weight fraction of total RNA (>200 nt) from pooled bronchial airway epithelial cell brushings (n = 3 patients per pool) obtained during bronchoscopy from healthy never smoker (NS) and current smoker (S) volunteers and smokers with (C) and without (NC) lung cancer undergoing lung nodule resection surgery. RNA-Seq libraries were prepared using 2 distinct approaches, one capable of capturing non-polyadenylated RNA (the prototype NuGEN Ovation RNA-Seq protocol) and the other designed to measure only polyadenylated RNA (the standard Illumina mRNA-Seq protocol) followed by sequencing generating approximately 29 million 36 nt reads per pool and approximately 22 million 75 nt paired-end reads per pool, respectively. The NuGEN protocol captured additional transcripts not detected by the Illumina protocol at the expense of reduced coverage of polyadenylated transcripts, while longer read lengths and a paired-end sequencing strategy significantly improved the number of reads that could be aligned to the genome. The aligned reads derived from the two complementary protocols were used to define the compendium of genes expressed in the airway epithelium (n = 20,573 genes). Pathways related to the metabolism of xenobiotics by cytochrome P450, retinol metabolism, and oxidoreductase activity were enriched among genes differentially expressed in smokers, whereas chemokine signaling pathways, cytokine–cytokine receptor interactions, and cell adhesion molecules were enriched among genes differentially expressed in smokers with lung cancer. There was a significant correlation between the RNA-Seq gene expression data and Affymetrix microarray data generated from the same samples (P < 0.001); however, the RNA-Seq data detected additional smoking- and cancer-related transcripts whose expression was were either not interrogated by or was not found to be significantly altered when using microarrays, including smoking-related changes in the inflammatory genes S100A8 and S100A9 and cancer-related changes in MUC5AC and secretoglobin (SCGB3A1). Quantitative real-time PCR confirmed differential expression of select genes and non-coding RNAs within individual samples. These results demonstrate that transcriptome sequencing has the potential to provide new insights into the biology of the airway field of injury associated with smoking and lung cancer. The measurement of both coding and non-coding transcripts by RNA-Seq has the potential to help elucidate mechanisms of response to tobacco smoke and to identify additional biomarkers of lung cancer risk and novel targets for chemoprevention.
Although only a subset of smokers develop lung cancer, we cannot determine which smokers are at highest risk for cancer development, nor do we know the signaling pathways altered early in the process of tumorigenesis in these individuals. On the basis of the concept that cigarette smoke creates a molecular field of injury throughout the respiratory tract, this study explores oncogenic pathway deregulation in cytologically normal proximal airway epithelial cells of smokers at risk for lung cancer. We observed a significant increase in a genomic signature of phosphatidylinositol 3-kinase (PI3K) pathway activation in the cytologically normal bronchial airway of smokers with lung cancer and smokers with dysplastic lesions, suggesting that PI3K is activated in the proximal airway before tumorigenesis. Further, PI3K activity is decreased in the airway of high-risk smokers who had significant regression of dysplasia after treatment with the chemopreventive agent myo-inositol, and myo-inositol inhibits the PI3K pathway in vitro. These results suggest that deregulation of the PI3K pathway in the bronchial airway epithelium of smokers is an early, measurable, and reversible event in the development of lung cancer and that genomic profiling of these relatively accessible airway cells may enable personalized approaches to chemoprevention and therapy. Our work further suggests that additional lung cancer chemoprevention trials either targeting the PI3K pathway or measuring airway PI3K activation as an intermediate endpoint are warranted.
Although there have been numerous observations of vitamin D deficiency and its links to chronic diseases, no studies have reported on how vitamin D status and vitamin D3 supplementation affects broad gene expression in humans. The objective of this study was to determine the effect of vitamin D status and subsequent vitamin D supplementation on broad gene expression in healthy adults. (Trial registration: ClinicalTrials.gov NCT01696409).
Methods and Findings
A randomized, double-blind, single center pilot trial was conducted for comparing vitamin D supplementation with either 400 IUs (n = 3) or 2000 IUs (n = 5) vitamin D3 daily for 2 months on broad gene expression in the white blood cells collected from 8 healthy adults in the winter. Microarrays of the 16 buffy coats from eight subjects passed the quality control filters and normalized with the RMA method. Vitamin D3 supplementation that improved serum 25-hydroxyvitamin D concentrations was associated with at least a 1.5 fold alteration in the expression of 291 genes. There was a significant difference in the expression of 66 genes between subjects at baseline with vitamin D deficiency (25(OH)D<20 ng/ml) and subjects with a 25(OH)D>20 ng/ml. After vitamin D3 supplementation gene expression of these 66 genes was similar for both groups. Seventeen vitamin D-regulated genes with new candidate vitamin D response elements including TRIM27, CD83, COPB2, YRNA and CETN3 which have been shown to be important for transcriptional regulation, immune function, response to stress and DNA repair were identified.
Our data suggest that any improvement in vitamin D status will significantly affect expression of genes that have a wide variety of biologic functions of more than 160 pathways linked to cancer, autoimmune disorders and cardiovascular disease with have been associated with vitamin D deficiency. This study reveals for the first time molecular finger prints that help explain the nonskeletal health benefits of vitamin D.
We have previously defined the impact of tobacco smoking on nasal epithelium gene expression using Affymetrix Exon 1.0 ST arrays. In this paper, we compared the performance of the Affymetrix GeneChip Human Gene 1.0 ST array with the Human Exon 1.0 ST array for detecting nasal smoking-related gene expression changes. RNA collected from the nasal epithelium of five current smokers and five never smokers was hybridized to both arrays. While the intersample correlation within each array platform was relatively higher in the Gene array than that in the Exon array, the majority of the genes most changed by smoking were tightly correlated between platforms. Although neither array dataset was powered to detect differentially expressed genes (DEGs) at a false discovery rate (FDR) <0.05, we identified more DEGs than expected by chance using the Gene ST array. These findings suggest that while both platforms show a high degree of correlation for detecting smoking-induced differential gene expression changes, the Gene ST array may be a more cost-effective platform in a clinical setting for gene-level genomewide expression profiling and an effective tool for exploring the host response to cigarette smoking and other inhaled toxins.
The fluid-filled lung exists in relative hypoxia in utero (∼25 mm Hg), but at birth fills with ambient air where the partial pressure of oxygen is ∼150 mm Hg. The impact of this change was studied in mouse lung with microarrays to analyze gene expression one day before, and 2, 6, 12 and 24 hours after birth into room air or 10% O2. The expression levels of >150 genes, representing transcriptional regulation, structure, apoptosis and antioxidants were altered 2 hrs after birth in room air but blunted or absent with birth in 10% O2. Kruppel-like factor 4 (Klf4), a regulator of cell growth arrest and differentiation, was the most significantly altered lung gene at birth. Its protein product was expressed in fibroblasts and airway epithelial cells. Klf4 mRNA was induced in lung fibroblasts exposed to hyperoxia and constitutive expression of Klf4 mRNA in Klf4-null fibroblasts induced mRNAs for p21cip1/Waf1, smooth muscle actin, type 1 collagen, fibronectin and tenascin C. In Klf4 perinatal null lung, p21cip1/Waf1mRNA expression was deficient prior to birth and associated with ongoing cell proliferation after birth; connective tissue gene expression was deficient around birth and smooth muscle actin protein expression was absent from myofibroblasts at tips of developing alveoli; p53, p21cip1/Waf1 and caspase-3 protein expression were widespread at birth suggesting excess apoptosis compared to normal lung. We propose that the changing oxygen environment at birth acts as a physiologic signal to induce lung Klf4 mRNA expression, which then regulates proliferation and apoptosis in fibroblasts and airway epithelial cells, and connective tissue gene expression and myofibroblast differentiation at the tips of developing alveoli.
Delivery of the transcription factors Oct4, Klf4, Sox2 and c-Myc via integrating viral vectors has been widely employed to generate induced pluripotent stem cell (iPSC) lines from both normal and disease-specific somatic tissues, providing an invaluable resource for medical research and drug development. Residual reprogramming transgene expression from integrated viruses nevertheless alters the biological properties of iPSCs and has been associated with a reduced developmental competence both in vivo and in vitro. We performed transcriptional profiling of mouse iPSC lines before and after excision of a polycistronic lentiviral reprogramming vector to systematically define the overall impact of persistent transgene expression on the molecular features of iPSCs. We demonstrate that residual expression of the Yamanaka factors prevents iPSCs from acquiring the transcriptional program exhibited by embryonic stem cells (ESCs) and that the expression profiles of iPSCs generated with and without c-Myc are indistinguishable. After vector excision, we find 36% of iPSC clones show normal methylation of the Gtl2 region, an imprinted locus that marks ESC-equivalent iPSC lines. Furthermore, we show that the reprogramming factor Klf4 binds to the promoter region of Gtl2. Regardless of Gtl2 methylation status, we find similar endodermal and hepatocyte differentiation potential comparing syngeneic Gtl2ON vs Gtl2OFF iPSC clones. Our findings provide new insights into the reprogramming process and emphasize the importance of generating iPSCs free of any residual transgene expression.
Lung carcinogenesis is a complex, stepwise process that involves the acquisition of genetic mutations and epigenetic changes that alter cellular processes, such as proliferation, differentiation, invasion, and metastasis. Here, we review some of the latest concepts in the pathogenesis of lung cancer and highlight the roles of inflammation, the “field of cancerization,” and lung cancer stem cells in the initiation of the disease. Furthermore, we review how high throughput genomics, transcriptomics, epigenomics, and proteomics are advancing the study of lung carcinogenesis. Finally, we reflect on the potential of current in vitro and in vivo models of lung carcinogenesis to advance the field and on the areas of investigation where major breakthroughs will lead to the identification of novel chemoprevention strategies and therapies for lung cancer.
Field of cancerization; inflammation; stem cells; genomics; epigenomics; proteomics
The “field of injury” hypothesis proposes that exposure to an inhaled insult such as cigarette smoke elicits a common molecular response throughout the respiratory tract. This response can therefore be quantified in any airway tissue, including readily accessible epithelial cells in the bronchus, nose, and mouth. High-throughput technologies, such as whole-genome gene expression microarrays, can be employed to catalog the physiological consequences of such exposures in the airway epithelium. Pulmonary diseases such as chronic obstructive pulmonary disease, lung cancer, and asthma are also thought to be associated with a field of injury, and in patients with these diseases, airway epithelial cells can be a useful surrogate for diseased tissue that is often difficult to obtain. Global measurement of mRNA and microRNA expression in these cells can provide useful information about the molecular pathogenesis of such diseases and may be useful for diagnosis and for predicting prognosis and response to therapy. In this review, our aim is to summarize the history and state of the art of such “transcriptomic” studies in the human airway epithelium, especially in smoking and smoking-related lung diseases, and to highlight future directions for this field.
epithelium; lung neoplasms; chronic obstructive pulmonary disease; asthma; tobacco
The acute phase response is an evolutionarily conserved reaction in which physiological stress triggers the liver to remodel the blood proteome. Although thought to be involved in immune defense, the net biological effect of the acute phase response remains unknown. As the acute phase response is stimulated by diverse cytokines that activate either NF-κB or STAT3, we hypothesized that it could be eliminated by hepatocyte-specific interruption of both transcription factors. Here, we report that the elimination in mice of both NF-κB p65 (RelA) and STAT3, but neither alone, abrogated all acute phase responses measured. The failure to respond was consistent across multiple different infectious, inflammatory, and noxious stimuli, including pneumococcal pneumonia. When the effects of infection were analyzed in detail, pneumococcal pneumonia was found to alter the expression of over a thousand transcripts in the liver. This outcome was inhibited by the combined loss of RelA and STAT3. Moreover, this interruption of the acute phase response increased mortality and exacerbated bacterial dissemination during pneumonia, possibly as a result of acute humoral enhancement of macrophage opsonophagocytosis, which was impaired in the mutant mice. Thus, we conclude that RelA and STAT3 are essential for stress-induced transcriptional remodeling in the liver and the subsequent activation of the acute phase response, whose functional role includes compartmentalization of local infection.
The homeodomain transcription factor Nkx2-1 is essential for normal lung development and homeostasis. In lung tumors, it is considered a lineage survival oncogene and prognostic factor depending on its expression levels. The target genes directly bound by Nkx2-1, that could be the primary effectors of its functions in the different cellular contexts where it is expressed, are mostly unknown. In embryonic day 11.5 (E11.5) mouse lung, epithelial cells expressing Nkx2-1 are predominantly expanding, and in E19.5 prenatal lungs, Nkx2-1-expressing cells are predominantly differentiating in preparation for birth. To evaluate Nkx2-1 regulated networks in these two cell contexts, we analyzed genome-wide binding of Nkx2-1 to DNA regulatory regions by chromatin immunoprecipitation followed by tiling array analysis, and intersected these data to expression data sets. We further determined expression patterns of Nkx2-1 developmental target genes in human lung tumors and correlated their expression levels to that of endogenous NKX2-1. In these studies we uncovered differential Nkx2-1 regulated networks in early and late lung development, and a direct function of Nkx2-1 in regulation of the cell cycle by controlling the expression of proliferation-related genes. New targets, validated in Nkx2-1 shRNA transduced cell lines, include E2f3, Cyclin B1, Cyclin B2, and c-Met. Expression levels of Nkx2-1 direct target genes identified in mouse development significantly correlate or anti-correlate to the levels of endogenous NKX2-1 in a dosage-dependent manner in multiple human lung tumor expression data sets, supporting alternative roles for Nkx2-1 as a transcriptional activator or repressor, and direct regulator of cell cycle progression in development and tumors.
Identifying similarities between patterns of differential gene expression provides an opportunity to identify similarities between the experimental and biological conditions that give rise to these gene expression alterations. The growing volume of gene expression data in open data repositories such as the NCBI Gene Expression Omnibus (GEO) presents an opportunity to identify these gene expression similarities on a large scale across a diverse collection of datasets. We have developed a fast, pattern-based computational approach, named openSESAME (Search of Expression Signatures Across Many Experiments), that identifies datasets enriched in samples that display coordinate differential expression of a query signature. Importantly, openSESAME performs this search without prior knowledge of the phenotypic or experimental groups in the datasets being searched. This allows openSESAME to identify perturbations of gene expression that are due to phenotypic attributes that may not have been described in the sample annotation included in the repository.
To demonstrate the utility of openSESAME, we used gene expression signatures of two biological perturbations to query a set of 75,164 human expression profiles that were generated using Affymetrix microarrays and deposited in GEO. The first query, using a signature of estradiol treatment, identified experiments in which estrogen signaling was perturbed and also identified differences in estrogen signaling between estrogen receptor-positive and -negative breast cancers. The second query, which used a signature of silencing of the transcription factor p63 (a key regulator of epidermal differentiation), identified datasets related to stratified squamous epithelia or epidermal diseases such as melanoma.
openSESAME is a tool for leveraging the growing body of publicly available microarray data to discover relationships between different biological states based on common patterns of differential gene expression. These relationships may serve to generate hypotheses about the causes and consequences of specific patterns of observed differential gene expression. To encourage others to explore the utility of this approach, we have made a website for performing openSESAME queries freely available at http://opensesame.bu.edu.
Smoking is the most important known risk factor for the development of lung cancer. Tobacco exposure results in chronic inflammation, tissue injury and repair. A recent hypothesis argues for a stem/progenitor cell involved in airway epithelial repair that may be a tumor-initiating cell in lung cancer, and which may be associated with recurrence and metastasis. We used immunostaining, quantitative real-time PCR, Western blots and lung cancer tissue microarrays to identify subpopulations of airway epithelial stem/progenitor cells under steady state conditions, normal repair, aberrant repair with premalignant lesions and lung cancer and their correlation with injury and prognosis. We identified a population of keratin 14 (K14)-expressing progenitor epithelial cells that was involved in repair after injury. Dysregulated repair resulted in persistence of K14+ cells in the airway epithelium in premalignant lesions. The presence of K14+ cells in non-small cell lung cancer (NSCLC) samples predicted poorer outcomes. This was especially true in smokers where the presence of K14+ cells in NSCLC was predictive of metastasis. The presence of K14+ progenitor airway epithelial cells in NSCLC predicted a poor prognosis and this predictive value was strongest in smokers, where it also correlated with metastasis. This suggests that reparative K14+ progenitor cells may be tumor-initiating cells in this subgroup of smokers with NSCLC.
Lung carcinogenesis; dysregulated repair; injury
Using valproic acid as an example, the authors demonstrate that drug response signatures derived from genome-wide expression data can identify individuals likely to respond to a drug, and propose that this method could select optimal populations for clinical trials of new therapies.
Drug response signatures that accurately reflect the cellular response to a drug can be generated from Connectivity Map and publically available gene expression data.Predictions from the drug response signature for valproic acid correlate with sensitivity to valproic acid in breast cancer cell lines and patient tumors grown in three-dimensional culture and mouse xenografts.The MATCH algorithm provides an efficient approach for using genome-wide gene expression data to identify a target population for a drug prior to clinical trials.MATCH can predict drug sensitivity in tumors without knowledge of mechanism of action.
Unlike traditional chemotherapy, targeted cancer therapies are expected to work in only a subset of people with a particular cancer. However, biomarkers of response are not always known before clinical trial initiation. We present MATCH (Merging genomic and pharmacologic Analyses for Therapy CHoice), an algorithm for using genome-wide gene expression data to identify and validate a genomic biomarker of sensitivity (see Figure 1). Our proof-of-principle example is valproic acid (VPA), but we also show that an estrogen blocking drug currently used for breast cancer and a B-RAF inhibitor in trials for melanoma give predictions that correspond to their clinical uses.
We use genome-wide gene expression data from treated and untreated samples from the Connectivity Map to generate a VPA response signature. We validate that the VPA signature can identify treated and untreated cells in an independent data set of normal cells and in independent samples from the Connectivity Map. The AUC for the ROC curve is 0.86. We then apply the VPA signature to publically available data sets from a panel of cancer cell lines and from primary tumor and normal tissue samples. These data suggest that there is a subset of women with breast cancer who will be sensitive to VPA. Finally, we validate that our predictions correlate with sensitivity to VPA in breast cancer cell lines grown in two-dimensional culture, primary breast tumor samples grown in three-dimensional culture, and in vivo mouse breast cancer xenografts. Together, these studies show that MATCH can identify cancer patients most likely to respond to a specific drug treatment.
Identifying the best drug for each cancer patient requires an efficient individualized strategy. We present MATCH (Merging genomic and pharmacologic Analyses for Therapy CHoice), an approach using public genomic resources and drug testing of fresh tumor samples to link drugs to patients. Valproic acid (VPA) is highlighted as a proof-of-principle. In order to predict specific tumor types with high probability of drug sensitivity, we create drug response signatures using publically available gene expression data and assess sensitivity in a data set of >40 cancer types. Next, we evaluate drug sensitivity in matched tumor and normal tissue and exclude cancer types that are no more sensitive than normal tissue. From these analyses, breast tumors are predicted to be sensitive to VPA. A meta-analysis across breast cancer data sets shows that aggressive subtypes are most likely to be sensitive to VPA, but all subtypes have sensitive tumors. MATCH predictions correlate significantly with growth inhibition in cancer cell lines and three-dimensional cultures of fresh tumor samples. MATCH accurately predicts reduction in tumor growth rate following VPA treatment in patient tumor xenografts. MATCH uses genomic analysis with in vitro testing of patient tumors to select optimal drug regimens before clinical trial initiation.
biomarkers; cancer; pharmacogenomics
The directed differentiation of iPS and ES cells into definitive endoderm (DE) would allow the derivation of otherwise inaccessible progenitors for endodermal tissues. However, a global comparison of the relative equivalency of DE derived from iPS and ES populations has not been performed. Recent reports of molecular differences between iPS and ES cells have raised uncertainty as to whether iPS cells could generate autologous endodermal lineages in vitro. Here, we show that both mouse iPS and parental ES cells exhibited highly similar in vitro capacity to undergo directed differentiation into DE progenitors. With few exceptions, both cell types displayed similar surges in gene expression of specific master transcriptional regulators and global transcriptomes that define the developmental milestones of DE differentiation. Microarray analysis showed considerable overlap between the genetic programs of DE derived from ES/iPS cells in vitro and authentic DE from mouse embryos in vivo. Intriguingly, iPS cells exhibited aberrant silencing of imprinted genes known to participate in endoderm differentiation, yet retained a robust ability to differentiate into DE. Our results show that, despite some molecular differences, iPS cells can be efficiently differentiated into DE precursors, reinforcing their potential for development of cell-based therapies for diseased endoderm-derived tissues.
Although cigarette smoking is the major cause of chronic obstructive pulmonary disease (COPD), only a subset of smokers develops this disease. There is significant clinical, radiographic, and pathologic heterogeneity within smokers who develop COPD that likely reflects multiple molecular mechanisms of disease. It is possible that variations in the individual response to cigarette smoking form the basis for the distinct clinical and molecular phenotypes and variable natural history associated with COPD. Using the biologic premise of a molecular field of airway injury created by cigarette smoking, this response to tobacco exposure can be measured by molecular profiling of the airway epithelium. Noninvasive study of this field effect by profiling airway gene expression in patients with COPD holds important implications for our understanding of disease heterogeneity, early disease detection, and identification of novel disease-modifying therapies.
airway gene expression; chronic obstructive pulmonary disease; bioinformatics
Chronic obstructive pulmonary disease (COPD) fulfills criteria for a complex genetic disease in which environmental factors interact with multiple polymorphic genes to influence susceptibility. Finding the genes that influence susceptibility can be approached in hypothesis testing or unbiased study designs. In candidate gene association studies, genetic variation in, and/or levels of, expression of genes known or suspected to be involved in the pathogenesis of COPD are compared in affected and unaffected individuals. Although this approach is useful it is limited by our present knowledge of disease pathophysiology. Genomewide studies of gene expression and of genetic variation are now possible and are not constrained by our limited knowledge. Although both of these unbiased approaches are in their infancy, they have already provided exciting new avenues for future investigation and potentially now approaches to risk prediction and therapy.
chronic obstructive pulmonary disease; genetics; genomics
Prior microarray studies of smokers at high risk for lung cancer have demonstrated that heterogeneity in bronchial airway epithelial cell gene expression response to smoking can serve as an early diagnostic biomarker for lung cancer. As a first step in applying functional genomic analysis to population studies, we have examined the relationship between gene expression variation and genetic variation in a central molecular pathway (NRF2-mediated antioxidant response) associated with smoking exposure and lung cancer. We assessed global gene expression in histologically normal airway epithelial cells obtained at bronchoscopy from smokers who developed lung cancer (SC, n = 20), smokers without lung cancer (SNC, n = 24), and never smokers (NS, n = 8). Functional enrichment analysis showed that the NRF2-mediated, antioxidant response element (ARE)-regulated genes, were significantly lower in SC, when compared with expression levels in SNC. Importantly, we found that the expression of MAFG (a binding partner of NRF2) was correlated with the expression of ARE genes, suggesting MAFG levels may limit target gene induction. Bioinformatically we identified single nucleotide polymorphisms (SNPs) in putative ARE genes and to test the impact of genetic variation, we genotyped these putative regulatory SNPs and other tag SNPs in selected NRF2 pathway genes. Sequencing MAFG locus, we identified 30 novel SNPs and two were associated with either gene expression or lung cancer status among smokers. This work demonstrates an analysis approach that integrates bioinformatics pathway and transcription factor binding site analysis with genotype, gene expression and disease status to identify SNPs that may be associated with individual differences in gene expression and/or cancer status in smokers. These polymorphisms might ultimately contribute to lung cancer risk via their effect on the airway gene expression response to tobacco-smoke exposure.
Lung cancer is the leading cause of cancer death in the US and the world. The high mortality rate results, in part, from the lack of effective tools for early detection and the inability to identify subsets of patients who would benefit from adjuvant chemotherapy or targeted therapies. The development of high-throughput genome-wide technologies for measuring gene expression, such as microarrays, have the potential to impact the mortality rate of lung cancer patients by improving diagnosis, prognosis, and treatment. This review will highlight recent studies using high-throughput gene expression technologies that have led to clinically relevant insights into lung cancer. The hope is that diagnostic and prognostic biomarkers that have been developed as part of this work will soon be ready for wide-spread clinical application and will have a dramatic impact on the evaluation of patients with suspect lung cancer, leading to effective personalized treatment regimens.