Lung cancer is the worldwide leading cause of death from cancer. Tobacco usage is the major pathogenic factor, but all lung cancers are not attributable to smoking. Specifically, lung cancer in never-smokers has been suggested to represent a distinct disease entity compared to lung cancer arising in smokers due to differences in etiology, natural history and response to specific treatment regimes. However, the genetic aberrations that differ between smokers and never-smokers’ lung carcinomas remain to a large extent unclear.
Unsupervised gene expression analysis of 39 primary lung adenocarcinomas was performed using Illumina HT-12 microarrays. Results from unsupervised analysis were validated in six external adenocarcinoma data sets (n=687), and six data sets comprising normal airway epithelial or normal lung tissue specimens (n=467). Supervised gene expression analysis between smokers and never-smokers were performed in seven adenocarcinoma data sets, and results validated in the six normal data sets.
Initial unsupervised analysis of 39 adenocarcinomas identified two subgroups of which one harbored all never-smokers. A generated gene expression signature could subsequently identify never-smokers with 79-100% sensitivity in external adenocarcinoma data sets and with 76-88% sensitivity in the normal materials. A notable fraction of current/former smokers were grouped with never-smokers. Intriguingly, supervised analysis of never-smokers versus smokers in seven adenocarcinoma data sets generated similar results. Overlap in classification between the two approaches was high, indicating that both approaches identify a common set of samples from current/former smokers as potential never-smokers. The gene signature from unsupervised analysis included several genes implicated in lung tumorigenesis, immune-response associated pathways, genes previously associated with smoking, as well as marker genes for alveolar type II pneumocytes, while the best classifier from supervised analysis comprised genes strongly associated with proliferation, but also genes previously associated with smoking.
Based on gene expression profiling, we demonstrate that never-smokers can be identified with high sensitivity in both tumor material and normal airway epithelial specimens. Our results indicate that tumors arising in never-smokers, together with a subset of tumors from smokers, represent a distinct entity of lung adenocarcinomas. Taken together, these analyses provide further insight into the transcriptional patterns occurring in lung adenocarcinoma stratified by smoking history.
Lung cancer; Smoking; Gene expression analysis; Adenocarcinoma; EGFR; Never-smokers; Immune response
Lung cancer is the commonest cause of cancer death in developed countries. Adenocarcinoma is becoming the most common form of lung cancer. Cigarette smoking is the main risk factor for lung cancer. Long-term cigarettes smoking may be characterized by genetic alteration and diffuse injury of the airways surface, named field cancerization, while cancer in non-smokers is usually clonally derived. Detecting specific genes expression changes in non-cancerous lung in smokers with adenocarcinoma may give us instrument for predicting smokers who are going to develop this malignancy.
We described the gene expression in non-cancerous lungs from 21 smoker patients with lung adenocarcinoma and compare it to gene expression in non-cancerous lung tissue from 10 non-smokers with primary lung adenocarcinoma.
Total RNA was isolated from peripheral non-cancerous lung tissue. The cDNA was hybridized to the U133A GeneChip array. Hierarchical clustering analysis on genes obtained from smokers and non-smokers, after subtracting were exported to the Ingenuity Pathway Analysis software for further analysis.
The genes subtraction resulted in disclosure of 36 genes with high score. They were subsequently mapped and sorted based on location, cellular components, and biochemical activity. The gene functional analysis disclosed 20 genes, which are involved in cancer process (P = 7.05E-5 to 2.92E-2).
Detected genes may serve as a predictor for smokers who may be at high risk of developing lung cancer. In addition, since these genes originating from non-cancerous lung, which is the major area of the lungs, a sample from an induced sputum may represent it.
Smoking is responsible for 90% of lung cancer cases. There is currently no clinically available gene test for early detection of lung cancer in smokers, or an effective patient selection strategy for adjuvant chemotherapy in lung cancer treatment. In this study, concurrent coexpression with multiple signaling pathways was modeled among a set of genes associated with smoking and lung cancer survival. This approach identified and validated a 7-gene signature for lung cancer diagnosis and prognosis in smokers using patient transcriptional profiles (n=847). The smoking-associated gene coexpression networks in lung adenocarcinoma tumors (n=442) were highly significant in terms of biological relevance (network precision = 0.91, FDR<0.01) when evaluated with numerous databases containing multi-level molecular associations. The gene coexpression network in smoking lung adenocarcinoma patients was confirmed in qRT-PCR assays of the identified biomarkers and involved signaling pathway genes in human lung adenocarcinoma cells (H23) treated with 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK). Furthermore, the western blotting results of p53, phospho-p53, Rb and EGFR in NNK-treated H23 and transformed normal human lung epithelial cells (BEAS-2B) support their functional involvement in smoking-induced lung cancer carcinogenesis and progression.
smoking; lung cancer diagnosis and prognosis; gene signature; signaling pathway; coexpression networks
Smoking is responsible for 90% of lung cancer cases. There is currently no clinically available gene test for early detection of lung cancer in smokers, or an effective patient selection strategy for adjuvant chemotherapy in lung cancer treatment. In this study, concurrent coexpression with multiple signaling pathways was modeled among a set of genes associated with smoking and lung cancer survival. This approach identified and validated a 7-gene signature for lung cancer diagnosis and prognosis in smokers using patient transcriptional profiles (n=847). The smoking-associated gene coexpression networks in lung adenocarcinoma tumors (n=442) were highly significant in terms of biological relevance (network precision=0.91, FDR<0.01) when evaluated with numerous databases containing multi-level molecular associations. The gene coexpression network in smoking lung adenocarcinoma patients was confirmed in qRT-PCR assays of the identified biomarkers and involved signaling pathway genes in human lung adenocarcinoma cells (H23) treated with 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK). Furthermore, the Western blotting results of p53, phospho-p53, Rb, and EGFR in NNK-treated H23 and transformed normal human lung epithelial cells (BEAS-2B) support their functional involvement in smoking induced lung cancer carcinogenesis and progression.
smoking; lung cancer diagnosis and prognosis; gene signature; signaling pathway; coexpression networks
An embryonic stem cell profile correlates with poorly differentiated breast, bladder and glioma cancers. In this manuscript, we assess the correlation between the embryonic stem cell profile and clinical variables in lung cancer.
Microarray gene expression analysis was done using Affymetrix Human Genome U133A on 443 samples of human lung adenocarcinoma and 130 samples of squamous cell carcinoma. To identify gene-set enrichment patterns we used the Genomica software.
Our analysis showed that an increased expression of the embryonic stem cell gene set and decreased expression of Polycomb target gene set identified poorly-differentiated lung adenocarcinoma. In addition, this gene expression signature was associated with markers of poor prognosis and worse overall survival in lung adenocarcinoma. However, there was no correlation between this embryonic stem cell gene signature and any histological or clinical variable assessed in lung squamous cell carcinoma.
This work suggests that not all poorly-differentiated non-small cell lung cancers exhibit a gene expression profile similar to ESC, and that other characteristics may play a more important role in the determination of differentiation and survival in squamous cell carcinoma of the lung.
Embryonic genes; stem cell; Affymetrix; lung; cancer
Lung cancer is the most common cause of cancer-related deaths. Tobacco smoke exposure is the strongest aetiological factor associated with lung cancer. In this study, using serial analysis of gene expression (SAGE), we comprehensively examined the effect of active smoking by comparing the transcriptomes of clinical specimens obtained from current, former and never smokers, and identified genes showing both reversible and irreversible expression changes upon smoking cessation.
Twenty-four SAGE profiles of the bronchial epithelium of eight current, twelve former and four never smokers were generated and analyzed. In total, 3,111,471 SAGE tags representing over 110 thousand potentially unique transcripts were generated, comprising the largest human SAGE study to date. We identified 1,733 constitutively expressed genes in current, former and never smoker transcriptomes. We have also identified both reversible and irreversible gene expression changes upon cessation of smoking; reversible changes were frequently associated with either xenobiotic metabolism, nucleotide metabolism or mucus secretion. Increased expression of TFF3, CABYR, and ENTPD8 were found to be reversible upon smoking cessation. Expression of GSK3B, which regulates COX2 expression, was irreversibly decreased. MUC5AC expression was only partially reversed. Validation of select genes was performed using quantitative RT-PCR on a secondary cohort of nine current smokers, seven former smokers and six never smokers.
Expression levels of some of the genes related to tobacco smoking return to levels similar to never smokers upon cessation of smoking, while expression of others appears to be permanently altered despite prolonged smoking cessation. These irreversible changes may account for the persistent lung cancer risk despite smoking cessation.
KRAS mutations are found in ~ 25% of lung adenocarcinomas in Western countries and, as a group, have been strongly associated with cigarette smoking. These mutations are predictive of poor prognosis in resected disease as well as resistance to treatment with erlotinib or gefitinib.
We determined the frequency and type of KRAS codon 12 and 13 mutations and characterized their association with cigarette smoking history in patients with lung adenocarcinomas.
KRAS mutational analysis was performed on 482 lung adenocarcinomas, 81 (17%) of which were obtained from patients who had never smoked cigarettes. KRAS mutations were found in 15% (12/81; 95% CI 8%-24%) of tumors from never smokers. Similarly, 22% (69/316; 95% CI 17%-27%) of tumors from former smokers, and 25% (21/85; 95% CI 16%-35%) of tumors from current smokers had KRAS mutations. The frequency of KRAS mutation was not associated with age, gender, or smoking history. The number of pack years of cigarette smoking did not predict an increased likelihood of KRAS mutations. Never smokers were significantly more likely than former or current smokers to have a transition mutation (G→A) rather than the transversion mutations known to be smoking related (G→T or G→C; p<0.0001).
Based upon our data, KRAS mutations are not rare among never smokers with lung adenocarcinoma and such patients have a distinct KRAS mutation profile. The etiologic and biological heterogeneity of KRAS mutant lung adenocarcinomas is worthy of further study.
Maternal smoking doubles the risk of delivering a low birth weight infant. The purpose of this study was to analyze differential gene expression in umbilical cord tissue as a function of maternal smoking, with an emphasis on growth-related genes. We recruited 15 pregnant smokers and 15 women who never smoked during pregnancy to participate. RNA was isolated from umbilical cord tissue collected and snap frozen at the time of delivery. Microarray analysis was performed using the Affymetrix GeneChip Scanner 3000. Six hundred seventy-eight probes corresponding to 545 genes were differentially expressed (i.e., had an intensity ratio > +/−1.3 and a corrected significance value p < 0.005) in tissue obtained from smokers versus nonsmokers. Genes important for fetal growth, angiogenesis, or development of connective tissue matrix were up regulated among smokers. The most highly up-regulated gene was CSH1, a somatomammotropin gene. Two other somatomammotropin genes (CSH2 and CSH-L1) were also up regulated. The most highly down-regulated gene was APOBEC3A; other down-regulated genes included those that may be important in immune and barrier protection. Validation of the three somatomammotropin genes showed a high correlation between qPCR and microarray expression. We conclude that maternal smoking may be associated with altered gene expression in the offspring.
prenatal tobacco exposure; pregnancy; microarray; genetics
Affordable early screening in subjects with high risk of lung cancer has great potential to improve survival from this deadly disease. We measured gene expression from lung tissue and peripheral whole blood (PWB) from adenocarcinoma cases and controls to identify dysregulated lung cancer genes that could be tested in blood to improve identification of at-risk patients in the future. Genome-wide mRNA expression analysis was conducted in 153 subjects (73 adenocarcinoma cases, 80 controls) from the Environment And Genetics in Lung cancer Etiology (EAGLE) study using PWB and paired snap-frozen tumor and non-involved lung tissue samples. Analyses were conducted using unpaired t-tests, linear mixed effects and ANOVA models. The area under the receiver operating characteristic curve (AUC) was computed to assess the predictive accuracy of the identified biomarkers. We identified 50 dysregulated genes in stage I adenocarcinoma versus control PWB samples (False Discovery Rate ≤0.1, fold change ≥1.5 or ≤0.66). Among them, eight (TGFBR3, RUNX3, TRGC2, TRGV9, TARP, ACP1, VCAN, and TSTA3) differentiated paired tumor versus non-involved lung tissue samples in stage I cases, suggesting a similar pattern of lung cancer-related changes in PWB and lung tissue. These results were confirmed in two independent gene expression analyses in a blood-based case-control study (n=212) and a tumor-non tumor paired tissue study (n=54). The eight genes discriminated patients with lung cancer from healthy controls with high accuracy (AUC=0.81, 95% CI=0.74–0.87). Our finding suggests the use of gene expression from PWB for the identification of early detection markers of lung cancer in the future.
microarray gene expression; peripheral blood; lung cancer; stage I
Lung cancer is mainly caused by smoking, but the quantitative relations between smoking and histologic subtypes of lung cancer remain inconclusive. Using one of the largest lung cancer datasets ever assembled, we explored the impact of smoking on risks of the major cell types of lung cancer. This pooled analysis included 13,169 cases and 16,010 controls from Europe and Canada. Studies with population controls comprised 66.5% of the subjects. Adenocarcinoma (AdCa) was the most prevalent subtype in never smokers and in women. Squamous cell carcinoma (SqCC) predominated in male smokers. Age-adjusted odds ratios (ORs) were estimated with logistic regression. ORs were elevated for all metrics of exposure to cigarette smoke and were higher for SqCC and small cell lung cancer (SCLC) than for AdCa. Current male smokers with an average daily dose of >30 cigarettes had ORs of 103.5 (95% CI 74.8-143.2) for SqCC, 111.3 (95% CI 69.8-177.5) for SCLC, and 21.9 (95% CI 16.6-29.0) for AdCa. In women, the corresponding ORs were 62.7 (95% CI 31.5-124.6), 108.6 (95% CI 50.7-232.8), and 16.8 (95% CI 9.2-30.6), respectively. Whereas ORs started to decline soon after quitting, they did not fully return to the baseline risk of never smokers even 35 years after cessation. The major result that smoking exerted a steeper risk gradient on SqCC and SCLC than on AdCa is in line with previous population data and biological understanding of lung cancer development.
cigarette smoking; lung cancer; relative risk characterization; tobacco smoke; stem cells
Smoking is a prominent risk factor for lung cancer. However, it is not an established prognostic factor for lung cancer in clinics. To date, no gene test is available for diagnostic screening of lung cancer risk or prognostication of clinical outcome in smokers. This study sought to identify a smoking associated gene signature in order to provide a more precise diagnosis and prognosis of lung cancer in smokers.
Methods and materials
An implication network based methodology was used to identify biomarkers by modeling crosstalk with major lung cancer signaling pathways. Specifically, the methodology contains the following steps: 1) identifying genes significantly associated with lung cancer survival; 2) selecting candidate genes which are differentially expressed in smokers versus non-smokers from the survival genes identified in Step 1; 3) from these candidate genes, constructing gene coexpression networks based on prediction logic for the smoker group and the non-smoker group, respectively; 4) identifying smoking-mediated differential components, i.e., the unique gene coexpression patterns specific to each group; and 5) from the differential components, identifying genes directly co-expressed with major lung cancer signaling hallmarks.
A smoking-associated 6-gene signature was identified for prognosis of lung cancer from a training cohort (n=256). The 6-gene signature could separate lung cancer patients into two risk groups with distinct post-operative survival (log-rank P < 0.04, Kaplan-Meier analyses) in three independent cohorts (n=427). The expression-defined prognostic prediction is strongly related to smoking association and smoking cessation (P < 0.02; Pearson’s Chi-squared tests). The 6-gene signature is an accurate prognostic factor (hazard ratio = 1.89, 95% CI: [1.04, 3.43]) compared to common clinical covariates in multivariate Cox analysis. The 6-gene signature also provides an accurate diagnosis of lung cancer with an overall accuracy of 73% in a cohort of smokers (n=164). The coexpression patterns derived from the implication networks were validated with interactions reported in the literature retrieved with STRING8, Ingenuity Pathway Analysis, and Pathway Studio.
The pathway-based approach identified a smoking-associated 6-gene signature that predicts lung cancer risk and survival. This gene signature has potential clinical implications in the diagnosis and prognosis of lung cancer in smokers.
implication networks based on prediction logic; gene coexpression networks based on formal logic; smoking; gene signature; lung cancer diagnosis and prognosis; signaling pathways
Use of tobacco is responsible for approximately 30% of all cancer-related deaths in the United States including cancers of the upper aerodigestive tract. In the current study, 40 current and 40 age- and gender-matched never smokers underwent buccal biopsies to evaluate the effects of smoking on the transcriptome. Microarray analyses were carried out using Affymetrix HGU 133 Plus2 arrays. Smoking altered the expression of numerous genes: 32 genes showed increased expression and 9 genes showed reduced expression in the oral mucosa of smokers vs. never smokers. Increases were found in genes involved in xenobiotic metabolism, oxidant stress, eicosanoid synthesis, nicotine signaling and cell adhesion. Increased numbers of Langerhans cells were found in the oral mucosa of smokers. Interestingly, smoking caused greater induction of aldo-keto reductases, enzymes linked to polycyclic aromatic hydrocarbon induced genotoxicity, in the oral mucosa of women than men. Striking similarities in expression changes were found in oral compared to the bronchial mucosa. The observed changes in gene expression were compared to known chemical signatures using the Connectivity Map database, and suggested that geldanamycin, an Hsp90 inhibitor, might be an anti-mimetic of tobacco smoke. Consistent with this prediction, geldanamycin caused dose-dependent suppression of tobacco smoke extract-mediated induction of CYP1A1 and CYP1B1 in vitro. Collectively, these results provide new insights into the carcinogenic effects of tobacco smoke, support the potential use of oral epithelium as a surrogate tissue in future lung cancer chemoprevention trials and illustrate the potential of computational biology to identify chemopreventive agents.
tobacco; smoking; microarray; aryl hydrocarbon receptor; heat shock protein 90
The aetiologic role of tobacco smoking was elucidated in a case-control study comprising 579 cases of male lung cancer registered during 1972-1977 in northern Sweden. The population aetiologic fraction attributable to smoking was about 80% in this series. Pipe smoking was as common as cigarette smoking and gave similar relative risk. The pipe smoking cases, however, had significantly higher mean age and mean smoking years at the time of diagnosis than the cigarette smoking cases. An obvious dose-response relation was found for both cigarette and pipe smoking. In ex-smokers, the relative risk gradually decreased from five years after cessation of smoking. This decrease was, however, much less pronounced in ex-pipe smokers than in ex-cigarette smokers. High relative risks were obtained for small cell and squamous cell carcinomas. For adenocarcinomas the relative risk was considerably lower but still significantly increased. Two types of controls were used, i.e. decreased and living. Comparison with living controls gave generally higher risk estimates than comparison with deceased controls.
Lung cancer is strictly associated with tobacco smoking. Tumours developed in non-smoking subjects account for less than 10% of all lung cancers and show peculiar histopathological features, being prevalently adenocarcinomas. A number of genetic data suggest that their biological behaviour may be different from that of lung tumours caused by smoking, however the number of cases investigated to date is too low to draw definitive conclusions. We have examined the status of p53 and K-ras genes and the presence of loss of heterozygosity (LOH) at the FHIT locus in a series of 35 lung adenocarcinomas that developed in subjects who had never smoked. Results were compared with those obtained in a series of 35 lung adenocarcinomas from heavy-smoking subjects. In the group of non-smoking subjects p53 mutations and LOH at the FHIT locus were present in seven (20%) cases, and the two alterations were constantly associated (P < 0.0001), whereas they were not related in the series of carcinomas caused by smoking. In tumours developed in heavy-smoking subjects, the frequency of LOH at the FHIT locus was significantly higher (P = 0.006) than in tumours from non-smoking subjects. The frequency of p53 mutations in adenocarcinomas caused by smoking was not different from that seen in non-smoking subjects. However, in the group of smoking subjects we observed mostly G:C --> T:A transversions, whereas frameshift mutations and G:C --> A:T transitions were more frequently found in tumours from non-smoking subjects. No point mutations of the K-ras gene at codon 12 were seen in subjects who had never smoked, whereas they were present (mostly G:C --> T:A transversions) in 34% of tumours caused by smoking (P = 0.002). Our data suggest that lung adenocarcinomas developed in subjects who had never smoked represent a distinct biological entity involving a co-alteration of the p53 gene and the FHIT locus in 20% of cases.
OBJECTIVE: To estimate the risk of lung cancer in lifelong non-smokers exposed to environmental tobacco smoke. DESIGN: Analysis of 37 published epidemiological studies of the risk of lung cancer (4626 cases) in non-smokers who did and did not live with a smoker. The risk estimate was compared with that from linear extrapolation of the risk in smokers using seven studies of biochemical markers of tobacco smoke intake. MAIN OUTCOME MEASURE: Relative risk of lung cancer in lifelong non-smokers according to whether the spouse currently smoked or had never smoked. RESULTS: The excess risk of lung cancer was 24% (95% confidence interval 13% to 36%) in non-smokers who lived with a smoker (P < 0.001). Adjustment for the effects of bias (positive and negative) and dietary confounding had little overall effect; the adjusted excess risk was 26% (7% to 47%). The dose-response relation of the risk of lung cancer with both the number of cigarettes smoked by the spouse and the duration of exposure was significant. The excess risk derived by linear extrapolation from that in smokers was 19%, similar to the direct estimate of 26%. CONCLUSION: The epidemiological and biochemical evidence on exposure to environmental tobacco smoke, with the supporting evidence of tobacco specific carcinogens in the blood and urine of non-smokers exposed to environmental tobacco smoke, provides compelling confirmation that breathing other people's tobacco smoke is a cause of lung cancer.
Smoking causes lung cancer and chronic obstructive pulmonary disease (COPD) that impose severe health problem to humans. Both diseases are related to each other and can be induced by chronic inflammation in the lung. To identify the molecular mechanism for lung cancer formation, a CCSP-rtTA/(Teto)7Stat3C bitransgenic model was generated recently. In this model, persistent activation of the Stat3 signaling pathway induced pulmonary inflammation and adenocarcinoma formation in the lung. A group of Stat3 downstream genes were identified by Affymetrix GeneChip microarray analysis that can be used as biomarkers for lung cancer diagnosis and prognosis. To determine which human lung cancers are related to the Stat3 pathway, multiple Stat3 downstream genes were screened in human lung cancers (adenocarcinomas and squamous cell carcinomas) and lung tissue with COPD. In both cancer and COPD, the Stat3 gene was up-regulated. A panel of Stat3-up-regulated downstream genes in mice was up-regulated in human adenocarcinomas, but not in human squamous cell carcinomas. This panel of genes was also modestly up-regulated in lung tissue with COPD from patients with a history of smoking and not up-regulated in those without histories of smoking. Several Stat3-down-regulated downstream genes also showed differential expression patterns in carcinoma and COPD. These studies support a concept that Stat3 is a potent oncogenic molecule that plays a role in formation of lung adenocarcinomas in both mice and humans. The carcinogenesis of adenocarcinoma and squamous cell carcinoma is mediated by different molecular mechanisms and pathways in vivo. Stat3 and its downstream genes can serve as biomarkers for lung adenocarcinoma and COPD diagnosis and prognosis in mice and humans.
Stat3; Adenocarcinoma; Squamous Cell Carcinoma; COPD; Biomarkers; Real-Time qRT-PCR
This study was designed to determine the relationship of cigarette smoking to the frequency and qualitative differences among KRAS mutations in lung adenocarcinomas from Korean patients.
Materials and Methods
Detailed smoking histories were obtained from 200 consecutively enrolled patients with lung adenocarcinoma according to a standard protocol. EGFR (exons 18 to 21) and KRAS (codons 12/13) mutations were determined via direct-sequencing.
The incidence of KRAS mutations was 8% (16 of 200) in patients with lung adenocarcinoma. KRAS mutations were found in 5.8% (7 of 120) of tumors from never-smokers, 15% (6 of 40) from former-smokers, and 7.5% (3 of 40) from current-smokers. The frequency of KRAS mutations did not differ significantly according to smoking history (p=0.435). Never-smokers were significantly more likely than former or current smokers to have a transition mutation (G→A or C→T) rather than a transversion mutation (G→T or G→C) that is known to be smoking-related (p=0.011). In a Cox regression model, the adjusted hazard ratios for the risk of progression with epidermal growth factor receptor-tyrosine kinase inhibitors (EGFR-TKIs) were 0.24 (95% CI, 0.14-0.42; p<0.001) for the EGFR mutation and 1.27 (95% CI, 0.58-2.79; p=0.537) for the KRAS mutation.
Cigarette smoking did not influence the frequency of KRAS mutations in lung adenocarcinomas in Korean patients, but influenced qualitative differences in the KRAS mutations.
EGFR; KRAS; pulmonary adenocarcinoma; cigarette smoking; EGFR-tyrosine kinase inhibitors
Although several prognostic signatures have been developed in lung cancer, their application in clinical practice has been limited because they have not been validated in multiple independent data sets. Moreover, the lack of common genes between the signatures makes it difficult to know what biological process may be reflected or measured by the signature. By using classical data exploration approach with gene expression data from patients with lung adenocarcinoma (n = 186), we uncovered two distinct subgroups of lung adenocarcinoma and identified prognostic 193-gene gene expression signature associated with two subgroups. The signature was validated in 4 independent lung adenocarcinoma cohorts, including 556 patients. In multivariate analysis, the signature was an independent predictor of overall survival (hazard ratio, 2.4; 95% confidence interval, 1.2 to 4.8; p = 0.01). An integrated analysis of the signature revealed that E2F1 plays key roles in regulating genes in the signature. Subset analysis demonstrated that the gene signature could identify high-risk patients in early stage (stage I disease), and patients who would have benefit of adjuvant chemotherapy. Thus, our study provided evidence for molecular basis of clinically relevant two distinct two subtypes of lung adenocarcinoma.
Degradation and chemical modification of RNA in formalin-fixed paraffin-embedded (FFPE) samples hamper their use in expression profiling studies. This study aimed to show that useful information can be obtained by Exon-array profiling archival FFPE tumour samples.
Nineteen cervical squamous cell carcinoma (SCC) and 9 adenocarcinoma (AC) FFPE samples (10–16-year-old) were profiled using Affymetrix Exon arrays. The gene signature derived was tested on a fresh-frozen non-small cell lung cancer (NSCLC) series. Exploration of biological networks involved gene set enrichment analysis (GSEA). Differential gene expression was confirmed using Quantigene, a multiplex bead-based alternative to qRT–PCR.
In all, 1062 genes were higher in SCC vs AC, and 155 genes higher in AC. The 1217-gene signature correctly separated 58 NSCLC into SCC and AC. A gene network centered on hepatic nuclear factor and GATA6 was identified in AC, suggesting a role in glandular cell differentiation of the cervix. Quantigene analysis of the top 26 differentially expressed genes correctly partitioned cervix samples as SCC or AC.
FFPE samples can be profiled using Exon arrays to derive gene expression signatures that are sufficiently robust to be applied to independent data sets, identify novel biology and design assays for independent platform validation.
cervix cancer; exon array; expression profiling; FFPE; histology
MicroRNA plays an important role in human diseases and cancer. We seek to investigate the expression status, clinical relevance, and functional role of microRNA in non-small cell lung cancer.
We performed miRNA expression profiling in matched lung adenocarcinoma and uninvolved lung using 56 pairs of fresh-frozen (FF) and 47 pairs of formalin-fixed, paraffin-embedded (FFPE) samples from never smokers. The most differentially expressed miRNA genes were evaluated by Cox analysis and Log-Rank test. Among the best candidate, miR-708 was further examined for differential expression in two independent cohorts. Functional significance of miR-708 expression in lung cancer was examined by identifying its candidate mRNA target and through manipulating its expression levels in cultured cells.
Among the 20 miRNAs most differentially expressed between tested tumor and normal samples, high expression level of miR-708 in the tumors was most strongly associated with an increased risk of death after adjustments for all clinically significant factors including age, sex, and tumor stage (FF cohort: HR, 1.90; 95% CI, 1.08-3.35; P=.025 and FFPE cohort: HR, 1.93; 95% CI, 1.02-3.63; P=.042). The transcript for TMEM88 gene has a miR-708 binding site in its 3′ UTR and was significantly reduced in tumors high of miR-708. Forced miR-708 expression reduced TMEM88 transcript levels and increased the rate of cell proliferation, invasion, and migration in culture.
MicroRNA-708 acts as an oncogene contributing to tumor growth and disease progression by directly down regulating TMEM88, a negative regulator of the Wnt signaling pathway in lung cancer.
NSCLC; adenocarcinoma; miR-708; never smoker; survival; TMEM88; Wnt signaling
Objective To assess the risk of lung cancer in smokers of medium tar filter cigarettes compared with smokers of low tar and very low tar filter cigarettes.
Design Analysis of the association between the tar rating of the brand of cigarette smoked in 1982 and mortality from lung cancer over the next six years. Multivariate proportional hazards analyses used to assess hazard ratios, with adjustment for age at enrolment, race, educational level, marital status, blue collar employment, occupational exposure to asbestos, intake of vegetables, citrus fruits, and vitamins, and, in analyses of current and former smokers, for age when they started to smoke and number of cigarettes smoked per day.
Setting Cancer prevention study II (CPS-II).
Participants 364 239 men and 576 535 women, aged ≥ 30 years, who had either never smoked, were former smokers, or were currently smoking a specific brand of cigarette when they were enrolled in the cancer prevention study.
Main outcome measure Death from primary cancer of the lung among participants who had never smoked, former smokers, smokers of very low tar (≤ 7 mg tar/cigarette) filter, low tar (8-14 mg) filter, high tar (≥ 22 mg) non-filter brands and medium tar conventional filter brands (15-21 mg).
Results Irrespective of the tar level of their current brand, all current smokers had a far greater risk of lung cancer than people who had stopped smoking or had never smoked. Compared with smokers of medium tar (15-21 mg) filter cigarettes, risk was higher among men and women who smoked high tar (≥ 22 mg) non-filter brands (hazard ratio 1.44, 95% confidence interval 1.20 to 1.73, and 1.64, 1.26 to 2.15, respectively). There was no difference in risk among men who smoked brands rated as very low tar (1.17, 0.95 to 1.45) or low tar (1.02, 0.90 to 1.16) compared with those who smoked medium tar brands. The same was seen for women (0.98, 0.80 to 1.21, and 0.95, 0.82 to 1.11, respectively).
Conclusion The increase in lung cancer risk is similar in people who smoke medium tar cigarettes (15-21 mg), low tar cigarettes (8-14 mg), or very low tar cigarettes (≤ 7 mg). Men and women who smoke non-filtered cigarettes with tar ratings ≥ 22 mg have an even higher risk of lung cancer.
Cigarette smoking is a leading cause of preventable death and a significant cause of lung cancer and chronic obstructive pulmonary disease. Prior studies have demonstrated that smoking creates a field of molecular injury throughout the airway epithelium exposed to cigarette smoke. We have previously characterized gene expression in the bronchial epithelium of never smokers and identified the gene expression changes that occur in the mainstem bronchus in response to smoking. In this study, we explored relationships in whole-genome gene expression between extrathorcic (buccal and nasal) and intrathoracic (bronchial) epithelium in healthy current and never smokers.
Using genes that have been previously defined as being expressed in the bronchial airway of never smokers (the "normal airway transcriptome"), we found that bronchial and nasal epithelium from non-smokers were most similar in gene expression when compared to other epithelial and nonepithelial tissues, with several antioxidant, detoxification, and structural genes being highly expressed in both the bronchus and nose. Principle component analysis of previously defined smoking-induced genes from the bronchus suggested that smoking had a similar effect on gene expression in nasal epithelium. Gene set enrichment analysis demonstrated that this set of genes was also highly enriched among the genes most altered by smoking in both nasal and buccal epithelial samples. The expression of several detoxification genes was commonly altered by smoking in all three respiratory epithelial tissues, suggesting a common airway-wide response to tobacco exposure.
Our findings support a relationship between gene expression in extra- and intrathoracic airway epithelial cells and extend the concept of a smoking-induced field of injury to epithelial cells that line the mouth and nose. This relationship could potentially be utilized to develop a non-invasive biomarker for tobacco exposure as well as a non-invasive screening or diagnostic tool providing information about individual susceptibility to smoking-induced lung diseases.
Several different gene expression signatures have been proposed to predict response to therapy and clinical outcome in lung adenocarcinoma. Herein, we investigate if elements of published gene sets can be reproduced in a small dataset, and how gene expression profiles based on limited sample size relate to clinical parameters including histopathological grade and EGFR protein expression.
Affymetrix Human Genome U133A platform was used to obtain gene expression profiles of 28 pathologically and clinically annotated adenocarcinomas of the lung. EGFR status was determined by fluorescent in situ hybridization and immunohistochemistry.
Using unsupervised clustering algorithms, the predominant gene expression signatures correlated with the histopathological grade but not with EGFR protein expression as detected by immunohistochemistry. In a supervised analysis, the signature of high grade tumors but not of EGFR overexpressing cases showed significant enrichment of gene sets reflecting MAPK activation and other potential signaling cascades downstream of EGFR. Out of four different previously published gene sets that had been linked to prognosis, three showed enrichment in the gene expression signature associated with favorable prognosis.
In this dataset, histopathological tumor grades but not EGFR status were associated with dominant gene expression signatures and gene set enrichment reflecting oncogenic pathway activation, suggesting that high immunohistochemistry EGFR scores may not necessarily be linked to downstream effects that cause major changes in gene expression patterns. Published gene sets showed association with patient survival; however, the small sample size of this study limited the options for a comprehensive validation of previously reported prognostic gene expression signatures.
Cigarette smoke creates a molecular field of injury in epithelial cells that line the respiratory tract. We hypothesized that transcriptome sequencing (RNA-Seq) will enhance our understanding of the field of molecular injury in response to tobacco smoke exposure and lung cancer pathogenesis by identifying gene expression differences not interrogated or accurately measured by microarrays. We sequenced the high-molecular-weight fraction of total RNA (>200 nt) from pooled bronchial airway epithelial cell brushings (n = 3 patients per pool) obtained during bronchoscopy from healthy never smoker (NS) and current smoker (S) volunteers and smokers with (C) and without (NC) lung cancer undergoing lung nodule resection surgery. RNA-Seq libraries were prepared using 2 distinct approaches, one capable of capturing non-polyadenylated RNA (the prototype NuGEN Ovation RNA-Seq protocol) and the other designed to measure only polyadenylated RNA (the standard Illumina mRNA-Seq protocol) followed by sequencing generating approximately 29 million 36 nt reads per pool and approximately 22 million 75 nt paired-end reads per pool, respectively. The NuGEN protocol captured additional transcripts not detected by the Illumina protocol at the expense of reduced coverage of polyadenylated transcripts, while longer read lengths and a paired-end sequencing strategy significantly improved the number of reads that could be aligned to the genome. The aligned reads derived from the two complementary protocols were used to define the compendium of genes expressed in the airway epithelium (n = 20,573 genes). Pathways related to the metabolism of xenobiotics by cytochrome P450, retinol metabolism, and oxidoreductase activity were enriched among genes differentially expressed in smokers, whereas chemokine signaling pathways, cytokine–cytokine receptor interactions, and cell adhesion molecules were enriched among genes differentially expressed in smokers with lung cancer. There was a significant correlation between the RNA-Seq gene expression data and Affymetrix microarray data generated from the same samples (P < 0.001); however, the RNA-Seq data detected additional smoking- and cancer-related transcripts whose expression was were either not interrogated by or was not found to be significantly altered when using microarrays, including smoking-related changes in the inflammatory genes S100A8 and S100A9 and cancer-related changes in MUC5AC and secretoglobin (SCGB3A1). Quantitative real-time PCR confirmed differential expression of select genes and non-coding RNAs within individual samples. These results demonstrate that transcriptome sequencing has the potential to provide new insights into the biology of the airway field of injury associated with smoking and lung cancer. The measurement of both coding and non-coding transcripts by RNA-Seq has the potential to help elucidate mechanisms of response to tobacco smoke and to identify additional biomarkers of lung cancer risk and novel targets for chemoprevention.
Adenocarcinomas of the ampulla of Vater are classified as biliary cancers, though the exact epithelium of origin for these cancers is not known. We sought to molecularly classify ampullary adenocarcinomas in comparison to known adenocarcinomas of the pancreas, bile duct, and duodenum by gene expression analysis.
We analyzed 32 fresh-frozen resected, untreated periampullary adenocarcinomas (8 pancreatic, 2 extrahepatic biliary, 8 duodenal, and 14 ampullary) using the Affymetrix U133 Plus 2.0 genome array. Unsupervised and supervised hierarchical clustering identified two subtypes of ampullary carcinomas that were molecularly and histologically characterized.
Hierarchical clustering of periampullary carcinomas segregated ampullary carcinomas into two subgroups, which were distinctly different from pancreatic carcinomas. Non-pancreatic periampullary adenocarcinomas were segregated into two subgroups with differing prognoses: 5 year RFS (77% vs. 0%, p = 0.007) and 5 year OS (100% vs. 35%, p = 0.005). Unsupervised clustering analysis of the 14 ampullary samples also identified two subgroups: a good prognosis intestinal-like subgroup and a poor prognosis biliary-like subgroup with 5 year OS of 70% vs. 28%, P = 0.09. Expression of CK7+/CK20- but not CDX-2 correlated with these two subgroups. Activation of the AKT and MAPK pathways were both increased in the poor prognostic biliary-like subgroup. In an independent 80 patient ampullary validation dataset only histological subtype (intestinal vs. pancreaticobiliary) was significantly associated with OS in both univariate (p = 0.006) and multivariate analysis (P = 0.04).
Gene expression analysis discriminated pancreatic adenocarcinomas from other periampullary carcinomas and identified two prognostically relevant subgroups of ampullary adenocarcinomas. Histological subtype was an independent prognostic factor in ampullary adenocarcinomas.