|Home | About | Journals | Submit | Contact Us | Français|
This review focuses on recent research using genomics to examine lung carcinogenesis, histologic differentiation, and progression.
Lung cancer is the leading cause of cancer death in both men and women in the United States, despite its incidence being less than that of prostate cancer in men and breast cancer in women. With 166,000 deaths expected in 2008, the sum total of lung cancer deaths exceeds those of prostate, breast, and colon cancer combined (1). Prostate, breast, and colorectal carcinoma have all demonstrated significant improvements in 5-year survival over time and are currently 99, 88, and 64%, respectively. In contrast, the 5-year survival rate for lung cancer has remained relatively stable at 15%.
There are several potential explanations for the disparity between lung cancer survival and that of other common tumors, including late detection and histologic heterogeneity. Currently, over 75% of new lung cancer diagnoses are made in patients presenting with distant or regional metastatic disease (2). Histologically, prostate, breast, and colorectal carcinomas are uniformly adenocarcinoma and treatment is primarily determined by clinical stage, at times modified by results of molecular assays (3, 4). In contrast, only 30% of lung carcinoma is adenocarcinoma. Yet for the most part, until recently, non–small cell lung cancers (squamous, large cell, adenocarcinoma) were treated similarly, regardless of the biological heterogeneity associated with histology. It is likely that poor historical lung cancer response rates may in part be attributable to a relatively homogenous approach to a heterogeneous disease. Ongoing efforts are underway to identify clinically relevant biological properties of tumors that will facilitate individualized lung cancer treatment.
This research takes advantage of technical advances that allow rapid high-throughput assays to interrogate the genome (mRNA, microRNA, copy number, mutation analyses), proteome, and epigenome. In this review, we will focus on transcriptional profiling studies that have advanced our understanding of malignant transformation of lung epithelial cells and of lung cancer differentiation and progression. These advances have direct relevance to the goals of early detection, accurate prognostic assessment, and targeted therapy.
Fifteen percent of lifetime smokers develop lung cancer, but 10% of lung cancers occur in never-smokers (5). In nonsmokers, exposure to secondhand smoke or to other lung carcinogens such as radon, asbestos, arsenic, or air pollution may be contributory. In both smokers and nonsmokers, genetic polymorphisms in genes associated with carcinogen metabolism, DNA damage repair, and cell cycle control may influence lung cancer susceptibility and modify the injury response associated with exposure to the dozens of carcinogens contained in tobacco smoke (6, 7). While the heritable component of lung cancer risk due to these genetic polymorphisms is most notable in early-onset cases occurring with less total smoke exposure, due to the limited number of genes studied, the overall attributable risk remains small even in this group of patients (7, 8). Recently, three genome-wide association studies identified a locus on chromosome 15q associated with lung cancer (9–11). The common locus variant near rs1051730 contains three genes encoding subunits of the nicotinic acetylcholine receptor (nAChR), CHRNA3, CHRNA5, and CHRNB4. The nAChR holoreceptor is a pentamer surrounding a gated channel responsive to ligands such as acetylcholine, nicotine, or its highly carcinogenic derivative 4(methylnitrosamine)-1-(3-pyridyl)-1-butanone (12). It is unclear whether the gene variant mediates susceptibility predominantly through influencing smoking behavior and nicotine dependence or through direct effects on lung tumor growth and suppression of apoptosis (13).
Because of tobacco smoke exposure, the bronchial epithelium of smokers is subject to field cancerization with alterations of airway cell DNA. Such alterations bring about oncogene activation, tumor suppressor gene silencing, and widespread loss of heterozygosity, all of which drive distinct gene expression signatures. While it is unclear which molecular alterations are required for neoplasia, in some instances, field carcinogenesis leads to malignant transformation of lung cancer progenitor cells. This malignant transformation may also require contributions from other systemic cell types such as bone marrow–derived stem cells (14, 15).
Microarray expression profiling has provided important information about cigarette smoke–associated lung carcinogenesis. Recently, we and others used gene expression profiling to identify gene signatures associated with cigarette smoking and lung adenocarcinoma in smokers and nonsmokers (16–20). Together, these results suggest that distinct cell transformation pathways and stromal tissue responses are involved in adenocarcinoma arising in smokers and nonsmokers (21). Observational studies using microarray analyses to profile bronchial epithelial cells obtained from bronchoscopic brushings of healthy smokers and nonsmokers identified overexpression of antioxidant genes and genes associated with xenobiotic metabolism by cells obtained from smokers (22, 23). However, these changes appeared to be potentially reversible when evaluated in a comparison of current versus former smokers. This transient genomic response to injury has been confirmed by in vitro studies using human bronchial epithelial cells exposed to cigarette smoke condensate and by in vivo rodent models of cigarette smoke exposure (24).
In addition, a field defect in smokers closely associated with lung cancer was reported by Spira and colleagues, who acquired gene expression signatures from nonmalignant bronchial cell brushings in a large number of smokers with and without lung cancer and observed that signatures from uninvolved cells in patients with cancer were distinct from those obtained from healthy smokers (25). This suggests that an epithelial “field defect” typically associated with cigarette smoke exposure may be specific for cancer. If validated prospectively, molecular testing of epithelial cells acquired throughout the aerodigestive tract may provide useful information for risk assessment.
Current paradigms suggest that lung carcinomas arise from pluripotent stem and progenitor cells capable of differentiation into one or several histologic cell types. These paradigms suggest that lung tumor cell ontology is determined by the consequences of gene transcriptional activation and/or repression events that recapitulate embryonic lung development (26, 27). The hypothesis that lung cancer arises from aberrant expression of genes involved in lung development is supported by gene expression studies demonstrating similarities between signatures obtained from human lung tumors and signatures characteristic of normal lung development (Figure 1). In an analysis of 32 non–small cell lung carcinoma (NSCLC) specimens and 7 normal specimens, unsupervised hierarchical analysis segregated tumors on the basis of histologic type and differentiation (28). Supervised clustering analysis of tumors identified numerous genes with known important function in embryonic lung development. Comparison of human lung tumor histology classifiers with genes temporally activated during mouse lung development reveals that genes expressed by large cell carcinoma (LCC) are similarly expressed during the early pseudoglandular and canalicular stages of lung development, while those expressed by adenocarcinoma mirror those expressed during the later terminal sac and alveolar stages. In addition to highlighting the expression of proliferation-associated genes by LCC and of differentiation-associated genes by adenocarcinoma, these results suggest a recapitulation of developmentally regulated pathways in lung tumors.
Similar observations were reported by Bonner and colleagues, who compared gene expression profiles of human lung tumors with those obtained from a murine model of lung adenocarcinoma (29). Liu and colleagues (30) found that genes associated with early lung development were more often expressed by small cell lung carcinoma (SCLC) and by tumors associated with poor prognosis. In addition, Glinsky and colleagues reported that a gene signature of “stemness” derived from BMI-1–regulated genes in normal stem cells is associated with metastasis and survival in several tumor types, including NSCLC (31). Taken together, these observations suggest that poor differentiation is linked to molecular parameters of early development representing lung stem and progenitor cell programs, and that gene signatures of these phenotypes are important for lung cancer differentiation, progression, and clinical outcome.
One of the first studies using gene expression profiling to examine lung cancer histologic diversity compared carcinoids and small cell carcinoma tumors (32). Despite sharing neuroendocrine differentiation, the molecular profiles suggested different cells of origin for these tumor types. This molecular distinction was supported by a subsequent study from Bhattacharjee and colleagues (33) showing two distinct clusters of small cell carcinoma and carcinoid tumors, with small cell carcinoma characterized by frequent expression of proliferation markers such as MCM2, PCNA, MCM6, and thymidylate synthase. This was later confirmed by Virtanen and coworkers (34), who identified a carcinoid subgroup distinct from high-grade neuroendocrine carcinomas. Interestingly, large cell neuroendocrine carcinoma, a high-grade neuroendocrine carcinoma, was not readily distinguishable from small cell carcinoma by gene expression profiling. This observation suggested that despite being classified as an NSCLC subtype, large cell neuroendocrine carcinoma bears more similarity to small cell carcinoma than to other NSCLC subtypesn and illustrates how gene expression profiling can explain similar clinical features observed in histologically distinct tumors.
Researchers have examined gene expression profiles of squamous cell carcinomas (SCC) of the lung to address three major areas of concern: (1) to account for the histologic heterogeneity of NSCLC and to differentiate squamous cell carcinoma from other subtypes, (2) to identify prognostic factors specific to squamous cell carcinoma, and (3) to aid in distinguishing primary pulmonary SCC from metastatic SCC of the head and neck (HNSCC).
Several studies have identified gene expression profiles that distinguish pulmonary adenocarcinoma from pulmonary SCC (35–38). Notably, a shared set of genes encoding cytokeratins and cell adhesion proteins consistently distinguish squamous cell carcinoma from adenocarcinoma. These classifiers include cytokeratin 14 and components of desmosomes, hemi-desmosomes, and gap junctions, such as integrins ITGB4, desmocollin, and desmoplakin.
A consistent finding of unsupervised clustering analysis in pulmonary SCC is the generation of two clusters not attributable to histology, differentiation, or stage (39, 40). Rather, these clusters were enriched in prognosis associated genes encoding proteins with function in the cell cycle and proliferation. Larson and colleagues (41) compared gene expression profiles of node-negative SCC tumors (N0) with signatures obtained from tumors with ipsalateral hilar or peribronchial nodal involvement (N1) via direct nodal invasion and with signatures obtained from N1 tumors with regional lymph node metastases via lymphatics, and found that N1 tumors with direct lymphatic invasion were similar to N0 tumors and distinct from N1 tumors with lymphatic spread, both in terms of molecular features and prognosis.
Although immunohistochemistry panels are effective in determining a site of primary origin for adenocarcinoma, equally effective panels do not exist for squamous cell carcinoma. Because primary SCC of the lung must be distinguished from metastatic HNSCC, this poses a diagnostic dilemma in patients with a history of HNSCC who subsequently develop a pulmonary nodule. Toward this end, Talbot and colleagues (42) developed a 500-gene classifier that distinguished pulmonary SCC from SCC of the tongue. In addition, Vachani and colleagues (43) used a training set of 18 HNSCC and 10 pulmonary squamous cell carcinomas to develop a 10-gene profile to distinguish these two tumor types. This classifier was validated on 122 subjects from previously published independent datasets, with 96% accuracy. Using real-time PCR, they correctly classified 12 independent samples as either of head and neck or pulmonary origin. These studies did not use microdissection; thus, it is possible that the signal from the host tissue may confound interpretation of gene signatures of tumor origin. However, if validated by others using microdissected specimens, these genomic classifiers hold promise for addressing important clinical staging issues in squamous cell carcinoma.
Somatic DNA alterations including mutations, amplifications, deletions, and translocations are common in tumors and are required for the activation of oncogenes and the inactivation of tumor suppressor genes that drive carcinogenesis. Large-scale efforts to systematically characterize these alterations in lung cancer and other tumors (44, 45) have provided insights into DNA structural alterations in non–small cell lung cancer. Adenocarcinoma, the most frequent histologic subtype of NSCLC, is characterized by multiple somatic DNA alterations. Recently, Weir and colleagues comprehensively examined DNA copy number alterations in 371 lung adenocarcinoma specimens (46) and found that the top focal regions of amplification and deletion included 14q13.3, 12q15, 8q24.21, 7p11.2, and 8q21.13. These results confirmed amplifications and deletions reported in other smaller studies, and also identified novel alterations such as amplification of the transcription factor TTF-1 on chromosome 14q13.3. TTF-1 encodes thyroid transcription factor 1, a member of the Nk-2 homeobox family that binds to and activates the promoter of thyroid- and lung-specific genes. Interestingly, other groups using similar approaches also identified TTF-1 amplification as a frequent lung adenocarcinoma alteration, thus providing independent validation of this finding (47, 48). Together, these observations link lung tumor differentiation states and histologic grade with epithelial cell developmental pathways (28, 49), and suggest that these DNA amplification events are important in mediating lung cancer initiation, differentiation, and progression.
Lung adenocarcinoma, the most frequent histologic type of NSCLC, is heterogeneous. Histologically, subclassification of adenocarcinoma is based upon World Health Organization (WHO) criteria, determined predominantly by cell morphology and growth pattern, along a spectrum including noninvasive bronchioloalveolar carcinoma (BAC), adenocarcinoma with mixed subtypes (AC-Mixed), and pure invasive adenocarcinoma (IAC) (50). Lung cancer metastasis represents the final step of a complex sequence composed of invasion (loss of cell–cell adhesion, increased cell motility, and basement membrane degradation), vascular intravasation and extravasation, establishment of a metastatic niche, and angiogenesis. Recent research has focused on characterizing the molecular mechanisms and clinical implications of adenocarcinoma invasion, the initial step of metastasis.
Paralleling malignancies in other organs, such as breast and cervix, where tumors are defined as noninvasive (in situ carcinoma), microinvasive (microscopic invasion), or as invasive carcinomas, the extent of the invasive component seen in lung adenocarcinoma is associated with clinical outcomes. Notably, this is most relevant in the lung for adenocarcinoma, because the noninvasive component is generally minimal in squamous cell carcinoma and often absent in large cell carcinoma. The clinical importance of lung adenocarcinoma invasion is supported by several recent studies (51–56) indicating that the risk of death in noninvasive BAC tumors is significantly lower than that of pure invasive tumors and in tumors with less than 0.6 cm of fibrosis or linear invasion. These results suggest that the prognosis of BAC is favorable, and suggest a similarly favorable prognosis for the subset of AC-Mixed subtype tumors with limited invasion (<6 mm).
We (57) and others (58–60) have used microarray gene expression profiling of lung adenocarcinoma tumors to identify signatures associated with histologic subtype and invasion. The results of unsupervised analyses are striking in that in each instance, tumors segregate into three major clades composed predominantly of BAC, AC-Mixed subtype, and pure invasive tumors, providing biological plausibility for the notion that these adenocarcinoma subtypes are distinct tumors. Taken together with the clinical data indicating distinct prognoses for each subtype, these studies have motivated consideration of revising the WHO lung adenocarcinoma classification scheme to reinforce the designation of purely noninvasive tumors and to create a designation for minimally invasive tumors.
To identify the molecular pathways important in mediating the acquisition of invasion by lung adenocarcinoma, we performed supervised analysis of mRNA microarray data to identify genes differentially expressed in noninvasive BAC and in AC-mixed type tumors. Among the genes differentially expressed in the progression from BAC to invasive tumors was the type II transforming growth factor β receptor (TGFβRII), which was less highly expressed by AC-Mixed and solid invasive tumors compared with BAC. This finding, which suggested that TGFβRII repression was required for lung adenocarcinoma invasion, is supported by genetic models combining targeted deletion of TGFβRII with other oncogenic events such as Apc mutation in colon tumors and Kras mutations in pancreatic and oropharyngeal carcinomas (61–63). The phenotypes of these TGFβRII-deficient cancer models clearly demonstrate the importance of TGF-β signaling in tumor invasion, yet the downstream signaling mechanisms are undefined.
We used a tumor cell invasion system and microarray analysis to identify and characterize downstream mediators of TGF-β signaling important for lung adenocarcinoma invasion (57). Among potential mediators identified was the chemokine CCL5 (RANTES [regulated on activation, normal T cell expressed, and presumably secreted]), which is up-regulated by invasive tumors and TGFβRII knockdown cells. RANTES is involved in immunoregulatory and inflammatory processes and is transcribed and secreted not only by T cells, other inflammatory cells, and stromal cells, but also by tumor cells and normal bronchial epithelium. RANTES is a ligand for chemokine receptors CCR1, CCR3, CCR4, and CCR5, which are expressed on epithelial cells, macrophages, lymphocytes, dendritic cells, and stromal cells (64–67). Inhibition of RANTES signaling was found to be associated with abrogation of tumor invasion, suggesting that RANTES is required for invasion in TGFβRII repressed lung adenocarcinoma cells (Figure 2). The clinical significance of this pathway is further supported by the finding that tumor expression of both RANTES and CCR5 by a large panel of lung adenocarcinoma is associated with patient survival. These studies illustrate how information gained from global expression profiling of tumors can be used to identify key pathways and genes mediating tumor growth, invasion, and metastasis.
Proof of concept that tumor mRNA profiling provides clinically significant information regarding patient outcome after resection has been established. The promise of this approach to segregate tumors and to direct patient treatment based on these signatures is being prospectively tested in two breast cancer clinical trials. For NSCLC, several predictors have been developed, which for the most part are based on methodologically sound approaches that include independent validation. As indicated in Table 1, the results of these studies are heterogeneous both in terms of the number of genes in the predictors and in the specific genes included in each signature. This heterogeneity is expected, given differences in study design, assay platform, tumor histology, and patient selection. Subsequently, Shedden and colleagues published the results of a large, multicenter, blinded evaluation of several genomic signatures of prognosis in 442 adenocarcinomas (68). The addition of clinical covariates enhanced the performance of novel signatures, which performed slightly better than previously published signatures (69, 70). None of the signatures performed significantly better than the others. This landmark study is a model for the meticulous handling of challenges inherent in translational cancer genomic studies, and for its vast data repository of clinical and pathologically annotated data that now is available for testing of new signatures and hypotheses. It remains unclear which genomic predictor of prognosis is best or whether specific genes or entire signatures are most important in predicting outcome. Independent prospective evaluation of the predictive accuracy of these signatures, prospective clinical trials, and application to small biopsy specimens (71) are appropriate and will be required to extend this important area of research.
Gene expression profiling studies of lung tumors not only provide important information regarding prognosis and survival, but also help to identify potential therapeutic targets by offering key insights into how genetic alterations affect lung tumorigensis and metastasis. As medicine becomes increasingly reliant on genomic information, these studies hold much promise for the diagnosis and treatment of lung cancer.
Supported by the NIH (1RO1CA120174 to C.A.P.), the American Cancer Society (RSG-CNE-108857), Joan's Legacy Foundation, and by the Flight Attendants Medical Research Institute.
Conflict of Interest Statement: A.C.B. is a co-inventor involved in a patent application “Gene expression profiles correlating with histology and prognosis” measuring gene expression of a set of genes with positive correlation with histologic subtype and prognosis. To date, no financial benefit has been derived from this application. R.L.T. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. C.A.P. received $125,000 in 2006–2007 from Philip Research, NA as research grants and is co-inventor for a pending patent filed by Columbia University for expression profiles correlating with histology and prognosis.