|Home | About | Journals | Submit | Contact Us | Français|
Lung cancer is the leading cause of cancer-related death in men and women in the United States accounting for approximately 28% of total cancer deaths in 2010 despite comprising only ~15% of new cancer cases1. Decades of research have contributed to our understanding that lung cancer is a multi-step process involving genetic and epigenetic alterations where resulting DNA damage transforms normal lung epithelial cells into lung cancer2,3. It is not known whether all lung epithelial cells or only a subset of these cells (such as pulmonary epithelial stem cells or their immediate progenitors) are susceptible to full malignant transformation. Additionally, while the tumor initiating cell may have only a handful of mutations, as the tumor expands cells may acquire additional mutations4. Smoking damages the entire respiratory epithelium and thus “field cancerization” or “field defects” (molecular changes) are observed in histologically normal lung epithelium, as well as a variety of histologic preneoplastic/premalignant lesions, which also harbor molecular abnormalities common to the adjacent tumor5. The culmination of these changes leads to lung cancers exhibiting all the “hallmarks of cancer” (including self-sufficiency of growth signals, insensitivity to growth-inhibitory (anti-growth) signals, evasion of programmed cell death (apoptosis), limitless replicative potential, sustained angiogenesis, and tissue invasion and metastasis)6,7. Lung cancer is a heterogeneous disease clinically, biologically, histologically and molecularly. Understanding the molecular causes of this heterogeneity is the focus of current research and these could reflect changes occurring in different classes of epithelial cells or different molecular changes occurring in the same target lung epithelial cells. Identifying the genes and pathways involved, determining how they relate to the biologic behavior of lung cancer and their utility as diagnostic and therapeutic targets are important basic and translational research issues. Thus, current information on the key molecular steps in lung cancer pathogenesis and their timing in preneoplasia, primary cancer, and metastatic disease and the clinical implications is the subject of this review.
The two main types of lung cancer, non-small cell lung cancer (NSCLC) (representing 80–85% of cases) and small cell lung cancer (SCLC) (representing 15–20%) are identified based on histological, clinical and neuroendocrine characteristics. NSCLC and SCLC also differ molecularly with many genetic alterations exhibiting subtype specificity. NSCLC can be further histologically subdivided into adenocarcinoma, squamous carcinoma, large cell carcinoma (including large cell neuroendocrine lung cancers), bronchoalveolar lung cancer, and mixed histologic types (e.g. adenosquamous carcinoma). Common molecular differences between these major NSCLC subtypes and between NSCLC and SCLC are outlined in Table 1. These differences, as well as advances in both conventional and targeted therapy, signify the importance of stratifying NSCLC tumors by subtype for prognostic and predictive purposes and molecular studies8.
Approximately 85% of lung cancers are caused by carcinogens present in tobacco smoke, while worldwide, 15–25% of lung cancer cases occur in life time “never smokers” (less than 100 cigarettes in a lifetime). These etiologic differences are associated with distinct differences in tumor acquired molecular changes and are discussed later in this review9,10. While the general public associates lung cancer with smoking, due to the number of lung cancer cases overall, lung cancer occurring in life time never smokers is also a huge public health problem. Likewise, over 50% of newly diagnosed lung cancers in the USA occur in “former smokers” who changed their lifestyle – but the damage caused by past smoking still led to the development of lung cancer. Thus, it will be important to identify the non-smoking related etiologies of lung cancer arising in “never smokers” as well as methods to identify which former smokers are most likely to develop clinically evident lung cancer.
There has been intense study of inherited predisposition to lung cancer including study of polymorphisms associated with lung cancer risk (reviewed11,12) and familial linkage studies. In 2008, three independent genome-wide association studies (GWASs) identified single nucleotide polymorphism (SNP) variations at 15q24-q25.1 were associated with an increased risk of both nicotine dependence and developing lung cancer13–15. This locus includes genes encoding nicotinic acetylcholine receptor (nAChR) subunits (CHRNA5, CHRNA3, and CHRNB4). More recently, two meta-analyses have provided further evidence that variation at 15q25.1, 5p15.33, and 6p21.33 influences lung cancer risk16,17. It has not yet been elucidated whether there is a mechanistic association with these nAChR polymorphisms and nicotine addiction, carcinogenic derivatives of nicotine exposure, or the effect of nicotine acting on nAChRs known to be expressed in lung epithelial cells18–26. In addition, a genome-wide linkage study of pedigrees containing multiple generations of lung cancer from the Genetic Epidemiology of Lung Cancer Consortium (GELCC) mapped a familial susceptibility locus to 6q23-2527,28. A member of the regulator of G-protein signaling (RGS) family, RGS17, was identified as a potential causal gene within this locus where common variants were associated with familial, but not sporadic lung cancer29; however, it is likely that more than one genetic locus in the 6q region is influencing susceptibility.
Never smoking lung cancers represent a distinct epidemiological, clinical and molecular disease from smoking lung cancers. If considered independently, never smoking lung cancers comprise the seventh most common cause of cancer death30. Never smoking lung cancer occurs more frequently in women and East Asians, has a peak incidence at a younger age, targets the distal airways, are usually adenocarcinomas, and frequently have acquired EGFR mutations making them very responsive to EGFR targeted therapies9,31–36. Table 2 outlines the molecular differences between smoking and never smoking lung cancers.
Human papilloma virus (HPV), an established human carcinogen (for both uterine cervical and head and neck cancer), has been proposed to play a role in lung cancer pathogenesis; however, published data remains controversial. The presence of HPV oncoproteins E6 and E7 lead to inactivation of tumor suppressors p53 and Rb, respectively37,38. A meta-analysis of 53 publications comprising 4,508 cases found a mean incidence of HPV positive lung cancer of 25%, detected in all subtypes of lung cancer39. Geographically, European and American studies had a lower incidence of 15–17% while Asian lung cancer cases reported a mean incidence of 38%. In an effort to overcome sample and detection limitations of earlier studies, a recent case-control study of ~400 lung cancer patients of European descent, representing the largest study to date, found no evidence of an association of HPV and lung cancer40. While HPV will likely be primarily found in lung cancer arising in Asian populations, the detection of oncogenic variants of HPV in some tumors and the wealth of knowledge of the role of HPV oncoproteins suggest that a subset of lung cancer will have HPV infection as a major etiologic feature. It will be important to characterize other molecular alterations in these lung cancers, and how they respond to various therapies, given the differences in response of head and neck cancer associated with HPV to EGFR targeted therapy.
Characterization of the molecular changes in lung cancer and associated preneoplastic cells is becoming increasingly well-defined, aided immeasurably by the continued advancement of both clinical and genomic tools. Improved detection and sampling of clinical samples using fluorescent bronchoscopy, endobronchial ultrasounds and laser capture microdissection techniques for instance, enables precise analysis of abnormal epithelial cells. Introduction of high-resolution and high-throughput genomic tools (described in more detail later in this review) has facilitated the identification and characterization of key molecular changes – often involving oncogenes and tumor suppressor genes (TSGs) – and importantly, the associated “tumor cell acquired vulnerabilities” that accompany these oncogenotype changes (Figure 1). The key new concept that applies to many cancers, including lung cancer, is that with the genetic and epigenetic changes that occur during carcinogenesis the cancer becomes both dependent (“addicted”) to the continued presence/function of these changes and also must make other cellular adaptations including mutations to minimize the “oncogene stress” induced by these changes. While mutated oncogenic proteins themselves are therapeutic targets (see discussion of mutant EGFR below), the other cellular adaptations which are present in tumor but not normal cells also become cancer specific therapeutic targets. The cancer needs both the oncogenic changes as well as the cellular adaptations to tolerate the oncogenic changes – that is the oncogenic changes are “synthetically lethal” with the adaptation changes. Thus, both of these are potential therapeutic targets that can be discovered by genome wide functional approaches such as siRNA library screening (see below). Together, these advances promote our understanding of the development and progression of lung cancer, which is of fundamental importance for improving the prevention, early detection, and treatment of this disease. Ultimately these findings need to be translated to the clinic by using molecular alterations as: biomarkers for early detection and risk assessment; targets for prevention; signatures for personalizing prognosis and therapy selection for each patient; and as therapeutic targets to selectively kill or inhibit the growth of lung cancer.
Chronic exposure to tobacco smoke carcinogens propels genetic and epigenetic damage which can result in lung epithelial cells steadily acquiring growth and/or survival advantages. Malignant transformation is characterized by genetic instability which can exist at the chromosomal level (with large-scale loss or gain of genomic material, translocations, and microsatellite instability), at the nucleotide level (with single or several nucleotide base changes), or in the transcriptome (with altered gene expression). Abnormalities are typically targeted to proto-oncogenes, TSGs, DNA repair genes and other genes that can promote outgrowth of affected cells. Activation of telomerase (the telomere-lengthening enzyme required for cell immortality) and disruption or escape from apoptotic pathways are other common events in cancer cells. Over the past 5–10 years there has been a revolution in technologies that can be applied to determining all of the genetic and epigenetic changes in lung cancer as well as other cancers. These include genome-wide mRNA expression profiles, genome-wide DNA copy number variation changes, genome-wide DNA methylation changes, miRNA changes and mass spectroscopy proteomics analyses. The recent application of “next generation” (“NexGen”) sequencing technologies has led to the first genome-wide mutational analyses of lung cancers compared to normal germline DNA41–43. These have demonstrated a huge number of mutations occurring in lung cancers arising in smokers, many changes that do not alter the coding sequences, and many changes that are idiotypic to the particular tumor (see below in “Genomics” section). Within the next several years there will be similar data on perhaps 1,000 lung cancers which will provide an unprecedented amount of information. The key issues will be to determine which of these mutations are “actionable” – that is provide a guide for targeting therapy, which are “passenger” and which are “driver” mutations, how frequent the mutations are, how the mutations are related to other molecular changes (e.g. in the epigenome and miRNAs), and which mutations provide information to identify important subgroups (“molecular portraits”) of lung cancer that provide prognostic (survival information independent of therapy) and/or predictive (survival information dependent on the administration of specific therapies) utility. Of course this will require large scale multidisciplinary and international collaboration to unite clinically annotated with molecularly annotated lung cancer specimens. Examples of this are the USA NCI “The Cancer Genome Anatomy” Program (TCGA), the NCI Lung Cancer Mutation Consortium (LCMC), as well as international lung cancer sequencing consortiums. A key component of this is to be able to perform mutation testing of clinically available materials (such as formalin fixed paraffin embedded [FFPE] specimens) in a timely fashion using clinical laboratory practices (CLIA certified laboratory methods). Recently, the NCI’s LCMC performed such a study on >800 lung adenocarcinoma tumor specimens examining mutations in established lung cancer driver genes (EGFR, KRAS, BRAF, HER2, AKT1, NRAS, PIK3CA, MEK1, EML4-ALK, MET amplification). Mutations in at least one of these genes were found in ~60% of tumor specimens and >90% were “exclusive” – only one mutation was found in a particular tumor44. Table 1 describes the current state of our knowledge of the common genetic alterations found in lung cancer. A key element will be to make this information accessible and understandable to patients and physicians not expert in cancer genomics. An example of how patients and their physicians can interface with this data is the “My Cancer Genome” website established by the Vanderbilt Cancer Center (http://www.vicc.org/mycancergenome/).
Like many solid tumors, genomic instability is a hallmark of lung cancer3. Mapping high-level amplifications and deletions in copy-number throughout the cancer genome has led to the identification of many oncogenes and TSGs45–62. Many genetic alterations have been associated with lung cancer, with the more frequently observed changes including aneuploidy, specific allelic loss at 3p, 4q, 9p, and 17p and gain at 1q, 3q, 5p, and 17q63–65. Additionally, genetic alterations in several genes have been implicated in lung cancer development, including activation of MYC, RAS, EGFR, NKX2-1, ERBB2, SOX2, BCL2, FGFR2, and CRKL as well as inactivation of RB1, CDKN2A, STK11 and FHIT3,63,65–80.
Identification of the genetic alterations that occur in tumors has long been an important approach to understanding tumorigenesis. Early techniques to analyze the cancer genome involved cytogenetic karyotyping, loss of heterozygosity (LOH) and microsatellite analyses, followed later by comparative genomic hybridization (CGH) using metaphase spreads or fluorescence in situ hybridization (FISH). These techniques identified multiple numeric and structural chromosomal alterations in the cancer genome; however, the shift of CGH into a microarray-based format improved upon previous techniques by providing high-resolution detection of copy-number gain and loss56,79,81–92. Thus, due to low resolution of earlier cytogenetic and CGH techniques, which made it difficult to identify focal aberrations and the causal genes critical for tumorigenesis, aberrant loci/genes in lung carcinogenesis continue to be defined75–80.
Oncogene activation occurs in probably all lung cancers (typically by gene amplification, over-expression, point mutation, or DNA rearrangements) and can result in persistent upregulation of mitogenic growth signals which induce cell growth as well as “oncogene addiction” whereby the cell becomes dependent upon this aberrant oncogenic signaling for survival (Table 1)48,50–52,56,58,60,62,74,93,94. In lung cancer, commonly activated oncogenes include EGFR, ERBB2, MYC, KRAS, MET, CCND1, CDK4, MET, EML4-ALK fusion, and BCL2. These “driver” oncogenes or oncogene “addictions” represent acquired conditional (on the oncogene) vulnerabilities in lung cancer cells, and present as significant therapeutic targets by offering specificity of killing tumor but not normal cells. Oncogenic signaling pathways commonly found in lung cancer and potential targeted therapies are summarized in Figures 2–5 and Table 3, (also see article in this issue by Gettinger at al.).
The ErbB family of tyrosine kinase receptors includes four members – EGFR, ErbB-2 (HER2), ErbB-3, and ErbB-4 – with ability to form homo- and heterodimers and bind different ligands leading to receptor activation (Figure 2)95. EGFR exhibits over-expression or aberrant activation in 50–90% of NSCLCs; therefore, much effort has been focused on the development of targeted inhibitors for this molecule96. Initial research used monoclonal antibodies that target the extracellular domain but this was supplanted by the development of small molecules that inhibit intracellular EGFR tyrosine kinase activity: EGFR tyrosine kinase inhibitors (TKIs). In 2004, a significant advancement was made in the treatment of NSCLC following the observation that somatic mutations in the kinase domain of EGFR strongly correlated with sensitivity to EGFR TKIs50,51. Exquisite sensitivity and marked tumor response has since been shown with EGFR TKIs (such as erlotinib and gefitinib) and antibodies (such as cetuximab) in EGFR mutant tumors50–52,97,98 – an example of oncogene addiction in lung cancer where tumors initiated through EGFR mutation-activation of EGF signaling rely on continued EGF signaling for survival. Mutant EGFRs (either by exon 19 deletion or exon 21 L858R mutation) show an increased amount and duration of EGFR activation compared with wildtype receptors50, and have preferential activation of the PI3K/AKT and STAT3/STAT5 pathways rather than the RAS/RAF/MEK/MAPK pathway98. EGFR mutations are particularly prevalent in certain patient subgroups: adenocarcinoma histology, women, never smokers, and East Asian ethnicity52,99–103. Resistance to TKI therapy has been associated with EGFR exon 20 insertions or a secondary T790M mutation, KRAS mutation, or amplification of the MET proto-oncogene104–109 where MET activates the PI3K pathway through phosphorylation of ERBB3, independent of EGFR and ERBB2109. Importantly, the authors found inhibition of MET signaling can restore sensitivity to TKIs109. In lung adenocarcinomas, activated mutant EGFR has been shown to induce levels of IL-6 leading to activation of STAT3110. IL-6 also plays an important role by activation of JAK family tyrosine kinases111, which in turn activate multiple pathways through signaling molecules such as STAT3, MAPK, and PI3K112.
Activation of the RAS/RAF/MEK/MAPK pathway occurs frequently in lung cancer (Figure 3), most commonly via activating mutations in KRAS which occur in ~20% of lung cancers, particularly adenocarcinomas113,114. In lung cancer, 90% of mutations are located in KRAS (80% in codon 12, and the remainder in codons 13 and 61) with HRAS and NRAS mutations only occasionally documented115. Mutation results in constitutive activation of downstream signaling pathways, such as PI3K and MAPK, rendering KRAS mutant tumors independent of EGFR signaling and therefore resistant to EGFR TKIs as well as chemotherapy97,106,116. KRAS mutations are mutually exclusive with EGFR and ERBB2 mutations and are primarily observed in lung adenocarcinomas of smokers97,117. The prevalence and importance of KRAS in lung tumorigenesis make it an attractive therapeutic target. Two unsuccessful approaches were farnesyltransferase inhibitors, to inhibit posttranslational processing and membrane localization of RAS proteins, and antisense oligonucleotides against RAS113. More recently, efforts have been centered on downstream effectors of RAS signaling: RAF kinase and mitogen-activated protein kinase (MAPK) kinase (MEK)113,118. BRAF is the direct effector of RAS and while commonly mutated in melanoma (~70%) mutations are rare in lung cancer (~3%), predominantly in adenocarcinoma, and mutually exclusive to EGFR and KRAS mutations119–122. Strategies to inhibit RAF kinase include degradation of RAF1 mRNA through antisense oligodeoxyribonucleotides, and inhibition of kinase activity with multikinase inhibitor such as sorafenib. Several MEK inhibitors have commenced Phase II testing in lung cancer patients and are listed in Table 3. Attempts to directly inhibit or perturb mutant KRAS continue with the advent of whole-genome approaches. Synthetic lethal siRNA screens have identified small interfering RNAs (siRNAs) that specifically kill human lung cancer cells with KRAS mutations in vitro123–125. Additionally, combination of anti-KRAS strategies (such as depletion with short-hairpin RNAs (shRNAs)) with other targeted drugs has shown potential therapeutic utility126–128.
One of the major downstream effectors of the RAS/RAF/MEK/MAPK pathway is the MYC proto-oncogene (Figure 3). In normal conditions this transcription factor functions to keep tight control of cellular proliferation; however, aberrant expression through amplification or over-expression is commonly found in lung cancer129,130. MYC proto-oncogene members (MYC, MYCN and MYCL) are targets of RAS signaling and key regulators of numerous downstream pathways such as cell proliferation131 where enforced Myc expression drives cell cycle in an autonomous fashion. It can also sensitize cells to apoptosis through activation of the mitochondrial apoptosis pathway – thus, Myc driven tumorigenesis often requires co-expression of anti-apoptotic BCL2 proteins132. Activation of MYC members often occurs through gene amplification. MYC is most frequently activated in NSCLC133, while the other two members, MYCN and MYCL along with MYC, are usually activated in SCLC64,134.
In 2007, a novel fusion gene with transforming ability was reported in a small subset of NSCLC patients135. Formed by the inversion of two closely located genes on chromosome 2p, fusion of PTK echinoderm microtubule-associated protein like-4 (EML4) with anaplastic lymphoma kinase (ALK), a transmembrane tyrosine kinase, yields the EML4-ALK fusion protein. The fusion results in constitutive oligomerization leading to persistent mitogenic signaling and malignant transformation and a recent meta-analysis of 13 studies encompassing 2,835 tumors reported the EML4-ALK fusion protein is present in 4% of NSCLCs136. EML4-ALK fusions are found exclusive of EGFR and KRAS mutations, and occur predominantly in adenocarcinomas and never or light smokers. Tumors with EML4-ALK fusions exhibit dramatic clinical responses to ALK targeted therapy137–141 and the ALK inhibitor crizotinib (PF-02341066) has now entered a Phase III clinical trial.
Phosphoinositide 3-kinases (PI3Ks) are lipid kinases that regulate cellular processes such as proliferation, survival, adhesion and motility142. The PI3K/AKT/mTOR pathway is a downstream signaling pathway of several receptor tyrosine kinases, such as EGFR, and can also be activated via binding of PI3K to activated RAS143. In lung tumorigenesis, activation of the PI3K/AKT/mTOR pathway occurs early in pathogenesis, generally through mutations in PI3K or PTEN as well as EGFR or KRAS, amplification of PIK3CA, PTEN loss, or activation of AKT144 and results in cell survival through inhibition of apoptosis (Figure 4). The pathway has two negative regulators: the tumor suppressor gene, PTEN, and TUSC1/TUSC2 complex which act upstream and downstream of AKT, respectively. The serine/threonine kinase mTOR, a downstream effector of AKT, is an important intracellular signaling enzyme in the regulation of cell growth, motility, and survival in tumor cells145. Targeted therapies to the PI3K/AKT/mTOR pathway (such as LY294002 and rapamycin) have shown significant efficacy in both NSCLC and SCLC cells with activated AKT signaling146–148.
Genome-wide screens for DNA copy number changes in primary NSCLCs has led to the identification of recurrent, histologic subtype-specific focal amplification at 14q13.3 (adenocarcinoma) and 3q26.33 (squamous cell carcinoma) 74,75,80,93,149. Functional analysis identified NKX2-1 (also termed TITF1) and SOX2 as the respective targets of these amplifications. NKX2-1 encodes a lineage-specific transcription factor essential for branching morphogenesis in lung development and the formation of type II pneumocytes – the cells lining lung alveoli150,151. Initial studies reported on the oncogenic role of NKX2-1 in lung adenocarcinoma74,93,149,152; however, recent in vivo data suggests it also has a tumor suppressive role153. SOX2 amplification was identified specifically in squamous cell carcinomas and is required for normal esophageal squamous development75,80. Amplification of tissue-specific transcription factors in cancer has been previously observed in prostate cancer (AR)154, melanoma (MITF)155, and breast cancer (ESR1)156. These findings have led to the development of a “lineage-dependency” concept in tumors157 where the survival and progression of a tumor is dependent upon continued signaling through a specific lineage pathways (i.e. abnormal expression of pathways involved in normal cell development) rather than continued signaling through the pathway of oncogenic transformation as seen with oncogene addiction94.
Loss of TSG function is an important step in lung carcinogenesis and usually results from inactivation of both alleles with LOH inactivating one allele through chromosomal deletion or translocation, and point mutation, epigenetic or transcriptional silencing inactivating the second allele158,159. Commonly inactivated TSGs in lung cancer include TP53, RB1, STK11, CDKN2A, FHIT, RASSF1A and PTEN.
TP53 (17p13) encodes a phosphoprotein which prevents accumulation of genetic damage in daughter cells. In response to cellular stress, p53 induces the expression of downstream genes such as cyclin-dependent kinase (CDK) inhibitors which regulate cell cycle checkpoint signals, causing the cell to undergo G1 arrest and allowing DNA repair or apoptosis159 (Figure 5). p53 inactivating mutations are the most common alterations in lung cancer where 17p13 frequently demonstrates hemizygous deletion and mutational inactivation in the remaining allele160–162. Some point mutations in TP53 confer a gain-of-function phenotype leading to increased aggressiveness of lung cancer163. Due to the prevalence of p53 inactivating mutations in human cancers large scale efforts have been focused on therapeutic strategies to restore normal p53 function. These include re-introduction of wildtype p53 using gene therapy, pharmacological rescue of mutant p53 with small molecule agents and peptides, blocking of MDM2 expression, inhibiting MDM2 ubiquitin ligase activity, and targeting the p53-MDM2 interaction with small molecule inhibitors. In vivo restoration of p53 expression in a subpopulation of tumor cells has been achieved with p53 gene therapy of lung cancer patients164.
The CDKN2A-RB1 pathway controls G1 to S phase cell cycle progression (Figure 5). Hypophosphorylated retinoblastoma (RB) protein, encoded by RB1, halts the G1/S phase transition by binding to the transcription factor E2F1 and was the first tumor suppresser gene identified in lung cancer165,166. Absent or mutant RB protein is found in approximately 90% of SCLCs compared to only 10–15% of NSCLCs while abnormalities in p16 (encoded by CDKN2A) and an upstream regulator of RB phosphorylation are predominantly found in NSCLCs167.
Loss of one copy of chromosome 3p is one of the most frequent and early events in human cancer, found in 96% of lung tumors and 78% of lung preneoplastic lesions168. Mapping of this loss identified several genes with functional tumor suppressing capacity including FHIT (3p14.2), RASSF1A, TUSC2 (also called FUS1), and semaphorin family members SEMA3B and SEMA3F (all at 3p21.3), and RARβ (3p24). In addition to LOH or allele loss, some of these 3p genes (FHIT, RASSF1A, SEMA3B and RARβ) often exhibit decreased expression in lung cancer cells by means of epigenetic mechanisms such as promoter hypermethylation169–173. Furthermore, FHIT, RASSF1A, TUSC2, and SEMA3B will reduce growth when re-introduced into lung cancer cells. FHIT, located in the most common fragile site in the human genome (FRA3B), has been shown to induce apoptosis in lung cancer174. RASSF1A can induce apoptosis, as well as stabilize microtubules, and affect cell cycle regulation175. The tumor suppressing effect of TUSC2 is thought to occur via through inhibition of protein tyrosine kinases such as EGFR, PDGFR, c-Abl, c-Kit, and AKT176 as well as inhibition of MDM2-mediated degradation of p53177. The candidate TSG SEMA3B encodes a secreted protein which can decrease cell proliferation and induce apoptosis when re-expressed in lung, breast and ovarian cancer cells169,170,178,179 in part, by inhibiting the AKT pathway180. Another family member, SEMA3F may inhibit vascularization and tumorigenesis by acting on VEGF and ERK1/2 activation181,182 and RARβ exerts its tumor suppressing function by binding retinoic acid, thereby limiting cell growth and differentiation.
The serine/threonine kinase STK11 (also called LKB1) functions as a TSG by regulating cell polarity, motility, differentiation, metastasis and cell metabolism183. Germline inactivating mutations of STK11 cause Peutz-Jeghers syndrome184, but somatic inactivation through point mutation and frequent deletion on 19p13 occurs in ~30% of lung cancers – ranking it the third most commonly mutated gene in lung adenocarcinoma after p53 and RAS119,185,186. STK11 mutations often correlate with KRAS activation and result in the promotion of cell growth187. Its tumor suppressing effect is thought to function, in part, through inhibition of the mTOR pathway via AMP-activated protein kinase188 (Figure 3). STK11 inactivation appears to be particularly prevalent in NSCLC while rare in SCLCs, and inactivating mutations are more common in tumors from males and smokers, and poorly differentiated adenocarcinomas78,185–187,189. Mutation in both KRAS and STK11 appears to confer increased sensitivity to MEK inhibition in NSCLC cell lines compared to either mutation alone190.
The cancer stem cell (CSC) model hypothesizes there is a population of rare, stem-like tumor cells capable of self-renewing and undergoing asymmetric division thereby giving rise to differentiated progeny that comprise the bulk of the tumor191–193. While the first evidence for CSCs (also termed tumor initiating cells) was reported in acute myeloid leukemia194, support for their existence in solid tumors, including lung cancer, is becoming increasingly common137,139,195–199. Several cell surface biomarkers have been reported for the detection and isolation of putative lung CSCs (Table 4). Interestingly, it is becoming apparent that in addition to significant variability of the utility of CSC biomarkers between different solid tumor types, no single biomarker can reliably detect CSCs in tumors from the same tissue – possible reflecting tumor heterogeneity. Regulation of CSCs in lung cancer is likely by the Hedgehog (Hh), Wnt and Notch stem cell signaling pathways200 (Figure 6). Important in normal lung development, specifically progenitor cell development and pulmonary organogenesis, these pathways are now also being studied in regards to their role in tumor development. Increased signaling of the HH pathway results in activation of the transcription regulating GLI oncogenes (GLI1, GLI2, and GLI3)201–203 and persistent activation is found in both SCLC and NSCLC204,205. The Wnt pathway has critical roles in organogenesis, cancer initiation and progression, and maintenance of stem cell pluripotency. In NSCLC, studies have found dysregulation of Wnt pathway members such as Wnt1, Wnt2 and Wnt7a, as well as upregulation of Wnt pathway agonists (Dvl proteins, LEF1, and Ruvb11) and underexpression or silencing of antagonists (WIF-1, sFRP1, CTNNBIP1, and WISP2)206–212. Notch signaling is important in cell fate determination but can also promote and maintain survival in many human cancers213–216. These signaling pathways are thought to be involved in the regulation of stem/progenitor cell self-renewal and maintenance and while normally a tightly regulated process; genes that comprise these pathways are often mutated in human cancers217–219, leading to abnormal activation of downstream effectors.
CSCs are thought to have higher resistance to cytotoxic therapies and radiotherapy than the bulk tumor cells. Thus, while conventional treatment strategies may initially “de-bulk” the primary tumor through elimination of differentiated tumor cells, the small population of CSCs eventually regenerate the tumor, giving rise to recurrence. In lung cancer, evidence of this increased resistance has been shown in primary tumors199 and lung cancer mouse xenografts137. Approaches to specifically treating the CSC population include selective targeting using CSC detection molecules, sensitization of CSCs to conventional therapies and differentiation therapies, and inhibition of signaling pathways important to CSCs, such as Hh, Wnt and Notch signaling pathways, and telomerase an important enzyme in normal stem cell function that is activated in most lung cancers (see below). In lung, progress towards the latter approach has been shown in lung cancer cells204,220. Inhibition of the Hh pathway has been demonstrated with cyclopamine, a naturally occurring inhibitor of SMO which has led to the development of synthetic oral inhibitors which show clinical activity in basal cell carcinoma221. Inhibition of the Notch signaling pathway shows potential with γ-secretase inhibitors. Several inhibitors have shown efficacy in NSCLC222,223 and a Phase II trial using a γ-secretase inhibitor as second line therapy has commenced. Lastly, analysis of CSC biomarkers as diagnostic and prognostic biomarkers has recently shown clinical utility196,224–226.
Angiogenesis is one of the hallmarks of cancer, essential for a microscopic tumor to expand into a macroscopic, clinically relevant tumor. Thus, angiogenic growth factors are required early in pathogenesis. A number of angiogenic proteins have been characterized including vascular endothelial growth factor (VEGF), platelet-derived growth factor (PDGF), fibroblast growth factor (FGF), interleukin-8, and angiopoietins 1 and 2. VEGF is an important inducer of angiogenesis and is known to stimulate proliferation and migration, inhibit apoptosis, promote survival and regulate endothelial cell permeability227. VEGF signaling is stimulated by tumor hypoxia, growth factors and cytokines, and oncogenic activation228. VEGF is highly expressed in both NSCLC and SCLC229 and its expression is associated with poor prognosis in NSCLC230–232, therefore inhibition of VEGF signaling in tumor cells is an important therapeutic target.
Two main approaches to anti-VEGF therapy are blocking VEGF from binding to its extracellular receptors using VEGF-specific antibodies and recombinant fusion proteins, or using small molecule TKIs that bind to the intracellular region of VEGFR233. The humanized monoclonal antibody bevacizumab blocks the binding of VEGF-A to its receptors VEGFR1 and VEGFR1 and is now approved for use in some solid cancers, including lung234. Interestingly, VEGF expression does not always correlate with response to bevacizumab235. One possible reason could be single nucleotide polymorphisms (SNPs) in VEGF. Numerous SNPs have been reported in VEGF with some being associated with lower plasma levels of VEGF236, better outcome in NSCLC237, or recently, response to bevacizumab238.
The tumor microenvironment describes the complex and dynamic milieu of stromal cells, endothelial cells, innate cells and lymphoblasts that surround tumor cells. Cells that comprise the tumor microenvironment interact both with each other and with tumor cells, and as a consequence, they can affect tumor growth, invasion and metastasis239. This supports the “seed and soil” hypothesis proposed by Stephen Paget in 1889240 who observed that the patterns of organ metastasis were a result of favorable conditions between metastatic tumor cells (the “seed”) and the organ microenvironment (the “soil). Modulation of critical tumor microenvironment biomarkers could improve current treatment of lung cancers. For example, hypoxia is associated with an increased risk of metastasis and increased resistance to radiotherapy and possible chemotherapy. Inhibition of HIF1α, a master transcription factor activated in response to hypoxia, or VEGFR, a target of HIF1α, can increase sensitivity to radiotherapy241,242.
Many of the molecular changes discussed above promote metastatic capability of a tumor cell, enabling it to detach from the primary tumor, invade tissue and enter circulation and lastly colonize and grow in a secondary site. Recently, the cell-biological program epithelial to mesenchymal transition (EMT), involved in embryogenesis and normal development in the differentiation of multiple tissues and organs, has been the focus of tumor progression and metastasis due, in part, to evidence of EMT in many in vitro cancer cell models243. EMT describes the loss of cell polarity into a motile, mesenchymal phenotype typically characterized by loss of E-cadherin expression244. Conversion of epithelial cells to a mesenchymal state promotes motility and invasiveness allowing the tumor cells to detach from the primary tumor and relocate to a secondary site. The cells will then undergo a mesenchymal to epithelial transition (MET) to revert to an epithelial state to enable proliferative growth245. While initial reports demonstrated the role of EMT in invasion and metastasis, EMT has since been associated with early events in carcinogenesis246, the acquirement of stem cell-like properties246–248, and resistance to cell death, senescence and conventional chemotherapies245. In lung cancer, mesenchymal markers and EMT inducers (e.g. Vimentin, Twist and Snail) have been shown to be strong prognostic markers249–251. EMT has also been linked to resistance to EGFR TKIs252,253 and COX-2 and LKB1 have been implicated promoting EMT in lung cancer254–256. The miR-200 family of miRNAs is an important negative regulator of EMT257–260 and is discussed later in this review.
Activation of telomerase, the telomere-lengthening enzyme, in premalignant cells prevents loss of telomere ends beyond critical points and is essential for cell immortality. Although silenced in normal cells, telomerase is activated in >80% of NSCLCs and almost uniformly in SCLCs (Table 1)261–263.
The prevalence of activated telomerase in cancer cells has made it an attractive target for therapeutic inhibition. Inhibition of telomerase in such cells leads to telomere shortening and ultimately either cellular senescence or apoptosis264,265. Approaches to telomerase inhibition include using antisense oligonucleotides that bind to human telomerase RNA265 (such as Imetelstat, which has started Phase II trials266) and immunotherapy whereby a patient’s own immune system is stimulated with a vaccine to recognize tumor cells containing a major histocompatibility complex presenting hTERT peptide on the cell surface267,268.
Epigenetic events can lead to changes in gene expression without any changes in DNA sequence and therefore, importantly, are potentially reversible269. Aberrant promoter hypermethylation is an epigenetic change that occurs early in lung tumorigenesis resulting in silencing of gene transcription and therefore a common method for inactivation of TSGs in lung cancer (Table 1)270. They include genes involved in tissue invasion, DNA repair, detoxification of tobacco carcinogens, and differentiation. The prevalence of promoter methylation has been reported to differ between smokers and never-smokers. Promoter methylation of p16, MGMT, RASSF1, MTHFR, and FHIT was significantly higher among smokers than never-smokers whilst RASSF2, TNFRSF10C, BHLHB5, and BOLL was more common in never-smokers271–275. Recent advances in whole-genome microarray profiling have allowed researchers to globally study DNA methylation patterns in lung cancer – the lung cancer epigenome or methylome – and indicate the role of methylation in lung tumorigenesis may have been underestimated276–285. Initial genome-wide studies analyzed the effect on gene expression following treatment of lung cancer cell lines with demethylating agents (such as 5-azacytidine); however, development of methylation-specific microarrays enables epigenomic analysis of tumor specimens276–281.
Aberrant methylation occurs early in lung cancer pathogenesis and can be detected in circulating DNA; thus, many studies have investigated the utility of methylation status in lung cancer for risk assessment, early detection, disease progression and prognosis (reviewed286,287). Table 5 summarizes published candidate early detection, prognostic and predictive methylation biomarkers where hypermethylation of p16, APC, FHIT, RASSF1A, DAPK and CDH1 being repeatedly reported as potential prognostic markers288–302.
DNA is methylated by DNA methyltransferases (DNMTs) which are responsible for both de novo and maintenance of pre-existing methylation in a cell303. Histone modification is another mechanism for epigenetic control of gene transcription where histone deacetylation results in condensing of chromatin resulting in transcriptionally inactive DNA. Inhibitors of DNMTs or histone deacetylases (HDACs) resulting in pharmacologic restoration of expression of epigenetically silenced genes is an exciting targeted therapeutic approach and show promise in lung cancer304,305 (Table 3).
MicroRNAs (miRNAs) are a class of non-protein encoding small RNAs capable of regulating gene expression by either direct cleavage of a targeted mRNA or inhibiting translation by interacting with the 3’ untranslated region (UTR) of a target mRNA. miRNAs commonly have multiple target genes therefore a single miRNA can often affect multiple cellular processes. Furthermore, a mRNA may be targeted by more than one miRNA resulting in a complex network of molecular pathways to elucidate. Aberrant expression of miRNAs has been found to play an important role in the pathogenesis of cancer as either oncogenes or TSGs306–316. Microarray-based analyses of miRNA expression have identified many lung cancer-associated miRNAs313,314,317–328, and a review of experimentally validated miRNAs has been published previously329. One of the most widely-studied lung cancer-associated miRNAs is the let-7 miRNA family. Functioning as a tumor suppressor, it has been shown to regulate N-RAS, K-RAS, MYC and HMGA2330–332 via binding to the let-7 binding sites in their respective 3’ UTRs330,333. It is frequently under-expressed in lung tumors, particularly NSCLC, compared to normal lung, and decreased expression has also been associated with poor prognosis313,318. Induction of let-7 miRNA expression has been found to inhibit in vitro growth313,331,334,335 and reduce tumor development in a murine model of lung cancer335,336. Other miRNAs that exhibit tumor suppressing effects in lung cancer include miR-29a/b/c, miR-34a/b/c, miR-16, and miR-126318–321,337,338, and recently, miR-128b was reported to be a direct regulator of EGFR with frequent LOH occurring in NSCLC cell lines322. Oncogenic miRNAs found to be over-expressed in lung cancer include the miR-17-92 cluster of seven miRNAs (that target PTEN, E2F1-3 and BIM), miR-21 (suggested to be positively regulated by the EGFR signaling pathway, specifically EGFR mutations), miR-93, miR-98, miR-197, miR-221/222, and miR-155314,323,327,328. Additionally, hsa-miR-146b, miR-155 and miR-21 and have been reported to be strong predictors of poor prognosis in lung cancer318,326,339,340. Recent evidence shows a strong link between miRNAs and invasion and metastasis with several miRNAs found to regulate key regulators of EMT, a process central to cancer metastasis258–260,341. These include miR-10b (through inhibition of HOXD10), miR-126, and the miRNA-200 family (which inhibit EMT inducers ZEB1 and ZEB2)257–259,320,341.
There is currently a strong research focus on miRNAs as potential diagnostic and prognostic biomarkers, and therapeutic targets. Restoration of aberrantly expressed miRNAs can be achieved in vitro and in vivo using miRNA mimics (for under-expressed miRNAs) or miRNA inhibitors (termed antisense oligonucleotides or antagomirs) (for over-expressed miRNAs)342–346. miRNA profiles for histologic347,348 and prognostic318,326,337,338,340 classification of lung tumors and detection of miRNAs in peripheral blood and sputum349–351 illustrate the potential of miRNAs as diagnostic and early detection biomarkers in lung cancer. Additionally, concurrent inhibition or over-expression of miRNAs with conventional therapies has resulted in an increased response to EGFR TKIs and radiotherapy327,352. These studies illustrate the immense potential of miRNAs in therapeutics development; however, limitations in pharmacokinetics, delivery and toxicity need to be addressed353,354.
Genetic and epigenetic mechanisms underlying lung cancer development and progression continue to emerge, spearheaded by the development of technologies allowing genome-wide analysis of DNA copy-number, mutations, gene expression, SNPs and methylation.
Profiling the lung cancer transcriptome has imparted biologically- and clinically-relevant information such as novel dysregulated genes and pathways and gene signatures that can predict patient prognosis, response to treatment, and histology reviewed in355–357. In an effort to overcome limitations of sample size and heterogeneity in previous studies, a multi-site, blinded validation study of 442 lung adenocarcinomas comprehensively examined whether the mRNA profile of primary tumors robustly predicts patient outcome either alone or in combination with clinicopathological factors358. This study developed several models (or signatures) which for the most part predicted outcome better than current clinical methods. A recent critical review of published prognostic signatures in lung cancer, however, found little evidence of any published signature being ready for clinical application due, for the most part, to problems with study design and analysis359. The role of expression of the 48 nuclear receptors (and later their co-regulators) has been studied in lung cancer and found to provide as good or better prognostic information than other mRNA expression signatures360. Since the nuclear receptors are also targets for therapeutic manipulation (via hormone agonists and antagonists) the expression of nuclear receptor patterns in individual lung cancers may also provide insight for targeted therapy. Despite complexities of mRNA profiling, the success of prognostic signatures in breast cancer, as seen with Oncotype DX361, impels further research efforts.
High resolution mapping of copy number alterations in the lung cancer genome has been able to identify single genes as targets of genomic gain or loss through improved definition of known aberrant regions or by identification of focal alterations undetectable with earlier technology74–76,79,80,83,84,86. A large-scale analysis of 371 primary lung adenocarcinomas identified 57 significant recurrent copy-number alterations, of which 31 were focal events and many were new lung cancer loci74; for example, amplification at 14q13.3 was reported as the most common event targeting the transcription factor NKX2-1, discussed earlier. Similar studies in NSCLC and squamous cell carcinoma cohorts have identified other novel ‘drivers’ of lung carcinogenesis75,76,79,80.
Large-scale sequencing and SNP analyses have also led to the identification of novel somatic mutations in the lung cancer genome13–15,119. In a screen of 188 lung adenocarcinomas Ding et al119 identified somatic mutations in putative oncogenes (ERBB4, KDR, FGFR4, EPHA3) and TSGs (NF1, RB1, ATM, and APC). A major breakthrough has come with the development of “next generation” (also termed second-generation) DNA sequencing technologies which enable sequencing of expressed genes (‘transcriptomes’), known exons (‘exomes’) and complete genomes of tumors362. Data analysis can detect point mutations, insertions/deletions, copy number alterations, translocations and non-human sequences. Comparison of a primary lung NSCLC of adenocarcinoma histology with adjacent normal tissue identified many somatic mutations at an estimated rate of ~18 per megabase, including >50,000 single nucleotide variants41. Sequencing of a SCLC cell line revealed over 22,000 somatic substitutions42 while another study which sequenced a SCLC cell line and a neuroendocrine lung cancer cell line found a higher rate of somatic and germline rearrangements in the SCLC cell line43. Sequencing of the coding exons of ~1,500 genes across 441 tumors, including 134 lung, found lung adenocarcinomas and squamous cell carcinomas displayed high protein-altering mutation rates363, perhaps indicative of the inherent heterogeneity found in lung tumors compared with tumors from other tissues. One hurdle in second-generation sequencing is storage and analysis of the immense amount of data that is produced and separating biologically meaningful data from noise. However, the potential insight we will have into cancer genomes and its applicability to diagnostic sampling brings us even closer to the goal of ‘personalized medicine’.
“Synthetic lethal” screens using RNAi (siRNAs and shRNA libraries) technology have allowed unbiased, genome-wide approaches to identification of genes whose perturbation can selectively kill lung cancer cells (Figure 1). The ability to identify “synthetic lethality” associated with oncogenic changes in tumor cells has particular utility in identifying new therapeutic targets or molecules to treat traditionally hard to target tumors, such as those with oncogenic KRAS. siRNA and shRNA screens have identified genes whose perturbation can selectively sensitize NSCLC cell lines to sub-lethal doses of chemotherapeutic agents364, sensitize KRAS mutant cells to targeted drugs126–128, suppress tumorigenicity in cells with specific gene dysregulation such as oncogenic KRAS123–125,365, or aberrant EGFR366,367, or identify novel genes critical for tumorigenic processes such as metastasis368.
Although the challenges in gathering reliable and clinically- and pathologically-annotated data are not trivial, high throughput technologies and publicly stored genome-wide databases related to lung cancer are resources with the potential to drive a global collaborative effort in identifying new targets for lung cancer diagnostics and therapeutics. Currently, and within the near future, all lung cancer investigators will have access to all of the genome-wide studies performed on lung cancers with the attached clinical annotation. This will allow independent confirmation on the role of the different molecular changes for prognosis, prediction, and targeting of therapy. With these tools researchers have enhanced ability to correlate patient subsets with augmented sensitivity to conventional or targeted therapeutics, distinguish driver versus passenger mutations, and better focus the design on novel therapeutic targets.
While genome-wide approaches have the capacity of identifying novel genes or interactions in relation to lung cancer, the functional relevance of these findings need to be elucidated using preclinical model systems, namely in vitro models (such as tumor cell lines or immortalized human bronchial epithelial cells) and in vivo xenograft and transgenic mouse models of lung carcinogenesis. Experimental disease models play a crucial role in developing our understanding of lung carcinogenesis. Lung cancer cell lines and xenografts provide one set of important models. However, due to the genetic complexity of lung cancers they will usually have hundreds if not thousands of genetic/epigenetic changes. By contrast, two much simpler and equally valuable models, particularly to study the progression of lung carcinogenesis, are immortalized human bronchial epithelial cells (HBECs) and genetically engineered mouse models (GEMMs). These systems provide methods to reduce the inherent complexity and heterogeneity of the lung cancer genome and allow characterization of single or sequential genetic alterations in relation to the development, maintenance, and progression of lung cancer.
HBECs are derived from primary human airway epithelial cells and immortalized with either viral oncoproteins (such as SV40 early region) and hTERT369 or overexpression of Cdk4 and hTERT260,370. Stepwise transformation of these cells can be studied by the introduction of defined genetic manipulations commonly found in lung cancer371,372.
GEMMs allow the study of lung cancer pathogenesis with defined changes in the setting of the whole organism. They were critical in developing our understanding of oncogene dependence94, as observed in conditional KrasD12-induced lung adenocarcinomas, where switching off the driving oncogene was sufficient to induce tumor regression even in the presence of other non-driving oncogenic alterations373. Ensuing research has characterized several conditional lung tumor inducing combinations of oncogenic activations in mice (summarized in Table 6) which have been used to test new targeted therapies, improve effectiveness of conventional chemotherapies, identify biomarkers and imaging strategies for early detection, and study disease relapse and metastasis374.
This review has outlined some of the significant molecular alterations known to be involved in the initiation and/or progression of lung cancer. Continued development of targeted therapies for the treatment of lung cancer is dependent upon increased understanding of involved molecules and pathways. Cancer genome analyses are identifying 100s to 1000s of candidate targets but these all require molecular and clinical validation. Furthermore, it is becoming increasingly apparent that targeting a single molecule will not be enough due to the non-linearity of pathways involved in carcinogenesis. Rather, targeting multiple molecules at once to combat the inter-connective and complex signaling pathways will improve efficacy. Recent next-generation sequencing efforts are revealing the lung cancer genome is mutated at a high rate, likely contributing to the known heterogeneity of these tumors and explaining the lack of identifying effective conventional and targeted therapies that have a universal effect in lung cancer. Systematic understanding of the molecular basis of lung cancer through comprehensive characterization of aberrations in the cancer genome and their functionality will provide the means to evaluate their use in diagnosis, prognosis and therapy. Integration of clinical and biological factors will ultimately lead to improved detection, diagnosis, treatment, and prognosis of lung cancer by achieving “personalized medicine”, the selection of the best treatment for each patient based on tumor associated biomarkers.
This research was supported by:
National Cancer Institute Lung Cancer Specialized Program of Research Excellence (SPORE) (P50CA70907), Department of Defense VITAL (W81XWH0410142) and PROSPECT (W81XWH0710306), NASA NSCOR (NNJ05HD36G), NASA (NNJ05HD36G) and by the Office of Science (BER) U.S. Department of Energy, Grant Number DE-AI02-05ER64068. JEL supported by NH&MRC Biomedical Fellowship (494511).
We thank the many current and past members of the Minna lab for their contributions to lung cancer translational research and our especially our long term collaborator Dr. Adi Gazdar. Also we apologize to other investigators for omission of any references.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.