PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Cancer Res. Author manuscript; available in PMC 2017 April 17.
Published in final edited form as:
PMCID: PMC5393446
NIHMSID: NIHMS856264

Genomic Landscape Established by Allelic Imbalance in the Cancerization Field of a Normal Appearing Airway

Abstract

Visually normal cells adjacent to, and extending from, tumors of the lung may carry molecular alterations characteristics of the tumor itself, an effect referred to as airway field of cancerization. This airway field has been postulated as a model for early events in lung cancer pathogenesis. Yet the genomic landscape of somatically acquired molecular alterations in airway epithelia of lung cancer patients has remained unknown. To begin to fill this void, we sought to comprehensively characterize the genomic architecture of chromosomal alterations inducing allelic imbalance (AI) in the airway field of the most common type of lung tumors, non–small cell lung cancer (NSCLC). To do so, we conducted a genome-wide survey of multiple spatially distributed normal-appearing airways, multiregion tumor specimens, and uninvolved normal tissues or blood from 45 patients with early-stage NSCLC. We detected alterations in airway epithelia from 22 patients, with an increased frequency in NSCLCs of squamous histology. Our data also indicated a spatial gradient of AI in samples at closer proximity to the NSCLC. Chromosome 9 displayed the highest levels of AI and comprised recurrent independent events. Furthermore, the airway field AI included oncogenic gains and tumor suppressor losses in known NSCLC drivers. Our results demonstrate that genome-wide AI is common in the airway field of cancerization, providing insights into early events in the pathogenesis of NSCLC that may comprise targets for early treatment and chemoprevention.

Introduction

Lung cancer is the leading cause of cancer-related deaths in both men and women (1). Non-small cell lung cancer (NSCLC) comprises the majority (~85%) of all lung tumors, with lung adenocarcinoma (LUAD) and squamous cell carcinoma (LUSC) the most frequently diagnosed histologic subtypes (2). NSCLC exhibits relatively poor prognosis with an average 5-year survival rate of 15% (2). Importantly, even early-stage NSCLC exhibits relatively poor clinical outcome compared with other similar stage malignancies with 5-year survival rates of stage I NSCLC reaching only approximately 50% (3). Data from the National Lung Screening Trial suggest that screening is expected to increase detection rates and augment the number of diagnosed early-stage lung cancers (4), warranting the need for better early treatment strategies for this growing subpopulation. Understanding molecular events that drive NSCLC development will permit identification of early targets for prevention and treatment.

Studies in field carcinogenesis of the lung airway have revealed molecular alterations in normal-appearing cells adjacent to lung tumors that are characteristic of the tumor itself (5). This airway “field of cancerization” in the lung has been linked to smoking-associated damage and is thought to be highly pertinent to lung oncogenesis (5). Previously described airway field changes include loss-of-heterozygosity (LOH) at chromosomal regions 3p and 9p (6), promoter methylation of CDKN2A (7), mutations in the EGFR (8), and KRAS oncogenes (9) as well as gene expression profiles that are common between tumors and adjacent normal airway cells (10). A better understanding of these field changes may provide important biologic insights into lung tumorigenesis.

LOH and other forms of acquired chromosomal alterations that induce allelic imbalance (AI) have an established role in oncogenesis (11). Yet the genomic landscape of somatically acquired alterations in the airway field of cancerization remains largely unexplored. In this study, we interrogated a rich collection of airways from NSCLC patients, analyzing whole genome copy number alterations (CNA) using SNP arrays and applying novel computational tools that allow us to detect alterations that are present at low cellular fractions. We report the landscape of AI (CNAs and copy-neutral LOH, cn-LOH) in the airway field of cancerization, offering new insights into the spatiotemporal evolution of lung cancer from the premalignant field.

Patients and Methods

NSCLC field cancerization cohort

NSCLC tumors and airway field cancerization samples were obtained from early-stage (I–IIIA) NSCLC (31 LUADs, 14 LUSCs) patients who were evaluated at The University of Texas MD Anderson Cancer Center (MD Anderson; Houston, TX). The median follow-up times to survival and recurrence were 22 months (range 1–41) and 21 months (range 1–42), respectively. Tumor stage was classified as described in previous work (10). Individuals who smoked at least 100 cigarettes in his or her lifetime but quit smoking more than 12 months before NSCLC diagnosis were considered former smokers (12). The study was approved by the Institutional Review Board and all participants provided written informed consents. All 45 NSCLC patients did not receive neoadjuvant therapy prior to surgery and their clinicopathologic information is summarized in Table 1. Paired NSCLC and uninvolved normal lung tissues were obtained snap-frozen. We also obtained multiregion core-needle biopsies (CNB) from 20 of the 45 NSCLC tumors (Supplementary Table S1).

Table 1
Clinicopathologic data of the 45 early-stage NSCLC patients

Airway epithelial cell collection

Each patient set comprised samples from the primary tumor and normal-appearing airways paired with blood cells and/or uninvolved normal lung tissue (all samples, n = 435; Fig. 1). The type (including numbers) of airway samples obtained from each case is summarized in Supplementary Table S1. White blood cells were available for 36 of the 45 NSCLC cases and, along with normal lung tissue, were used for somatic contrasts. Brushings from nasal epithelia and ipsilateral large (mainstem bronchi) airways (L1) were obtained from 27 and 22 NSCLC patients, respectively, during endoscopic bronchoscopy prior to resective surgery as described previously (13). Nasal brushings were obtained using sterile Cytosoft Cytology Brushes (Medical Packaging Corporation), whereas brushings from large airways were obtained endoscopically using ConMed disposable bronchial cytology brushes (ConMed Corporation). Small airway epithelia (S) adjacent to NSCLCs were obtained from the resected specimens as described previously (10). Briefly, small airway epithelia were collected by brushing, using Cytosoft brushes, 1–5 sequential bronchiolar structures with varying distances from the tumors (S1, relatively closest from NSCLC to S5, relatively farthest small airway from the tumor). Confirmation of epithelial cell collection by pan-cytokeratin immunohistochemical analysis as well as cytopathologic control was performed as described previously (10). All airway samples included in this study were determined to be normal by pathology review. Airway brushings were placed in PBS, pelleted, and immediately stored in −80°C until further processing.

Figure 1
Genome- and airway-wide analysis of the field of cancerization in early-stage NSCLC patients. To comprehensively characterize the genomic landscape of allelic imbalance in the airway field of cancerization, we collected and surveyed 435 samples from 45 ...

Genome-wide high-density array profiling

Genomic DNA was isolated from all samples (n = 435) using the QIAamp DNA Kit from Qiagen according to the manufacturer's instructions. Double-stranded genomic DNA was quantified using the RNase P Assay (Life Technologies) according to the manufacturer's instructions. DNA quality was assessed by running samples on 1% agarose gels to confirm absence of DNA degradation and RNA contamination. High-quality DNA was then surveyed for genome-wide AI using the whole genome Human OmniExpressExome (v.1.2) BeadChip Array Platform (Illumina), which queries approximately 981,000 SNPs (741KtagSNPs, 240K exonic). All raw data and sample annotations were submitted to the Gene Expression Omnibus (GEO) under series GSE80519 (samples GSM2129208-GSM2129642).

Identification of genome-wide AI

To survey normal-appearing airways with expected low fractions of aberrant DNA, a sensitive haplotype-based statistical algorithm, hapLOH (14) was used to infer genome-wide AI. This method is based on identifying subtle B-allele frequency (BAF) shifts among heterozygous markers that are congruent with one of the parental haplotypes and thus consistent with an AI event such as deletion, duplication, orcn-LOH (Supplementary Fig. S1). The following parameters were used to run hapLOH: mean event size, 20 Mb; event prevalence, 0.001; and max iterations, 100. The BAF values (sample specific) and germline haplotype estimates (patient-specific) were input to hapLOH. The number of aberrant event states was set to 1 for nontumor samples and it was set to 2 for tumor/CNB samples. For each patient, a germline sample (blood or uninvolved normal lung tissue) was designated, after which fastPHASE (15) was applied to each patient for statistical reconstruction of the haplotypes. The hidden Markov model of hapLOH was used to compute the probability that a set of adjacent markers span a region of AI. An AI event was defined as a continuous set of markers with posterior probabilities exceeding the threshold of 0.5, although 81% of identified events exceeded 0.9 (Supplementary Fig. S2). Events had an average size of 10,480 markers (median=5,346; max=44,650; min=58). As all but two events were comprised of more than 100 markers, an arbitrarily selected value of 20 markers rendered virtually no impact on our findings. To complement findings from hapLOH, BAFsegmentation (16) was also applied using the following parameters: noninformative mBAF threshold, 0.85; triplet filtering cutoff, 0.8; AI calling mBAF threshold, 0.56; and minimum segment size, 4. Events called by either method were removed if any of the following were true for a called event: fewer than 20 markers, greater than 50% reciprocal overlap with copy number variants in the database of genomic variants (for putative gains), or if the event showed a 10% overlap with an event called in the paired germline (blood or normal lung) sample. Calls from BAFsegmentation with less than 50 markers were removed unless they showed marginal statistical support from hapLOH (P < 0.05).

Bedtools was used to identify genes impacted by AI events and gene coordinates were downloaded from the UCSC table browser RefSeq release 68 and genome build hg19 (GRCh37), the same build used for the array design.

Statistical analysis

Fisher exact test was applied to assess associations between smoking and presence of AI in the airway. Logistic regression was used to assess associations with histology, adjusting for smoking status. Azero-inflated Poisson distribution (R; www.r-project.org) was used to examine the interplay of smoking, histology, and AI, treating smoking (pack-years) and AI (counts of events) as continuous variables.

See Supplementary Information for a full description of Methods.

Results

Early-stage NSCLC airway field of cancerization cohort

We sought to understand the extent and role of chromosomal alterations in the airway field by analysis of genome- and airway-wide AI in NSCLC. To do so, we collected a rich set of tumor and normal-appearing airway samples from 45 early-stage NSCLC patients (31 LUADs and 14 LUSCs; Table 1). We performed AI analysis on a total of 435 samples including 1–5 small normal-appearing airways adjacent to the tumor as well as NSCLCs from all 45 patients (Fig. 1; Supplementary Table S1). Our sample set also included centrally (in the lung) located normal-appearing large airways and nasal epithelia (obtained from 22 and 27 cases, respectively) for a better assessment of airway-wide patterns of AI in the airway field (Supplementary Table S1). In addition, we analyzed multiregion tumor core-needle biopsies (CNB) from 20 patients for a comprehensive assessment of the primary tumor as well as white blood cells (in 36 cases) and uninvolved normal lung tissue (in all 45) as comparators to aid identification of somatic AI events in the field (Supplementary Table S1).

Identification of genome- and airway-wide AI in the field of cancerization in NSCLC

We applied a haplotype-based computational method, hapLOH (14), to infer genome-wide AI in the normal-appearing airway field of cancerization. In total, we detected 255 somatic airway AI events (247 autosomal, 8 events in the X chromosome; Supplementary Fig. S3). These somatic events were found in 22 of 45 patients (Fig. 2), distributed across 30 small airways adjacent to tumors and three relatively more-distant large airways (Fig. 2; Supplementary Table S2). Of note, we did not find AI events in any of the 27 nasal samples, which are the most distant airway samples from the tumor. Our findings suggest that somatic events are distributed along a spatial gradient in the normal-appearing airway field, particularly in LUAD patients, in which samples in closer proximity to the primary tumor are more likely to exhibit somatic chromosomal alterations (Fig. 2; Supplementary Table S2).

Figure 2
Spectrum of allelic imbalance in the normal-appearing airway field of cancerization in early-stage NSCLC. We identified 255 allelic imbalance events (247 autosomal, 8 events in chromosome X; Supplementary Methods) in the normal-appearing airway field ...

We sought to assess the relationship between AI (as presence and burden) in the airway with clinical and pathologic features. Overall, lifetime smokers had a higher AI burden (252 events in 19 of 37 smoker patients) relative to nonsmokers (3 events in 3 of 8 nonsmokers; Fig. 2). We found that events in chromosome 9 were the most frequent AI alterations in the airway field and were detected in smokers only (Fig. 2). When we compared AI events by tumor stage, we found no positive correlation between AI burden and progression of pathologic stage, which may be in part due to the relatively small number of stage III samples (n = 6) in our early-stage cohort. Notably, histology significantly predicted somatic alterations, with 79% (11/14) of LUSC patients exhibiting AI, compared with 35% (11/31) for LUAD (P = 0.011). When we excluded nonsmokers from this analysis, the difference between histologic subtypes was still statistically significant, with the presence of AI in LUSC (11/14) patients significantly exceeding that in LUAD smokers (9/23; P = 0.02 for a χ2 test and P = 0.03 using logistic regression and adjusting for pack-years as a continuous covariate). Although our cohort is relatively young, we attempted to probe the association between presence of AI and tumor recurrence. While this association exhibited an OR of 2.5, it did not reach statistical significance (P = 0.2).

CNAs in the airway field of cancerization in NSCLC

We then interrogated the spectrum of different types of alterations (deletions, cn-LOH, or duplications) detected in the airway field. To do so, we jointly analyzed BAF and log R ratio values within a region of detected AI (Supplementary Fig. S4). Still, at cell fractions below 5%, determination of the specific alteration becomes exceedingly difficult. Of our detected AI events in the airway field, 39 were gains, 33 were losses, 4 were cn-LOH, and 179 were deemed undeterminable. Among the somatic events in the normal-appearing airways for which we confidently assign alteration type, 88% matched the classification of the event found in a corresponding tumor or CNB sample from the same patient. For purposes of overall summaries of the alterations, we assigned alteration types to the undeterminable events based on the classification of suitably matched events in a paired tumor sample (see Supplementary Methods and Supplementary Fig. S4), resulting in 46 gains, 64 losses, and 27 cn-LOH and 118 undeterminable (Fig. 2 and Supplementary Fig. S3).

Thirteen airway AI events matched an alteration in the paired tumor by physical position but differed in one of the two ways. For 4 of these 13, the event designations (e.g., deletion, duplication) differed between airway and tumor. For the other 9 events, the specific haplotype in relative excess differed between the airway and tumor, that is, the maternal copy of the chromosome was observed to be in relative excess in one sample and the paternal copy in excess in the other sample (Supplementary Fig. S5). These observations imply regions of genomic instability and recurrent independent AI (independent mutations in the airway and tumor).

In fact, these 13 observations (listed in bold in Supplementary Table S3) understate the actual rates of “recurrent” (within-patient, across-sample) mutation. An expected half of recurrent AI events will induce imbalance in the same direction and thus go undetected by the above analysis of haplotype consistency. In our data, 6 of the 9 aforementioned examples of opposite-direction AI comprised an entire chromosomal arm. Interestingly, all 6 were found on chromosome 9q. There were 15 total 9q AI mutations in the field, leading to an estimated 9q recurrent mutation rate of 12/15 (0.80, ± 0.15; see Supplementary Methods).

To understand the effects of these AI field events on the pathobiology of NSCLC, we annotated the identified field alterations as bona fide drivers in cancer (17) and with genes previously reported to be aberrant in NSCLCs (e.g., lineage-restricted oncogenes). Overall, there were more CNAs in the airway field in LUSCs, relative to LUADs, in driver genes (105 relative to 73; Fig. 3). Losses or cn-LOH in 9q, spanning KLF4 (9q31), PTCH1 (9q22), GNAQ (9q21), TSC1 (9q34), ABL1 (9q34), and NOTCH1 (9q34), were the most frequent CNAs in the airway field of both LUADs and LUSCs (Fig. 3). The airway field of both LUAD and LUSC also displayed focal or arm losses in 19p13 comprising STK11, KEAP1, and SMARCA4 tumor suppressors. We also noted different airway field CNAs between LUADs and LUSCs. In particular, we observed gains in 3q26 that include PIK3CA and the squamous lineage-specific transcription factor SOX2 in the airway field of 3 LUSCs, whereas a gain in the adenocarcinoma-restricted lineage oncogene NKX2-1 (14q13) was observed in the field of one LUAD patient (Fig. 3). In addition, we detected copy number gain of the MYC oncogene (8q24) in the airway field of 2 LUADs, whereas focal or chromosomal arm losses in regions comprising the tumor suppressors VHL (3p25), RB1 (13q14), TP53 (17p13), MTUS1 (8p22), and SMARCB1 (22q11) were restricted to the airway field of LUSCs.

Figure 3
Allelic imbalance of driver and lineage-specific genes in the airway field of cancerization in early-stage NSCLC. Allelic imbalance events affecting driver and lineage-restricted genes were identified and are summarized. Patients with aberration of the ...

Discussion

In this study, we sought to characterize the heretofore unknown landscape of somatic genome-wide chromosomal alterations in the airway field of cancerization in NSCLC. To comprehensively interrogate genome-wide AI in the airway field, we compiled and studied a rich set of matched NSCLCs and spatially distributed normal-appearing airway epithelia.Weapplied anovel algorithm hapLOH (14) and performed genome-wide assessment of AI in matched NSCLC tissues, germline samples (normal lung parenchyma or blood cells), and multiple normal-appearing airway epithelia. We also investigated “intrafield heterogeneity” by genome-wide survey of multiple spatially distributed airway field samples found in both the local/adjacent airway field (airways adjacent to tumors) and relatively more distant fields (large airways and nasal epithelia). We found that almost half (22/45) of the NSCLC patients harbored AI events in the normal-appearing airway field, the majority of which matched alterations in the paired NSCLC. We observed that the airway field of LUSCs comprised significantly more AI than the field of smoker LUADs. AI was more frequently found in adjacent (to the tumor), relative to more distant, airways and was absent in the nasal epithelia, suggestive of a spatial gradient of AI across the field. Importantly, we found consistent somatic variation in driver oncogenes and tumor suppressor genes between the field and the paired NSCLCs.

This genomic profile of the airway field of cancerization offers a window to interrogate early or critical events in lung cancer pathogenesis. For example, the airway field profiles comprised losses in chromosomal regions harboring known tumor suppressors such as 3p25 (VHL), 8p22 (MTUS1), 9q (TSC1), 19p (STK11, KEAP1, SMARCA4), 13q14 (RB1), and 17p13 (TP53) as well as gains in 3q26 (PIK3CA and SOX2), 8q24 (MYC), and 14q13 (NKX2-1). It is noteworthy that, for the most part, these airway field AI events were reported to be present in premalignant lesions. LOH in 9q (TSC1) and 17p13 (TP53) as well as reduced protein expression of STK11 were reported in atypical adenomatous hyperplasias, precursor lesions in the histopathologic sequence of LUAD development (18). Also, AI events in 8p22 (MTUS1), 13q14 (RB1), and 17p13 (TP53) were detected in squamous preinvasive lesions (6). Moreover, our finding of loss of 3p25 (VHL) in the airway field of LUSCs but not of LUADs is consistent with earlier reports demonstrating loss of chromosome 3p in normal epithelium adjacent to LUSCs (6). It is important to mention that our analysis revealed that the airway field in one LUAD harbored gain of NKX2-1, a lineage-specific oncogene that is specific to adenocarcinoma histology (19). We also observed gain of SOX2, a lineage-specific oncogene for lung tumors of squamous histology (20), in 3 LUSCs. This is of particular interest, as SOX2 has been found to be amplified in preinvasive squamous lesions (21) in the sequence of LUSC development. On the basis of the above, it is plausible to speculate that these events may be among the earliest aberrations in field carcinogenesis of the lung and, if so, represent targets for chemoprevention.

Our study revealed differences in somatic AI between the normal-appearing airway fields of LUSCs and LUADs. Genome-wide analysis of AI in multiple airway samples per case that were sampled along the respiratory tract pointed to a genomic spatial gradient in the airway field of LUADs and not in LUSCs. It cannot be neglected that this spatial gradient may be largely due to known differences (5) in anatomic locations, from which LUADs (relatively peripheral in the lung) and LUSCs (more centrally located) develop. In addition, our analysis revealed that normal-appearing airways of LUSC patients exhibited more AI events compared with normal airways of smoker LUAD patients, suggesting the significant association of squamous histology with AI burden in the airway field of NSCLCs. It is worthwhile to mention that previous work demonstrated that LUSC tumors harbor more frequent copy number alterations compared with ever-smoker LUADs (20, 22). In addition, our previous analysis of the transcriptomic architecture of the airway field of cancerization adjacent to NSCLCs revealed that the local airway field surrounding early-stage LUSCs comprised substantially more tumor-associated expression changes compared with the field adjacent to smoker LUADs (10). It is reasonable to surmise that the increased levels of AI in the airways of LUSC patients are a reflection of the overall increased genomic anomalies and instability in the LUSC tumors themselves.

Our analysis identified AI in the large proximal airways (mainstem bronchi) of three NSCLC patients, suggesting that airway epithelia that are still in situ in the lung following surgical removal of the tumor may continue to carry genomic alterations in definitively treated patients. It is intriguing to suggest that airway field aberrations in the lung may be implicated in relapse of early-stage NSCLC patients and thus warrant analysis following surgery to derive prognostic biomarkers. Another observation was that there were events, predominantly in 9q, that comprised independent and subclonal alterations. Although we studied genome-wide AI in a large set of samples (n = 435) with the primary objective of investigating the airway field to model early somatic events in the pathogenesis of NSCLC, the role of the airway field alterations in development of recurrence could not be statistically addressed at the present due to our cohort's size and relatively short follow-up time and warrant further assessment in future adequately powered studies. However, it is not unlikely that the identified somat-ically acquired AI events are involved in lung carcinogenesis, as they were shared between the airway field and NSCLCs. Nonetheless, our study represents the first in-depth attempt to characterize the landscape and architecture of genome-wide somatic alterations (e.g., AI) in the adjacent (to tumor) and more distant fields of cancerization, and current efforts are underway to expand this model to study airway AI attributable to recurrence and other clinicopathologic features such as smoking.

Earlier work pointed to common gene expression profiles between normal large airways and nasal epithelia of phenotypically healthy smokers (23), suggesting that intrathoracic gene expression changes may extend to extrathoracic (e.g., nasal) cavities. In our analysis of somatic DNA alterations in the airway field of cancerization, we did not find AI events in any of the 27 nasal brushings. It cannot be neglected that our relatively small cohort (n = 45) and the availability of nasal brushings in a subset of the cases (9/14 LUSCs and 18/31 LUADs) may impact conclusions on the frequency of DNA alterations such asdetectable AI in the nasal epithelia of NSCLC patients. It is important to note that the previously reported airway field profiling studies in smokers with cancer centered on the bronchial compartment demonstrating that gene expression changes in bronchial epithelia can improve diagnostic performance of bronchoscopy (24, 25). It is also noteworthy that our study, based on its goal of identifying potential genomic drivers in the airway field, did not assess gene expression changes but rather performed a genome-wide survey of DNA alterations (e.g., AI). It is likely that the nasal epithelial field may harbor different types of alterations (e.g., epigenetic) and that warrant future studies.

In conclusion, we characterized the architecture of genome-wide AI in the airway field of cancerization of early-stage NSCLC. We found that AI is common in the airway field of NSCLC and exhibits intrafield heterogeneity within patients. We also shed light on driver gene copy number alterations that are present in the normal-appearing airway field of cancerization and, thus, may embody early events in the pathogenesis of NSCLC.

Supplementary Material

b

Acknowledgments

Grant Support: This work was supported in part by Molecular Genetics of Cancer training grant T32 CA009299 (Y. Jakubek), Department of Defense (DoD) grant W81XWH-10-1-1007 (I.I. Wistuba and H. Kadara.), Lung Cancer SPORE grant P50CA70907 from the NCI (I.I. Wistuba), Cancer Prevention and Research Institute of Texas (CPRIT) award RP150079 (P. Scheet and H. Kadara), NIH grant R01HG005859 (P. Scheet), and by the Institutional Cancer Center Support Grant CA16672.

Footnotes

Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).

Disclosure of Potential Conflicts of Interest: No potential conflicts of interest were disclosed.

Authors' Contributions: Conception and design: P. Scheet, H. Kadara

Development of methodology: L. Xu, Z. Weber, J. Fujimoto, S.G. Swisher, P. Scheet

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): W. Lang, W. Lu, Z. Weber, G. Davies, C. Behrens, N. Kalhor, C. Moran, J. Fujimoto, R. Mehran, R. El-Zein, S.G. Swisher, E.A. Ehli, P. Scheet, H. Kadara

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Y. Jakubek, S. Vattathil, L. Huang, S.-Y. Yoo, L. Shen, J. Huang, J. Wang, J. Fowler, I.I. Wistuba, P. Scheet, H. Kadara

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M. Garcia, L. Xu, C.-W. Chow, J. Fujimoto, P. Scheet

Writing, review, and/or revision of the manuscript: Y. Jakubek, W. Lang, G. Davies, J. Fujimoto, R. El-Zein, S.G. Swisher, J. Fowler, A.E. Spira, E.A. Ehli, I.I. Wistuba, P. Scheet, H. Kadara

Study supervision: P. Scheet, H. Kadara

References

1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2015. CA Cancer J Clin. 2015;65:5–29. [PubMed]
2. Herbst RS, Heymach JV, Lippman SM. Lung cancer. N Engl J Med. 2008;359:1367–80. [PubMed]
3. Novello S, Asamura H, Bazan J, Carbone D, Goldstraw P, Grunenwald D, et al. Early stage lung cancer: progress in the last 40 years. J Thorac Oncol. 2014;9:1434–42. [PubMed]
4. Aberle DR, Adams AM, Berg CD, Black WC, Clapp JD, Fagerstrom RM, et al. Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med. 2011;365:395–409. [PMC free article] [PubMed]
5. Kadara H, Wistuba II. Field cancerization in non-small cell lung cancer: implications in disease pathogenesis. Proc Am Thorac Soc. 2012;9:38–42. [PubMed]
6. Wistuba II, Behrens C, Milchgrub S, Bryant D, Hung J, Minna JD, et al. Sequential molecular abnormalities are involved in the multistage development of squamous cell lung carcinoma. Oncogene. 1999;18:643–50. [PubMed]
7. Belinsky SA, Nikula KJ, Palmisano WA, Michels R, Saccomanno G, Gabrielson E, et al. Aberrant methylation of p16(INK4a) is an early event in lung cancer and a potential biomarker for early diagnosis. Proc Natl Acad Sci U S A. 1998;95:11891–6. [PubMed]
8. Tang X, Shigematsu H, Bekele BN, Roth JA, Minna JD, Hong WK, et al. EGFR tyrosine kinase domain mutations are detected in histologically normal respiratory epithelium in lung cancer patients. Cancer Res. 2005;65:7568–72. [PubMed]
9. Nelson MA, Wymer J, Clements N., Jr Detection of K-ras gene mutations in non-neoplastic lung tissue and lung cancers. Cancer Lett. 1996;103:115–21. [PubMed]
10. Kadara H, Fujimoto J, Yoo SY, Maki Y, Gower AC, Kabbout M, et al. Transcriptomic architecture of the adjacent airway field cancerization in non-small cell lung cancer. J Natl Cancer Inst. 2014;106:dju004. [PMC free article] [PubMed]
11. Biesecker LG, Spinner NB. A genomic view of mosaicism and human disease. Nat Rev Genet. 2013;14:307–20. [PubMed]
12. Spitz MR, Hong WK, Amos CI, Wu X, Schabath MB, Dong Q, et al. A risk model for prediction of lung cancer. J Natl Cancer Inst. 2007;99:715–26. [PubMed]
13. Kadara H, Shen L, Fujimoto J, Saintigny P, Chow CW, Lang W, et al. Characterizing the molecular spatial and temporal field of injury in early-stage smoker non-small cell lung cancer patients after definitive surgery by expression profiling. Cancer Prev Res. 2013;6:8–17. [PMC free article] [PubMed]
14. Vattathil S, Scheet P. Haplotype-based profiling of subtle allelic imbalance with SNP arrays. Genome Res. 2013;23:152–8. [PubMed]
15. Scheet P, Stephens M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet. 2006;78:629–44. [PubMed]
16. Staaf J, Lindgren D, Vallon-Christersson J, Isaksson A, Goransson H, Juliusson G, et al. Segmentation-based detection of allelic imbalance and loss-of-heterozygosity in cancer cells using whole genome SNP arrays. Genome Biol. 2008;9:R136. [PMC free article] [PubMed]
17. Vogelstein B, Papadopoulos N, Velculescu VE, Zhou S, Diaz LA, Jr, Kinzler KW. Cancer genome landscapes. Science. 2013;339:1546–58. [PMC free article] [PubMed]
18. Wistuba II, Gazdar AF. Lung cancer preneoplasia. Annu Rev Pathol. 2006;1:331–48. [PubMed]
19. Weir BA, Woo MS, Getz G, Perner S, Ding L, Beroukhim R, et al. Characterizing the cancer genome in lung adenocarcinoma. Nature. 2007;450:893–8. [PMC free article] [PubMed]
20. Bass AJ, Watanabe H, Mermel CH, Yu S, Perner S, Verhaak RG, et al. SOX2 is an amplified lineage-survival oncogene in lung and esophageal squamous cell carcinomas. Nat Genet. 2009;41:1238–42. [PMC free article] [PubMed]
21. McCaughan F, Pole JC, Bankier AT, Konfortov BA, Carroll B, Falzon M, et al. Progressive 3q amplification consistently targets SOX2 in preinvasive squamous lung cancer. Am J Respir Crit Care Med. 2010;182:83–91. [PMC free article] [PubMed]
22. Staaf J, Isaksson S, Karlsson A, Jonsson M, Johansson L, Jonsson P, et al. Landscape of somatic allelic imbalances and copy number alterations in human lung carcinoma. Int J Cancer. 2013;132:2020–31. [PubMed]
23. Gesthalter YB, Vick J, Steiling K, Spira A. Translating the transcriptome into tools for the early detection and prevention of lung cancer. Thorax. 2015;70:476–81. [PubMed]
24. Silvestri GA, Vachani A, Whitney D, Elashoff M, Porta Smith K, Ferguson JS, et al. A bronchial genomic classifier for the diagnostic evaluation of lung cancer. N Engl J Med. 2015;373:243–51. [PMC free article] [PubMed]
25. Spira A, Beane JE, Shah V, Steiling K, Liu G, Schembri F, et al. Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer. Nat Med. 2007;13:361–6. [PubMed]