Given the enormous efforts put forth to promote smoking cessation and prevention initiatives, in the next few decades NS (and former smokers) will constitute a larger proportion of the lung cancer population 
. It is a well established concept that lung tumors in smokers and NS are distinct disease entities 
. At the DNA level, molecular differences discovered to date are gene-specific and cannot account for all of the clinical differences exhibited by smokers and NS 
. In this study we sought to elucidate global genomic differences in lung adenocarcinomas from NS and smokers. Using a genome-wide comparison approach, we discovered that NS lung tumors have a greater proportion of their genomes altered than those of smokers, we identified regional genomic disparities in the tumor genomes of these two groups, and we validated our findings in two independent external cohorts from the Memorial Sloan Kettering Cancer Centre (MSKCC) and Tumor Sequencing Project (TSP).
mutations in the BCCA and MSKCC tumors segregated with NS and smokers, respectively, consistent with the reported literature. This confirmed the accuracy of our smoking status classifications and validated that these tumors were appropriate to perform a smoker versus NS genome comparison. Genes and regions known to be frequently disrupted in lung adenocarcinoma were not preferentially disrupted in smokers or NS, and the recurrent alterations we observed in both groups were highly concordant with recent reports 
. For example, the most frequently altered region detected in our 69 tumors was gain of 5p15.32-15.33 (51% of tumors), which harbors the hallmark cancer gene TERT
. Gain of 5p was also the most common genomic alteration observed by Weir et al.
in a collection of over 350 lung adenocarcinoma tumors 
. Having established that regions commonly altered in lung adenocarcinoma were not associated with smoking status, we proceeded to determine whether distinct genomic features exist that may underlie the disparate clinical phenotypes observed in smoker and NS lung cancer patients.
Intriguingly, our comparative study revealed that NS lung tumors have a greater fraction of the genome encompassed by genomic alterations. Despite the caveat that our smoker and NS groups were not balanced for ethnicity or mutation (which is not surprising given the known clinical and molecular features associated with smoking status) our multivariate analysis suggested smoking history was the clinical variable most strongly associated with this observed difference. However, since an earlier study that sampled a small fraction of the genome had suggested a greater degree of alterations in smoker tumors 
, we assessed the repeatability of our results in two additional independent cohorts. Specifically we investigated whether the observed global genomic distinction in NS tumors was also evident in these two independent cohorts (MSKCC and TSP, from distant geographical sites, with likely different demographics). Across these three independently performed genomic analyses, we found corroborating results. Even though our observation held true in independent datasets, we are mindful of the fact that the contributions of mutational and smoking status cannot be distinguished in our study. It remains a possibility that PGA is associated with mutation, as PGA between NS and smokers with no mutations was not significantly different. Our study is not powered to test adjusted effects of smoking or mutation type on PGA adjusting for all other confounding factors. This is because smoking and race are correlated with EGFR
mutation. A clean comparison would require large numbers of patients in each smoking/mutation/race combination, at least 300 subjects for each, in order to achieve 80% power to detect a 10% difference in PGA at a significance level of 0.05.
Amidst the genomic instability observed in the lung adenocarcinoma tumors, we identified frequent genomic alterations whose recurrent nature signifies their selection in tumor genomes. After cataloguing numerous differentially altered regions in our dataset, we interrogated two independent cohorts to validate our findings and to reveal the most robust and pronounced regional differences in smokers and NS. We identified six MCRs of copy number gain on chromosomes 5q, 7p and 16p. It is possible that additional, less prominent MCRs, may have been identified had we used less stringent concordance criteria between the three datasets. The regions we have reported are the most robust as they are present in multiple independent cohorts. Again, we performed a multivariate analysis to confirm that smoking status (and subsequently EGFR
mutation) was the strongest factor associated with genomic regions of difference identified. Broet et al.
recently reported regions differentially altered in East-Asian and Western European lung adenocarcinoma tumors, none of which overlapped with the smoking-related regions we identified, indicating that our regions are not ethnic specific 
. The two MCRs we identified on 5q were strongly correlated with one another, as were the three MCRs on 7p, suggesting they may actually be the result of single copy number events. The positive correlation we observed between 5q and 16p could signify that concurrent gains of these regions is non-random and may be biologically relevant.
A recent study profiled 60 NS lung adenocarcinomas using array comparative genomic hybridization (aCGH), albeit without a comparison against tumors from smokers, reporting several MCRs of copy number gain and loss 
. We cross referenced our differentially altered regions to the regions identified by Job et al.
to determine whether any of their regions might be NS-specific. Most of the regions reported by Job et al.
were commonly disrupted in NS and smokers in our BCCA tumors; however, Job et al.
also reported regions of copy number gain on chromosomes 5q, 7p and 16p which we identified as NS-specific. Both of our groups observed gains of chromosome 1q21 in NS, however in our cohort, 1q21.1 gain was up to 30% more frequent in smokers than NS, suggesting it may be a smoker-specific alteration.
Early profiling studies led to the discovery that gains of chromosome 16p are more common in NS than smokers, and this remains one of the few consistently replicated NS-specific genetic alterations discovered to date, which implicates the importance of this region in NS tumor biology 
. We and others also observed an association between gain of 16p13.3-13.2 and Asian ethnicity, however, this could reflect the fact that a large fraction of NS lung cancer patients are of Asian descent 
. The earliest NS lung tumor profiling studies did not identify frequent gains of 5q in NS; however, we found two robust regions of gain at 5q33.3 and 5q34. Other recent studies have also identified frequent gains on 5q in NS lung adenocarcinomas, corroborating our results 
Our analysis revealed three distinct MCRs of gain on 7p, however none encompassed the lung cancer oncogene, EGFR
, located on chromosome 7p11.2. The closest region (7p12.3) was situated 5.4 Mbp telomeric of the EGFR
locus. The presence of these MCRs could imply that additional oncogenes are responsible for 7p gains, as previously suggested 
. Investigators from the MSKCC discovered that DUSP4
, a gene located on chromosome 8p12, was down regulated and associated with EGFR
. We analyzed DUSP4
in the BCCA dataset and confirmed this association (Fisher's Exact test, p
0.05). Interestingly, we also mapped an MCR of loss on 8p in the BCCA and MSKCC datasets which encompassed DUSP4
and found that this region was more frequently lost in NS tumors. Although 8p was not one of our most robust differentially altered regions, it appears genomic loss of DUSP4
is associated with EGFR
mutation and NS.
While many known regions of copy number alteration in lung adenocarcinoma were present in both our smoker and NS cohorts, our results, along with the well established differences in mutational profiles and clinical features, suggest lung tumors of smokers and NS develop through different molecular mechanisms. This may be similar to what has been observed in ovarian cancer, where Type I serous ovarian cancers are typically chromosomally stable and harbor mutations in the Ras signaling pathway, while high-grade serous ovarian cancers (Type II) are RAS
wild-type and exhibit widespread copy number aberrations 
. Intriguingly, Sidransky and co-workers recently discovered that NS lung adenocarcinoma genomes have a greater number of mitochondrial DNA alterations than smokers 
. This finding is consistent with our discovery, providing additional evidence to support the concept that lung cancers in smokers and NS are driven by different molecular alterations. We postulate that NS lung tumors acquire specific genetic alterations early in tumorigenesis that compromise genome integrity. For example, we hypothesize that NS could be inherently predisposed to genomic instability, or they could be exposed to non-tobacco related carcinogens that drive genomic instability. Elucidation of the precise mechanism driving this instability phenotype could potentially lead to targeted therapy for NS patients, or to identify NS at risk of lung cancer development.
It is well known that the mutation profile of NS lung adenocarcinoma is distinct from that of smokers 
. The recent discovery of increased mtDNA mutations and mtDNA content in NS relative to smokers further supports the concept that the distinction between smoker and NS tumors extends beyond EGFR
. Our findings provide a third and novel line of evidence towards genetic differences between smoker and NS lung tumors, namely, that the extent of segmental genomic alterations is greater in NS tumors. Collectively, our findings provide evidence that these lung tumors are globally and genetically different, which implies they are likely driven by distinct molecular mechanisms. Although the biological mechanism underlying our observations in NS remains unknown, elucidation of this mechanism is crucial to the early detection and possibly treatment of these patients, as no known risk factors or molecular features exist to assess lung cancer risk in NS besides family history. Our work provides a rationale for the stratification of patients based on smoking status in future studies, which will in turn facilitate discoveries of the nature of lung cancer in both smokers and NS. Prospective findings will have significant implications and may lead to the development of clinical tools that could be utilized to improve the prognosis of both smoker and NS patients.