|Home | About | Journals | Submit | Contact Us | Français|
The molecular epidemiology of most EGFR and KRAS mutations in lung cancer remains unclear.
We genotyped 3026 lung adenocarcinomas for the major EGFR (exon 19 deletions and L858R) and KRAS (G12, G13) mutations and examined correlations with demographic, clinical and smoking history data.
EGFR mutations were found in 43% of never smokers (NS) and in 11% of smokers. KRAS mutations occurred in 34% of smokers and in 6% of NS. In patients with smoking histories up to 10 pack-years, EGFR predominated over KRAS. Among former smokers with lung cancer, multivariate analysis showed that, independent of pack-years, increasing smoking-free years raise the likelihood of EGFR mutation. NS were more likely than smokers to have KRAS G>A transition mutation (mostly G12D) (58% vs. 20%, p=0.0001). KRAS G12C, the most common G>T transversion mutation in smokers, was more frequent in women (p=0.007) and these women were younger than men with the same mutation (median 65 vs. 69, p=0.0008) and had smoked less.
The distinct types of KRAS mutations in smokers vs. NS suggest that most KRAS-mutant lung cancers in NS are not due to secondhand smoke exposure. The higher frequency of KRAS G12C in women, their younger age, and lesser smoking history together support a heightened susceptibility to tobacco carcinogens.
EGFR and KRAS mutations are present in almost 50% of lung adenocarcinomas in Caucasian patients. More than 90% of EGFR mutations are small in frame deletions in exon 19 and L858R missense mutation in exon 21.1 These mutations are associated with responsiveness to tyrosine kinase inhibitors (TKI) therapy. 2–4 EGFR mutations are more frequently found in women, Asians, and in never smokers.5,6 There is an inverse relationship between duration and intensity of cigarette smoking and frequency of EGFR mutations suggesting that smoking history has predictive value for presence of EGFR mutations.7,8
Although KRAS mutations were identified in lung cancer more than two decades ago, 9,10 the clinical importance of KRAS mutation status became apparent only relatively recently, as lung adenocarcinomas harboring KRAS mutations were found to show lack of response to EGFR TKI therapy.11,12 KRAS-mutated lung cancers are prognostically unfavorable when compared with EGFR-mutateds.13–16 In more than 95% of cases, KRAS missense mutations are found in codons 12 and 13.17 Unlike EGFR mutations, KRAS mutations show no sex predilection, are more frequent in white populations than Asians and most patients are former or current cigarette smokers.18,19 KRAS mutations known to be smoking-associated (G12C, G12V) are transversion mutations (G>T and G>C), whereas KRAS transitions mutations (G>A) are more common in lung adenocarcinomas from patients without any smoking history.20,21
Even though the distinctive distribution of EGFR and KRAS mutations in relation to ethnicity, sex and smoking history suggests that patient characteristics have a significant predictive value for the presence of these mutations, the etiology of most mutations arising in never smokers remains unknown. In the current study, we hypothesized that correlations between demographic, epidemiologic, and clinical data and types of EGFR and KRAS mutations could provide a better insight into specific etiology and/or biology of these mutations. Therefore, we took the advantage of our large clinical dataset and performed an in-depth retrospective analysis of more than 3000 consecutive lung adenocarcinoma cases subjected to routine testing for EGFR and KRAS mutations over a 5-year period.
From September 2004 to December 2009, 3026 lung adenocarcinomas (including 2 adenosquamous and one large cell carcinoma with adenocarcinoma component) were consecutively received and clinically tested for presence of EGFR exon 19 deletion and exon 21 L858R mutation. In January 2006, testing for KRAS mutations (codons 12 and 13) was introduced for all cases negative for EGFR mutation, and 2529 cases were received after that time. Cases with more than one tumor were included if: all the tumors were either mutation negative, harbored the same mutation, or if one tumor harbored EGFR or KRAS mutation and the other(s) was(were) mutation negative. Twenty three patients with more than one tumor harboring different KRAS or EGFR mutations were excluded from the study; some of these have been reported separately.22 Clinical samples submitted for molecular testing included surgically resected tumor samples, biopsies and cytology specimens. Clinical data were collected with the approval of Institutional Review Board (IRB) of Memorial Sloan-Kettering Cancer Center. Stage designated as IIIB/IV included stages IIIB, IV and multifocal bronchioloalveolar carcinoma (BAC). Smoking status was defined as never smokers (<100 lifetime cigarettes), former smokers (quit >1 year prior to diagnosis), or current smokers (still smoking, or quit <1 year prior to diagnosis). Pack-years of smoking was defined as (average number of cigarettes per day/20) × years smoking.
DNA was extracted using a kit (DNeasy, Qiagen) from frozen tumor tissue or formalin-fixed paraffin embedded tumor tissue. If necessary, manual microdissection of paraffin sections was done to ensure at least 50% tumor content. EGFR mutations were detected by sensitive PCR-specific assays as previously described.23 KRAS mutations were detected by PCR-sequencing of exon 2 as described.24 In limited volume tumor samples, presence of an exuberant inflammatory response or extensive fibrosis, PCR was performed with addition of locked nucleic acid (LNA) oligonucleotide to favor amplification of-mutated allele, if present.25
Cases were divided into three groups based on mutation status: EGFR-mutated, KRAS-mutated or wild type for EGFR/KRAS. The associations were tested between the mutation groups and the demographic or clinical characteristics, and the smoking status using Fisher exact test or unpaired t-test. A p value <0.01 was considered significant. The Bonferroni method was used to control for family-wise error rate. Univariate and multivariate logistic regression analyses were used to test the association of smoking-free years and pack-years of smoking with EGFR and KRAS mutational status.
A nomogram was generated for the likelihood of EGFR mutation among Caucasian smokers based on the following logistic regression model: EGFR~β0+ β1smoke-free-years+ β2pack-years + β3gender + β4age+ β5age2. The quadratic term allows a U-shape pattern of the age association with the mutational outcome. All analyses were performed using the R package Design and Hmisc. An independent data set was used for validation26; specifically, we used 375 adenocarcinoma patients who were Caucasian smokers from the Boston cohort included in the study by Girard and colleagues 26 as the validation cohort.
Table 1 summarizes characteristics of lung cancer patients with EGFR and KRAS mutations. Our lung adenocarcinoma patient population was predominantly female (1898/3026, 62.7%) and this was consistent in each year from 2006 to 2009 (1624/2620, 62%) reflecting the routine reflex EGFR/KRAS testing which was initiated in 200627. Only 13% of the cases (406/3026) were submitted for testing before 2006 and these showed a slightly higher female to male ratio (274/406; 67.5%) presumably reflecting some referral bias. Of 3026 cases tested clinically for the two major EGFR mutations, 593 (20%) were-mutated, including 347 exon 19 deletions (59%) and 246 L858R mutations (41%). Patients with EGFR L858R tended to be older than exon 19-mutateds (median age 68 vs. 64; p=8.1×10−5), reflected by an exon 19 del to L858R ratio of 3.5 under age 50 (p=0.002) and of 1.0 in patients 70 and over (p=0.004) (Figure 1). Men with EGFR mutations were more likely than women to present at late stage (i.e. IIIB/IV) of disease (118/170, 69% vs. 235/423, 56%; p=0.002), whereas women predominated at stage I (31% vs. 19%, p=0.004) (Supplementary Figure 1A). Tumors with EGFR L858R presented more often at stage I than tumors with exon 19 del (83/246, 34% vs. 82/347, 24%; p=0.009) (Supplementary Figure 1B). Testing of 2529 cases for KRAS mutations (codons 12 and 13) detected 670 (26%) mutations, including G12C (39%), G12V (21%), G12D (17%), G12A (11%), and other G12 and G13 mutations (12%). Although none of the EGFR-mutated tumors in the present clinical data set were tested for concomitant KRAS mutations, our more recent experience using multiplex genotyping by MALDI-TOF mass spectrometry (Sequenom) further confirm their mutually exclusive occurrence pattern.28 No significant differences in age or stage at presentation were noted between different subtypes of KRAS mutations (Supplementary Figure 2).
The positive and negative associations of KRAS and EGFR mutations, respectively, with smoking are well known but had not previously been analyzed in detail in a single large dataset. Figure 2 illustrates the frequency of EGFR and KRAS mutations in relation to smoking history and smoking pack-years. EGFR mutations were found in 352 of 828 (43%) of never smokers and in 241 of 2198 (11%) former and current smokers. There was no significant difference in frequency of EGFR exon 19 del vs. EGFR L858R relative to smoking pack-years (data not shown). KRAS mutations were found in 627/1860 (34%) of former and current smokers and in 43/669 (6%) of never smokers, the latter proportion being notably lower than in a smaller study from our center but within the confidence interval of the previously reported higher percentage21. While any smoking history significantly decreased the likelihood of EGFR mutations, no difference was noted among smokers with less than 10 pack-years smoking history. Furthermore, in smokers of more than 10 pack-years, EGFR mutations were 5-fold less likely to be found than in never smokers (p=0.0001). In contrast, the proportion of KRAS-mutated lung cancers was significantly higher in smokers with any smoking history than in never smokers; among smokers, we found 15 pack-years as a cut-point above which the likelihood of a lung cancer harboring KRAS mutations was 6-fold higher than in never smokers (p=0.0001). Notably, even in patients with up to 10 pack-years of smoking, tumors with EGFR mutations were still more common than those with KRAS mutations.
The effect of smoking and smoking-free period on the likelihood of EGFR mutation has been previously reported in Asian patients with lung adenocarcinoma 29,30, but the impact of these two smoking variables on the proportions of lung adenocarcinomas with either EGFR or KRAS mutations has not been previously investigated in a predominantly white patient population. Because smoking-free years and pack-years of smoking are partly dependent variables, we performed a multivariate logistic regression analysis to examine the effect of these two parameters in current and former Caucasian smokers. Interestingly, this showed that, among patients with lung cancer, smoking-free years change the likelihood of EGFR mutation but not that of KRAS mutation (Supplementary Table 1).
Given the variety of possible nucleotide substitutions leading to missense mutations of KRAS G12 and G13, we examined their association with smoking in this large dataset. Among never smokers, the most common KRAS mutation was G12D (56%), and G12C was the most frequent mutation among former and current smokers (41%) (Figure 3A). Never smokers were significantly more likely than former and current smokers to have G>A transition mutations (as in G12D) (58% vs. 19% vs. 21%; p=0.0001), whereas G>T transversion mutations (as in G12C), a typical change associated with tobacco carcinogens, was the most common nucleotide change in former and current smokers (67% and 71%, respectively) (Figure 3B). Compared to other KRAS mutations types, G12C was more frequent in women (p=0.007) (Figure 3C), who were also younger than men with the same mutation (median age 65 vs. 69; p=0.0008). Intriguingly, women with G>T transversions had smoked less (average 34 pack-years vs. 40 pack-years, p=0.001) (Supplementary Table 2) and were younger than men with the same nucleotide change (median age 64 vs. 67, p=0.006). As discussed below, this pattern of findings suggests an increased susceptibility to tobacco carcinogenesis in women.
There is continuing interest in using clinical variables to prioritize EGFR mutation testing. Certain patient subsets, such as Asians and never-smokers are routinely tested, while other subsets, such as male Caucasian smokers, are considered of lower priority for testing. However, it is also becoming clear that these patient characteristics should not be used individually to exclude patients from testing, as shown in a recent analysis of a subset of the present data31. Given the significant associations of EGFR mutation with sex (p=0.01), pack-years of smoking (p<0.0001) and smoking-free years (p=0.002), we used these variables along with age to generate a nomogram to predict the EGFR status specifically in Caucasian smokers (current and former). We excluded Asians and never smokers from the nomogram dataset because it is generally agreed that patients in these groups should be tested regardless. The area under the receiver operating characteristic (ROC) curve was 0.70 (Figure 4). To validate the performance of our nomogram in an independent dataset, we used the Caucasian smokers from the Boston cohorts used in the study by Girard and colleagues.26 In this independent set of patients, our nomogram generated an area under the ROC curve for predicting EGFR status of 0.71 (Supplementary Figure 3). In the MSKCC training dataset (n=2078), 16 had a predicted probability of EGFR mutation of 1% or less, and none were EGFR-mutated, and 421 had a predicted probability of 0.05 or lower, of which 14 (3%) had EGFR mutations. In the Boston dataset (n=375) used for validation, ten patients had probability below 1%, one of which was EGFR-mutated, 145 had a probability below 5%, including ten (7%) EGFR-mutated cases. As discussed below, we view these results as indicating that, even in the context of a rigorously developed nomogram, clinical variables cannot be used to robustly identify patients with a negligible chance of harboring an EGFR-mutated lung cancer.
To accurately and reliably determine the frequency of the major mutations in EGFR and KRAS in lung adenocarcinoma in relation to patient characteristics and different levels of smoking, a sufficiently large number of case subjects is necessary to provide statistical power for more detailed analyses. Here, we performed a retrospective analysis of our large clinical database of lung adenocarcinomas with established EGFR/KRAS mutation status. We found (1) distinct differences in sex, age and stage distribution of two most common types of EGFR mutations; (2) we determined the likelihood of EGFR and KRAS mutations by intensity and duration of smoking, and (3) evaluated the effects of smoking-free period on the proportions of EGFR and KRAS mutations in lung cancers arising in former smokers; (4) we designed a nomogram to predict presence of EGFR mutation in Caucasian smokers; (5) we noted a distinct distribution of types of KRAS mutations in smokers vs. never smokers; (6) we observed significant sex and age differences in the frequency of G12C as the most common smoking-related KRAS mutation.
EGFR exon 19 del was relatively more common than L858R mutation in younger patients. Notably, of eight patients below age 40 with EGFR-mutated lung adenocarcinoma, seven were EGFR exon 19 del. On the other hand, L858R occurred in a relatively older age distribution and the patients more often presented with stage I disease. These findings may suggest a potentially more aggressive natural history of adenocarcinomas with EGFR exon 19 del and relatively less aggressive one for L858R-mutateds. Differences between EGFR exon 19 del and L858R-mutateds have been reported in patients treated with TKI or chemotherapy. EGFR exon 19 deletions have been associated with better response to TKI and with a longer time-to-progression (TTP) and overall survival (OS) in advanced adenocarcinoma patients.32–35 However, the better clinical outcome of patients with EGFR exon 19 del compared to patients harboring EGFR L858R mutations remains controversial; two prospective randomized phase III trial studies did not confirm these observations. 36,37 A distinct age and stage distribution as well as differences in response to molecular targeted therapy may suggest subtle differences in biology and/or etiology for EGFR exon 19 del and L858R mutation.
Although typically seen in the absence of smoking history, a significant minority (11%) of former and current smokers harbor EGFR-mutated tumors, arguing against excluding smokers from EGFR testing. Moreover, among smokers with less than ten pack-years, EGFR mutations were more common than KRAS mutations. In a study of 265 lung adenocarcinomas, some of which are included in the current dataset, Pham et al. found significantly fewer EGFR mutations in people who smoked for more than 15 pack-years or stopped smoking less than 25 years ago compared with individuals who never smoked.7 Our extended dataset allowed for more accurate risk stratification by pack-years categories and showed, that any smoking history at or above one pack-year significantly decreased the likelihood of EGFR mutated tumors with no notable difference up to ten pack-years. Although our patient population was primarily Caucasian, the results appear generalizable, as a similar relationship of EGFR mutations to pack-years and smoke-free years has also been reported in Asian patients with lung cancer.29,30,38
As expected, most of the KRAS mutations were found among current and former smokers, and consistent with other studies 39, we identified 6% of never smokers with KRAS-mutated tumors. In our earlier study that included 102 KRAS-mutated tumors, 21 we failed to demonstrate predictive value of pack-years for the presence of KRAS mutations likely due to small number of cases. Here, we have shown that any smoking history significantly increases the likelihood of a KRAS mutation being found in the lung cancer. Smoking-free years provided additional value in predicting the likelihood of EGFR mutations but not that of KRAS mutations, independent of pack-years of smoking. These multivariate results suggest a model in which KRAS mutations occur at the time of smoking and lead to cancer eventually, explaining the lack of impact of smoke-free years. This is also supported by the observation that former and current smokers have similar proportions of KRAS-mutated lung cancers (Figure 2A). Overall, this further supports the notion that permanent DNA damage by tobacco carcinogens acquired at the time of smoking is the major source of most KRAS-mutated lung adenocarcinomas. Thus, the likelihood that a patient with lung cancer has a KRAS mutation is determined by pack-years of smoking and does not decrease significantly over time upon smoking cessation; in contrast, because overall lung cancer incidence decreases with increasing smoke-free years, the relative proportion of non-smoking associated cancers (represented by EGFR-mutated tumors) increases. Importantly, these data should not be misinterpreted as supporting a “protective” effect of smoking on the risk of EGFR-mutated lung adenocarcinoma.
Based on the need for efficient medical resource utilization and concerns regarding health care costs and possible treatment delays due to testing, there is continuing controversy regarding routine EGFR mutation testing in certain patient subsets perceived as having a low chance of EGFR mutation in their lung cancer, such as male Caucasian smokers. Using the readily available clinical parameters of age, sex, pack-years, and smoking-free years, we developed a nomogram to predict the likelihood of EGFR mutation in Caucasian current or former smokers with lung adenocarcinoma. We should note that a similar, recently published nomogram 26 differs in two important ways from the one we have developed. First, it includes never smokers, a group in which the value of EGFR testing is no longer in question. Secondly, it includes the histologic subtype of adenocarcinoma, which usually can only be properly analyzed in resection specimens, but decisions regarding EGFR testing often have to be made in advanced stage patients in whom the available small biopsies are sometimes suboptimal for histologic subtyping. The accuracy of our nomogram was 70% on the source dataset and 71% in an independent validation dataset. Based on clinical considerations (for instance, the fact that testing of ALK fusions present in only 3–5% of lung adenocarcinomas is now indicated to select patients for crizotinib),40 we deemed that only a probability of harboring an EGFR mutation of less than 1% was clinically negligible and therefore actionable in terms of bypassing EGFR testing. However, only a very small proportion of patients fall in this category, 0.8% is the source dataset and 2.7% in the validation dataset and the latter included one incorrect prediction (10% error rate). Overall, the 70–71% accuracy of nomogram prediction, along with the very low proportion of predictions below 1%, suggests that clinical variables cannot be used to robustly identify patients with a negligible chance of harboring an EGFR-mutated lung cancer. Nonetheless, our nomogram may still be helpful in situations where mutation analysis for EGFR is simply not possible and the clinical parameters and smoking history are used to direct the treatment decision.
In a previous smaller study, we showed that never smokers were significantly more likely than former or current smokers to have a KRAS transition mutation (G>A) rather than the transversion mutations known to be smoking-related (G>T or G>C).21 The much larger number of cases in the present series allowed us to robustly confirm these earlier findings as well as to detect sex and age differences in the frequency of the most common smoking-related G>T transversion mutation, KRAS G12C. These findings support the notion that most KRAS-mutant lung adenocarcinomas in never smokers are not likely to be caused by environmental (second hand) tobacco smoke, a potentially important observation in assessing the level of risk posed by such exposure.
Sex differences in sensitivity to tobacco smoke have been well documented.41 Zang and Wynder have reported that the odds ratios for major lung cancer types are consistently higher in women than in men at every level of exposure to cigarette smoke and that these differences cannot be explained by differences in baseline exposure, smoking history, or body size, but are likely due to a higher susceptibility to tobacco carcinogens in women.42 Computed tomographic screening data suggest that female smokers are almost twice as likely as male smokers to have a lung cancer detected in spite of lesser smoking histories.43 Consistent with our findings in KRAS, studies of the mutational spectrum of TP53 in relation to smoking and sex showed that cancers arising in female smokers had significantly more tobacco-related mutations (G>T transversions) than in male smokers.44,45 Therefore, taken together, the relatively higher percentage of the female patients with tumors containing KRAS G12C (due to G>T transversion), their younger age at diagnosis, and the fewer pack-years of smoking in women with this KRAS mutation, compared to men with the same KRAS mutation, provide yet another type of data supporting the hypothesis that women are more susceptible to tobacco carcinogens.
The apparent increased susceptibility of women to tobacco carcinogenesis may reflect constitutive differences in genes encoding tobacco carcinogen-metabolizing enzymes. For example, the cytochrome P450 phase I detoxifying enzyme CYP1A1 shows higher expression in the normal lung tissue of female smokers than male smokers.46 The most common polymorphism found in cytochrome P450 phase II detoxification enzymes is the GSTM1-null genotype, which is present in 40%–50% of the general population due to homozygosity for a deletion polymorphism and the impact of this GSTM1 genotype may be enhanced in female smokers.47
In summary, several observations emerge from this large analysis of the molecular epidemiology of EGFR and KRAS mutations in lung adenocarcinoma. Pack-years of smoking have a significant predictive value for the presence of EGFR and KRAS mutations and smoking-free years have additional predictive value for presence of EGFR mutations but not that of KRAS mutations. However, even in the context of a rigorously developed nomogram incorporating these clinical variables, it remains difficult to reliably identify a significant subset of smokers who would have an EGFR mutation likelihood of <1%, and therefore our data do not support excluding any subset of patients with lung adenocarcinoma from EGFR testing. Our results suggest a different etiology of KRAS mutations in smokers vs. never smokers and firmly support earlier observations of increased susceptibility to tobacco carcinogenesis in women. More broadly, our observations strengthen the notion that careful consideration of histologic subtypes (focusing on adenocarcinoma instead of mixing all lung cancer types) and molecular subtypes defined by distinct, non-overlapping driver mutations (EGFR, KRAS) can help to clarify epidemiologic associations that may otherwise remain elusive.48,49 This approach, which recognizes the possible etiologic diversity represented by different histologic and molecular subtypes, has recently been termed molecular pathological epidemiology.50
To clarify the molecular epidemiology of EGFR and KRAS mutations in lung adenocarcinoma, we examined tumor genotyping data in 3026 patients in relation to demographic, clinical and smoking history data. In addition to the expected reciprocal associations of EGFR and KRAS mutations with smoking history, this showed that 11% of smokers had EGFR-mutated tumors and 6% of never smokers had KRAS-mutated tumors. Pack-years of smoking were predictive for EGFR and KRAS mutations but even in the context of a nomogram, it is difficult to identify a significant subset of smokers with an EGFR mutation likelihood of <1%, and therefore our data do not support excluding any patient subset from EGFR testing. The distinct types of KRAS mutations in smokers vs. never smokers suggest that most KRAS-mutant lung cancers in never smokers are not due to secondhand smoke exposure. The higher frequency of KRAS G12C in women, their younger age, and lesser smoking history support a heightened susceptibility to tobacco carcinogens.
Financial support: NIH P01 CA129243 (to M.L., M.G.K.)
The authors wish to thank Justyna Sadowska, Jacklyn Casanova, and Lin Dong for excellent technical support. We also thank Dr Cameron Brennan for helpful discussions.