PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of jnciLink to Publisher's site
 
J Natl Cancer Inst. Dec 21, 2011; 103(24): 1859–1870.
Published online Dec 8, 2011. doi:  10.1093/jnci/djr420
PMCID: PMC3243673

Prognostic and Predictive Value of a Malignancy-Risk Gene Signature in Early-Stage Non–Small Cell Lung Cancer

Abstract

Background

The malignancy-risk gene signature is composed of numerous proliferative genes and has been applied to predict breast cancer risk. We hypothesized that the malignancy-risk gene signature has prognostic and predictive value for early-stage non–small cell lung cancer (NSCLC) patients.

Methods

The ability of the malignancy-risk gene signature to predict overall survival (OS) of early-stage NSCLC patients was tested using a large NSCLC microarray dataset from the Director’s Challenge Consortium (n = 442) and two independent NSCLC microarray datasets (n = 117 and 133, for the GSE13213 and GSE14814 datasets, respectively). An overall malignancy-risk score was generated by principal component analysis to determine the prognostic and predictive value of the signature. An interaction model was used to investigate a statistically significant interaction between adjuvant chemotherapy (ACT) and the gene signature. All statistical tests were two-sided.

Results

The malignancy-risk gene signature was statistically significantly associated with OS (P < .001) of NSCLC patients. Validation with the two independent datasets demonstrated that the malignancy-risk score had prognostic and predictive values: Of patients who did not receive ACT, those with a low malignancy-risk score had increased OS compared with a high malignancy-risk score (P = .007 and .01 for the GSE13212 and GSE14814 datasets, respectively), indicating a prognostic value; and in the GSE14814 dataset, patients receiving ACT survived longer in the high malignancy-risk score group (P = .03), and a statistically significant interaction between ACT and the signature was observed (P = .02).

Conclusions

The malignancy-risk gene signature was associated with OS and was a prognostic and predictive indicator. The malignancy-risk gene signature could be useful to improve prediction of OS and to identify those NSCLC patients who will benefit from ACT.

CONTEXT AND CAVEATS

Prior knowledge

Although adjuvant chemotherapy has become the standard treatment for non–small cell lung cancer (NSCLC), a proportion of patients do not benefit from the therapy. Molecular profiling of a patient’s tumor may be one tool to help clinicians determine which patients will benefit from adjuvant chemotherapy.

Study design

A malignancy-risk gene signature including 94 genes was previously derived from a comparison of normal and malignant breast tissues. Microarray data from the 442 NSCLC patients in the Director’s Challenge Consortium were analyzed to determine the prognostic and predictive value of the malignancy-risk gene signature after calculating an overall malignancy-risk score. The results were validated using microarray data from two additional independent microarray datasets of 117 and 113 patients, respectively.

Contribution

The malignancy-risk gene signature was associated with overall survival. The malignancy-risk score was also able to identify NSCLC patients who may benefit from adjuvant chemotherapy.

Implications

The malignancy-risk gene signature could potentially be used to predict overall survival of NSCLC patients and to identify patients who would benefit from adjuvant chemotherapy.

Limitations

Validation of the malignancy-risk gene signature in a large independent dataset is needed. Developing methods to measure the expression of the malignancy-risk gene signature in formalin-fixed and paraffin-embedded tissues would broaden the possible applications.

From the Editors

Lung cancer is one of the most common causes of cancer-related death worldwide, accounting for more than 1 million deaths each year. Non–small cell lung cancer (NSCLC) accounts for 80%–90% of all lung cancers (1). The primary treatment for early-stage NSCLC is surgery. However, 30%–50% of early-stage patients relapse after resection and die of metastatic recurrence (2). Five-year survival probabilities for early stage I and II NSCLC range from 40% to 70% (3). Several international clinical trials have demonstrated that adjuvant chemotherapy (ACT) improves the survival of patients with early-stage disease, reporting a 4%–15% survival advantage at 5 years (48). As a result, ACT has become the standard treatment for patients with resected stage II–III NSCLC (9). Clearly, a 4%–15% survival advantage at 5 years suggests that a proportion of patients, but not all patients, benefit from ACT. Given the morbidity associated with ACT, it is imperative to develop new prognostic tools to identify those patients with a high probability of relapse. Such advances would improve patient selection in early-stage NSCLC to optimize the potential benefits of ACT and minimize unnecessary treatment and treatment-associated morbidity.

Recent advances in molecular profiling have provided some insights into the importance of mRNA expression in cancer development (1015). As such, numerous gene signatures have been developed to classify lung cancer patients with different clinical outcomes (1423). There are some gene signatures derived from breast cancer that have prognostic value for lung cancer (24,25) or are associated with lung metastasis (26). This motivated us to investigate the clinical association between the malignancy-risk gene signature (27) and NSCLC.

We previously defined a malignancy-risk gene signature (27) that is rich in genes involved in cell proliferation and was associated with cancer risk in normal breast tissue, as well as a prognostic factor for breast cancer. In this study, we hypothesized that the malignancy-risk gene signature has prognostic and predictive value for early-stage NSCLC.

Methods

Development of the Malignancy-Risk Gene Signature

The malignancy-risk gene signature was derived from a comparison of breast normal tissues with breast cancers and is capable of discerning molecularly abnormal breast tissues that appear histologically normal (27). The signature includes 120 genes with 140 probe sets on the Affymetrix 133 Plus2 chip (Affymetrix, Inc, Santa Clara, CA), but its complexity is reduced to 94 genes with 102 probe sets in the Affymetrix 133 A chip (Supplementary Table 1, available online; updated in November 2010). This signature is predominantly composed of genes involved in proliferation (56 of the 94 malignancy-risk genes, 59.6%), consistent with the near universal loss of cell cycle control in the earliest stages of tumor development.

Microarray Datasets

Data for the primary analysis were from the Director’s Challenge Consortium for the Molecular Classification of Lung Adenocarcinoma (2). This is a large retrospective multisite microarray study for lung adenocarcinomas (https://array.nci.nih.gov/caarray/project/jacob-00182). A total of 442 samples were used for statistical analysis. Overall survival [censored at 5 years (2)] was the primary outcome variable with a median follow-up of 3.92 years (255 samples were from patients who were alive and 187 samples were from those who had died). Clinical predictors included TNM stage, T stage, N stage, pathological grade, smoking history, ACT, adjuvant radiotherapy, and sex (Table 1).

Table 1
Descriptive statistics of clinical predictors and association of the malignancy-risk gene signature with overall survival (OS) within subgroups in the Director’s Challenge Consortium dataset (n = 442)*

Two independent NSCLC microarray datasets and one breast cancer dataset were included to validate the malignancy-risk gene signature: GSE13213 (28) (http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE13213), GSE14814 (29) (http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE14814), and GSE10780 (27) (http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE10780), respectively. The GSE13213 dataset had 117 lung adenocarcinomas samples with overall survival (OS) information available (68 samples were from patients who were alive and 49 samples were from those who had died). These 117 patients did not receive ACT, facilitating the evaluation of the prognostic value of the malignancy-risk gene signature. Because the dataset was generated from an Agilent cDNA array, we used gene symbols to identify the malignancy-risk genes for this dataset (116 probe sets for 87 genes). The GSE14814 dataset (Affymetrix 133 A chip) was extracted from the JBR.10, a randomized controlled trial with two cohorts: Patients who received ACT (n = 71) vs observation alone (n = 62). Because the study was a randomized trial and data were collected in a prospective way, this dataset provides a unique opportunity to evaluate both prognostic and predictive features for the malignancy-risk gene signature. For the GSE10780 (27) dataset composed of 143 normal breast and 42 tumor samples, we evaluated if genes patterns were consistent between breast and lung cancer (eg, increase in the expression of genes in both cancer types).

Statistical Analysis

All statistical analyses were performed with R software version 2.13.1 (http://www.R-project.org) and documentation was prepared by Sweave R code and LaTex in the material (Supplementary Materials, available online).

Data Normalization and Derivation of the Malignancy-Risk Score.

Gene expression values were calculated using the robust multiarray average algorithm (30) for the Director’s Challenge Consortium, GSE14814, and GSE10780 datasets (Affymetrix gene chips), whereas the GSE13213 dataset was normalized by the Loess method (31).

An overall malignancy-risk score was generated by principal component analysis to reflect the combined expression of the malignancy-risk genes. Specifically, we used the first principal component (a weighted average expression among the malignancy-risk genes), as it accounts for the largest variability in the data, to represent the overall expression level for the signature. That is, malignancy-risk score = equation n1, a weighted average expression among the malignancy-risk genes, where xi represents gene i expression level, wi is the corresponding weight (loading coefficient) with equation n2, and the wi values maximize the variance of equation n3. This approach has been used to derive the malignancy-risk gene signature in our previously reported breast cancer study (27).

Association with OS and Other Clinical Predictors.

The influence of the malignancy-risk gene signature was tested to see if the OS of two malignancy-risk groups (high and low) formed by a median split of the malignancy-risk score were statistically significantly different. The two-sided log-rank test was used to calculate P values. Evaluation of the median split malignancy-risk score as an independent factor predicting lung cancer prognosis was done by including several clinical predictors in the multivariable Cox proportional hazards model: TNM stage (IA, IB, II, and III), grade (well, moderately, or poorly differentiated), ACT (yes or no), adjuvant radiotherapy (yes or no), sex (female or male), and smoking history (yes or no). The proportional hazards assumption was verified by the scaled Schoenfeld residual (32). Multivariable analysis was also used to evaluate interactions between the high and low malignancy-risk groups and a clinical predictor after adjusting other clinical predictors. Spearman correlation (r) analysis was used to test any increasing trend of the continuous malignancy-risk score with stage, grade, and smoking history. A two-sided log-rank test was used to determine if the malignancy-risk gene signature could predict OS within different malignancy-risk groups by clinical predictors (eg, TNM stage IA, IB, and II–III) or risk groups jointly defined by all clinical predictors (eg, TNM stage with smoking history). Our study did not investigate the potential influence of race/ethnicity because the majority of patients were of European ancestry (n = 294) with only 12 African Americans and seven Asian/Pacific Islanders. Race/ethnicity data were not reported for the remaining 129 patients.

Univariate Analysis.

Cox proportional hazard models were used to examine the association of each malignancy-risk gene with OS. The scaled Schoenfeld residual was used to verify the proportional hazards assumption (32). Fisher exact test was used to determine the overall statistical significance of the malignancy-risk genes (102 probe sets) by comparison with non–malignancy-risk genes (22 181 probe sets). The two-sided P value was calculated by univariate analysis and was adjusted by the false discovery rate for multiple testing (33).

Evaluation of Prognostic and Predictive Features.

According to the guideline by Clark (34), we tested the prognostic value of the malignancy-risk gene signature on the patients without ACT for each of the three lung cancer datasets to see if those with either high or low malignancy-risk scores (high or low malignancy-risk group) had statistically significantly different OS as measured by the two-sided log-rank test. For the predictive value, treatment effect (compared with an observation cohort who did not receive ACT) was evaluated to determine any association with OS within each malignancy-risk group in the GSE14814 dataset. In addition, an interaction model was used to investigate a statistically significant interaction between ACT and the malignancy-risk gene signature, which could suggest differential treatment effects among those in the high or low malignancy-risk groups.

Because the microarray platforms were different among the three NSCLC datasets, gene level data were used for evaluation (a gene expression level was defined as an average of the expression level for a set of probe sets for the same gene; any probe set with a missing value was excluded). As a result, 87 malignancy-risk genes were identified in all datasets to evaluate the predictive value of the signature. Eight-two patients were excluded in the Director’s Challenge Consortium data for the evaluation of the prognostic and predictive values because they were included in the GSE14814 dataset (29). Before analysis, data were standardized by centering the mean and scaled by the SD for each gene in each dataset. Principal component analysis was first implemented on the Director’s Challenge Consortium data to obtain the malignancy-risk score, which was constructed on the basis of the loading coefficients from the first principal component. The same loading coefficients were also used to compute the malignancy-risk score for the GSE13213 and GSE14814 datasets. The median of the malignancy-risk score in the Director’s Challenge Consortium dataset was used as the cutoff to designate low and high malignancy-risk groups in each of the three datasets to test the prognostic and predictive values.

Results

Data from the Director’s Challenge Consortium were used in the primary analysis of 1) the association between the malignancy-risk gene signature and OS, grade, TNM stage, and other clinical predictors and 2) the malignancy-risk gene signature within different risk groups by clinical predictors and the interaction between the two. A univariate analysis was also done. The other two lung datasets were used to test the prognostic and predictive value of the malignancy risk signature.

Principal Component Analysis

The malignancy-risk gene signature was analyzed using principal component analysis to evaluate the percent of variability and loading coefficients by the first principal component (ie, the malignancy-risk score) for each of the four datasets. Results showed 43.1%–53% variability explained by the first principal component in three lung datasets and 72.1% variability in the breast cancer dataset (Supplementary Figure 1, available online), suggesting the first principal component well represent the malignancy-risk gene signature. Pearson correlation of the loading coefficients was 0.92–0.97 among the three early-stage NSCLC datasets and 0.79–0.87 between the breast cancer dataset and the three early-stage NSCLC datasets, indicating transferability of the signature between breast cancer and lung cancer (Supplementary Figure 1, available online).

Relationship between the Malignancy-Risk Gene Signature and OS and Other Clinical Predictors

Division of lung cancer patients from Director’s Challenge Consortium dataset into high vs low malignancy-risk groups showed that patients in the high malignancy-risk group had statistically significantly shorter OS compared with those in the low malignancy-risk group (log-rank P < .001 and hazard ratio [HR] of death = 2.02, 95% confidence interval [CI] = 1.5 to 2.72) (Figure 1). The 5-year survival rate estimate for the high malignancy-risk group (5-year survival rate = 45.2%, 95% CI = 38.9% to 52.5%) was less than that for the low malignancy-risk group (5-year survival rate = 64.6%, 95% CI = 58.1% to 71.8%), and their 95% confidence intervals did not overlap (Figure 1). In multivariable analysis, the median-split malignancy-risk score was a statistically significant prognostic predictor (P < .001) after adjusting for clinical predictors, including TNM stage, grade, smoking history, sex, and adjuvant treatments (HR = 2.14, 95% CI = 1.42 to 3.22 for high vs low malignancy-risk groups). The assumption of proportional hazards was not rejected. In relation to histological grade, an increasing trend from well to poorly differentiated tumors was observed for the malignancy-risk score (r = 0.52, P < .001) (Figure 2, A). A similar association between the malignancy-risk score and TNM stage (r = 0.24, P < .001) (Figure 2, B), pathological T stage (r = 0.28, P < .001), pathological N stage (r = 0.13, P = .01), and smoking history (r = 0.27, P < .001) was observed (Supplementary Figure 2, available online).

Figure 1
Association of the malignancy-risk gene signature with overall survival. A malignancy-risk score was generated for each patient from the Director’s Challenge Consortium (n = 442) by principal component analysis to reflect the combined expression ...
Figure 2
Association of the malignancy-risk gene signature with histological grade and TNM staging system. The malignancy-risk score was calculated for patients from the Director’s Challenge Consortium for whom data on A) histological grade or B) TNM stage ...

Evaluation of the Signature Within Different Risk Groups by Clinical Predictors and a Measurement of the Potential Interaction

Several clinical predictors were statistically significantly associated with OS by log-rank test: TNM stage (P < .001; Figure 3, A), pathological T stage (P < .001), pathological N stage (P < .001), ACT (P = .01), and adjuvant radiotherapy (P < .001) (Supplementary Figure 3, available online). For each clinical predictor, a statistically significant association of the malignancy-risk gene signature with OS was found in one or more risk groups: TNM stage IB and III (P = .004 and .003, respectively); pathological T stage T2 (P < .001); pathological N stage N0, N1, and N2 (P = .005, .03, and .004, respectively); moderately differentiated histological grade (P < .001); patients who did not receive ACT (P < .001); patients who did not receive adjuvant radiotherapy (P < .001); male and female patients (P = .02 and <.001, respectively); and former smokers (P < .001) (Table 1). For example, TNM stage was associated with poor survival in patients with late-stage disease (P < .001) (Figure 3, A). For each TNM stage subgroup, patients with low malignancy-risk scores had increased OS compared with those with a high malignancy-risk score in stage IB and III (stage IB: log-rank P = .004, HR = 2.29, 95% CI = 1.27 to 4.13; stage III: log-rank P = .003, HR = 2.57, 95% CI = 1.36 to 4.86) (Figure 3, B and C).

Figure 3
Analysis of the association between the malignancy-risk gene signature and overall survival by TNM stage. A) Kaplan–Meier curves of overall survival for patients from the Director’s Challenge Consortium for whom data on TNM stage was available ...

In addition, multivariable analysis using all clinical predictors (without the signature) yielded two statistically significant predictors of OS: TNM stage and smoking history. Because the malignancy-risk gene signature had shown a statistically significant association with OS in stage IB and III, smoking history was examined in the two subgroups to evaluate the usefulness of the malignancy-risk score within each subgroup (2 stages × 3 smoking statuses). Subgroup analysis showed that for the stage IB patients with past smoking history, the malignancy-risk gene signature was able to differentiate the two risk groups, with increased OS observed in the group with a low malignancy-risk score (log-rank P < .001; and HR = 3.39, 95% CI = 1.57 to 7.29) (Figure 4). The 5-year survival rate estimate for the high malignancy-risk group (5-year survival rate = 49.3%, 95% CI = 36.8% to 66%) was less than that for the low malignancy-risk group (5-year survival rate = 79%, 95% CI = 67.5% to 92.5%), and their 95% confidence intervals did not overlap (Figure 4). We also investigated whether interactions between a clinical predictor and the median-spilt malignancy-risk score existed. A statistically significant interaction between the malignancy-risk gene signature and TNM stage was observed after adjusting for other clinical predictors (stage IB HR = 6.23, 95% CI = 1.19 to 32.53, PInteraction = .03 and stage III HR = 6.94, 95% CI = 1.27 to 38.07, PInteraction = .03) (data not shown).

Figure 4
Analysis of the association between the malignancy-risk gene signature and overall survival in stage IB patients with smoking history. Data from the Director’s Challenge Consortium (n = 100) was analyzed. A statistically significant difference ...

Univariate Analysis

Univariate analysis by Cox proportional hazards modeling yielded 75.5% probe sets with statistically significant expression of the malignancy-risk genes (77 probe sets for 70 genes with P < .01) in the Director’s Challenge Consortium dataset. In contrast, there were only 10.7% probe sets with statistically significant expression of non–malignancy-risk genes. The difference between these two (75.5% vs 10.7%) was statistically significant (P < .001 by Fisher exact test), indicating a strong association between the malignancy-risk gene signature and OS (Supplementary Figure 4, available online). After adjusting for multiple testing at the 1% false discovery rate level, there were 67 unique statistically significant malignancy-risk genes (74 probe sets), of which 48 genes (71.6%) are involved in cell proliferation (Supplementary Table 2, available online). All the 48 proliferative genes were correlated with shorter OS when the genes were overexpressed. Moreover, these genes were consistent with those identified in our previous study (27) in which the malignancy-risk gene signature was identified in breast tumors (Supplementary Table 2, available online).

Prognostic and Predictive Value of the Malignancy-Risk Gene Signature for NSCLC

The malignancy-risk gene signature was prognostic for OS in the patients who did not receive ACT or radiation therapy with poorer survival in the high malignancy-risk group in the three lung cancer datasets (Director’s Challenge Consortium dataset: log-rank P = .004; HR of death = 2.10, 95% CI = 1.26 to 3.51; GSE13213 dataset: log-rank P = .007; HR of death = 2.17, 95% CI = 1.22 to 3.86; GSE14814 dataset: log-rank P = .01; HR of death = 2.57, 95% CI = 1.17 to 5.64) (Figure 5, A–C). For the predictive value evaluated in the GSE14814 dataset, the ACT cohort experienced longer survival than the observation cohort in the high malignancy-risk group (log-rank P = .03; HR of survival = 0.48, 95% CI = 0.24 to 0.96) (Figure 6, A). Patients in the high malignancy-risk group had a higher 5-year survival rate estimate for patients who received ACT (5-year survival rate = 72.7%, 95% CI = 59% to 89.6%) compared with those who received observation only (5-year survival rate = 39.2%, 95% CI = 25.4% to 60.4%) (Figure 6, A). In the low malignancy-risk group, patients who received ACT had a lower survival probability in the first 2 years than those who received observation; however, there was no statistically significant difference between the two groups (Figure 6, B). Moreover, the interaction between ACT and the malignancy-risk gene signature was statistically significant (HR = 0.29, 95% CI = 0.10 to 0.85, PInteraction = .02).

Figure 5
Prognostic value of the malignancy-risk gene signature. A malignancy-risk score was generated using the loading coefficients of the first principal component from the Director’s Challenge Consortium dataset for each patient. High and low malignancy-risk ...
Figure 6
Predictive value of the malignancy-risk gene signature. A malignancy-risk score was generated using the loading coefficients of the first principal component from the Director’s Challenge Consortium dataset for each patient. High and low malignancy-risk ...

Evaluation of the predictive value in the Director’s Challenge Consortium dataset showed that patients who received ACT had poorer OS in both the high and low malignancy-risk groups compared with patients who did not receive ACT. Because this was a retrospective study, patients receiving ACT had poorer OS than those who did not get ACT (log-rank P = .01; HR of death = 1.59, 95% CI = 1.12 to 2.27) (Supplementary Figure 5, available online). It is likely that the patients receiving ACT had high-risk clinical characteristics such that ACT was recommended. As expected, the patients who received ACT had shorter survival than those who received non-adjuvant treatment in the low malignancy-risk group (log-rank P = .002; HR of death = 2.36, 95% CI = 1.34 to 4.15) (Supplementary Figure 5, available online). However, this result should not be interpreted as indicating that ACT did harm to patients but is indicative that poorer survival may be associated with high-risk clinical characteristics. On the other hand, the high malignancy-risk group also showed a poorer survival in the ACT cohort, whereas the HR was relatively small compared with that of the low malignancy-risk group (log-rank P = .52; HR of death = 1.16, 95% CI = 0.73 to 1.86) (Supplementary Figure 5, available online). This observation indicates that there may be some clinical advantage to adjuvant treatment in the high malignancy-risk group, but the benefit could not overcome the detrimental contribution of high-risk clinical characteristics.

Discussion

In this study, we have demonstrated that the malignancy-risk gene signature is a prognostic and predictive indicator in early-stage NSCLC. In our previous study (27), the signature was shown to be capable of discriminating molecularly abnormal breast tissues that appear histologically normal from molecularly normal breast tissues. Both observations suggest that expression of genes in the malignancy-risk gene signature may contribute to carcinogenesis in lung and breast cancer (3537). The malignancy-risk gene signature was derived from a comparison of normal breast tissue with invasive ductal carcinomas (27). A majority of the genes in the malignancy-risk signature are core regulators of the mammalian cell cycle and are essential for DNA replication and repair (38). The application of the malignancy-risk gene signature to both breast and lung cancers may not be surprising because sustained proliferative signaling has been considered one of the earliest and most fundamental hallmarks of cancer cells for the past decade (39).

Several gene signatures have been developed to predict outcome in NSCLC (1423). Generally these gene signatures are not composed of genes involved in proliferation and few malignancy-risk genes overlapped with these signatures. In fact, a common biology underlying these previously defined gene signatures has not been described. Nonetheless, our study showed that the malignancy-risk gene signature, a proliferative gene signature, is associated with both cancer risk and progression. One might predict that a gene signature derived from the Director’s Challenge Consortium dataset of lung cancers could have better prognostic and predictive value than the malignancy-risk gene signature because there may be substantial differences between lung and breast cancer, and the gene signature derived from the breast tissue may not be optimal for lung cancer. Surprisingly, a gene signature derived on the basis of high correlation with OS in the Director’s Challenge Consortium dataset was prognostic but not predictive (data not shown). Furthermore, majority of genes in the malignancy-risk signature were absent in this signature, as has been reported for other gene signatures (20,40) derived from this database. Why these strongly prognostic and predictive genes do not appear in these analyses is unclear. What is clear is that different approaches may lead to different gene signatures.

There are a few gene signatures developed in breast cancer and tested in lung cancer, although they do not completely overlap with the malignancy-risk gene signature and are either a metastasis signature (26) or a prognostic signature (24,25). In contrast, the malignancy-risk gene signature features both prognostic and predictive factors in NSCLC and shares some unique clinical features in both lung and breast cancer. The expression of the majority of malignancy-risk genes was increased in breast cancer and also was associated with poorer survival in lung cancer. In addition, a strong correlation of the loading coefficients was reported between the two tumor types. To our best knowledge, our study is the first to show such a high consistency of the gene signature in both tumor types. The malignancy-risk gene signature showed clinical association with cancer relapse/progression, and prognosis in the breast cancer (41). Similarly, the gene signature demonstrated a statistically significant association with OS and other clinical predictors in NSCLC (TNM stage and histological grade). Collectively, these findings suggest transferability of the malignancy-risk gene signature between breast and lung cancer, one unique feature not seen in other gene signatures derived for various tumor types (2426).

From the predictive aspect, the malignancy-risk gene signature has demonstrated the potential to identify early-stage NSCLC patients likely to benefit from ACT. A 15-gene signature described by Zhu et al. (29) was the first predictive signature for ACT in resected NSCLC, derived from the randomized phase III JBR.10 trial (8). However, the malignancy-risk gene signature also showed a statistically significant predictive value comparable with that on the reverse transcription polymerase chain reaction basis reported by Zhu et al. (29) with no overlap between the genes in both signatures. This observation suggests that the relationship between a survival benefit and ACT could be also affected expression of the genes included in the malignancy-risk gene signature. Specifically, the survival benefit from ACT relative to the observation cohort was considerably greater in the high malignancy-risk group. In contrast, the survival benefit of ACT vs the observation cohort was not statistically significant in the low malignancy-risk group; however, the observation cohort seemed to have the advantage in OS for the first 2 years compared with those receiving ACT. In addition, evaluation of the predictive value in the Director’s Challenge Consortium dataset indirectly supported the utility of the signature, although it was a retrospective study. Together, these results suggest that the malignancy-risk gene signature is a strong predictive factor for a differential OS benefit from ACT. Although recent multinational clinical trials (49) have established that ACT is associated with improvement of OS in patients with early-stage NSCLC, the malignancy-risk gene signature may provide an additional tool to help identify a subset patients at high-risk of death who may benefit from ACT.

Similar to other prognostic signatures (1521,23,24), the malignancy-risk gene signature was able to predict OS in NSCLC patients. Patients with a high malignancy-risk score tended to have shorter OS compared with those who had a low malignancy-risk score. In addition, subgroup analysis showed the malignancy-risk signature’s value beyond the conventional clinical predictors with a statistically significant association of the gene signature with OS in one or more risk groups for each clinical predictor. In particular, the malignancy-risk gene signature was able to consistently distinguish between the two risk groups (low and high malignancy-risk groups, respectively, corresponding to good and poor OS) in the subgroups of stage IB patients and stage IB patients who had a history of smoking. Because the benefit of ACT remains unclear in stage IB NSCLC (49), the signature may have potential clinical application for stage IB patients, such as recommendation of ACT only for stage IB patients with a high malignancy-risk score. The utility of the signature for treatment management in stage IB still needs to be further evaluated in a prospective ACT clinical trial.

Our study has some limitations. We have shown the prognostic and predictive values of the malignancy-risk gene signature using three publicly available NSCLC microarray datasets. However, to be considered as a personalized medicine strategy for clinical decision making, validation of the malignancy-risk gene signature in an independent dataset, larger or at least comparable with the Director’s Challenging Consortium dataset, is needed. Successful validation will advance the malignancy-risk gene signature to the next level for the analytical and clinical validity. We plan to evaluate the malignancy-risk gene signature and complete a large-scale validation using microarray data from Total Cancer Care (42,43) collected at the Moffitt Cancer Center. Second, the microarray datasets in our study used fresh frozen tissues to extract RNA to measure gene expression. Although fresh frozen tissues are commonly used in research communities for microarray experiments, formalin-fixed and paraffin-embedded tissues are often collected in community-based hospitals. If the malignancy-risk gene signature could be validated in formalin-fixed and paraffin-embedded tissues, the signature would be a great clinical utility for broad application in personalizing treatment care. A recent study has demonstrated the feasibility of using formalin-fixed and paraffin-embedded tissues for gene signature development in NSCLC (22).

In summary, the malignancy-risk gene signature could be useful to improve prediction of OS in NSCLC patients and is a potential tool to more accurately identify patients who will benefit from adjuvant therapy after surgical resection. Future prospective studies are needed to determine if the malignancy-risk score can be used clinically to benefit early-stage NSCLC patients.

Funding

This work was supported in part by the National Institutes of Health (P50-CA119997 to D.-T.C., W.D.C., and E.B.H.; P30CA076292 to D.-T.C., D.C., and W.J.F.; and R01CA112215 to T.J.Y.); US Army Medical Research and Materiel Command, National Functional Genomics Center project (W81XWH-08-2-0101 to W.J.F., W.D.C., and D.C.); and Taiwan National Science Council (NSC-99-2118-M-005-004 to Y.-L.H.).

Supplementary Material

Supplementary Data:

Footnotes

The authors take full responsibility for the design of the study, the analysis and interpretation of the data, the writing of the article, and the decision to submit the article for publication. The content does not necessarily represent the view of the sponsors. We are also grateful to the reviewers for their constructive comments, which led to improvements in this article.

References

1. Wahbah M, Boroumand N, Castro C, et al. Changing trends in the distribution of the histologic types of lung cancer: a review of 4,439 cases. Ann Diagn Pathol. 2007;11(2):89–96. [PubMed]
2. Shedden K, Taylor JM, Enkemann SA, et al. Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat Med. 2008;14(8):822–827. [PMC free article] [PubMed]
3. Booth CM, Shepherd FA, Peng Y, et al. Adoption of adjuvant chemotherapy for non-small-cell lung cancer: a population-based outcomes study. J Clin Oncol. 2010;28(21):3472–3478. [PMC free article] [PubMed]
4. Pignon JP, Tribodet H, Scagliotti GV, et al. Lung adjuvant cisplatin evaluation: a pooled analysis by the LACE Collaborative Group. J Clin Oncol. 2008;26(21):3552–3559. [PubMed]
5. Arriagada R, Bergman B, Dunant A, et al. Cisplatin-based adjuvant chemotherapy in patients with completely resected non-small-cell lung cancer. N Engl J Med. 2004;350(4):351–360. [PubMed]
6. Douillard JY, Rosell R, De Lena M, et al. Adjuvant vinorelbine plus cisplatin versus observation in patients with completely resected stage IB-IIIA non-small-cell lung cancer (Adjuvant Navelbine International Trialist Association [ANITA]): a randomised controlled trial. Lancet Oncol. 2006;7(9):719–727. [PubMed]
7. Strauss GM, Herndon JE, II, Maddaus MA, et al. Adjuvant paclitaxel plus carboplatin compared with observation in stage IB non-small-cell lung cancer: CALGB 9633 with the Cancer and Leukemia Group B, Radiation Therapy Oncology Group, and North Central Cancer Treatment Group Study Groups. J Clin Oncol. 2008;26(31):5043–5051. [PMC free article] [PubMed]
8. Winton T, Livingston R, Johnson D, et al. Vinorelbine plus cisplatin vs. observation in resected non-small-cell lung cancer. N Engl J Med. 2005;352(25):2589–2597. [PubMed]
9. Pisters KM, Evans WK, Azzoli CG, et al. Cancer Care Ontario and American Society of Clinical Oncology adjuvant chemotherapy and adjuvant radiation therapy for stages I-IIIA resectable non small-cell lung cancer guideline. J Clin Oncol. 2007;25(34):5506–5518. [PubMed]
10. Wigle DA, Jurisica I, Radulovich N, et al. Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. Cancer Res. 2002;62(11):3005–3008. [PubMed]
11. Larsen JE, Pavey SJ, Passmore LH, et al. Gene expression signature predicts recurrence in lung adenocarcinoma. Clin Cancer Res. 2007;13(10):2946–2954. [PubMed]
12. Raponi M, Zhang Y, Yu J, et al. Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung. Cancer Res. 2006;66(15):7466–7472. [PubMed]
13. Kratz JR, Jablons DM. Genomic prognostic models in early-stage lung cancer. Clin Lung Cancer. 2009;10(3):151–157. [PubMed]
14. Boutros PC, Lau SK, Pintilie M, et al. Prognostic gene signatures for non-small-cell lung cancer. Proc Natl Acad Sci U S A. 2009;106(8):2824–2828. [PubMed]
15. Roepman P, Jassem J, Smit EF, et al. An immune response enriched 72-gene prognostic profile for early-stage non-small-cell lung cancer. Clin Cancer Res. 2009;15(1):284–290. [PubMed]
16. Chen HY, Yu SL, Chen CH, et al. A five-gene signature and clinical outcome in non-small-cell lung cancer. N Engl J Med. 2007;356(1):11–20. [PubMed]
17. Skrzypski M, Jassem E, Taron M, et al. Three-gene expression signature predicts survival in early-stage squamous cell carcinoma of the lung. Clin Cancer Res. 2008;14(15):4794–4799. [PubMed]
18. Sun Z, Wigle DA, Yang P. Non-overlapping and non-cell-type-specific gene expression signatures predict lung cancer survival. J Clin Oncol. 2008;26(6):877–883. [PubMed]
19. Baty F, Facompre M, Kaiser S, et al. Gene profiling of clinical routine biopsies and prediction of survival in non-small cell lung cancer. Am J Respir Crit Care Med. 2010;181(2):181–188. [PubMed]
20. Wan YW, Sabbagh E, Raese R, et al. Hybrid models identified a 12-gene signature for lung cancer prognosis and chemoresponse prediction. PLoS One. 2010;5(8):e12222. [PMC free article] [PubMed]
21. Kadara H, Lacroix L, Behrens C, et al. Identification of gene signatures and molecular markers for human lung cancer prognosis using an in vitro lung carcinogenesis system. Cancer Prev Res (Phila). 2009;2(8):702–711. [PMC free article] [PubMed]
22. Xie Y, Xiao G, Coombes K, et al. Robust gene expression signature from formalin-fixed paraffin-embedded samples predicts prognosis of non-small-cell lung cancer patients. Clinical Cancer Res. 2011;17(17):5705–5714. [PMC free article] [PubMed]
23. Raz DJ, Ray MR, Kim JY, et al. A multigene assay is prognostic of survival in patients with early-stage lung adenocarcinoma. Clin Cancer Res. 2008;14(17):5565–5570. [PubMed]
24. Liu R, Wang X, Chen GY, et al. The prognostic role of a gene signature from tumorigenic breast-cancer cells. N Engl J Med. 2007;356(3):217–226. [PubMed]
25. Wan YW, Qian Y, Rathnagiriswaran S, et al. A breast cancer prognostic signature predicts clinical outcomes in multiple tumor types. Oncol Rep. 2010;24(2):489–494. [PMC free article] [PubMed]
26. Minn AJ, Gupta GP, Siegel PM, et al. Genes that mediate breast cancer metastasis to lung. Nature. 2005;436(7050):518–524. [PMC free article] [PubMed]
27. Chen DT, Nasir A, Culhane A, et al. Proliferative genes dominate malignancy-risk gene signature in histologically-normal breast tissue. Breast Cancer Res Treat. 2010;119(2):335–346. [PMC free article] [PubMed]
28. Tomida S, Takeuchi T, Shimada Y, et al. Relapse-related molecular signature in lung adenocarcinomas identifies patients with dismal prognosis. J Clin Oncol. 2009;27(17):2793–2799. [PubMed]
29. Zhu CQ, Ding K, Strumpf D, et al. Prognostic and predictive gene signature for adjuvant chemotherapy in resected non-small-cell lung cancer. J Clin Oncol. 2010;28(29):4417–4424. [PMC free article] [PubMed]
30. Irizarry RA, Bolstad BM, Collin F, et al. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 2003;31(4):e15. [PMC free article] [PubMed]
31. Yang YH, Dudoit S, Luu P, et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 2002;30(4):e15. [PMC free article] [PubMed]
32. Grambsch P, Therneau T. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika. 1994;81(3):515–526.
33. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B Methodol. 1995;57(1):289–300.
34. Clark GM. Prognostic factors versus predictive factors: examples from a clinical trial of erlotinib. Mol Oncol. 2008;1(4):406–412. [PubMed]
35. Rosenwald A, Wright G, Wiestner A, et al. The proliferation gene expression signature is a quantitative integrator of oncogenic events that predicts survival in mantle cell lymphoma. Cancer Cell. 2003;3(2):185–197. [PubMed]
36. Whitfield ML, George LK, Grant GD, et al. Common markers of proliferation. Nat Rev Cancer. 2006;6(2):99–106. [PubMed]
37. Chung CH, Bernard PS, Perou CM. Molecular portraits and the family tree of cancer. Nat Genet. 2002;32(suppl):533–540. [PubMed]
38. Bild AH, Yao G, Chang JT, et al. Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature. 2006;439(7074):353–357. [PubMed]
39. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144(5):646–674. [PubMed]
40. Guo NL, Wan YW, Bose S, et al. A novel network model identified a 13-gene lung cancer prognostic signature. Int J Comput Biol Drug Des. 2011;4(1):19–39. [PMC free article] [PubMed]
41. Chen DT, Nasir A, Venkataramu C, et al. Evaluation of malignancy-risk gene signature in breast cancer patients. Breast Cancer Res Treat. 2010;120(1):25–34. [PMC free article] [PubMed]
42. Yeatman TJ, Mule J, Dalton WS, et al. On the eve of personalized medicine in oncology. Cancer Res. 2008;68(18):7250–7252. [PMC free article] [PubMed]
43. Koomen JM, Haura EB, Bepler G, et al. Proteomic contributions to personalized cancer care. Mol Cell Proteomics. 2008;7(10):1780–1794. [PubMed]

Articles from JNCI Journal of the National Cancer Institute are provided here courtesy of Oxford University Press