|Home | About | Journals | Submit | Contact Us | Français|
Early diagnosis of lung cancer followed by surgery presently is the most effective treatment for non-small-cell lung cancer (NSCLC). An accurate, minimally invasive test that could detect early disease would permit timely intervention and potentially reduce mortality. Recent studies have shown that the peripheral blood can carry information related to the presence of disease, including prognostic information and information on therapeutic response. We have analyzed gene expression in peripheral blood mononuclear cell (PBMC) samples including 137 patients with NSCLC tumors and 91 patient controls with non-malignant lung conditions, including histologically diagnosed benign nodules. Subjects were primarily smokers and former smokers. We have identified a 29-gene signature that separates these two patient classes with 86% accuracy (91% sensitivity, 80% specificity). Accuracy in an independent validation set, including samples from a new location, was 78% (sensitivity of 76% and specificity of 82%). An analysis of this NSCLC-gene signature in 18 NSCLCs taken pre-surgery, with matched samples from 2-5 months post-surgery, showed that in 78% of cases, the signature was reduced post-surgery and disappeared entirely in 33%. Our results demonstrate the feasibility of using peripheral blood gene expression signatures to identify early-stage NSCLC in at-risk populations.
Lung cancer is the second most-prevalent cancer occurring in both men and women in the United States, accounting for 162,000 deaths in 2008 (1), more than any other cancer. High-risk populations include smokers and former smokers, as well as individuals exposed to second-hand smoke, asbestos, and radon. Presently, there is no easily applied screening protocol for lung cancer similar to those used for breast, prostate, and colon cancers. Screening high-risk patients with low-dose spiral CT (LDCT) (2-5) identifies small, non-calcified pulmonary nodules in approximately 30-70% of high-risk individuals, but only a small proportion (0.4 to 2.7%) of detected nodules ultimately are diagnosed as lung cancers (6-8). Even using the best clinical algorithms, 20-55% of patients selected to undergo surgical lung biopsy for indeterminate lung nodules are found to have benign disease (4), and those that do not undergo immediate biopsy or surgery require sequential imaging studies resulting in continued radiation exposure.
Accordingly, efforts are in progress to develop complementary non-invasive diagnostics using techniques such as detection of methylated tumor DNA in sputum (9), serum proteomics (10-12), detection of auto-antibodies (13, 14), and gene expression profiling in sputum (15) and airway epithelial brushings (16). Although each of these approaches has its own merits, none has yet passed the exploratory stage. Biomarkers that could be identified from a simple blood test, a routine event associated with regular clinical office visits, would be ideal.
Given previous studies that have analyzed gene expression from peripheral blood mononuclear cells for cancer diagnosis or prognosis (17-21), the goals of this study were to determine whether we could identify a gene expression signature in PBMCs that would accurately distinguish patients with early-stage lung cancer from non-cancer controls with similar risk factors (i.e. matched for age, gender, race, and smoking history) and whether such a signature had value in predicting whether lung nodules detected by diagnostic X-ray or CT scans were malignant or benign.
Study participants (Supplementary Tables 1A-1B) for the initial training sets were recruited from the University of Pennsylvania Medical Center (Penn) during the period 2003 through 2007: 91 subjects with a history of tobacco use without lung cancer, including 41 subjects that had one non-calcified lung nodule diagnosed as benign after biopsy, and 137 patients with newly diagnosed, histopathologically confirmed, non-small-cell lung cancer. All participants had blood collection in conjunction with a clinical visit or just prior to surgery. None of the case subjects had received any cancer therapy prior to blood collection. Subjects with any prior history of cancer, except non-melanoma skin cancer, were excluded. Obstructive lung disease was defined as an FEV1/FVC < 70%. We recruited a total of 298 cases and controls from Penn. We excluded 10 NSCLC patients that were diagnosed to have a second cancer, and arrays for 6 samples were removed as technical outliers (see Methods). The Penn samples were specifically recruited for this study. PBMC were purified at Penn and RNA extracted at Wistar. The study was approved by the Penn Institutional Review Board. We also received 90 RNA samples processed at the New York University Medical Center (NYUMC), 27 had acceptable RNA quality based on gel electrophoresis and Bioanalyzer analysis and only these 27 were further processed for array analysis. Samples from NYUMC were all collected under IRB approval, and are listed in Supplementary Table 1C.
Blood samples from Penn were drawn in two “CPT” tubes (BD). PBMC were isolated within 90 minutes of blood draw, washed in PBS, transferred into RNAlater (Ambion) and then stored at 4 °C overnight before transfer to −80 °C. A subset of patient PBMCs was analyzed by flow cytometry, with anti-CD3, CD4, CD8, CD14, CD16, CD19, or CD-56 antibodies or isotype controls (BD Biosciences), and analyzed using FlowJo software. Samples collected at NYUMC were processed within 2 hours from collection; PBMC were transferred to Trizol (Invitrogen) and stored at −80 °C. Extracted RNA was transferred to the Wistar Institute for further processing.
RNA purification of the Penn samples was carried out at Wistar using TriReagent (Molecular Research), as recommended and controlled for quality using the Bioanalyzer. Only samples with 28S/16S ratios >0.75 were used for further studies. A constant amount (400ng) of total RNA was amplified, as recommended by Illumina. The NYU samples required DNAse-treatment before hybridization. Samples were processed as mixed batches of cases and controls and hybridized to the Illumina WG-6v2 human whole genome bead arrays (http://www.illumina.com/pages.ilmn?ID=197)
All arrays were processed in the Wistar Institute Genomics Facility. Arrays were checked for outliers by computing the gene-wise, between-array, median correlation for all the arrays and comparing it with correlation for each array. An array was declared an outlier if the difference between its median correlation with other arrays versus the overall between-array median correlation was greater than 8 median absolute deviations. Non-outlier arrays were quantile normalized and background was subtracted from expression values. Non-informative probes were removed if their intensity was low relative to background in the majority of samples or if maximum ratio between any 2 samples was not at least 1.2. (See Supplementary Methods for details).
Classification was performed using a Support Vector Machine with recursive feature elimination (SVM-RFE)(22) using random, tenfold, cross-validation repeated 10 times. Classification scores for each tested sample were recorded at each reduction step, down to a single gene. Average accuracy for each reduction step was calculated and all the genes at the points of maximal accuracy formed the initial discriminator, which then underwent additional reduction to form the final discriminator (see Supplementary Methods for details). Pathway analysis was carried out using Ingenuity Pathways Analysis software (http://www.ingenuity.com/). Significance of the changes in the SVM score before and after surgery was determined with a one-sided t-test.
Each of the genes in the signature from SVM analysis of the microarray data identified in the training set is assigned a coefficient that defines its importance in the classifier. In validating or testing the accuracy of the signature on new samples that are not identified by class association, the analysis is carried out essentially as follows: The signature is applied as an equation of the form:
where A, B, C, etc. are the microarray expression levels of each of the signature genes, and a,b,c, etc. are the coefficients by which each expression level is multiplied to give a value for X (the classification score). The expression levels of the 29 genes [A, B, C...Z] determined by microarray for a new patient are each multiplied by the appropriate coefficient (a, b, c...z) to determine a classification score, “X.” If the threshold value of X is set to be zero, then patients with positive scores will be declared to have malignant disease and those with negative scores will be called non-malignant. The higher the positive score, the greater is the confidence of malignancy, and the more negative the score, the greater is the confidence of no malignancy (Supplementary Figure 2).
Clinical and demographic variables for 137 non-small-cell lung cancer (NSCLC) cases and 91 controls with non-malignant lung disease, including those with pathologically diagnosed benign nodules collected at the Penn, are summarized in Table 1 and detailed in Supplementary Tables 1A and 1B. The case and control groups were similar in terms of age, race, gender, and smoking history. Fifty-five percent of the cancer patients were Stage 1, 13%, Stage 2, and 32% Stages 3 and 4. Eighty-four percent of the control group and 93% of the NSCLC group were current or previous smokers. Samples used for independent validation included an additional 12 cases and 15 controls collected at the NYUMC and 26 additional cases and 2 controls collected at Penn (Supplementary Table 1C). These samples were not included in the studies to develop a general classifier.
Flow cytometry was performed on peripheral blood mononuclear cells (PBMC) from 35 cases and 14 controls collected at Penn. As shown in Supplementary Table 2, there were no significant differences in the percentages of T-cells, CD4 cells, B-cells, monocytes, or NK cells. The tumor group had a slightly lower percentage of CD8 cells (18.9%) than the controls (24.5%), which did reach significance (p=0.03).
We compared gene expression profiles in PBMC samples from the 137 NSCLC cases to 91 controls with non-malignant lung disease. We applied a support vector machine with recursive feature elimination (SVM-RFE) and tenfold cross-validation (22) to the data to find the minimal number of genes that could most accurately distinguish the case and control groups by their PBMC gene expression (see Supplementary Methods and Supplementary Figure 1). We identified a 29-gene signature that distinguished the cases from controls with an overall classification accuracy of 86%, a sensitivity of 91%, and a specificity of 80%. The distribution of SVM scores, which measure how well a particular sample is classified, is shown in Figure 1A for each NSCLC patient and in Figure 1B for each control. The numerical classification score of each sample, together with its clinical annotation, is listed in Supplementary Table 3. The 29 genes used for classification are listed in Table 2 ordered by their SVM score, which is a measure of each gene's contribution to the classifier.
Although an SVM score of 0 achieved the greatest degree of accuracy in separating case and control classes, additional clinical utility can be derived from this data by taking advantage of the value of the assigned SVM predictive score in the class assignments. For example, individuals with an SVM score of less than −0.65 are classified as controls with 100% specificity. Similarly, an SVM threshold of +0.65 or above would eliminate 12 of 17 false positives and could identify a lung cancer case with 95% sensitivity. The scores have confidence levels which are proportionate to the score itself as shown in Supplementary Figure 2. The ROC curve (Figure 1C) demonstrates the full spectrum of performance characteristics for various cutoffs of the SVM scores. The overall area under the curve (AUC) achieved by the classifier was 0.92.
To address the issue of data over-fitting and to test the generality of the classification model, we also performed the analysis using only 80% of the samples for training and set aside 20% of the samples for validation. We repeated that process for 5, non-overlapping, 20% set-asides. Similar average accuracies were found over the 5 training sets (81.8%) and the 5 validation sets (81.1%) (Supplementary Table 4) demonstrating the ability of the algorithm to classify new samples with the predicted accuracy. The overall accuracy is slightly reduced when using the smaller training sets (81% vs. 86%). The average accuracy of the analysis with randomly permuted sample labels was 58% across 10 permutation runs.
We also determined the accuracy of the NSCLC- classifier on histological subtypes and clinical tumor stages (Supplementary Table 6). The sensitivity for adenocarcinoma (AC) samples was 86%, while the squamous cell carcinomas (LSCC) were classified significantly better with 98% sensitivity (p=0.04, chi-squared test). We also determined whether classification sensitivity varied with increasing pathological stages. As shown in Supplementary Table 6, we find a significant increase in sensitivity from Stage 1A (83%) to Stages 3 and 4 (100%) (p=0.005, chi-squared test), suggesting the PBMC cancer signature becomes more pronounced with disease burden.
The accuracy of the NSCLC- classifier varied slightly based on the smoking status of the participants (although there are a limited number of non-smokers in the study population). The overall accuracy was 79%, 87%, and 88% for current, former, and never smokers, respectively (non-significant difference, p=0.28 by Fisher exact test). (The accuracy data based on smoking status and case/control status are shown in Supplementary Table 7).
The NSCLC signature was generated with controls from two different at-risk populations. About half (50) were “high risk” based on underlying lung disease and smoking history, while an additional 41 had been further diagnosed by CT or chest X-ray with lung nodules and were to undergo surgical evaluation. When we calculated classification accuracy for the two control populations separately, the NSCLC- classifier had a specificity of 89%, if only the “high risk” controls without lung nodules are considered, whereas the specificity was 71% for the controls with confirmed benign nodules. Although the difference in specificity appears to be large for these 2 control groups, it does not quite reach statistical significance (p=0.051, Fisher Exact Test), limited in part by sample numbers. However, we further explored this difference in accuracy by analyzing patients with confirmed benign nodules separately. We were able to obtain a 24-gene nodule classifier by cross-validation (Supplementary table 5) using only the 41 benign nodule samples as the control group and data from a randomly selected group of 54 NSCLC case samples. This classifier had a somewhat better apparent specificity of 80% as determined by SVM, but the difference in accuracy between the NSCLC and nodule classifiers did not reach significance (p=0.44, Fisher Exact Test). Because of its higher accuracy and potentially broader applicability, the following analyses were carried out with the 29 gene NSCLC- classifier
Although we had used cross-validation to establish our NSCLC- classifier, to further validate the utility of the classifier for analyzing new samples we assessed the classification accuracy using samples not included in the 29-gene selection process. The validation set included 38 NSCLC samples and 17 controls. Twenty-seven of the validation samples (Supplementary Table 1C) were collected at the NYU Lung Cancer Biomarker Center, an Early Detection Research Network (EDRN) Clinical and Epidemiologic Validation Center. The dataset included 12 Stage 1 NSCLC (5 of whom were never smokers) and 15 smoker and ex-smoker controls. Six of the controls were diagnosed by serial CT scans as having non-malignant Ground Glass Opacities (GGO) (23). No GGO patient samples were included in our original training set. The RNA for these samples was prepared at NYU. An additional 26 patients and 2 control samples were collected at Penn and had not been analyzed previously. The NSCLC classification algorithm is applied to these samples with no knowledge of whether a sample is a case or control (see Methods). The classification for the validation set is shown in Figure 2 and in more detail in Supplementary Table 8. The overall accuracy for the validation set was 78%, with 76% sensitivity and 82% specificity. This small decrease in accuracy and sensitivity (although with an increase in specificity) was not unexpected since the NYU samples were not specifically collected for these studies and, as a result, the sample collection and RNA purification were not standardized for these samples.
Eighteen of the NSCLC patients in the validation set shown in Figure 2 also had post-resection blood samples that were collected 2-5 months after surgery (Supplementary Table 9). To assess how the removal of the tumor affected the NSCLC- SVM score we had determined for the pre-surgery samples, we also determined the scores for the post-resection samples from each pair (Figure 3) Of the 14 patients that classified as cancer in the validation set (i.e. had positive SVM scores), 13 (93%) showed a decrease in their SVM scores in the post-resection samples. Five of these post-surgery samples (4, 5, 6, 10, and 13) had clearly negative SVM scores and would be classified as non-cancer samples in the analysis. Of the 4 misclassified, pre-surgery patients, 1 showed a highly decreased score and 3 showed increases in their scores. Although the time intervals between the first and second samples ranged between 2 and 5 months (Supplementary Table 9), there was no obvious relationship between the change in the scores and the time to post-resection sample collection. In the large majority of the patients (14 out of 18), tumor removal was associated with a decrease in the cancer signature score.
Although 29 genes were sufficient to distinguish cancer and control classes, many more statistically significant genes were differentially expressed providing some indication of the nature of the changes we are detecting. We used Ingenuity Core Analysis to determine the functions significantly and preferentially represented after correction for multiple testing in the top 1,000 significant genes from the NSCLC vs NHC and NSCLC vs. benign nodule comparisons (from a total of 2386 and 3276 differentially expressed genes respectively, p<0.05 by t-test). We did both analyses to further assess the similarities and differences between the genes identified in the 2 comparisons. Details are in Supplementary Methods. A list of statistically significantly enriched pathways is shown in Figure 4. As expected, pathways associated with specific immune functions are well represented, and highly significant, including pathways for CD28 and T-cell receptor signaling, calcium induced T-cell apoptosis. and macrophage and monocytes phagocytosis The top 5 pathways by p value in the NSCLC/NHC comparison are also found to be significant for the NSCLC vs. benign nodule comparison and rank among the top 6 pathways for that analysis. There were, in addition, 3 significantly enriched pathways that were unique to the latter comparison, SAPK/JNK Signaling, p38 MAPK Signaling and Lymphotoxin β Receptor Signaling.
In addition to identifying significant canonical pathways, we looked at genes associated with functional categories. We focused on those functional categories associated with the innate and humoral immune response, in particular, those functions associated with inflammation and infection. The overlap of genes associated with these 2 processes is significant. Under the functional categories of cell mediated and humoral immunity, we found that 13/13 (p=9.2E-06) differentially expressed anti-pathogen response genes and 8/9 genes (p=5.04E-04) associated with the generation of reactive oxidative species, an end product of Toll Receptor (TLR) activation, are downregulated in the NSCLCs as compared to controls with benign nodules. In parallel we found that 7/7 antibacterial response genes are downregulated in the NSCLCs compared to all NHC (p=4.15E-02). Five genes are common to the 2 comparisons including Toll receptor 5 (TLR5), the surface receptor for bacterial lipopolysaccahrides. TLRs 1, 7 and 8 are down in NSCLCs compared to either control class. We also find that genes associated with activation of the NFκb pathway, through which the TLR signals are transmitted (24), are down while pathway inhibitory genes like IkB are up in NSCLC PBMC. Recently an important role for Toll receptor functions in respiratory diseases has emerged, in particular for COPD a condition affecting the majority of both our case and control subjects (24-26) suggesting that innate response pathways are suppressed in our cancer samples despite the presence of the activating condition of COPD.
We previously suggested that chemokines and cytokines released by malignant cells could impose a tumor-specific signature on normal immune cells of patients with non-hematopoietic cancers (27). Gene expression profiles from PBMC that identify blood signatures associated with a variety of cancers, including metastatic melanoma (18), breast (20), renal (17, 21), and bladder cancers (19) have now been reported. However, most of these studies have focused on later-stage cancers or response to therapy and used healthy control groups for comparison. We now have identified gene expression signatures in PBMC that can distinguish patients with early-stage NSCLC from appropriate at-risk controls with non-malignant lung diseases common to both patient and control classes.
The observed classification is not likely to be influenced by circulating tumor cells since 1) our classifiers do not contain genes characteristic of lung tumors such as SFTBP(28) or lung specific keratins (29); and 2) any tumor cells would be diluted to an extraordinary degree by the PBMC without efforts to enrich for such cells. This classifier appears not to be smoking dependent. Lung cancer in individuals who have never smoked has been shown to have several important differences from tobacco-associated lung tumors, and some molecular changes have been suggested to be unique to non-smokers (30, 31). There were 14 NSCLC patients in our study that had no prior history of smoking. Despite this, 11 of the 14 “never” smokers in our dataset were correctly classified as cancer by our NSCLC- panel.
The mechanism(s) for the effect we have detected remains to be determined. Interactions between the tumor and immune cells could be direct or mediated by cytokines or other tumor-released factors. The effects are enhanced with tumor progression, as evidenced by the increased accuracy of our gene panel in classifying late-stage NSCLC. Our ability to build a classifier from peripheral immune cells is consistent with recent findings from both mouse models and studies of immune suppression by tumors in humans. For example, Redente et al (32) showed, in a mouse lung-cancer model, that soluble factors produced in lung pre-malignant lesions influenced expression of specific macrophage activation markers in bone marrow macrophages and that the effect on gene expression was enhanced with tumor progression. The ability of tumors to induce myeloid-derived suppressor cells in lymph nodes, spleens, and peripheral blood in mouse models is now well established (33-35). The observation that tumor-resection results in disappearance of these myeloid-derived suppressor cells (36) supports our observations that the PBMC tumor signature diminished after tumor removal in the majority of the patients we examined. Similar tumor-induced suppressor cells in the PBMC fraction of blood also have been identified in human cancer patients (37, 38). Evidence from recent studies, comparing gene expression in PBMC and tumor-infiltrating lymphocytes from patients with either liver cirrhosis alone or in conjunction with liver cancer, suggests that the tumor presence can be communicated to the peripheral immune system and that the signal can be detected in the PBMC gene expression patterns (39). These observations support our finding that the NSCLC signature detected in PBMC diminishes in a majority of post-surgery patients.
The 5 pathways most significantly represented among the top 1,000 differentially expressed genes between cases and controls were significant for both the comparison of NSCLC and all controls and for the comparison of NSCLC and nodule controls. There is significant, but not complete overlap in the genes associated with these 5 pathways for the 2 comparisons. For 3 of the pathways (1, 2 and 5) <50% of the genes are common to both comparisons. Clearly there are significant similarities as well as some differences in the 2 comparisons we have carried out to identify our NSCLC general classifier. Recent studies have suggested that while diagnostic genes detected in various pathways may vary, the pathways themselves are better classifiers (40, 41).
We also identified some interesting differences between cases and controls in relation to immune response functional categories. The reduction in TLR expression in NSCLC was somewhat surprising as a high proportion of our patients and controls have COPD which would normally be expected to have activated TLR pathways (25). TLR function has been studied primarily in response to pathogens but a more expansive role in immune regulation has been emerging for recognition of self-antigens associated with auto-immunity (42-45). In addition endogenous ligands for Toll receptors have been identified including MUC1 a tumor expressed antigen that has been shown to be a negative regulator of TLR signaling (46) and heat shock proteins (47-51).
Our study follows the paradigm for biomarker development described by Pepe et al. (52) and adopted by the NCI Early Detection Research Network (EDRN). This paradigm first outlines the use of cross-sectional studies of patients with cancer versus appropriately chosen controls without disease to document initial estimates of sensitivity and specificity. Biomarkers meeting appropriate thresholds are then to be tested in external populations and finally in prospective studies. Following this model, our first analysis showed that a 29-gene panel could differentiate between a lung cancer population and an appropriate at-risk control population. Additional validation studies were then carried out on an external, independent dataset. Plans for prospective studies are in progress.
Although the NSCLC- signature could be developed as a screening tool for high-risk patients, the initial clinical use of our biomarkers is more likely to provide additional data to a clinician trying to evaluate a pulmonary nodule diagnosed by CT scan or chest X-ray. Based on prevalence data from a large CT screening study (3), the 29-gene NSCLC classifier has a PPV of 0.06 and an NPV of 1.0. (3) (Supplementary Table 10). This is comparable to the PPV and NPV values calculated using the same prevalence values for the 80-gene classifier derived from lung epithelial cells obtained from bronchial brushing recently described by Spira et al (16).
Since higher SVM score increases the likelihood of a sample being cancer the specific SVM value may be useful for clinical decision making in patients with suspected lung cancer or a non-calcified nodule and thus could help determine which patients require immediate interventions such as biopsy or surgical resection. This could potentially decrease the number of patients with benign lung nodules that would otherwise undergo biopsy or surgery (i.e. false positives).
Our results represent an encouraging first step, but several tasks remain to be addressed. Additional external validation sets are required to establish a standard collection protocol and to confirm the gene signatures and their accuracy. A larger prospective cohort study in patients with lung nodules is needed to more fully determine the role of smoking or other potentially confounding effects or diseases and to evaluate the overall clinical feasibility and utility of this approach. In addition, the observed reduction of the NSCLC cancer signature in the post-surgery samples suggests the possibility that post-surgery gene expression profiles might contain information predictive of recurrence. Ongoing follow-up studies are being conducted to determine the applicability of our approach to recurrence and response to therapy.
In summary, we have found gene expression signatures in PBMC that can distinguish individuals with early-stage NSCLC from individuals with non-malignant lung disease. The changes in PBMC gene expression with tumor removal suggest some specific functional effects of the tumor on the immune system that can be detected in the gene expression profiles. Although we have only examined NSCLC in this study, other types of lung cancer also may be detectible by gene expression in the peripheral immune cells.
*Gene expression data is available in the gene expression omnibus (GEO). The index code is GE12355.
We thank WenHwai Horng, Linda Alila, and Shere Billouin for technical assistance and support from the Genomics and Bioinformatics Cores. This project was supported by PA DOH Tobacco Settlement grants SAP 4100020718 and 4100038714, the PA DOH Commonwealth Universal Research Enhancement Program, EDRN Set-Aside funds. Wistar Cancer Center Support Grant P30 CA010815. A.V. was supported by NCI K07 CA111952. There were no conflicts of interests.