|Home | About | Journals | Submit | Contact Us | Français|
Lung cancer is the leading cause of cancer death in the US and the world. The high mortality rate results, in part, from the lack of effective tools for early detection and the inability to identify subsets of patients who would benefit from adjuvant chemotherapy or targeted therapies. The development of high-throughput genome-wide technologies for measuring gene expression, such as microarrays, have the potential to impact the mortality rate of lung cancer patients by improving diagnosis, prognosis, and treatment. This review will highlight recent studies using high-throughput gene expression technologies that have led to clinically relevant insights into lung cancer. The hope is that diagnostic and prognostic biomarkers that have been developed as part of this work will soon be ready for wide-spread clinical application and will have a dramatic impact on the evaluation of patients with suspect lung cancer, leading to effective personalized treatment regimens.
Lung cancer is the leading cause of cancer death in the US and the world. The high mortality rate (80–85% within 5 years) results from the lack of effective screening tools and tools for early-stage diagnosis (1–3), the inability to identify subsets of patients who would benefit from adjuvant chemotherapy or adjuvant targeted therapies, and the slow development new drug therapies. The development of high-throughput genome-wide technologies for measuring gene expression, such as microarrays, have the potential to impact the mortality rate of lung cancer patients by improving diagnosis, prognosis, and treatment.
The use of high-throughput technologies in breast cancer illustrates the potential impact that similar approaches may have in thoracic oncology. DNA microarrays have been used to identify gene expression signatures comprised of multiple genes that indicate which estrogen receptor-positive and auxiliary node-negative patients may benefit from additional chemotherapy. The are currently three commercially available gene expression based prognostic tests for breast cancer – Oncotype DX, a 21-gene assay (4) (Genomic Health, Redwood City, CA), MammaPrint, a 70-gene assay (5) (Agendia BV, Amsterdam, the Netherlands), and H/I, a 2-gene ratio assay (6) (AvariaDx, Carlsbad, California). Currently, there are two ongoing prospective randomized trials, Trial Assigning Individual Options for Treatment (TAILORx) to evaluate OncotypeDx, and Microarray in Node-negative Disease may Avoid ChemoTherapy (MINDACT) to evaluate MammaPrint versus a prognostic clinical algorithm (7). If these tests prove efficacious, they will be the first of many prognostic and diagnostic tests based on high-throughput gene expression measurements. The advantage of multi-gene biomarkers is that they are able to achieve higher accuracy than would be possible from a single gene measure (Figure 1).
This review will highlight recent studies using high-throughput gene expression technologies that have led to clinically relevant insights into lung cancer. The studies present molecular markers for the diagnosis of lung cancer, the prognosis of early-stage lung cancer, and sensitivity and response to chemotherapeutic agents (Table 1). Due to the clinical focus of this review, mechanistic insights into lung cancer biology and pathogenesis using high-throughput gene expression technologies (examples include (8–10)) as well as the technical, computational and analytic challenges inherent in processing and analyzing high-throughput data will not be discussed. Also, the application of the high-throughput technologies to study SNPs, DNA methylation, alternative splicing, and protein expression in lung cancer is discussed elsewhere (11–14).
The success of the Human Genome Project coupled with a variety of technological advances such as rapid oligonucleotide synthesis and microarray chip fabrication has enabled the development of high-throughput gene expression technologies. Depending on the experimental design, RNA samples are obtained from cell cultures or surgical tissues. Prior to RNA isolation and processing, techniques such as laser capture microdissection (15) can be used to obtain a homogeneous population of cells from tissue specimens.
Microarrays are currently among the most commonly used technology for quantitatively measuring the expression of genes or miRNAs in a high throughput manner. Microarrays are orderly arrays of spots composed of oligonucleotides complementary to genes/miRNAs that are immobilized onto a solid support such as a glass slide (16;17). Microarrays take advantage of Watson-Crick base pairing, and therefore, only complementary nucleic acids will hybridize and produce a signal that can be used as a measure of expression. The production and use of microarrays requires several steps including the synthesis of probes, array fabrication, target hybridization, fluorescence scanning, and image processing to produce a numerical readout of expression. Complementary DNA (cDNA) microarrays, developed at Stanford University, use DNA clones (selected from sequence databases) between 500 and 5000 base pairs in length as probes. Oligonucleotide microarrays, as their name implies, use as probes short oligonucleotides that have been derived from gene or miRNA sequences. In addition to microarrays, other high-throughput sequence-based technologies to measure gene expression such as serial analysis of gene expression (SAGE) (18) have been used in the past, and technological advances in sequencing are leading to new massively parallel sequencing technologies (19–21) that will likely be used extensively in future research.
The risk for developing lung cancer increases with cumulative exposure to cigarette smoke. The incidence of lung cancer, however, even in a high-risk population of smokers is only ~15% over a lifetime (22). Currently, there are no effective diagnostic biomarkers to identify which current and former smokers are at the greatest risk for developing lung cancer. As a result of this failure to detect high-risk smokers and the low frequency of early stage detection the five year survival rates for lung cancer (~15%) have not changed appreciably over the past 4–5 decades. Previous screening trials with frequent chest x-rays and sputum cytology have not demonstrated an effect on lung cancer mortality (reviewed by Jett and Midthun et al. (23)). Spiral computerized tomography (CT) scan screening can detect lung tumors at an earlier stage than routine chest x-rays. However, while spiral CT can be highly sensitive it is also non-specific and many newly detected small lesions have proven on resection to be non-malignant scar tissue or old granulomas rather than early lung cancers (2). While final results from large-scale randomized trials using CT scans are still pending, recent work has suggested that this approach does not improve lung cancer mortality (24).
Developing biomarkers that are highly sensitive, specific, and identify smokers at high risk for developing lung cancer or individuals with early stage cancer represents a key approach to improving lung cancer mortality. In order to explore the mechanisms by which individuals respond to the carcinogenic effects of smoking, several groups have used DNA microarrays and SAGE to define the genome-wide impact of smoking and smoking cessation on cytologically normal bronchial airway epithelial cells (25–31) or peripheral blood lymphocytes (32;33) of never, former, and current smokers.
The results of the above studies suggest that it might be possible to detect which smokers the carcinogenic effects of cigarette smoke have resulted in lung cancer. A recent study by Spira et al. used DNA micoarrays to profile the gene expression patterns of cytologically normal large airway epithelial cells in current and former smokers undergoing bronchoscopy for the clinical suspicion of lung cancer (34). An 80-probeset lung cancer-specific biomarker was developed based on a training set of samples (n=77) that could distinguish between smokers with and without lung cancer. The biomarker was both sensitive and specific when tested on an independent test set (n=52) and on an additional prospectively collected set of samples (n=35). This biomarker was also shown to provide information about the likelihood of lung cancer that is independent of clinical risk factors for lung cancer among patients with non-diagnostic bronchoscopies (35). By increasing the diagnostic sensitivity of bronchoscopy, biomarkers such as the one described above, have the potential to expedite more invasive testing and definitive therapy for smokers with lung cancer, and reduce invasive diagnostic procedures for individuals without lung cancer. In addition, if future studies demonstrate that smoking-induced cancer-specific alterations in gene expression precede the development of lung cancer, biomarkers may be useful for indentifying high-risk lung cancer patients.
Differences in treatment between NSCLC and SCLC make the distinction between these two types of lung cancer important. Within NSCLC, there are potential differences in terms of prognosis and response to newer targeted therapies (36). Accurate molecular classification, therefore, has the potential to identify different molecular subtypes of NSCLC currently not recognized by pathologists that would benefit from subtype-specific therapies. In addition, molecular classification of tumors may augment surgical-pathological staging at surgery, allowing the most appropriate treatment for a given stage of tumor to be used.
One of the initial applications of high-throughput gene expression technology in the area of lung cancer was to explore whether or not differences in gene expression could be indentified between the different histological subtypes of lung tumors. Two studies in November 2001 began to explore this question using microarray technology and diverse sets of lung tumor samples. The broad goals were to identify gene expression profiles associated with the histological subtypes of lung tumors, identify subclasses of AD where there is frequent disagreement among pathologists, associate gene expression profiles with tumor features such as surgical-pathological stage as well as survival after resection, and identify metastases of non-lung origin.
Garber et al. (37) profiled the gene expression of 67 lung tumors with 5 years of clinical follow-up from 56 patients as well as 5 normal lung samples and 1 fetal lung sample using 24,000 element cDNA microarrays. Hierarchal clustering of samples according to the expression of the most variable genes revealed patterns of gene expression that corresponded to the major morphological classes of lung tumors: AD (n=41), SCC (n=16), LCC (n=5), and SCLC (n=5). The AD tumors were the most heterogeneous and formed 3 distinct clusters. There were differences in survival between the 3 groups, and this was in part associated with tumor grade and lymph node metastases.
In a larger study, Bhattacharjee et al.(38) used Affymetrix U95 microarrays containing 12,600 transcripts to profile gene expression levels of 17 normal lung samples and 186 lung tumors that included 127 ADs, 21 SCC, 20 carcinoids, 6 SCLC, and 12 AD tumors suspected to be of non-lung origin. Using a similar methodology to Garber et al., hierarchal clustering segregated samples based on histological subtype and identified molecular markers associated with each subtype. Both studies found, for example, keratin genes were highly expressed by SCC and genes associated with neuroendocrine differentiation were highly expressed in SCLC. Bhattacharjee et al. examined just the ADs using hierarchal and probabilistic model-based clustering and identified 6 distinct groups. A supervised approach was subsequently used to identify genes strongly associated with each of the 6 clusters. One cluster contained normal lung tissue, another cluster contained tumors suspected to be colon, breast, or liver metastases, and the remaining 4 clusters segregated the ADs based on markers of cell division, proliferation, neuroendocrine origin, and type II alveolar pneumocytes. The clusters were also associated with extent of tumor differentiation, presence of BAC, and patient outcome even when limited to stage I tumors.
Both the Garber et al. and Bhattacharjee et al. studies demonstrated that gene expression patterns could distinguish between the histological subtypes of lung cancer and found that ADs had the greatest heterogeneity. In addition, both studies demonstrated an association between the AD clusters and prognosis. The studies, however, lacked independent test sets to confirm the molecular classifications, however, a study by Hayes et al. demonstrated that the tumor subtypes of AD were reproducible across the two datasets plus an additional dataset (39). The AD specimens also contained a mixture of subtypes that included BACs with known favorable prognoses making it difficult to distinguish between genes related to prognosis or subtype, and the Bhattacharjee study lacked clinical data to confirm metastases from extrapulmonary tumors. Despite these shortcomings, the studies served as a foundation for future lung cancer gene expression studies.
Several smaller studies followed exploring similar questions using different analysis techniques, sample sets, and technologies such as SAGE (40–43). Other studies performed real time PCR and immunohistochemistry to validate gene expression differences between lung tumor subtypes (44), distinguish between primary and metastatic SCC of the lung (45), and explore differences between lung tumors and lung cancer cell lines (46). In addition, other studies have identified molecular markers for pulmonary neuroendocrine tumors using DNA microarrays and linked a subset of these markers to prognosis (47;48). Finally, these studies identified molecular markers for known histological subtypes of lung cancer and suggested refinements to the pathological classification of tumors. Molecular classification of lung tumors may eventually improve prognosis if newly identified subtypes respond differently to current treatments regimens or if they suggest new subtype-specific drug targets.
In addition to molecular classification of tumors, high-throughput gene expression technologies have been used to characterize tumor stage. A study by Ramaswamy et al. (49) identified a gene expression signature of metastasis that could distinguish between metastatic and primary ADs from multiple tumor types. Stage I and II lung ADs from the Bhattacharjee et al. dataset separated into two groups with significant differences in survival according to the expression of the metastatic gene signature. When the signature was applied to other tumor datasets, tumors expressing the metastatic gene signature consistently had a poor outcome, suggesting that metastatic potential may be encoded in the primary tumor.
Several studies using the primary lung tumor to predict lymph node metastases were subsequently published. Kikuchi et al. (50) and Inamura et al. (51) identified genes associated with lymph node metastasis among primary lung ADs, and Hoang et al. (52) identified genes associated with non-metastatic tumors, those with micrometastases, and those with overt metastasis. Xi et al. (53) used the Bhattacharjee et al. (see above) and the Beer et al. (54) (see Prognosis section below) datasets to examine whether gene expression in primary AD tumors was indicative of lymph node metastases. A 318-gene signature was able to accurately classify node positive patients in the training (Beer et al.) and test (Bhattacharjee et al.) sets, but frequently misclassified node negative patients. The classification as node negative or positive in the node negative patients was associated with survival. These studies suggest that the survival differences observed among stage I ADs in the Garber et al. and Bhattacharjee et al. datasets might be related to the presence of micrometastases or metastatic potential. The use of gene expression for “molecular staging” may enhance the sensitivity of clinical and pathologic methods for staging tumors, improving treatment decisions and ultimately outcomes for lung cancer patients.
miRNAs are short sequences of RNA about 22 nucleotides long that regulate gene expression by hybridizing to complementary sequences of target mRNA. The binding of miRNAs to mRNAs can result in degradation of the mRNA or repression of mRNA translation into proteins. Recently, expression profiling of miRNAs has contributed to our knowledge of how these short sequences are involved in cancer biology. Yanaihara et al. (55) focused on exploring miRNA expression in normal and cancerous lung tissue. DNA microarrays capable of measuring 352 miRNAs were used to identify 43 miRNAs that were differentially expressed between 104 pairs of normal and lung tumor tissue and 6 miRNAs differentially expressed between AD and SCC.
Thirty to 35% of Stage 1 NSCLC patients relapse following tumor resection(56;57). Clinical trials have indicated a potential survival advantage for early-stage lung cancer patients who receive adjuvant chemotherapy (58). However, it would be useful to identify the subset of these patients who are at low risk for relapse to spare them the side effects of unnecessary treatment. Gene expression profiles have the potential to augment current prognostic indicators such as clincopathological stage, K-ras and p53 mutations, poor differentiation, and high tumor proliferative index.
The Garber et al. and Bhattacharjee et al. studies found correlations between molecular subgroups of lung AD and prognosis. These findings set the stage for the publication of several studies that used supervised approaches to identify genes associated with prognosis among early-stage ADs. The supervised approaches first stratify patients by known outcome, identify genes associated with these outcomes in a set of training samples, and use these genes and an algorithm to predict the outcome of additional test set samples. In 2002, Beer et al. (54) used DNA microarrays to measure gene expression levels in 67 stage I ADs, 19 stage III ADs, and 10 non-neoplastic lung tissues. Stage I and III tumors were divided into training and testing sets and 50 genes associated with survival were identified across the training set using univariate Cox proportional-hazard regression modeling. Expression levels of these genes were combined using a prediction algorithm to calculate a risk index which was then used to stratify patients into low- and high-risk groups. There was a significant difference in survival between test set samples as a whole and the subgroup of stage I test samples predicted to be low- or high-risk. Interestingly, stratifying patients by prognostic markers such as K-ras and p53 mutation status did not identify subgroups with a significant difference in survival. After refining the predictor, it was validated across 84 lung AD samples from Bhattacharjee et al. and patients assigned to the low- and high-risk groups by gene expression varied significantly in survival. Since the publication of this study, several other studies have emerged with gene expression prognostic profiles for early stage NSCLC (59–65).
One such study by Potti et al. (66) analyzed 89 NSCLC patients using DNA microarrays to develop a metagene prediction model capable of predicting disease recurrence. The model had a higher accuracy than models containing clinical data alone (age, sex, tumor diameter, stage of disease, histological subtype, and smoking history) or both clinical and gene expression data. The model was 72% accurate across ACOSOG Z0030 trial samples (n=25), 79% accurate across CALGB 9761 trial samples (n=84), and 80% accurate across an independent set of stage I SCC (n=15). As proposed by Potti et al., a randomized Phase III trial, CALGB 30506, is about to begin to evaluate the metagene predictor to direct adjuvant therapy in high risk stage IA NSCLC patients. While the prediction model was validated on an independent sample set, it remains unclear if the signature is entirely related to differences in prognosis or recognized subtypes of AD (patients with BAC were not identified). In addition, the variables explored in the clinical risk model did not include potentially important prognostic indicators such as tumor grade, histological subtype of AD (67), and the mutational status of cancer-related genes (K-ras, p53). Finally, it is not clear if there were differences in the use of adjuvant chemotherapy treatment among the patients that could effect survival. The trial will hopefully answer several of these questions that were not addressed in the study.
A study by Lu et al. (68), published shortly after the Potti et al. study, performed a meta-analysis of 7 different datasets (10;38;54;69), including a previously unpublished dataset of their own, to identify a gene expression signature that predicts survival in patients with stage I NSCLC. Genes were identified that were common to the microarray platforms used in all of the studies, the datasets were adjusted for systematic bias, and 197 samples with stage I NSCLC from 5 of the 7 datasets were used to identify a gene expression signature of 64 genes predictive of survival. The signature had higher classification power compared to stage, was predictive of survival among ADs and SCCs, and was able to accurately predict survival in the 2 datasets not used to develop the signature. A subset of the 64 genes was also validated using quantitative RT-PCR and immunohistochemistry. This study demonstrates the feasibility of combining different Affymetrix DNA microarrays to increase sample size and predictive power and identify a robust gene expression signature predictive of survival.
Chen et al. (70) recently reported a 5 gene signature capable of predicting survival among patients with NSCLC. Sixteen genes were found to be associated with survival across training and test sets using DNA microarrays measuring 672 previously identified genes (71) associated with invasive activity in invasive NSCLC cell lines. A subset of the sixteen genes (n=5) were correlated with survival using quantitative RT-PCR, and this subset was used to create a decision tree that stratified patients into low- and high-risk for reoccurrence. The predictor was tested on an independent set of 60 patients and on the Beer et al. (54) dataset. The shortcomings of this study include a heterogeneous group of samples that included Stage I, II, and III NSCLC samples and different subtypes of NSCLC. In addition, Chen et al. chose to focus on a set of invasive genes derived from NSCLC cell lines characteristic of the lung tumor, but not the adjacent stromal tissue. The samples used were not microdissected and had both tumor and stromal tissue, and therefore, the analysis may be missing more robust predictive genes.
Given the publication of numerous studies that have identified prognostic gene expression signatures for NSCLC, one important question concerns the comparability of these studies as they have used different microarray platforms, analysis techniques, and samples. The Lu et al. study discussed above as well as other published studies (72–74) have demonstrated the feasibility of combining different datasets to increase the power and robustness of the prognostic signature. In addition to these studies, additional work has been done to determine the feasibility of conducting larger studies involving the participation of multiple laboratories. Recently a large retrospective, multi-site, blinded study by Shedden et al. collected 442 lung ADs with relevant clinical, pathological, and outcome data at 4 institutions from 6 lung cancer treatment sites to characterize the performance of several prognostic models (75). The feasibility of the study was established previously by comparing gene expression data produced on the same microarray platform using a standardized protocol by the 4 participating institutions (76). Eight prognostic classifiers and classifiers based on the work of Potti et al. (66) and Chen et al. (70) were developed and evaluated on designated training and blinded test subsets of the data and produced variable results. The inclusion of clinical covariates improved the performance of most classifiers, more complex classifiers (classifiers that included more genes) had better performance, classifiers trained across samples of all stages performed better across stage I samples, and a small subset of the classifiers performed well across both tests sets (from 2 different institutions).
The study illustrates many important points concerning the development of gene expression-based prognostic predictors for early stage lung cancer. While the prognostic classifiers contain different gene sets, there was some concordance between the predictions made by each of the classifiers. This suggests that the power of gene expression to predict prognosis is not restricted to the differential expression of a few genes and that each of the classifiers is measuring aspects of prognosis-related lung AD biology. Similar results have been seen in the setting of breast cancer, where various prognostic classifiers (containing different genes) show high rates of concordance in their outcome predictions of individual samples (77). It is interesting to note that for some lung tumor samples there was complete agreement or disagreement between the classifiers and clinical outcome, while for other samples there was considerable heterogeneity. There are several possible explanations for these discrepancies. Lung ADs have significant histological variation and mixed subtypes, and therefore, it is possible that for some samples, the tissue in the sample may not accurately represent the tumor or the biological process on which a particular classifier depends. In addition, heterogeneity in tissue composition and sample processing or inaccuracies in clinical information may contribute to the variability in the predictions made by the classifiers for a particular sample. There are also potential problems with using overall survival as an endpoint to evaluate prognostic gene expression signatures in subjects with “high risk” tumors that are completely resected or in subjects with “low risk” tumors that develop secondary conditions shortly after diagnosis. The study addresses problems that have plagued past studies such as small number of samples, inconsistent and variable clinical data and sample collection and illustrates many of the remaining challenges associated with developing a prognostic gene expression signature for clinical application.
In addition, the MicroArray Quality Control (MAQC) project led by the FDA evaluated microarray technology for its use in clinical and regulatory settings by examining repeatability of data generated within a particular site, across multiple sites, and between seven different microarray platforms (78). The study observed reproducibility of gene expression measurements between different sites and platforms. The reproducibility of gene expression measurements between sites and across platforms demonstrated by these studies is a critical milestone in the development of gene expression biomarkers that can be routinely used in the clinic.
Prior to the work of Johnson et al. (79) associating let-7 miRNA and RAS expression in lung cancer, Takamizawa et al. (80) demonstrated that reduced expression of let-7 miRNA in lung cancer was associated with shortened postoperative survival. One-hundred, forty-three lung tissue specimens, predominantly ADs, from stage I, II, and III lung cancers were collected from patients undergoing resection. Let-7 expression was used to dichotomize patients into two groups that had significantly different survival (p = 0.0003) when all samples were analyzed or just ADs. Patients with lower let-7 expression has significantly worse prognosis, independent of disease stage.
Yanihara et al. (55) used microarrays to quantify miRNA expression in lung tumors and found that two miRNAs, mir-155 and let-7a-2, were significantly associated with survival in lung ADs by Kaplan-Meier survival analysis. In a multivariate Cox proportional hazard analysis that included all clinicopathological and molecular factors, increased expression of mir-155 was significantly associated with worse prognosis. Real time RT-PCR across an independent validation set of 32 ADs confirmed a significant relationship between mir-155 expression and survival. A subsequent study by Yu et al. (81) used real-time PCR to measure the expression of 157 miRNAs in 112 NSCLC patients to identify a 5 miRNA signature (let-7a, mir-221, mir-137, mir-372, mir-182) capable of predicting overall and relapse-free survival. Cox proportional hazard regression and risk-score analysis was used to identify the 5 miRNA signature across a training set of samples (n=56). The signature was used to predict the risk (high- or low-) on a test set of samples (n=56) and an independent cohort of NSCLC samples (n=62). There was a statistically significant difference in overall and relapse-free survival between low- and high-risk groups and the signature was a reasonable predictor of survival among subsets of the samples with the same cell type or stage. Yu et al. was also able to show that modulating the levels of 4 out of the 5 miRNAs altered lung cancer cell invasiveness in vitro. The results indicate that miRNA expression profiles can be used as prognostic markers for lung cancer. Future studies profiling both gene and miRNA expression across a large cohort of early stage ADs is needed to determine if an expression signature composed of miRNAs, mRNAs, or both has the greatest diagnostic and prognostic potential in lung cancer.
Integration of diverse sources of clinical, biological, expression, and sequence information is the promise of personalized medicine and may make it possible to individually tailor treatment regimens for lung cancer. For example, biomarkers may identify chemotherapeutic-specific lung cancer subtypes with the potential to improve prognosis through use of individualized treatments. Work in this direction is already starting to yield promising results.
Staunton et al. (82) used DNA microarrays to measure gene expression in the NCI-60 panel (a collection of 60 human cancer cell lines (83;84)). By combining the untreated gene expression profile of each cell line together with information about each cell lines’ chemosensitivity profile, they were able to predict drug sensitivity in an independent test set of cell lines. A subsequent study by Potti et al. (85) repeated and built upon Staunton’s work. They showed that the drug sensitivity predictors derived from the NCI-60 data were capable of accurately predicting patient response to various chemotherapeutic agents, and were further able to predict that lung cancer patients sensitive to docetaxel were likely to be resistant to etoposide – both front-line chemotherapy options. The work by Potti et al. also connected patterns of chemotherapy sensitivity with deregulation of known oncogenic pathways. For example, a relationship between docetaxel resistance and deregulation of the PI3-kinase pathway was observed. Using a panel of 17 NSCLC cell lines a significant association was found between docetaxel resistance and sensitivity to a PI3-kinase inhibitor (LY-294002), suggesting its use as a second-line therapy.
Following the above work, Hsu et al. (86) developed predictors of cisplatin (a first line agent) and pemetrexed (a second line agent) sensitivity using the NCI-60 data and data from Gyorffy et al. (87). They found that docetaxel, abraxane, and pemetrexed sensitivity was significantly inversely correlated with sensitivity to cisplatin (p<0.01) suggesting their use in ciplatin-resistant patients. Another study by Gemma et al. (88) coupled gene expression data generated using 10 human lung cancer cell lines and drug sensitivity data across 8 anti-cancer drugs used in lung cancer chemotherapy (docetaxel, paclitaxel, gemcitabine, vinorelbine, 5-FU, SN38, CDDP, and CBDCA) to demonstrate sensitivity to gemcitabine was uncorrelated with sensitivity to the other agents, suggesting that combination therapy regimens that include gemcitabine might be interesting to pursue clinically.
Many of the studies described earlier in this review profiled gene expression in primary human tumors to identify gene expression predictors of clinical and pathological variables. An exciting aspect of the studies described above is that they use gene expression information from cell lines and demonstrate that this information can lead to clinically relevant predictors of drug sensitivity in lung cancer patients. These results, while tantalizing, are preliminary and need to be validated in larger longitudinal cohorts of lung cancer patients being treated with various chemotherapeutic regimens and followed for measures of disease outcome.
The studies described in this review demonstrate the potential for gene expression signatures to impact lung cancer management; however, numerous obstacles remain to the routine application of these profiles in the clinic. Further work on computational approaches for merging datasets across platforms is needed to effectively leverage the collective data being generated. In addition, large longitudinal studies measuring gene expression as well as routine clinical, biochemical, and pathologic measures are needed to demonstrate that gene expression is a better predictor of outcome than more routine measures. This could be accomplished by leveraging existing large-scale prospective clinical trials or epidemiologic studies and collecting biological samples for gene expression studies from those subjects. Additionally, integrating high-throughput gene expression measurements with other forms of molecular data (SNPs, methylation, proteomics) may give a more complete picture and result in the identification of the most robust diagnostic, prognostic, and predictive markers. However, the ultimate barrier to adoption of these markers in the clinic is the need for more of them to be validated in prospective multicenter studies to demonstrate their reproducibility and accuracy across multiple sites and operators. Physicians and other health care providers will need to be trained in the proper handling and storage of biological specimens for gene expression studies given RNA’s inherent instability. While the FDA has begun to address some of the regulatory issues surrounding multivariate gene expression assays, additional guidance is needed from physicians, third-party payers, and regulatory bodies if these tests are to be translated into clinical benefit for lung cancer patients.
Support: This work was supported by NIH/NCI R01CA124640 (AS, MEL, and JB) and the National Institute of Environmental Health Sciences (NIEHS)/NIH U01 ES016035.
Disclosure: M.E.L. and A.S. have equity in Allegro Dx Inc.