Lung cancer accounts for 28% of all cancer deaths—the highest percentage of all cancers 
. Non-small cell lung cancer (NSCLC) accounts for ~85–90% of lung cancers, of which adenocarcinoma and squamous cell carcinoma are the most common subtypes 
. Although upwards of 70% of NSCLC patients have advanced disease that is rarely curable when diagnosed, new advances for subsets of lung adenocarcinomas that harbor EGFR mutations or EML4-ALK gene fusions encourage the development of targeted therapies that may alter this dire situation 
. These genetic alterations primarily occur in adenocarcinomas of patients who never smoked, and are uncommon in SCC which is predominantly associated with smoking 
. While FGFR1 
and DDR2 
have recently emerged as potential therapeutic targets for some SCC patients, inhibitors have yet to reach clinical trials. Recent NSCLC high throughput sequencing studies primarily focused on analyzing DNA have shown that few genes are mutated at a sufficiently high frequency to be useful for targeted therapy; however these studies do predict DNA alterations that are frequently clustered in a limited number of important molecular pathways suggesting that targeting these pathways may be a viable therapeutic strategy 
. Deep transcriptome (RNA-seq) profiling of NSCLC to identify genes with deregulated expression that is common between tumors has not yet been reported, although such reports are to be expected given the large RNA-seq datasets being generated by TCGA 
and other consortia.
Cancer cells within an individual tumor exist in distinct phenotypic states that often exhibit important functional differences. A subpopulation of cells with self-renewing and tumor-initiating capabilities, commonly referred to as cancer-stem-like cells (CSCs), have been identified in a variety of tumor types including NSCLC 
. Mounting evidence suggests that CSCs are resistant to anticancer therapies and underlie metastasis 
, and hence are the primary cancer cell type responsible for relapse and progression of malignant tumors. The immediate implication is that by targeting CSCs it should be possible to eradicate the drug resistant and metastatic subpopulation of a cancer 
. However, recent studies have demonstrated that the CSC phenotype is plastic and can be reconstituted by other, non-CSC, tumor cells 
; thus not just CSCs but all tumor subpopulations that are “potential CSCs” must be targeted. Transcriptome sequencing of CSC and non-CSC subpopulations in NSCLC would provide insights into the molecular basis underlying their phenotypic similarities and differences and facilitate the identification of novel therapeutic targets. Such analysis will be an important and necessary complement to the bulk tumor transcriptome profiling being performed by TCGA and others.
The observations that non-CSCs can reconstitute CSCs, and vice versa, suggest that the phenotypic differences between these subpopulations are due to epigenetic rather than genetic differences. Therefore, exome and genome sequencing experiments aimed at identifying somatic mutations are not expected to reveal differences between sorted CSC and non-CSC subpopulations. On the other hand, transcriptome profiling, which is a readout of the epigenome (i.e. histone marks and DNA methylation that regulate expression), should be an excellent method for profiling CSCs and non-CSCs to reveal mechanistic differences. The advantage of RNA-seq data over microarrays is the ability to analyze isoform expression differences 
. In cancer cells, alternative mRNA isoforms can produce protein isoforms with dominant negative activity. The pathogenic role of cancer-specific isoforms has been extensively demonstrated across all aspects of cellular physiology, including cellular adhesion and metastasis (CD44 and RON), cell growth and tumorigenesis (PKM2, MDM2, FGFR2, CRK, NUMB), cell cycle (PYK), angiogenesis (VEGF), apoptosis (GS3KB, CD95, Bcl-X, caspase-2, caspase-9), metabolism (PK), and drug resistance (AR and MRP-1) 
. These examples underscore the advantage of isoform-level transcriptome information over whole gene expression for gaining insights into the molecular mechanisms underlying CSC and non-CSC phenotypic differences.
Here we report the application of genomics technologies to a SCC xenograft that was sorted into CSC and non-CSC subpopulations based on the CD133 marker (). CD133 (PROM1;Prominin-1) is a 5-transmembrane glycoprotein that is considered to be a marker for the subpopulation of CSCs in both subtypes of NSCLC 
. In NSCLC the CD133+ subpopulation has been shown to have higher tumorigenic potential in SCID mice, to express higher levels of stemness genes and to be more resistant to conventional chemotherapy than the CD133− subpopulation 
. Importantly, so that the SCC xenograft would be more representative of primary tumor, it was directly engrafted as minced primary tumor into NSG mice and was never grown in vitro
. Whole-genome DNA analysis revealed that the chromosomes of CD133+ and CD133− subpopulations were highly deranged in a very similar manner; however, as expected the tumor did not harbor clinically actionable mutations. Analysis of mRNA splice isoform expression profiles of the CD133+ and CD133− subpopulations resulted in the identification of SCC as a potential new indication for numerous drugs currently in development and suggest several additional new promising targets. Finally, analysis of The Cancer Genome Atlas (TCGA) publicly available transcriptome RNA-seq data of 221 SCCs 
supports the generality of our transcriptome findings for this disease. Altogether our study demonstrates the capability of transcriptome sequencing of sorted cancer cell subpopulations to inform clinical development in ways that are not possible with DNA sequencing.