Our primary result was the detection of four gene expression subtypes in HNSCC – basal, mesenchymal, atypical, and classical. We also showed that these subtypes have biological and clinical relevance, and therefore they provide a useful and informative mechanism of classifying HNSCC tumors that complements existing methods based on histology and tumor site. Analysis of publicly available expression datasets revealed that these subtypes are reproducible in HNSCC 
and are remarkably similar to those found in LUSC 
. Although gene expression patterns for the secretory LUSC subtype are similar to those seen in the mesenchymal subtype of HNSCC, we favor an alternate nomenclature. Data confirming the glandular origin of HNSCC is less compelling compared to that for the lung, and evidence of a mesenchymal signature is abundant 
. While it would be possible to use the existing data to produce a gene predictor for HPV status, we did not attempt to do this because results of this nature were presented by Martinez et al. 
. Regions of recurrent DNA copy number gain and loss were detected, some of which contain known oncogenes and tumor suppressors. The copy number values in certain aberrant regions were associated with tumor subtype, which suggests that copy number events may contribute to the development of expression subtypes. All of the expression subtypes were detected in HNSCC cell lines, a finding that provides the basis for future studies.
We now briefly discuss the definitions of the expression subtypes. Basal and classical were chosen because the expression patterns in these subtypes showed strong similarities to the basal and classical subtypes of LUSC. Wilkerson et al. compared the expression patterns in the LUSC subtypes to time course data from developing human bronchial epithelial cells, and they found that the basal subtype had similar expression patterns to those seen at early time points when basal cells are most common. Similarly, as shown in Table S2
, we observed that the basal subtype of HNSCC is most similar to the day 3 time point in the time course data from the ALI model 
. The classical subtype exhibits canonical genomic alterations associated with squamous cell carcinoma – e.g. deletion of 3p and 9p, amplification of 3q, and focal amplification of both EGFR
. Mesenchymal was selected based on pathway analysis indicative of an epithelial to mesenchymal transition. Finally, atypical was chosen because of the lack of either EGFR
amplification or deletion of 9p.
The differences in the expression patterns found in the subtypes are clinically relevant. TP63
produces six distinct proteins, and ΔNp63 is the most abundant isoform in HNSCC 
. Yang et al. 
show that ΔNp63 promotes cell proliferation. Chatterjee et al. 
noted that exposure to cisplatin led to decreased levels of ΔNp63, so this treatment may be particularly effective for patients in BA. Barbieri et al. 
showed that loss of TP63
in HNSCC cell lines led to the acquisition of a mesenchymal phenotype, which is compelling in light of the low expression levels of TP63
seen in MS. Martin and Cano 
indicated that elevated expression of TWIST1
in HNSCC cell lines could increase the likelihood of invasiveness and migration. Because MS tumors exhibited an EMT phenotype and increased expression of both TWIST1
, these subjects may be more likely to develop distant metastases. The fact that EGFR
is overexpressed in the vast majority of HNSCC tumors 
inhibitors an attractive treatment option for this disease. However, these therapies are less likely to be effective in AT tumors because EGFR
expression was lower than in the other expression subtypes. SOX2
were highly expressed in AT and CL, and both of these genes are putative cancer stem cell markers because of their contributions to self-renewal and a pleuripotent phenotype 
. The protein product of PIK3CA
is p110α, which phosphorylates AKT. Activated AKT contributes to the survival of tumor cells, and thus oncogenic transformation 
. West et al. 
showed that exposing normal lung epithelial cells to nicotine facilitates activation of AKT by making it dependent on PI3K alone. This observation, combined with the high levels of smoking seen in CL, suggests that PI3 kinase inhibitors provide an attractive treatment option for CL tumors.
There were several limitations to this study. First, we did not have GE, CN, and clinical data for all study subjects, which limited our ability to jointly analyze these variables. In addition, although the subtype labels were objectively defined by a clustering algorithm and the gene expression patterns were independently validated, the clinical associations were not. Copy number arrays were generated for all samples with sufficient quality and quantity of DNA. Unfortunately, over 20% of the arrays failed to meet standardized quality metrics. Also, it was not clear which isoform(s) of TP63 were assayed by our gene expression arrays, and unfortunately the role that TP63 plays in the basal subtype cannot be fully appreciated without knowledge of these isoforms. Because the HPV+ samples were removed when conducting our secondary survival analysis, these results should be viewed as exploratory and thus must be independently validated. Finally, the HPV status of all patients was not available.
In conclusion, we confirmed four molecular classes of HNSCC (basal, mesenchymal, atypical, and classical), consistent with signatures established for squamous carcinoma of the lung. Using an integrated genomic analysis and validation methodology, we documented subtypes identified by canonical tumor suppressor genes and oncogenes, including deregulation of the KEAP1/NFE2L2 oxidative stress pathway, differential utilization of the lineage markers SOX2 and TP63, and preference for the oncogenes PIK3CA and EGFR. For potential clinical use, the signatures are complimentary to classification by HPV infection status as well as the putative high risk marker CCND1 copy number gain. A molecular etiology for the subtypes is suggested by statistically significant chromosomal gains and losses and differential cell of origin expression patterns. Model systems representative of each of the four subtypes were also presented.