The principal novel hypotheses tested in this study is that lung SCC expression subtypes exist, are reproducible, clinically relevant, and exhibit patterns that correlate with unique cell types in the normal lung. These subtypes (primitive, basal, secretory, and classical) were identified in an unbiased and objective manner and are supported by cross-cohort validation using five training cohorts and by independent validation using a sixth cohort, which together total 438 patients. The expression subtypes were also found in a wide variety of patient populations from the United States, Asia and Europe, in a wide variety of cohort sizes from 36 to 127. All cohorts showed approximately the same subtype proportions, overall: primitive – 16%, classical – 37%, secretory – 26%, basal – 21%. These subtypes were associated with tumor differentiation and patient sex. Survival outcomes are significantly different among the subtypes and subtype is an independent predictor of survival. Possible limitations of our analysis include possible sample quality artifacts or patient behavior, such as smoking immediately prior to surgery; however, all six cohorts showed the same results so any limitation would have to occur in six large, independently collected, cohorts.
The SCC expression subtypes are biologically distinct and show similarities to distinct normal lung cell populations. These biological characteristics serve as the basis for the SCC nomenclature. The basal subtype exhibits many characteristics of lung basal cells such as: cell adhesion and epidermal development functional themes, S100A2 and KRT5 basal cell markers, overexpression of genes whose products are localized in the basement membrane, similarity to basal cells in the HBEC-ALIC model, and similarity to surface epithelia in the HMLCC model. The secretory subtype has many features of lung secretory cells such as: surfactant and mucin overexpression, similarity to secretory cells in the HBEC-ALIC model, and similarity to submucosal glands in the HMLCC. The primitive subtype has a cellular proliferation functional theme, the worst survival outcome, an overabundance of female patients, the most nonsmokers, and an overabundance of poorly differentiated tumors. This subtype is similar to early embryonic mouse lungs, where primitive, less differentiated cells may be predominant and would be consistent with the poorly differentiated nature of these tumors. The primitive subtype also has similarity to late stage HBEC-ALIC, which could be explained by lung “transient expression” in which differentiation markers are expressed during early lung formation and again in the developed lung (
48). Alternatively, a late-emerging and late-active cell type in HBEC-ALIC may be most similar to the embryonic mouse lung. The classical subtype, exhibits features representative of typical lung SCC including the highest prevalence at 37%, overabundance of males, greatest patient smoking behavior, overexpression of TP63, and putative amplification of the TP63-containing locus 3q27-28.
The distinct SCC subtype to cell population similarities could be explained by the SCC subtypes having different ancestor cells. These different ancestor cells could be cell types of distinct lineages or cellular differentiation stages such as proposed in breast cancer (
49). This scenario provides a reason why the SCC subtypes have dramatically different mRNA expression. The subtypes could arise by genetic mutation from different ancestors that have different mRNA expression and this ancestral mRNA expression could persist in progeny tumor cells. This putative subtype ancestral cell information could be utilized in developing SCC subtype pharmacologic interventions that exploit differences in the ancestral cell types. A caveat to our interpretation of SCC subtype to cell population similarity is that the similarity could be caused by coincidence and expression similarities could reflect similar biology and not similar origin. The lung has multiple proposed cellular development pathways and future studies that describe the molecular profiles of the lung cell types or lung cancer stem cells would further clarify the putative ancestral cells of the SCC subtypes (
50).
The SCC subtypes may have applications in patient care and in cancer research. For instance, patients with the primitive subtype could be treated more aggressively because of this subtype’s poor survival expectation or could be given a more accurate prognosis than by using traditional prognostic factors alone. Basic cancer research could be conducted using the subtype model system partners described in this study. The SCC subtypes could be useful for therapy benefit studies and possibly serve as a foundation for clinical trial selection.
In conclusion, we identified four, robust, expression subtypes of lung SCC using a multi-cohort discovery and validation strategy. The subtypes are clinically and phenotypically different, suggesting different therapies.