|Home | About | Journals | Submit | Contact Us | Français|
Accurate staging and outcome prediction is a major problem in clinical management of oral cancer patients, hampering high precision treatment and adjuvant therapy planning. Here, we have built and validated multivariable models that integrate gene signatures with clinical and pathological variables to improve staging and survival prediction of patients with oral squamous cell carcinoma (OSCC). Gene expression profiles from 249 human papillomavirus (HPV)-negative OSCCs were explored to identify a 22-gene lymph node metastasis signature (LNMsig) and a 40-gene overall survival signature (OSsig). To facilitate future clinical implementation and increase performance, these signatures were transferred to quantitative polymerase chain reaction (qPCR) assays and validated in an independent cohort of 125 HPV-negative tumors. When applied in the clinically relevant subgroup of early-stage (cT1-2N0) OSCC, the LNMsig could prevent overtreatment in two-third of the patients. Additionally, the integration of RT-qPCR gene signatures with clinical and pathological variables provided accurate prognostic models for oral cancer, strongly outperforming TNM. Finally, the OSsig gene signature identified a subpopulation of patients, currently considered at low-risk for disease-related survival, who showed an unexpected poor prognosis. These well-validated models will assist in personalizing primary treatment with respect to neck dissection and adjuvant therapies.
Head and neck squamous cell carcinoma (HNSCC) is the 7th most common tumor in the world . HNSCC originates in the mucosal linings of the oral cavity, oropharynx, hypopharynx and larynx. The majority of patients (30-40%) present with oral squamous cell carcinoma (OSCC) . Classical risk factors for HNSCC are tobacco use and alcohol consumption. Additionally, human papillomavirus (HPV) infection became manifest as a cause during the last decade. The HPV-attributable fraction is highest in oropharyngeal squamous cell carcinoma (OPSCC), and varies from 20-90% depending on the geographical region . Also oral cancers may arise from HPV infection, but the attributable fraction is lower, ranging from 0-6% . OPSCCs caused by HPV infection are different at the molecular level  and have a highly favorable prognosis . This different clinical behavior led to treatment de-intensifying trials to personalize treatment and a staging adaptation in the 8th edition of the TNM Classification of Malignant Tumors of the Union for International Cancer Control (UICC) .
The 5-years overall survival for OSCC is 60%, but ranges from 10 to 80% depending on the extent of the tumor at diagnosis , as defined by the TNM stage. TNM staging is based on prognosis and employed for treatment planning in patients with OSCC , but is group-based and meets limitations for personalizing treatment of the individual patient.
OSCC is mainly treated by surgery with or without postoperative radiotherapy or chemoradiotherapy, and besides TNM stage, additional important prognostic features are derived from histopathological examination of the surgical specimen. For example, tumor-positive surgical margins (R+) and lymph node metastasis (LNM) with extracapsular spread (ECS) are classical treatment-decisive prognostic factors and indicators for postoperative chemoradiotherapy. Of note, histopathological examination of the specimen is only available for postoperative therapy decisions, and not for pre-treatment prediction of prognosis and treatment planning. Particularly for patients with a clinically N0 neck an important choice has to be made between elective treatment of the neck, with associated morbidity, or active surveillance with the risk of occult lymph node metastases that will become manifest during follow-up. Molecular profiling of tumor specimen may provide additional, objective information to improve current prognostication, and can even be performed on pretreatment biopsies to stage the neck.
Several prognostic models based on molecular profiles have been evaluated for HNSCC in general, or for OSCC specifically [10–13]. These models predicted survival of the studied populations, and added independent information to other established prognostic factors. However, none of these models has been introduced in clinical practice. Reasons are (1) insufficient clinical validation of the models, (2) the complexity and lack of reproducibility of the different profiling platforms , (3) heterogeneous study populations regarding HPV status and tumor subsite, (4) the high costs of transcriptomic profiling, and (5) the lack of compatibility with formalin-fixed paraffin-embedded (FFPE) tissue specimen. Translation of expression profiles to quantitative real-time polymerase chain reaction (qPCR) platforms using selected gene panels may overcome most of these disadvantages.
Another argument holds true for expression profiles associated with the clinically N0 neck. Previously, an expression profile has been identified and appropriately validated in a multicenter trial [15–17]. The signature remained accurate with negative predictive values (NPV) of 88% to 90% in the clinically relevant subgroup. However, the sentinel node biopsy is a competing diagnostic modality in this patient group with an even higher NPV of 95% . Notwithstanding, sentinel node biopsy has not been introduced widely, has a poor performance for floor of mouth tumors, and has the obvious disadvantage that it remains a surgical procedure with radioactive tracers, whereas for gene expression analysis only a biopsy is required. Particularly, switching to RT-qPCR analysis of a thoroughly selected gene panel may further enhance the predictive power of the gene signature because of the large dynamic range of RT-qPCR.
We therefore aimed to identify and test gene expression signatures to address these important challenges in head and neck oncology: prediction of lymph node metastasis (LNM) and overall survival (OS). First, signatures of informative genes were selected from gene expression data by regression methods. Next, a limited number of genes were selected for platform transition to RT-qPCR assays, and the prognostic power was validated using an independent cohort of surgically-treated HPV-negative OSCC patients. The molecular data were further combined with clinical and pathology data to provide the most accurate models for clinical practice to predict nodal metastatic disease and prognosis.
Microarray data from two cohorts, 150 OSCC patients from The Netherlands (Array Cohort 1, AC1) and 99 OSCC patients from Italy (Array Cohort 2, AC2), were used to identify genes related to LNM and OS (Table (Table1).1). LNM was present in 60% of AC1 patients and 49.5% of AC2 patients. In AC1, the median overall follow-up time was 7.2 years (95% CI = 6.7 – 8.1). In AC2, the median overall follow-up time was 3.5 years (95% CI = 3.3 – 4.3).
The gene selection strategy is summarized in Figure Figure11 and described in detail in the Supplementary Materials. In short, the previously published LNM gene profile [15, 17] was evaluated to predict N-stage in AC1 and AC2. Using the global test with pathological N-stage as outcome, these genes had a p-value of 9.3E-06 and 9.9E-03 in AC1 and AC2, respectively. Combined univariable analysis identified 221 significant genes (FDR<0.1, Supplementary Table 1). From these genes, 22 genes were selected for RT-qPCR validation based on their ranking in univariable and multivariable analysis.
For survival, a similar gene pre-selection strategy was hampered by the lack of thoroughly validated prognostic gene signatures in the public domain. We therefore included other techniques to reduce the dimensions of the data, but also explored all genes to ensure that important prognostic genes were not missed. We only used AC1 for gene selections, as AC2 did not pass the global test due to the shorter follow-up time (global test p-values AC1: 7.8E-3 and AC2: 0.73). Univariable analysis of all genes identified 226 (out of 37,662) significant genes in AC1 (FDR<0.1, Supplementary Table 1). Next, 20 genes were selected by univariable and multivariable analyses for survival, and 20 additional genes were selected after ranking the genes on their predictive value for recurrent disease to account for disease-specific death (see Figure Figure1).1). Two genes overlapped between the 40 survival genes and the 22 LNM genes (Supplementary Figure 1), rendering an overall signature of 60 target genes for technical and independent RT-qPCR validation (Supplementary Table 2).
First, the 60 target genes were technically validated in a subset of 20 cases from AC2 to evaluate the platform transition. For these 20 cases, correlation coefficients were calculated between microarray and corresponding RT-qPCR data (Supplementary Table 3). In total, 52 of 60 genes validated well, as they showed a good correlation between microarray and RT-qPCR data (mean r=0.64, SD=0.26). The remaining 8 genes correlated poorly, showing a correlation coefficient >1 SD below the mean. Cox regression nonetheless indicated that two of these eight genes did correlate with survival, i.e. EIF5 (P=0.011) and ATP6V0A1 (P=0.057), and these were therefore kept in the panel. The remaining 6 genes were replaced by the second best genes from the initial microarray analyses (Supplementary Table 1), and subsequently analyzed.
The RT-qPCR validation cohort consisted of 125 OSCC cases that were independent from both microarray cohorts. In this validation cohort, nodal metastasis was detected in 51.2% of patients, and the median overall follow-up time was 5.1 years (95% CI = 4.4 – 6.3) (Table (Table1).1). The selected genes were run on customized microfluidic RT-qPCR cards, and the results were tested by univariable analyses and corrected for multiple testing. From the LNMsig 15 of 22 genes had an FDR<0.1 (Supplementary Table 4). From the OSsig 10 of 40 genes had an FDR<0.1 for OS, seven of which also significantly associated with disease-free survival (DFS). Thus, after correction for multiple testing, in total 25 of 60 genes selected from microarray datasets could be validated with RT-qPCR assays in an independent patient cohort.
The performance of the LNM predictive signature is summarized in Table Table2;2; see Supplementary Table 4 for the estimates per gene. When all clinical stages of disease are considered, the AUC of this model was 0.69 (Table (Table2),2), with an NPV of 66% (Table (Table2).2). Next, we performed a subgroup analysis on the clinically relevant subset of tumors with clinical stages I and II (n=54), because these tumors qualify for transoral resection without treatment of the neck. In this subgroup, the AUC (0.66, Table Table2)2) and the sensitivity of the LNMsig (67%, Table Table2)2) were comparable with the performance statistics in all stages. The NPV, however, increased from 66% to 84% (Table (Table2).2). There were no clinical variables that correlated to LNM (data not shown) and data from histopathology is not available before surgery planning. Moreover, the fraction of occult lymph node metastasis was comparable in cT1 and cT2 tumors (i.e. 25% and 29% respectively). Previously, Van Hooff et al.  proposed a clinical decision model that recommends an elective neck dissection when the gene expression signature prediction indicates N+ or active surveillance when the prediction is N0, and estimated the benefit. Following this decision model, the LNMsig shows a similar benefit and could have prevented overtreatment in over 66% of the pN0 cases (72% or 24% overtreatment without or with the clinical decision model, respectively; see Figure Figure22).
The 40 survival genes significantly discriminated between high and low risk cases (OS: iAUC=0.63, P=1.6E-3 (global test), Table Table33 and Figure Figure3A-left;3A-left; DFS: iAUC=0.65, P=6.8E-3 (global test), Table Table33 and Figure Figure3A-right;3A-right; see Supplementary Table 4 for Ridge estimates per gene). In a clinical setting the genes should add prognostic information to established parameters. Hence, the gene signature was analyzed in context of clinical and histopathological data.
Several clinical factors were associated with OS, and none with DFS. A model was trained with the most important clinical factors for this dataset and pathological TNM-stage (pTNM). The clinical factors selected and included in the model were: age at diagnosis and smoking (i.e. packyears, PY), see Supplementary Table 5 for univariable p-values. The model with these two clinical factors and pTNM accurately predicted overall survival (iAUC=0.66, Table Table3),3), but not DFS (iAUC=0.53, Table Table3).3). Adding OSsig to this model improved the accuracy (OS: iAUC=0.68, OSSig: P=0.03 (global test), Table Table33 and Supplementary Figure 2A). For DFS, a model based on the two clinical variables + pTNM and the OSsig gave an iAUC of 0.60. Note that this is lower than a model based on the OSsig only (iAUC 0.65).
Besides pTNM, other histopathological variables are important to decide on adjuvant treatment. In the Dutch guidelines, decisive criteria for adjuvant postoperative therapy are extracapsular spread (ECS), tumor-positive margins (R+) and multiple metastatic lymph nodes (>1 LNM). We created a composite variable (pCompVar) that was scored positive if ECS or R+ or >1 LNM was present. This composite variable was combined with clinical factors (i.e. age, PY) in a prognostic model (OS: iAUC=0.73, DFS: iAUC=0.62; see Table Table3).3). The OSsig improved the accuracy of the model (OS: iAUC=0.74, OSsig: P=0.02 (global test), Table Table33 and Supplementary Figure 2B; DFS: iAUC=0.68, OSsig: P=0.01 (global test), Table Table3).3). DFS was most accurately predicted by a model that combined the OSsig and pCompVar, not including pTNM (iAUC=0.70; OSsig: P=5.6E-3 (global test)).
A subgroup analysis was performed with patients without criteria for postoperative radiotherapy, i.e. cases that were pCompVar-negative (n=79, Figure Figure3B-left).3B-left). For these cases a multi-type prognostic model was built that included clinical factors (age and smoking) and the OSsig. The iAUC increased from 0.70 to 0.73 by adding the prognostic genes (Table (Table33 and Figure Figure3B-right).3B-right). Predictive models for DFS were less accurate in this subgroup, although a predictive model with genes only showed some predictive power (iAUC=0.65, OSsig: P=0.27) (Supplementary Figure 3, Supplementary Table 3).
These findings show that the prognostic value of the OSsig adds to established clinical and pathological prognostic variables.
For additional external validation, we used RNAseq data of HPV-negative OSCC tumors from the TCGA Network publication  (n=160, Table Table1).1). The 22-gene LNMsig was significantly associated to pathological N-stage (P= 7.6E-06, global test). Moreover, the LNMsig could accurately classify the tumors with an AUC of 0.73 (95% CI = 0.67 to 0.78). The performance of the 40-gene OSsig was also significant (iAUC=0.59, P=0.02 (global test), Supplementary Figure 4). The OSsig was less informative since the average follow-up time for the 89 non-deceased cases was very short (2.2 years, SD = 2.35, Supplementary Figure 5A), and the baseline hazard was relatively high when compared to the RT-qPCR validation cohort (Supplementary Figure 5B).
We identified prognostic gene expression signatures that are predictive of LNM and OS in OSCC by rigorous gene selection and validation. First, we selected 60 genes using microarray data, and these genes were validated in an independent cohort of OSCC patients by the use of RT-qPCR assays. Finally, we built 2 multivariable genomic models: a lymph node metastasis model (LNMsig) and overall survival model (OSsig) and confirmed the additive value of the gene signatures to existing and established variables.
The LNMsig with 22 genes predicted nodal metastatic disease with an NPV of 84% in clinical stages I and II. These diagnostic performance statistics are comparable to previous results using a 732-probe microarray signature . However, the RT-qPCR approach facilitates clinical implementation considerably, because a comparable performance was achieved with less genes and a more user-friendly platform. A high NPV is necessary to identify patients who can be spared an elective neck dissection. Recent reports showed that the sentinel node biopsy (SNB), which is an alternative staging technique, is more accurate with an NPV of 95%  at comparable prevalence rates of LNM. The SNB, however, is an invasive surgical procedure with associated risks and costs, and with lower sensitivity in floor of mouth tumors [20–22]. Moreover, it has not been introduced widely. It has been suggested that a combination of an expression signature and SNB may be more accurate for staging of the clinically N0 neck .
The OSsig could be used to personalize treatment. By itself, the OSsig predicted overall survival with an iAUC of 0.63, which is already promising compared to the iAUC of 0.51 of standard pTNM. For prediction of DFS, the OSsig was even more valuable, particularly when combined with histopathology, as clinical variables were not informative for DFS. These data confirm the predictive value of the OSsig, but also indicate that integrating clinical, molecular and histopathological variables delivers most accurate predictive models.
The design of this study enabled the identification of robust associations in three ways. First, we used different gene expression platforms to cancel out platform-specific findings. Second, we studied homogeneous patient cohorts: only HPV-negative, surgically treated OSCCs were included. Finally, we considered patients from 3 European countries, thereby excluding the discovery of population-specific gene signatures.
Our findings may be limited by two factors. First, intra-tumor heterogeneity may cause differences in gene expression profiles within a tumor; although previous findings suggest that expression profiles seem stable in HNSCC . Second, all cohorts investigated were retrospective. It should be mentioned, however, that retrospective data obtained in The Netherlands are generally accurate, because treatment and follow-up of HNSCC patients has been centralized to a few clinical centers and clinical management adheres to standardized national guidelines.
Our findings suggest at least two implications. First, the prognostic model may be used for treatment escalation in patients with tumors that do not fulfill the current criteria for postoperative radiotherapy, i.e. margin involvement, >1 metastatic lymph node or ECS. Second, a model that integrates clinical variables and the OSsig accurately predicts prognosis without the addition of histopathology. This model may specifically be important to predict survival in patients who are treated with primary radiotherapy or chemoradiotherapy, since histopathology is not available for these patients. These are important directions for future work. Since frozen material is not always available in these cases, future research should also include applications for FFPE tissue. Ultimately, prospective clinical trials will be required to determine whether the integrated risk models could guide clinical decision making and improve treatment results with respect to outcome and morbidity.
Four independent cohorts of human papillomavirus (HPV)-negative OSCC patients were included (1) a cohort of 2 merged tumor gene expression profiles (array cohort 1, AC1) from the University Medical Center Utrecht (UMCU) and VU University Medical Center Amsterdam (VUmc); (2) a cohort of tumor gene expression profiles (array cohort 2, AC2) from the University Hospital Parma Medical Center (UHPMC); (3) an independent cohort of frozen tumor samples from VUmc, UHPMC and University Hospital Düsseldorf (UHD) for RT-qPCR gene expression profiling (qPCR cohort); and (4) an RNAseq dataset of OSCC tumors from The Cancer Genome Atlas (TCGA) Network . Use of tissue from surgical specimen adhered to nation- and institution-specific procedures and guidelines. Informed consent was obtained of enrolled patients, when required. This study followed the Guidelines for the REporting of tumor MARKer Studies (REMARK)  (Supplementary Table 6).
HPV status was either determined with p16 immunostaining followed by HPV DNA PCR on p16-positive samples (AC1) and/or with HPV16 E6*I RT-PCR in the AC1 and qPCR cohorts. Both assays have been validated and described before . In AC2, the HPV status was not available. In the other cohorts on the other hand, 1 out of 151 (AC1) and 1 out of 126 (qPCR cohort) tumors were HPV-positive. Hence, the contribution of HPV positive tumors in AC2 was assumed to be low and no samples were excluded.
Similarly preprocessed VUmc (GSE84846) and UMCU (GSE30788) microarray datasets were combined, and comparability of the expression data of both centers was ensured. Data from AC2 (GSE84846) were not combined to the other datasets, because of a different reference design: Universal Human Reference RNA (cat. 740000, Agilent Technologies, Santa Clara, CA, USA) in AC1 and a pool of cell line RNA in AC2 (CAL 27, ATCC CRL-2095, American Type Culture Collection, Manassas, VA, USA). All preprocessing steps of the microarray data were performed using the limma package  in R (Supplementary Materials).
RNA was purified from fresh frozen tumor tissue and synthesis of cDNA was performed from 1 μg of total RNA using the High-Capacity RNA-to-cDNA Kit (cat. 4387406, Applied Biosystems; Foster City, CA). qPCR was performed using Taqman Low-Density Array (TLDA) Cards (cat. 4346800, Applied Biosystems) (Supplementary Table 2). qPCR Ct values were determined with predefined thresholds that were equal per gene for all patients. Relative gene expression was determined by the ΔΔCt method  using GUSB Ct-values for normalization. GUSB was selected as the most stable housekeeping gene (see Supplementary Table 7) out of four candidate genes (GAPDH, GUSB, RPLP0, and RPL4).
Per dataset, the predictive power for LNM and survival was assessed with the global test [29, 30]. Datasets with significant predictive power (p <0.05) were used for gene selection. Genes were selected from the microarray data by using a combination (detailed later) of lasso logistic regression or lasso Cox regression and univariable FDR-based association analysis. The latter was included to enhance reproducibility of individual markers assayed by qPCR. The gene selection procedure is displayed in Figure Figure11 and further detailed in the Supplementary Materials. For the LNM genes, the p-values per gene of AC1 and AC2 were combined by Fisher's combined probability test, whereas for the prognostic genes only p-values of AC1 were considered, because the AC2 data did not pass the global test. For technical validation, the correlation between microarray and RT-qPCR data of 20 cases was determined by Pearson's correlation coefficient. For the RT-qPCR data, the univariable association of delta Ct values of the selected genes with either LNM or OS was determined by logistic or Cox regression, respectively. For prediction on independent samples, clinical variables were selected using stepwise regression, followed by adding the selected genes in a logistic (Cox) ridge regression to render multi-type prediction models. Model performance was assessed by bootstrapping. The prediction models for outcome consisted of (1) prognostic genes, (2) significant clinical factors and pathological TNM-stage (pTNM), (3) significant clinical factors and a composite pathological variable (positive if ECS or R+ surgical margins or >1 LNM was present), and the combinations (4) 1+2 and (5) 1+3. The predictive performance was assessed by area-under-the-ROC-curve (AUC) and integrated AUC (iAUC) over 5-year follow-up time for LNM and OS, respectively, complemented for LNM by the negative predictive value (NPV). Additive value of the gene signature was assessed with the global test. All statistical tests performed were two-sided. Univariable p-values were corrected for multiple testing using the Benjamini-Hochberg FDR procedure .
We thank V. Okpanyi and L. Colter for sample preparation and assembly of clinical data; F. Rustenburg, P. Eijk, S. Smeets and B. Ylstra for processing of the VUmc microarrays; Agendia for making the UMCU data available; M. Doeleman and D. Heideman for the E6*I RT-PCR HPV testing; and A. Visser and C. Oudejans for their assistance with the qPCR microfluidic cards.
Author contributionsSM: conducting experiments, acquiring data, analyzing data, writing the manuscript
DtB: analyzing data, writing the manuscript
TP: designing research studies, writing the manuscript
SR: conducting experiments, acquiring data
KS: acquiring data
WvW: analyzing data
AB: conducting experiments
NB: conducting experiments, acquiring data
DL: acquiring data
ES: acquiring data
PvD: acquiring data, writing the manuscript
EB: acquiring data, writing the manuscript
RL: designing research studies, writing the manuscript
MvdW: designing research studies, analyzing data, writing the manuscript
RB: designing research studies, analyzing data, writing the manuscript
CONFLICTS OF INTEREST
The authors have declared that no conflicts of interest exists.
This work was supported by grants from the Dutch Cancer Society, KWF (grant number 2008-4201) and European Union's Seventh Framework Project (grant agreement 224483: NeoMark, and grant agreement 611425: OraMod).