HNSCC is the sixth most common non-skin cancer in the world, with an incidence of ~600,000 cases per year and mortality rate of ~50% (1
). The major risk factors for HNSCC are tobacco use, alcohol consumption, and infection with human papilloma virus (HPV) (2
). Despite advances in our knowledge of its epidemiology and pathogenesis, the survival rates for many types of HNSCC have improved little over the past forty years (3
). As such, a deeper understanding of HNSCC pathogenesis is needed to promote the development of improved therapeutic approaches.
We performed solution-phase hybrid capture and whole exome sequencing on paired DNA samples (tumors and matched whole blood) from 92 HNSCC patients. Most anatomic sites were represented (oral cavity, oropharynx, hypopharynx, larynx, and sinonasal cavity; and table S1
). Of these patients, 89% and 79% reported a history of tobacco and alcohol use, respectively (table S1
). Initially, 14% of all tumors and 53% of oropharyngeal tumors were found positive for HPV based on HPV-16 PCR/in situ hybridization ( and table S1
). Tumor copy-number analysis using SNP arrays (fig. S1
) replicated previous findings of frequent CCND1
deletions, and rarer MYC
, or CCNE1
), indicating that the collection is genetically representative of HNSCC.
Fig. 1 Mutation rates, base substitution frequencies, and rearrangements in head and neck cancers. (A) Rate of synonymous and non-synonymous mutations, expressed in mutations per megabase of covered target sequence. Non-synonymous mutation rates range from 0.43 (more ...)
We achieved 150-fold mean sequence coverage of targeted exonic regions, with 87% of loci covered at >20-fold (figs. S2 and S3
and table S2
). We excluded from further analysis 18 tumors in which initial analysis revealed extensive stromal admixture (figs. S3 and S4 and supplemental methods
), leaving 74 samples for analysis. We also performed whole genome sequencing (31-fold mean coverage, table S3
) on an oropharyngeal tumor and a hypopharyngeal tumor.
On average, 130 coding mutations per tumor were identified, 25% of which were synonymous (). We queried 321 of these mutations by mass spectrometric genotyping and validated 288 (89.7%). However, the validation rate increased to 95.7% for mutations whose allelic fraction was >20% of total DNA, suggesting that the sensitivity of mass spectrometric genotyping may be reduced in the setting of increased stromal admixture.
The overall HNSCC mutation rate was comparable to other smoking-related malignancies such as small cell lung cancer and lung adenocarcinoma (5
). The mutation rate of HPV-positive tumors was approximately half of that found in HPV-negative HNSCC (mean of 2.28 mutations/Mb compared with 4.83 mutations/Mb; p
= 0.004), consistent with epidemiologic studies suggestive of biological differences between HPV-positive and HPV-negative disease. The two tumors that underwent whole-genome sequencing harbored 19 (HN_62469) and 111 (HN_62699) “high-confidence” somatic rearrangements, respectively (fig. S5
and tables S4 and S5
Although base mutation rates varied widely (0.59–24 mutations/Mb; ), the average rate of guanosine-to-thymidine (G→T) transversions at non-CpG sites (12% ± 6%) was characteristic of tobacco exposure (). Among patients who reported a smoking history, tumors with the highest fraction of G→T transversions showed a tendency toward increased overall mutation rates (p
= 0.02) (). Thus, the G→T transversion frequency may represent a robust readout of “functional” tobacco exposure. We observed differences in mutation rates and G→T transversion frequencies by tumor site even when restricting the analysis to HPV-negative tumors. In particular, HPV-negative laryngeal cancers exhibited higher mutation rates and G→T transversion frequencies compared to HPV-negative cancers found in the oral cavity, oropharynx, hypoharynx, or sinonasal cavity (p
= 0.008 and p
< 0.0001, respectively, , and fig. S8
Notwithstanding the overall apparent correlation between G→T transversions and mutation rates, several “outlier” tumors showed elevated mutation rates despite a low fraction of G→T transversions. Some of these tumors contained mutations in one or more DNA repair genes. Strikingly, both HNSCC tumors with the highest mutation rates occurred in non-smokers (). These results raise the possibility that some HNSCC tumors may contain genetic alterations that promote elevated mutation rates apart from the effects of tobacco (SOM).
To explore the biological basis of HNSCC in an unbiased manner, we used the MutSig algorithm (7
) to identify genes harboring more mutations than expected by chance, given the total number of mutations detected. This analysis revealed 39 genes with high statistical significance (False discovery rate q < 0.1; figs. S6 and S7
and tables S6 and S7
). Compared to recent cancer genome projects such as ovarian cancer and multiple myeloma (7
), our analysis of HNSCC revealed a larger number of significantly mutated genes. However, the majority of mutated genes did not reach statistical significance (table S6
), suggesting that many may contain passenger events. Thus, we hypothesized that the MutSig algorithm identified an enriched set of genes that likely underwent positive selection during tumorigenesis. Toward this end, numerous significant genes had previously been implicated in HNSCC, including TP53
, and PIK3CA
) (q < 0.1) providing support for the validity of the approach. TP53
, the most commonly mutated gene in HNSCC, was also disrupted by a 100 kb deletion detected by whole genome sequencing, and validated with a focal copy number change detected by SNP array (fig. S9
). However, most significantly mutated genes had not previously been implicated in HNSCC.
To explore their biological significance, we first considered mutated HNSCC genes that also undergo frequent genetic alterations in other cancers. NOTCH1
was particularly noteworthy: point mutations affecting this gene occurred in 11% of the HNSCC tumors ( and and tables S6 and S7
), and focal deletions were seen in two additional tumors (). Previous evidence from animal models had implicated Notch dysregulation in cutaneous squamous cell carcinoma (9
), but somatic NOTCH1
mutations had not previously been identified in squamous malignancies. In addition, we found non-synonymous point mutations in NOTCH2
in 11% of the samples ( and and table S6
) and a focal deletion of NOTCH3
in one additional case (). Whereas NOTCH1
contains activating mutations in T-cell acute lymphoblastic leukemia and chronic lymphocytic leukemia (10
) and NOTCH2
contains activating mutations in diffuse large B-cell lymphoma (12
), the mutations in HNSCC appeared to be loss-of-function mutations, consistent with those recently described for myeloid leukemia (13
Fig. 2 NOTCH gene mutations identified in head and neck cancer. A. schematic diagram of the domain structure of NOTCH1. (domain structures of NOTCH2 and NOTCH3 are similar). All nonsense mutations occur upstream of the TAD domain, which is required for transactivation (more ...)
Fig. 3 Genetic disruption of a squamous differentiation program in head and neck cancers. (A) Heatmap representation of individual mutations present in a series of 74 tumors, represented in columns. (Top) HPV status by tumor. (Middle) Matrix of mutations in (more ...)
nonsense mutations in HNSCC are predicted to generate truncated proteins that lack the C-terminal ankyrin repeat domain, a region critical for transactivation of target genes () (14
). Five additional mutations (four missense and one in-frame deletion) cluster in highly conserved residues situated within or nearby the extracellular ligand binding domain (). Two others are splice-site mutations that may generate truncated proteins or delete critical functional residues (e.g., ligand binding or activation by proteolytic cleavage; ). Together, these findings suggested that NOTCH dysregulation—and more generally mechanisms governed by NOTCH signaling—contribute to the genesis or progression of HNSCC.
To further interpret the mutations identified in HNSCC, we looked for functionally related “gene sets” harboring an excess of mutations. For this purpose, we considered an expanded list of 76 genes (q < 0.25; table S7
) and looked for enrichment in functional gene sets. The highest-scoring gene set contained genes related to epidermal development (table S8
). The significantly mutated genes (q < 0.25) in this gene set included NOTCH1
, and TP63
. These genes are all clearly related to squamous differentiation. The most abundantTP63
protein product in squamous epithelia, known as ΔNp63, promotes renewal of basal keratinocytes by a mechanism that requires downregulation of NOTCH1
). IRF6, in turn, has been implicated in the proteasomal degradation of ΔNp63 (18
). Furthermore, terminal differentiation in squamous epithelia is induced in response to genotoxic stress by a mechanism involving p53-dependent transactivation of NOTCH1
—an activity antagonized by ΔNp63 (19
). Because HNSCC involves transformation of the squamous epithelial lineage, which is histologically similar to the epidermis, these findings led us to hypothesize that mutations in such genes disrupt a stratified squamous development/differentiation program in precursor cells of this malignancy.
Further inspection of recurrent mutations identified eleven additional genes carrying disruptive mutations that function in the squamous differentiation program. The evidence includes mouse knockouts with defects in squamous epithelial differentiation (Notch1
, and Dicer1
) () (20
); human germline mutations causing orofacial clefting syndromes (IRF6
, and MLL2
) and knockdown or deregulated expression leading to a differentiation block and increased proliferation in cultured human keratinocytes (TP63
). Thus, many mutated genes in HNSCC may govern squamous differentiation. These mutations may promote an immature and more proliferative basal-like phenotype, consistent with known stages of progression and markers of differentiation in HNSCC ().
We also found recurrent mutations in less well-characterized genes. For example, mutations in SYNE1
were observed in 20% and 8% of HNSCC samples, respectively (fig. S7
and tables S6 and S7
). These genes have been implicated in the regulation of nuclear polarity (28
), a process that operates upstream of NOTCH1 in squamous epithelia () (29
mutations were seen in 11% and 12% of cases, respectively; the corresponding proteins mediate calcium sensing (30
), another crucial process for terminal squamous differentiation (20
Beyond the genes directly involved in squamous differentiation, we found mutations involving two apoptosis-related genes: CASP8
(8%) and DDX3X
(4%) (fig. S7
and table S7
). Thus, suppression of apoptosis may also contribute to HNSCC pathogenesis, perhaps in concert with disrupted squamous maturation (). The histone methyltransferases PRDM9
(11%) and EZH2
(6%) also show highly significant mutation rates.
Viral infection by HPV figures prominently into the etiology of a subset of HNSCC, and is most frequently detected by in situ hybridization (ISH) or p16 immunohistochemistry. We reasoned that HNSCC genome sequencing might also offer a robust HPV detection method. We therefore utilized the PathSeq algorithm (31
) and a viral sequence database to identify HNSCC sequencing reads that aligned to HPV genomes. We observed HPV-16 sequence reads in 14 tumors (19%) (range: 1–40,000 reads), 11 of which were also positive by HPV-16 PCR (p
< 0.0001; table S9
). The three tumors that were HPV-negative by PCR had very low HPV-16 sequence read counts (fig. S10
); this may reflect reduced HPV dosage or technical contamination. We observed an inverse correlation between HPV status (determined by sequencing) and TP53
mutation, as shown previously (p
= 0.006) (32
). These data underscore the potential utility of massively parallel sequencing to detect both human and non-human etiologic agents in tumor specimens.
Given that NOTCH pathway inhibitors have entered clinical trials, the discovery of loss-of-function NOTCH1
mutations in HNSCC may have important therapeutic implications. A recent clinical trial of a gamma secretase inhibitor (which inhibits NOTCH) was halted in part due to an increased frequency of skin cancers in the treatment arm (33
). This clinical observation is consistent with those mouse models, in which cutaneous knockout of Notch1
promotes skin tumor formation (24
). Our results suggest that patients taking gamma secretase inhibitors may require monitoring for the development of both cutaneous and head/neck squamous malignancies.
Despite the anatomical distinctions that dominate current clinical management of HNSCC, our results point to several unifying features at the molecular level. For example, TP53
inactivation–either through somatic mutation or HPV infection–appears nearly universal in this malignancy. The present study suggests that disruption of the squamous differentiation program may represent an additional overarching feature that occurs by numerous genetic mechanisms across tumors from multiple anatomic sites. Thus, HNSCC pathogenesis may involve a maturation arrest or a lineage dependency similar to that seen in other cancer types (34
). However, HNSCC appears to be unusual in that the mutational etiology is diverse, in contrast to leukemia and prostate cancer where developmental pathologies appear to be caused by lesions in only a few target genes. Rational therapeutic avenues targeting this block in squamous differentiation may require synthetic lethal approaches to identify specific cellular dependencies arising from NOTCH inactivation, TP63 alteration, or other events that deregulate the program. Finally, our results demonstrate that whole exome sequencing of large numbers of tumor/normal pairs should enable fundamental new insights into tumor biology that are relevant to many human cancers.