|Home | About | Journals | Submit | Contact Us | Français|
Timely intervention for cancer requires knowledge of its earliest genetic aberrations. Sequencing of tumors and their metastases reveals numerous abnormalities occurring late in progression. A means to temporally order aberrations in a single cancer, rather than inferring them from serially acquired samples, would define changes preceding even clinically evident disease. We integrate DNA sequence and copy number information to reconstruct the order of abnormalities as individual tumors evolve for two separate cancer types. We detect vast, unreported expansion of simple mutation sharply demarcated by recombinative loss of the second copy of TP53 in cutaneous squamous cell carcinomas (cSCCs) and serous ovarian adenocarcinomas, in the former surpassing 50 mutations per megabase. In cSCCs, we also report diverse secondary mutations in known and novel oncogenic pathways, illustrating how such expanded mutagenesis directly promotes malignant progression. These results reframe paradigms in which TP53 mutation is required later, to bypass senescence induced by driver oncogenes.
Molecular characterization of human cancers usually profiles a single point in time, yielding a catalogue of genomic and epigenetic abnormalities reflecting years of somatic change. Although recent efforts reveal some mutations associated with metastasis and recurrence(1), timing of events early in tumorigenesis remains difficult. Precursor dysplastic lesions may be sampled and compared against invasive malignancies, but in many cancer types, early lesions are not clinically identifiable, nor is it obvious which lesions will actually progress. Additionally, the increasingly apparent heterogeneity of human cancers suggests such comparisons will require very large sample sizes to reconstruct progression(2).
Primary cSCCs rank among the most common human malignancies, with an annual incidence in Caucasians of more than 150 in 100,000 individuals(3). These tumors arise in anatomic site and demographic proportionate to sunlight exposure and acquire a mutational spectrum reflecting significant ultraviolet radiation damage(4). Although many are excised without complication, cSCCs sometimes behave aggressively with recurrence and regional spread, especially in immunosuppressed and repair-deficient genetic backgrounds (3). We hypothesized that the long-term mutational stress on these tumors might offer unique insight into the progressive events determining a cancer’s individuality.
Exome-level sequencing of eight primary cSCCs and matched normal tissue revealed a very large mutation burden of approximately 1,300 somatic single-nucleotide variants per cSCC exome, (1 per ~30,000 base pairs of coding sequence, Supplementary Figure 1, Supplementary Methods and Appendix) making cSCCs the most highly mutated human malignancy. Of mutations assessed by capillary sequencing across our series, 75/75 (100%) confirmed the originally detected mutations, including four instances of dinucleotide substitution. C > T transition base substitutions at dipyrimidine sites were by far the most common change (>85%), consistent with UV damage. Past analysis of selected TP53 exons suggests that the gene is mutated in 50–90% of cSCCs(5). Our study identified TP53 mutations in 7/8 skin cancers, all coinciding with previously reported changes in the COSMIC database(6). Known changes were also found in CDKN2A, encoding the p16/p14ARF bifunctional tumor suppressor, the HRAS small GTPase, and instances of numerous COSMIC mutations not previously described in cSCCs (Table 1; full list of base substitutions in Supplementary Table 1). We also detected 42 discrete chromosomal abnormalities, about five per sample (Supplementary Table 2).
The ability to temporally order successive molecular changes within an individual tumor, beginning in the initial stages of tumorigenesis, would allow discrimination between mutations forming precancerous lesions from those producing invasive carcinomas. The high prevalence of both simple mutations and copy number abnormalities abnormalities in cSCCs and ovarian cancers enabled us to reconstruct the evolutionary order of some somatic changes based on the following idea: if a mutation precedes a regional duplication, its copy number is doubled. Mutations following a duplication event appear in haploid copy number(7). Therefore 1) simple mutations preceding a chromosomal duplication event show discretely higher copy number compared to those occurring after duplication (Figure 1) and 2) the ratio of heterozygous to homozygous mutations ρ, in a region of CN-LOH, directly measures the age of the duplication (in evolutionary time) (Figure 2, Supplementary Figure 2).
We first utilized this principle to investigate the specific temporal order of mutations in areas of copy-neutral loss-of-heterozygosity (CN-LOH), in which a regional chromosomal duplication replaces the matching portion of the paired chromosome(8; 9). Of the cSCCs in our series, 4/8 showed CN-LOH at chromosome 17p, all harboring TP53 mutations reported in COSMIC. Remarkably, all four TP53 mutations were present at high allelic abundance in the CN-LOH region compared to other somatic mutations, indicating that TP53 mutations occurred and were duplicated before other mutations arose (Figure 1B,C). In aggregate, 59/63 mutations in 17p appear after loss of the second TP53 wild-type allele, 15-fold greater than those preceding loss. CN-LOH events at 17p represent 2% of coding sequence and show normalized mutation frequencies reflective of the remainder of the exome (Supplementary Figure 3). While studies establish some p53 mutations as gain-of-function with respect to cancer type(10), or biochemically dominant negative, ours is the first report that the vast majority of simple mutation – tens of thousands genome-wide in the case of cSCCs – appear sharply gated by elimination of the second copy of TP53. We further detect at least partial persistence of active DNA repair, suggesting a profound loss of damage surveillance contributes to the high number of observed mutations (Supplementary Figure 1, Supplementary Methods and Appendix) (11).
Three samples without CN-LOH at 17p demonstrate at least two distinct TP53 mutations, presumably causing biallelic mutation. In the sample in which TP53 mutations were not detected, a regionally duplicated mutation in the ATM kinase domain was observed, suggesting an alternate means of escaping damage surveillance mechanisms during telomere crisis(12).
We sought to validate our observations in an additional cancer type. Recently, full genomic sequence and copy number changes were determined for 10 ovarian serous adenocarcinomas by The Cancer Genome Atlas Project. Ovarian cancers generally show more complex karyotypic abnormalities than cSCCs(13). In the three samples with a clear, informative CN-LOH event at 17p, we again found clear evidence for complete loss of TP53 as the earliest event (Figure 1D). These initial events in ovarian tumorigenesis could not have been determined through sequencing of precursor lesions and invasive cancers(1; 14), as the asymptomatic nature of early disease precludes tissue collection.
Integrative analyses of copy number and exome sequence also reveal information about the temporal order of chromosomal abnormalities within an individual cancer(7). As described above, the ratio of heterozygous to homozygous mutations ρ, in a given region of CN-LOH, provides a direct measure of the relative age of the duplication (Figure 2A). In other words, duplications with higher ρ occur earlier than those with lower ρ. We found that ρ varied widely among regions of CN-LOH (Figure 2B–D, Figure 3) and could statistically distinguish the temporal order of aberrations within a sample (Figure 2). Overall, seven informative duplications co-occurring with 17p CN-LOH all showed a substantially lower relative ρ (Supplementary Table 2) and thus likely occurred after 17p duplication. Therefore loss of the second TP53 allele appears to precede not only a vast expansion of simple mutations, but also the development of chromosomal aberrations. As a general principle, any regional copy gain acquires a heterozygote mutation frequency uniquely reflective of the time of gain. For selected instances in our series, extension of this principle enabled temporal dissection of more complex copy gains (Figure 3), revealing that these alterations also follow complete TP53 loss.
In cSCCs, we found 486 non-synonymous mutations that were sequenced deeply enough to determine copy number (> 50 independent reads) and that fell at least once in a region of CNLOH. These included known mutations in CDKN2A, WT1, and HRAS (Table 1) each of which showed multiple instances of either wild-type allele loss or biallelic mutation, as seen for TP53. Interestingly, this pattern of recurrent biallelic inactivation was also detected at high prevalence for the suspected epithelial tumor suppressors NOTCH1 and NOTCH2 and the polycystic kidney disease gene PKHD1 (Supplementary Methods and Appendix). NOTCH1 demonstrates three instances of early truncation, two of which show wild-type loss, one case of multiple mutation, and two other mutations, one of which occurs in a splice site (Table 1). NOTCH2 shows multiple mutations in 4/8 samples, and three of these contain at least one truncating mutation.
We trace the mutational evolution of individual tumors using a novel, sequence-based assessment strategy and in doing so, provide an individualizable, patient-centric complement to more traditional “mutation-by-stage” approaches(2; 14). Our results illuminate key aspects of timing in cancer evolution without requiring large sample series, for which precursor lesions are often inaccessible. TP53 is often mutated in precursor lesions, but paradigms of oncogenesis propose p53 loss as a late requirement, overcoming senescence programs activated by prior activation of driver oncogenes(15; 16) and surviving telomere crisis(17). Furthermore, biallelic TP53 loss occurs frequently, despite evidence that p53 mutants behave dominantly structurally and also functionally with respect to phenotypes such as tumor formation(18; 10).
Our data show that decades of UV damage and inactivation of a single TP53 allele only result in about one hundred mutations in the epithelial exome. This tenacious genetic stability explains the benign behavior of clonal keratinocyte proliferations, harboring heterozygous TP53 mutation, that commonly form in sun-exposed skin(19; 20). Subsequent elimination of the second TP53 allele, often through recombination, sharply demarcates a vast expansion in simple mutation, in cSCCs reaching 50 per megabase (150,000 per genome) and making them the most mutagenized human cancers known. Because DNA repair remains at least partially active, this vast mutation burden might result from the collaborative effects of ongoing DNA damage (from intrinsic and exogenous insults) coupled with disabled DNA damage-induced apoptosis. The reproduction of this phenomenon in ovarian adenocarcinoma suggests that the vast majority of mutations follow second TP53 allele loss, irrespective of mode of DNA damage or tissue of origin.
Classic studies report that precursor lesions and invasive cancers both carry mutated driver oncogenes, but find TP53 inactivation more frequent in invasive disease (21; 22), suggesting p53 inactivation to be a late event. Activation of a key oncogene prior to biallelic TP53 loss in our series is formally possible, but few coding mutations precede 17p duplication and none recur in established oncogenes. While apparently contradictory, these findings could be reconciled by a temporal requirement that TP53 mutation precede driver oncogene mutation in precursor lesions destined to progress to invasive cancer. In this model, precursors lesions that activate oncogenes first (before TP53 inactivation) fail to progress, but would nonetheless be detected in “mutation frequency-by-stage” surveys. Alternatively, different cancer types might demonstrate distinct temporal ordering of key mutations. Application of our approach to sequence data from other cancer types, such as colon adenocarcinoma, should help distinguish these possibilities.
In selected instances we are able to show mutant TP53 duplication occurring before dosage changes in mutant alleles, such as for CDKN2A and WT1 (Figure 3). The consequences of such expanded mutagenesis and chromosomal instability emerge dramatically in the Notch signaling pathway, in which multiple family members develop mutations and wild-type alleles are lost frequently. Constitutive Notch protein activation drives subsets of acute leukemias(23), but a clear tumor suppressor phenotype has also been established in keratinocytes(24), with attentuated expression producing proliferation and invasive morphologies. Further study should clarify the breadth of epithelial cancers harboring these somatic changes, as well as their specific functional effects. Our data also confirms low-prevalence activation of known oncogenes such as Ras in human cSCCs, raising the possibility that numerous mutations in other pathways may serve as the functional oncogene in this setting(25).
Taken together, these insights imply that targeting activated oncogenes (e.g. with small molecule inhibitors) fails to address a fundamental, detectable abnormality in cancer genomes that accelerates evolution toward clinical resistance. Such temporal dissection of tumorigenesis provides both early, assayable diagnostic markers and also illuminates the specific biological consequences of these aberrations. We demonstrate the utility of this method for the CN-LOH and copy gains that comprise most chromosomal aberrations. Because many cancer types carry rearrangement of substantial proportions of the genome, especially those spanning key oncogenes, extension of this method should rapidly reveal additional ordered events. The described reconstruction of genomic aberration history can be applied immediately to any cancer for which sequence data and copy number is available.
We obtained eight matched cSCC and normal tissue samples as part of a skin cancer study protocol, with all subjects providing informed consent according to procedures approved by the University of California, San Francisco Committee on Human Research. Diagnosis of cutaneous squamous cell carcinoma was confirmed for all tumors via histological examination of a standard biopsy specimen by a board-certified dermatopathologist. DNA was extracted from tumor and control samples, and allele-specific copy number analysis was performed using Affymetrix Genome-Wide Human SNP Array 6.0 chips. About 40 megabases of coding region were isolated from each sample using oligonucleotide-based hybrid capture and sequenced using the Illumina sequencing-by-synthesis platform (Supplementary Figure 4). Mutation detection was performed as previously described (see Supplementary Methods and Appendix for detail). Eighty mutations were independently validated using Sanger sequencing; 100% confirmed the originally identified somatic chance. Sequence for all non-synonymous mutations will be deposited in dbGAP. Patient information and genomic profiling for ovarian serous adenocarcinomas analyzed here have been described previously(26).
Chromosomal regions with aberrant copy number, including copy neutral loss-of-heterozygosity (CN-LOH) and simple copy gains and losses, were identified based on discrete shifts in single nucleotide polymorphism copy number from both SNP array and exome sequencing data. The type of abnormality was further confirmed by assessing raw copy number depth from SNP array data. Mutations were called after alignment of sequence reads to a reference genome (NCBI36). The fraction of chromosomal copies carrying a mutation was estimated as the fraction of all independent sequence reads containing that mutation. (A detailed description of patient consent, methods and reagents used for tissue acquisition, genomic profiling, and statistical analysis of mutational evolution is provided in the Supplementary Methods and Appendix.)
Our approach reveals sequential ordering of oncogenic events in individual cancers, based on chromosomal rearrangements. Identifying the earliest abnormalities in cancer represents a critical step in timely diagnosis and deployment of targeted therapeutics.
We thank Allan Balmain, Boris Bastian, Jeffrey Cheng, Douglas Brash, and Dennis Oh for early support and helpful discussions. Henrik Bengtsson, Pierre Neuvial, Hubert Stoppler, Matthew Akana, Connie Ha, Lauren Lee, Annie Poon, and Eric Dybbro provided technical assistance. W.L. is supported by a Dermatology Foundation Psoriasis Career Development Award, J.C. by the University of California Cancer Coordinating Committee, S.T.A. by NIH/NCRR/OD UCSF-CTSI Grant Number KL2 RR024130, a Canary Foundation/American Cancer Society Postdoctoral Fellowship for the Early Detection of Cancer, and a Dermatology Foundation Career Development Award, R.J.C. by a Dermatology Foundation Career Development Award and as a Samsung Biotechnology Scholar-in-Residence.
This research was supported under Joe W Gray (by the Director, Office of Science, Office of Biological & Environmental Research, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, by the National Institutes of Health, National Cancer Institute grants P50 CA 58207, the U54 CA 112970, the NHGRI U24 CA 126551, by the Department of the Army, award: W81XWH-07-1-0663 [The U.S. Army Medical Research Acquisition Activity, 820 Chandler Street, Fort Detrick, MD 21702-5014 is the awarding and administering acquisition office], and by the Stand Up To Cancer-American Association for Cancer Research Dream Team Translational Cancer Research Grant SU2C-AACR-DT0409. The content of this information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred.); Eric A. Collisson (NIH/NCI K08 CA137153); Paul T. Spellman (NIH/NCI U24 CA1437991); Raymond J. Cho (by an unrestricted gift grant from the Samsung Advanced Institute of Technology); Theodoro M. Mauro (by NIH grants AR051930 and R01AG028492, and the Medical Research Service, Department of Veterans Affairs); and Allan Balmain, (by NIH/NIAMS Program Project Grant 5-P01-AR050440-05).
Financial disclosure: None reported