|Home | About | Journals | Submit | Contact Us | Français|
Bone and soft tissue sarcomas are a group of histologically heterogeneous and relatively uncommon tumors. To explore their genetic origins, we sequenced the exomes of 13 osteosarcomas, eight myxoid liposarcomas (MLPS), and seven synovial sarcomas (SYN). These tumors had few genetic alterations (median of 10.8). Nevertheless, clear examples of driver gene mutations were observed, including canonical mutations in TP53, PIK3CA, SETD2, AKT1, and subclonal mutation in FBXW7. Of particular interest were mutations in H3F3A, encoding the variant histone H3.3. Mutations in this gene have only been previously observed in gliomas. Loss of heterozygosity of exomic regions was extensive in osteosarcomas but rare in SYN and MLPS. These results provide intriguing nucleotide-level information on these relatively uncommon neoplasms and highlight pathways that help explain their pathogenesis.
Though the tissues in which sarcomas arise comprise 75% of human body weight, sarcomas comprise only ~1% of adult and ~15% of pediatric malignancies. It is estimated that 2,890 bone and 11,280 soft tissue sarcomas were diagnosed in US in 2012, with estimated deaths of 1,410 and 3,900, respectively (American Cancer Society, 2012). Three of the most important and prevalent sarcomas are myxoid liposarcomas (MLPSs), synovial sarcomas (SYNs), and osteosarcomas (Fig. 1).
MLPSs account for more than a third of all liposarcomas and ~10% of all adult soft tissue sarcomas (Conyers et al., 2011). The thigh is the most common site. MLPSs have no gender preference and patients are relatively young, with a peak age ranging between 30 and 50 years. The standard treatment of surgical excision and adjuvant radiation is often, but not always, sufficient to achieve local control; ~35% of the patients develop metastases, usually precluding cure. The 5-year survival ranges between 20% and 70%, depending on whether or not a round cell histology is present (Sandberg, 2004). At the genetic level, the most characteristic change is a t(12;16)(q13;p11) chromosomal translocation that results in the fusion of the FUS and DDIT3 genes. This fusion leads to the activation of the transcription factor encoded by DDIT3 (also known as [a.k.a.] CHOP or GADD153), and is found in more than 95% of MLPSs (Goransson et al., 2008; Conyers et al., 2011). Gain of the entire or parts of chromosome 8 occurs in 20% of MLPS (Ohguri et al., 2006; Nishio et al., 2011). Recently, promoter mutations in TERT were identified in 80% of MLPSs (Killela et al., 2013).
SYNs are particularly aggressive soft tissue sarcomas that usually arise in the extremities, but can occur in most organs and tissues, including the heart and brain. They predominantly affect young people and confer a 50% mortality rate (Bergh et al., 1999; Malay Haldar, 2008). SYNs comprise 5–10% of all soft tissue sarcomas (Malay Haldar, 2008; Suurmeijer et al., 2013) and 15–20% of those in adolescents and young adults (Suurmeijer et al., 2013). The peak is in the third decade of life, with ~30% occurring before the age of 20. The vast majority of SYNs harbor a reciprocal translocation t(X;18)(p11; q11) resulting in fusion of the SS18 (a.k.a. SYT) and SSX genes (Clark et al., 1994). A diagnosis of SYN is usually made on the basis of histology and immunolabeling, and confirmed by the presence of the pathognomonic t(X;18) translocation (Coindre et al., 2003). The 5-year survival rates from this disease ranges from 36% to 76% with tumor location, size, and grade as well as age at diagnosis having prognostic implications (Ferrari et al., 2004; Spurrell et al., 2005; Malay Haldar, 2008).
Though osteosarcomas arise in bone, they are clearly related to the soft tissue sarcomas in that all are mesenchymal in origin (Ottaviani and Jaffe, 2010). Osteosarcomas most commonly occur in the long bones, with 40% arising in the femur, 20% in the tibia, and 10% in the humerus. Less common locations include the skull or jaw, and the pelvis (Ottaviani and Jaffe, 2010). Osteosarcomas are one of the most common solid tumors of young people, annually occurring in ~900 individuals in US; of these, 400 were patients less than 15 years of age. They are usually aggressive, and approximately one third of the young patients will die from their disease within 5 years of their diagnosis. A second peak in incidence occurs in the elderly, usually associated with underlying bone pathology such as Paget’s disease, or prior irradiation. There is little known at the genetic level about the pathogenesis of osteosarcomas. No specific cytogenetic changes have been identified, but the karyotypes are typically highly complex. Except for a small number of mutations in commonly mutated tumor suppressor genes such as TP53 and RB1, the genes driving the tumorigenesis of osteosarcomas have yet to be identified (Hansen et al., 1985; Berman et al., 2008; Martin et al., 2012).
We have here performed analyses of the exomic sequences of these three sarcoma types to advance our understanding of their genetic underpinnings.
Twenty-five fresh-frozen surgically resected sarcomas and three osteosarcoma cell lines derived from culturing primary tumors were obtained from institutional tumor banks and matched blood was obtained from patients under an institutional review board approved protocol at the Johns Hopkins Hospital (Baltimore, MD) and Lund University Hospital. Neoplastic cell content was analyzed in frozen sections, and the tumors were macrodissected to remove residual normal tissue and enhance neoplastic cellularity to >70%. Patient characteristics are described in Table 1. All of the osteosarcomas were of high grade, except for OST201, OST202, OST203, and OST205 for which this information was not available.
DNA was extracted from tumor tissues and blood using an AllPrep kit (Qiagen) following the manufacturer’s protocol. The amount of amplifiable DNA was quantified by a real time quantitative PCR based assay that amplifies human repeated sequences, using the primers and conditions previously described (Rago et al., 2007).
Genomic DNA libraries were prepared following Illumina’s (Illumina, San Diego, CA) suggested protocol with the following modifications. (1) 3 μg of genomic DNA from tumor or normal cells in 100 μl of TE was fragmented in a Covaris sonicator (Covaris, Woburn, MA) to a size of 100–500 bp. DNA was purified with a PCR purification kit (Cat # 28104, Qiagen) and eluted in 35 μl of elution buffer included in the kit. (2) Purified, fragmented DNA was mixed with 40 μl of H2O, 10 μl of 10 × T4 ligase buffer with 10 mM ATP, 4 μl of 10 mM dNTP, 5 μl of T4 DNA polymerase, 1 μl of Klenow Polymerase, and 5 μl of T4 polynucleotide Kinase. All reagents used for this step and those described below were obtained from New England Biolabs (NEB, Ipswich, MA) unless otherwise specified. The 100 μl end-repair mixture was incubated at 20°C for 30 min, purified by a PCR purification kit (Cat # 28104, Qiagen) and eluted with 32 μl of elution buffer (EB). (3) To A-tail the DNA, all 32 μl of end-repaired DNA was mixed with 5 μl of 10 × Buffer (NEB buffer 2), 10 μl of 1 mM dATP and 3 μl of Klenow (exo-). The 50 μl mixture was incubated at 37°C for 30 min before DNA was purified with a MinElute PCR purification kit (Cat # 28004, Qiagen). Purified DNA was eluted with 12.5 μl of 70°C EB and obtained with 10 μl of EB. (4) For adaptor ligation, 10 μl of A-tailed DNA was mixed with 10 μl of PE-adaptor (Illumina), 25 μl of 2x Rapid ligase buffer and 5 μl of Rapid Ligase. The ligation mixture was incubated at room temperature (RT) or 20°C for 15 min. (5) To purify adaptor-ligated DNA, 50 μl of ligation mixture from step (4) was mixed with 200 μl of NT buffer from Nucleo-Spin Extract II kit (cat# 636972, Clontech, Mountain View, CA) and loaded into NucleoSpin column. The column was centrifuged at 14,000 g in a desktop centrifuge for 1 min, washed once with 600 μl of wash buffer (NT3 from Clontech), and centrifuged again for 2 min to dry completely. DNA was eluted in 50 μl elution buffer included in the kit. (6) To obtain an amplified library, ten PCRs of 25 μl each were set up, each including 13.25 μl of H2O, 5 μl of 5 × Phusion HF buffer, 0.5 μl of a dNTP mix containing 10 mM of each dNTP, 0.5 μl of Illumina PE primer #1, 0.5 μl of Illumina PE primer #2, 0.25 μl of Hotstart Phusion polymerase, and 5 μl of the DNA from step (5). The PCR program used was: 98°C 1 minute; 6 cycles of 98°C for 20 seconds, 65°C for 30 seconds, 72°C for 30 seconds; and 72°C for 5 min. To purify the PCR product, 250 μl PCR mixture (from the 10 PCR reactions) was mixed with 500 μl NT buffer from a NucleoSpin Extract II kit and purified as described in step (5). Library DNA was eluted with 70°C-warm elution buffer and the DNA concentration was estimated by absorption at 260 nm.
The human exome was captured following a protocol from Agilent’s SureSelect Paired-End Version 2.0 Human Exome Kit, or using custom kits with probes designed to capture the desired genomic regions (Agilent, Santa Clara, CA) with the following modifications. (1) A hybridization mixture was prepared containing 25 μl of SureSelect Hyb # 1, 1 μl of SureSelect Hyb # 2, 10 μl of SureSelect Hyb # 3, and 13 μl of SureSelect Hyb # 4. (2) 3.4 μl (0.5 μg) of the PE-library DNA described above, 2.5 μl of SureSelect Block #1, 2.5 μl of SureSelect Block #2, and 0.6 μl of Block #3; was loaded into one well in a 384-well Diamond PCR plate (cat# AB-1111, Thermo-Scientific, Lafayette, CO), sealed with microAmp clear adhesive film (cat# 4306311; ABI, Carlsbad, CA) and placed in GeneAmp PCR system 9700 thermocycler (Life Sciences, Carlsbad CA) for 5 min at 95°C, then held at 65°C (with the heated lid on). (3) 25–30 μl of hybridization buffer from step (1) was heated for at least 5 min at 65°C in another sealed plate with heated lid on. (4) 5 μl of SureSelect Oligo Capture Library, 1 μl of nuclease-free water, and 1 μl of diluted RNase Block (prepared by diluting RNase Block 1: 1 with nuclease-free water) were mixed and heated at 65°C for 2 min in another sealed 384-well plate. (5) While keeping all reactions at 65°C, 13 μl of Hybridization Buffer from Step (3) was added to the 7 μl of the SureSelect Capture Library Mix from Step (4) and then the entire contents (9 μl) of the library from Step (2). The mixture was slowly pipetted up and down 8 to 10 times. (6) The 384-well plate was sealed tightly and the hybridization mixture was incubated for 24 hours at 65°C with a heated lid.
After hybridization, five steps were performed to recover and amplify captured DNA library: (1) Magnetic beads for recovering captured DNA: 50 μl of Dynal MyOne Streptavidin C1 magnetic beads (Cat # 650.02, Invitrogen Dynal, AS Oslo, Norway) was placed in a 1.5-ml microfuge tube and vigorously resuspended on a vortex mixer. Beads were washed three times by adding 200 μl of SureSelect Binding buffer, mixing on a vortex for 5 seconds and then removing the supernatant after placing the tubes in a Dynal magnetic separator. After the third wash, beads were resuspended in 200 μl of SureSelect Binding buffer. (2) To bind captured DNA, the entire hybridization mixture described above (29 μl) was transferred directly from the thermocycler to the bead solution and mixed gently; the hybridization mix / bead solution was incubated in an Eppendorf thermomixer at 850 rpm for 30 min at room temperature. (3) To wash the beads, the supernatant was removed from beads after applying a Dynal magnetic separator and the beads were resuspended in 500 μl SureSelect Wash Buffer #1 by mixing on vortex mixer for 5 seconds and incubated for 15 min at room temperature. Wash Buffer#1 was then removed from beads after magnetic separation. The beads were further washed three times, each with 500 μl pre-warmed SureSelect Wash Buffer #2 after incubation at 65°C for 10 min. After the final wash, SureSelect Wash Buffer #2 was completely removed. (4) To elute captured DNA, the beads were suspended in 50 μl SureSelect Elution Buffer, vortex-mixed and incubated for 10 min at room temperature. The supernatant was removed after magnetic separation, collected in a new 1.5-ml microcentrifuge tube, and mixed with 50 μl of SureSelect Neutralization Buffer. DNA was purified with a Qiagen MinElute column and eluted in 17 μl of 70°C EB to obtain 15 μl of captured DNA library. (5) The captured DNA library was amplified in the following way: 15 PCR reactions each containing 9.5 μl of H2O, 3 μl of 5 × Phusion HF buffer, 0.3 μl of 10 mM dNTP, 0.75 μl of DMSO, 0.15 μl of Illumina PE primer #1, 0.15 μl of Illumina PE primer #2, 0.15 μl of Hotstart Phusion polymerase, and 1 μl of captured exome library were set up. The PCR program used was: 98°C for 30 seconds; 14 cycles of 98°C for 10 seconds, 65°C for 30 seconds, 72°C for 30 seconds; and 72°C for 5 min. To purify PCR products, 225 μl of PCR mixture (from 15 PCR reactions) was mixed with 450 μl of NT buffer from NucleoSpin Extract II kit and purified as described above. The final library DNA was eluted with 30 μl of 70°C elution buffer and DNA concentration was estimated by OD260 measurement.
Captured DNA libraries were sequenced with the Illumina GAIIx or HiSeq Genome. Sequencing reads were analyzed and aligned to human genome hg18 with the Eland algorithm in CASAVA 1.6 software (Illumina). A mismatched base was identified as a mutation only when (i) it was identified by more than three distinct tags; (ii) the number of distinct tags containing a particular mismatched base was at least 15% of the total distinct tags; and (iii) it was not present in >0.5% of the tags in the matched normal sample. SNP search databases included http://www.ncbi.nlm.nih.gov/projects/SNP/ and http://browser.1000genomes.org/index.html.
Informative SNPs, present in dbSNP and UCSC databases, were identified through sequencing of the exomes of the matched normal DNAs. Loss of heterozygosity (LOH) was determined based on the ratio of informative alleles in the corresponding tumor. SNPs with a ratio <0.3 between the two alleles was considered to exhibit LOH. A region was considered to show LOH if it contained eight or more uninterrupted SNPs in a row with a ratio less than 0.3.
To detect somatic mutations in the three tumor types, we analyzed frozen tumor tissue from eight patients with MLPSs, seven with SYNs, and 13 with osteosarcomas. Clinical characteristics of the patients and their sarcomas are provided in Table 1. In each case, the tumors were carefully dissected, so that only regions containing >70% neoplastic cells remained. Following purification of the DNA from these tumors, paired-end libraries were generated using standard Illumina procedures (see MATERIALS AND METHODS). For each case, another library was generated from the DNA of peripheral blood cells. The 56 libraries (25 from dissected tumor blocks, three from tumor cell line, and 28 from matched normal cells) were then captured with a SureSelect 2.0 exome kit that enriched the libraries for protein-coding sequences from ~21,000 genes. The captured libraries were then sequenced on Illumina GAIIx or HiSeq 2000 instruments. The average distinct coverage of each base in the targeted region varied from 75 to 109 fold, and a minimum of 91.9% of targeted bases were represented by at least 10 reads (Table 2). Using stringent criteria for the analysis of these presumptive mutations, we identified 576 high-confidence somatic mutations in 538 genes. To validate these mutations, we evaluated each of them by Sanger sequencing and were able to confirm 345 of them (60%). Of the 231 unconfirmed mutations, 64 (11% of all mutations) could not be amplified by PCR because of an unusually high guanine–cytosine content, difficulty in the design of unique primers, or other unknown factors preventing specific amplification and sequencing of the locus; the remaining 167 mutations (29% of all mutations) were not present at levels detectable by Sanger sequencing. The 345 confirmed mutations are listed in Supporting Information Table 1.
A median of 10.8 (range 3–15) somatic mutations per tumor were confirmed in MLPS, 8.1 (range 3–15) were confirmed in the SYNs, and 15.5 (range 3–38) in the osteosarcomas. In aggregate, there were 343 single base pair (bp) substitutions, one 2-bp substitution, 17 nonsense mutations, seven indels producing frameshifts, and five mutations at positions −1, −2, +1, or +2 relative to the adjacent exon. The most common single base pair substitutions were C:G > T:A (21.4%), G:C > A:T (20.0%) and C:G > A:T (11.3%) transitions (Table 3).
A number of mutations were identified in unequivocal driver genes in these malignancies. The most frequently mutated gene was TP53, mutated in three osteosarcomas and one SYN. Both sarcoma types have previously been shown to harbor mutations in TP53 (Schneider-Stock et al., 1999; Oda et al., 2000; Overholtzer et al., 2003;Wunder et al., 2005). Three of the four mutations were missense (C176Y, V216M, A276D), all located at commonly mutated positions in the DNA-binding domain, and the fourth was a nonsense mutation (R306X) resulting in the loss of the domain required for p53-p53 interactions (Brown et al., 2009). PIK3CA, a well-studied oncogene, was mutated in two out of the eight MLPSs, and it has been previously shown to be mutated in 18% of myxoid/round-cell liposarcomas (Barretina et al., 2010). One of the mutations (V344G) was in the helical domain, whereas the other (M1043I) was in the catalytic domain, both known hotspot positions. The product of the PIK3CA gene (PI3Kα) is known to exert its effects through AKT1, and AKT1 was found to be mutated in a third MLPS. The AKT1 mutation (E17K) was unequivocally of functional importance, as it was the canonical mutation observed in breast and colorectal cancers and shown to constitutively activate its kinase activity (Carpten et al., 2007).
The most unexpected mutation was in H3F3A, a gene encoding the histone variant H3.3, in an osteosarcoma. The mutation (G34W) was at codon 34, one of the two positions that are characteristically mutated in childhood gliomas (Schwartzentruber et al., 2012). To our knowledge, a childhood acute lymphoblastic leukemia in a study that included 1,003 cases of acute leukemias and non-Hodgkin lymphomas (Je et al., 2013) is the only neoplasm other than gliomas that has ever been found to have a mutation at either of the two critical residues (codons 27 and 34). The mutation at codon 34 presumably interferes with trimethylation at the nearby lysine (codon 36), affecting genome wide epigenetic landscapes. Another osteosarcoma had a mutation (S65L) in an evolutionarily conserved codon in the linker histone H1FX. Mutations in this gene have not been associated before with cancer, and it is not clear if indeed this mutation is a driver. In the sarcoma set we assessed, we identified one additional mutation of a chromatin-modifying gene that was unequivocally important: a truncating mutation (Q219X) of SETD2 in a SYN. SETD2 encodes a histone methyltransferase that methylates the lysine at codon 36 of histone H3. This methylation has been shown to be critical for epigenetic transcriptional activation in a variety of eukaryotic cell types.
In addition to the driver mutations described above, we identified single mutations in CBL, CDH1, EGFR, and MET among the sarcomas (Supporting Information Table 1). Though each of these genes can drive tumorigenesis when altered in specific ways, none of the mutations we identified was of the type, or at the positions, known to be functionally important for driving cancer, and we considered these mutations to be passengers.
Other potentially interesting mutations occurred in TIE1 (two osteosarcomas: T398S and A646G). TIE1 encodes a tyrosine kinase that regulates neo-angiogenesis and blood vessel stability (Jones et al., 2001). No previous mutations of TIE1 are listed in the COSMIC database, and these mutations were of interest only because two different sarcomas contained them. A third osteosarcoma harbored a mutation (E990G) in KDR, another kinase that regulates angiogenesis. These mutations were too infrequent to establish statistical significance, but it is possible that they have some role in establishing the vasculature in osteosarcomas.
There were only two other genes mutated in more than one sarcoma: MLANA and NCAN (Supporting Information Table 1). The products of these genes are known to play a role in melanocytes and the nervous system, respectively, and based on their function they most likely are passenger mutations.
A significant fraction of the mutations identified by massively parallel sequencing were present at low levels and they were not detectable by Sanger sequencing. Usually most of these mutations are artifacts, or passenger mutations present in a fraction of the neoplastic cells in a tumor. However, in one MLPS, there was a known driver mutation in FBXW7 (R479Q) present at a low level (17.1%). To confirm this mutation, we utilized a very sensitive method for detecting rare mutations called SafeSeqS (Kinde et al., 2011). Indeed the mutation was present at levels almost identical to those determined by massively parallel sequencing (15.9%), indicating the presence of a neoplastic subclone within this tumor.
Utilizing informative SNPs present in the exomic sequences we looked for LOH in the tumor DNA. The patterns of LOH were vastly different among the different types of sarcoma. Each MLPS had from 0 to 2 chromosomes with LOH. Similarly, each SYN had 0–1 chromosome with a region of LOH, except for one SYN that had four chromosomes with LOH regions. In contrast, 10 of 13 osteosarcomas had LOH in most of their chromosomes (Fig. 2).
The low number of mutations, the presence of subclonal driver mutations and the low LOH frequency prompted us to determine the level of the translocation in two MLPSs as a measure of the neoplastic cell content. Exome sequencing does not provide information regarding the breakpoint of these translocations as they are located within introns. To identify the breakpoints, paired-end libraries were generated using standard Illumina protocols. The libraries were captured with customized baits targeting the FUS and DDIT3 genes and sequenced on Illumina GAIIx instruments. This allowed the precise mapping of the breakpoints of the translocations. The breakpoints in one sample were determined to be in FUS intron 8 and DDIT3 intron 1, whereas in the other sample they were in FUS intron 5 and DDIT3 intron 1. In both cases, the number of paired sequence tags that spanned the breakpoint compared to the number of paired sequence tags that were wild type for the locus indicated that the neoplastic cell content was > 90%.
In sum, the three sarcoma types studied here, including the highly aggressive subtypes synovial sarcoma and osteosarcoma, contain rather few exonic mutations compared to other solid tumors (Parsons et al., 2008; Bettegowda et al., 2011; Agrawal et al., 2012). Part of the reason for this may be that patients with these sarcomas were often rather young, and tumors of younger patients often contain fewer mutations (Parsons et al., 2011; Tomasetti et al., 2013). Another reason is that MLPSs and SYNs are driven by characteristic gene fusions. These function as drivers, and perhaps only a small number of other genetic alterations are required for frank tumors to develop once the fusion has occurred. Another, nonmutually exclusive possibility is that the gene fusions function as the predominant mechanism for driver mutations, but later during tumor development subclones arise that have distinct driver mutations in other genes. In addition, all but one of the MLPSs we evaluated in this study has promoter mutations in TERT (Killela et al., 2013) and three out of seven have mutations in the PIK3CA/AKT pathway. Although osteosarcomas do not have characteristic translocations, they do have extensive LOH, which reflects previously noted copy number variation in these tumors (Man et al., 2004; Martin et al., 2012; Rosenberg et al., 2013) suggesting the presence of genomic rearrangements presumably resulting in the alteration of a number of genes involved in their development and consequently have few mutations. In osteosarcomas, changes in miRNA profiles have been associated with their pathogenesis (Jones et al., 2012). However, such changes and other epigenetic alterations were not evaluated in this study. Our results document that these additional drivers include TP53, PIK3CA, AKT1, H3F3A, SETD2, and FBXW7, and suggest that genes regulating angiogenesis (TIE1 and KDR) may play a role. Importantly, three of these known driver genes (PIK3CA, AKT1, and SETD2) and two speculative drivers (TIE1 and KDR) have enzymatic activities and therefore offer therapeutic opportunities.
We thank our patients for their courage and generosity. We thank J. Ptak, N. Silliman, L. Dobbyn, and J. Schaeffer for expert technical assistance.
Supported by: NIH, Grant numbers: RC2DE020957, CA57345, CA121113, CA146799, CA133012, and DK087454; the Virginia and D.K. Ludwig Fund for Cancer Research, the Swedish Cancer Foundation, the National Research Council of Sweden, the Swedish Childhood Cancer Foundation and the 973 Program of Ministry of Science and Technology of China, Grant number: 2011CB707900.
Additional Supporting Information may be found in the online version of this article.