|Home | About | Journals | Submit | Contact Us | Français|
Increasingly, human mesenchymal malignancies are classified by the abnormalities that drive their pathogenesis. While many of these aberrations are highly prevalent within particular sarcoma subtypes, few are currently targeted therapeutically. Indeed, most subtypes of sarcoma are still treated with traditional therapeutic modalities and in many cases are resistant to adjuvant therapies. In this Review, we discuss the core molecular determinants of sarcomagenesis and emphasize the emerging genomic and functional genetic approaches that, coupled to novel therapeutic strategies, have the potential to transform the care of patients with sarcoma.
Sarcomas are uncommon yet diverse mesenchymal malignancies, arising in or from bone, cartilage, or connective tissues such as muscle, fat, peripheral nerves, fibrous, or related tissues (FIG. 1). Together, they affect ~11,000 individuals in the United States each year and approximately 200,000 worldwide, arise from multiple lineages, and range from indolent to highly invasive and metastatic1, 2. From a molecular genetics perspective, they have traditionally been classified into two broad categories, each of which includes clinically diverse sarcomas. First are those sarcomas with near-diploid karyotypes and simple genetic alterations including translocations or specific activating mutations. The second are tumors with complex and unbalanced karyotypes. These tumors are typified by genome instability resulting in multiple genomic aberrations in a single tumor’s genome, and heterogeneity of aberrations across tumors of a given type. The contrasting features of these two categories, which we first highlighted in 20023, have been well reviewed4. These categories are, however, broadly drawn and do not reflect the genetic diversity among tumors of a given type, the subtypes within classes, or their diverse tumor biology (FIG. 1).
Most sarcomas with simple genetic alterations are translocation-associated sarcomas (approximately one-third of all sarcomas). These tumors tend to arise de novo and, in some cases, harbor only the single defining cytogenetic abnormality that is present at initiation and retained throughout their clonal evolution. The majority of gene fusions resulting from these specific translocations encode chimeric transcription factors that cause transcriptional dysregulation of target genes, while others encode chimeric protein tyrosine kinases or autocrine growth factors5. Although well studied, the physiological roles of the individual genes in these fusions have seldom been directly linked to their respective sarcoma phenotypes, save perhaps for translocations of the myogenic transcription factor genes paired box 3 (PAX3) and PAX7 with forkhead box O1 (FOXO1) in alveolar rhabdomyosarcomas (ARMS; discussed below)6.
In contrast to translocation-associated sarcomas, some karyotypically complex sarcomas can arise from a less aggressive form and pass through discrete stages of progression accompanied by increasing genomic complexity. Examples include the progression from atypical lipoma or well-differentiated liposarcoma to dedifferentiated liposarcoma7–9, or from neurofibroma to malignant peripheral nerve sheath tumor (MPNST)10, 11, or from enchondroma to chondrosarcoma12. Importantly, however, most high-grade karyotypically complex sarcomas present de novo, without antecedent lower-grade lesions. A detailed listing of the genetic abnormalities in sarcomas and their conventional treatment is available elsewhere4, 13, 14. We focus here on the core mechanisms of pathogenesis in soft tissue sarcoma, the advanced genomic and functional genetic approaches being deployed for target discovery in this group of diseases, and the novel therapeutic approaches for their treatment.
The mechanisms that drive human sarcomagenesis fall into three broad categories: transcriptional dysregulation owing to aberrant fusion proteins resulting from genomic rearrangements (FIG. 2a), somatic mutations in key genes and signaling pathways, and DNA copy number abnormalities. The epigenetic mediators of sarcomagenesis are largely still to be determined, as while specific chromatin changes are implied by translocations and subsequent transcriptional dysregulation, data on recurrent methylation in sarcoma genomes is limited. Although this review focuses on these three core oncogenic mechanisms and the distinct therapeutic modalities that may follow, some consideration of pathogenetic mechanisms relating to chromosomal translocations and genomic complexity or instability in sarcomas may be in order. The perennial question of how and why translocations arise has been the subject of recent reviews15. In sarcomas, as in many leukemias, these appear to be fundamentally random events that become fixed through natural selection within the precursor cell. In silico analysis of sequence and structure indicated that features such as overall gene size, average intron length, and the length of the longest intron were all higher in translocation partner genes16. Additional factors increase the likelihood of random breaks in two genes leading to an illegitimate recombination event. These include increased availability of translocation partner genes created by open chromatin conformation associated with gene transcription or replication, or the unexpected proximity of some partner genes due to either the three dimensional arrangement of chromosomes in the nucleus17, or coordinated transcription within the same transcriptional hubs. Importantly, so-called recombinogenic DNA sequence elements may be anecdotally involved18, but are not more frequent in translocated genes. More recently, binding of a transcription factor, the androgen receptor (AR), has been implicated more directly in generating DNA strand breaks and consequent gene fusions19–21, an observation so far restricted to prostate cancer, but with intriguing implications for other hormone-driven cancers. Finally, regarding external risk factors, sarcoma translocations, in particular the t(X;18) of synovial sarcoma, may be rarely related to radiotherapy-induced DNA damage22–24.
Another aspect of genomic integrity is the mechanisms of telomere maintenance of which two main types have been described in human tumors: telomerase activation and the alternative lengthening of telomeres (ALT). These appear to differ in frequency between the major genomic classes of sarcomas. A predominance of telomerase activation in the absence of ALT appears to characterize sarcomas with specific chromosomal translocations. Alternatively, ALT is frequently seen in sarcomas with non-specific complex karyotypes25, 26, and a connection between ALT and mesenchymal stem cell biology has been proposed27.
Sarcomas with non-specific complex karyotypes, but not translocation-associated sarcomas, are also occasionally seen in some hereditary syndromes associated with genomic instability such as Werner syndrome (gene: WRN)28, Nijmegen breakage syndrome (gene: NBS1)29, and Rothmund-Thomson syndrome (gene: RECQL4)30, 31. Finally, recent low-coverage whole genome sequencing found that 3/9 osteosarcomas and 2/11 chordomas underwent a process termed chromothripsis. Rather than a multistep accumulation of unbalanced rearrangements, this is a single catastrophic genomic instability event affecting primarily a single chromosome32. Investigating the pathogenesis of chromothripsis and its occurrence in other sarcomas is of immediate interest.
Most translocation-associated sarcomas share a common biology of transcriptional target dysregulation. As noted above, most recurrent tumor-type–specific translocations in sarcomas produce gene fusions that encode aberrant transcriptional proteins. The general biology of cancer gene fusions has been well reviewed33. Likewise, general reviews of translocation-associated sarcomas, including comprehensive listings of recurrent gene fusions in sarcomas, have recently been published (FIG. 2a)5, 14. Here, we will limit ourselves to two aspects of transcriptional target gene dysregulation in translocation sarcomas that have been the focus of recent advances: the application of genome-wide transcription factor location analyses to comprehensively identify target genes of the fusion proteins, and the emerging evidence for aberrant nuclear reprogramming of mesenchymal stem cells in translocation-associated sarcomas. In the Therapeutic Avenues section below, we also discuss the use of the transcriptional targets of the fusion proteins as therapeutic targets, a third area of recent advances.
Genome-wide approaches to define the target gene repertoires of sarcoma fusions have included chromatin immunoprecipitation (ChIP) coupled to arrays (ChIP-on-chip), and, more recently, to second-generation sequencing (ChIP-seq: Box 1). Both methods identify binding sites for the aberrant fusion proteins, but ChIP-seq, unlike ChIP-on-chip using commercial promoter arrays, is not limited to regions surrounding promoters. Upon integration with expression profiles, one can determine whether the effect of a given fusion is predominantly repressive or activating.
Second-generation sequencing is enabling nucleotide-resolution oncogenomics202. Paired-end (or mate-paired) sequencing involves sequencing of short stretches of DNA on both ends of a larger fragment and aligning to the reference genome. Atypically aligned pairs (those with unexpected position, orientation or separation distance) often reflect genomic rearrangements such as translocations (FIG. 2b). Paired-end sequencing is therefore a powerful method for ascertaining structural rearrangements and marks the first time this information is readily available in an unbiased manner. It is suitable for the detection of rearrangements from variable depth-of-coverage whole-genome sequencing, and its sensitivity increases as the fragment length increases. These methods will help detect previously unknown ‘driver’ fusions in sarcomas with highly complex karyotypes. Paired-end sequencing can also be deployed in RNA sequencing of tumor transcriptomes, as has been done for prostate cancers and other malignancies203, 204. To explore the biology of chimeric transcriptional proteins in translocation-associated sarcomas, chromatin immunoprecipitation coupled to sequencing (ChIP-seq) can determine fusion protein location, facilitating target gene discovery (see main text). Finally, deep whole-exome and whole-genome sequencing, while described in detail elsewhere202, have a central place in sarcoma genomics. Exome sequencing first captures and then deeply sequences all protein-coding exons of human genes. By contrast, whole-genome sequencing is unbiased, sequencing all accessible nucleotides in the human genome. Both experiments detect point mutations and small insertions and deletions (indels) in exons, while whole-genome sequencing can simultaneously capture genome structure (in paired-end format, described above) and critically, intergenic variation. Intergenic germline variation or somatic mutations, while under-explored currently, could play an important role in sarcomagenesis. This is typified by the MDM2SNP309 promoter polymorphism205, which along with MDM2 amplification and TP53 deletion and mutation, represents another mechanism of aberrant p53 activity in a broad range of sarcomas.
Mapping the genomic binding sites of the PAX3-FOXO1 fusion protein in ARMS cells has shown that binding is associated with activation of transcription34. PAX3-FOXO1 binds primarily to PAX3 sites outside of the immediate vicinity of transcription start sites, typically >4 kilobases (kb) downstream. Co-enrichment of target PAX3 motifs with E-box motifs suggests co-regulation of many target genes by other transcription factors that bind E-boxes34. The direct targets identified include myogenic genes such as myogenic differentiation 1 (MYOD1) and myogenic factor 5 (MYF5), as well as many biologically interesting targets such as fibroblast growth factor receptor 4 (FGFR4), anaplastic lymphoma kinase (ALK), MET, insulin-like growth factor 1 receptor (IGF1R), and MYCN, in some cases confirming previous single-gene studies35–37. The role of some of these PAX3-FOXO1 target genes in sarcomagenesis is further discussed below.
In alveolar soft part sarcoma (ASPS), the ASPL (also known as ASPSCR1) gene fuses with the transcription factor binding to IGHM enhancer 3 (TFE3) gene to form a chimeric protein that retains the TFE3 DNA binding domain and therefore its CACGTG recognition site. In ChIP-on-chip studies, we have found ASPL-TFE3 localization is predictably enriched at this canonical site and exclusively associated with target gene activation, including MET38, cytochrome P450 17A1 (CYP17A1) and uridine phosphorylase 1 (UPP1)39.
A somewhat more complicated picture has emerged for the major Ewing sarcoma fusion involving EWSR1 (also known as EWS) and the Friend leukaemia virus integration 1 (FLI1) gene. Several ChIP datasets have been generated in different Ewing sarcoma cell lines with endogenous EWS-FLI1, all using the same FLI1 antibody for immunoprecipitation of EWS-FLI1-bound DNA. The numbers of bound genomic regions in such studies have varied widely40–42. ChIP-seq subsequently demonstrated that the majority of genomic regions bound by EWS-FLI1 were intergenic and that, through its FLI1-derived ETS family DNA-binding domain, EWS-FLI1 binds avidly to GGAA microsatellites40, 41. Microsatellites containing 6 or more GGAA repeats (the core ETS domain binding sequence) are associated with EWS-FLI1 target gene upregulation40, 42. These repeats are often more than 200kb upstream of the target gene transcription start site, suggesting that chromatin looping brings distant regions together in a transcriptional hub to allow EWS-FLI1 to modulate gene expression. As microsatellites are known polymorphic sites, it has been hypothesized that higher repeat content at one or more key target genes may underlie individual or ethnic differences in Ewing sarcoma susceptibility, for instance its rarity in individuals of African descent41.
EWS-FLI1 also binds to more conventional, non-repetitive ETS motifs, and these sites are associated with genes that show either repression or activation of transcription42. A subset of EWS-FLI1 target regions show co-enrichment of sites for E2F, nuclear respiratory factor 1 (NRF1), and nuclear transcription factor Y (NFY) raising the possibility of specific cooperative interactions43. In general, the combination of genome-wide target gene identification with gene expression data should accelerate the discovery of genes crucial to tumor growth and survival in translocation sarcomas. Genes found to be directly up-regulated by specific aberrant sarcoma fusion proteins can be subjected to focused RNA interference (RNAi)-based screens to identify the genes most essential to the sarcoma in question (see Target Discovery below).
Recent efforts to generate non-embryonic stem cells have renewed interest in nuclear or lineage reprogramming44, 45. Understanding reprogramming may also inform our concepts of translocation sarcomas driven by aberrant transcription factors. Assigning lineage to translocation sarcomas has proven difficult, as is the case for Ewing and synovial sarcoma, ASPS, and others. The cell-of-origin for each of these has long been debated, especially owing to another peculiar clinical feature of these sarcoma types: their occurrence in unusual sites for tumors of bone and soft tissue, such as kidney, lung, or pancreas. One explanation for both characteristics is an origin from more than one stem or progenitor cell type or from related precursor cells in different parts of the body, with the similar undifferentiated or aberrantly differentiated phenotypes resulting from nuclear reprogramming by the aberrant transcription factors. For example, it has been shown that EWS-FLI1, the fusion defining Ewing sarcoma, can induce neuroectodermal gene expression in heterologous cell types such as fibroblasts and rhabdomyosarcoma cells46, 47.
Indeed, this scenario was postulated previously48, and is supported by compelling data from recent studies, although some disagreements remain49. Silencing EWS-FLI1 in Ewing sarcoma cell lines produces an expression profile most similar to mesenchymal stem cells (MSCs) or mesenchymal progenitor cells50, 51 and these can subsequently be induced to differentiate along adipogenic or osteoblastic lineages51. Thus, EWS-FLI1 induces a limited neuroectodermal gene expression program and imposes a differentiation block on MSCs (or a related cell type52), including a block on osteogenic differentiation by inhibiting runt-related transcription factor 2 (RUNX2) binding to genes associated with osteogenic differentiation53. In the converse experiment, EWS-FLI1 expression in human MSCs induces a Ewing sarcoma gene expression profile, especially clear in MSCs derived from younger individuals54, 55. By contrast, EWS-FLI1 expression in differentiated cell types with an intact ARF-p53 pathway induces apoptosis or growth arrest46. In human MSCs, EWS-FLI1 directly upregulates the polycomb group repressor enhancer of zeste homolog 2 (EZH2)56 and induces expression of embryonic stem cell genes POU5F1 (also known as OCT4), SRY-box 2 (SOX2) and NANOG, at least partly by repressing miR-145 expression54. Interestingly, EWSR1 also fuses with POU5F1 itself, albeit rarely, in undifferentiated bone sarcoma57, 58, myoepithelial tumors of the soft tissue59, and in certain salivary gland tumors60.
Synovial sarcomas contain fusions of the SS18 (also known as SYT) gene with either SSX1 or SSX2. In a striking analogy to the EWS-FLI1 data, synovial sarcoma cell lines also express POU5F1, SOX2 and NANOG, and silencing of SYT-SSX in these cell lines enhances their potential to differentiate along adipogenic, osteoblastic or chondrogenic lineages61. The formation of synovial sarcoma-like tumors in mice with conditional expression of SYT-SSX2 in myoblasts62 or other lineages63 can be interpreted as further evidence of nuclear reprogramming by the fusion protein in a variety of more or less committed mesenchymal lineages. Finally, the sarcoma fusions of myxoid liposarcoma [fused in sarcoma (FUS)-DDIT3 (also known as CHOP)] and ARMS (PAX3-FOXO1) have also been reported to transform mouse mesenchymal stem or progenitor cells64, 65.
Excluding the gene fusions in translocation sarcomas, few highly recurrent driver genes have been described in sarcoma. The major exception here is gastrointestinal stromal tumor (GIST). GIST, one of the more common human sarcoma types, is characterized by oncogenic mutations in KIT, or less often in platelet-derived growth factor receptor-α (PDGFRA), or rarely in BRAF 66–68. In fact, the dependence of GIST on constitutively activated KIT and PDGFRA has led to treatment with selective kinase inhibitors (discussed below), representing a paradigm of targeted therapy in solid tumors. Oncogenic mutations occur in several different domains of KIT, and the location affects sensitivity to targeted inhibitors. Levels of KIT are also high in interstitial cells of Cajal, the presumed cell of origin for GIST. Nevertheless, oncogenic KIT mutations (in the activation domain; D816V in particular) are also found in tumors of diverse lineages including mastocytosis, acute myeloid leukemia, and germ cell tumors.
Approximately 10% of adult GISTs lack a KIT or PDGFRA mutation, a small subset (<1% of total GIST cases) harbor BRAF-V600E mutations (Table 1)66. Most pediatric GISTs harbor no mutations in KIT, PDGFRA, or BRAF, although KIT pathway activity is high in pediatric cases and in adult cases lacking mutations. In total, approximately 10% of adult and most pediatric GISTs harbor no mutations in KIT, PDGFRA, or BRAF, although KIT pathway activity is high. Among these, pediatric tumors show consistent overexpression of IGF1R mRNA and protein, although the mechanism remains unknown as no genomic amplifications or activating mutations have been described at the IGF1R locus. In fact, pediatric tumors have mostly diploid genomes with few if any DNA copy-number alterations69. Therefore, ongoing deep sequencing in pediatric GISTs is expected to identify alternative oncogenic events. A particularly attractive method may be hybrid capture of protein-coding exons followed by second-generation sequencing (exome sequencing: see Box 1) given the power of its completeness and suitability for profiling small patient numbers.
Although generally sporadic, GIST can also present as part of syndromes such as familial GIST, Carney’s triad, Carney-Stratakis syndrome, and neurofibromatosis. In Carney-Stratakis syndrome, which is characterized by the co-occurrence of GIST and paraganglioma, germline mutations in genes encoding subunits of succinate dehydrogenase have been identified, as is also the case in familial paraganglioma70. Most GISTs occurring in association with neurofibromatosis type I harbor somatic inactivation of the wild-type neurofibromin 1 (NF1) allele, while very few have KIT or PDGFRA mutations71, 72.
A role for tyrosine kinases is also emerging in angiosarcoma, a highly aggressive vascular tumor, where transcriptional profiles show striking overexpression of vascular-specific receptor tyrosine kinases including kinase insert domain receptor (KDR; which encodes vascular endothelial growth factor receptor 2 (VEGFR2)), TIE1, SNF related kinase (SNRK), TEK, and fms-related tyrosine kinase 1 (FLT1)73. Sequencing of these five genes revealed KDR mutations in about 10% of cases of angiosarcoma. The VEGFR2 mutant proteins, when expressed in COS-7 cells, showed ligand-independent activation73.
A recent large-scale analysis of the genomic landscape of sarcomas encompassing seven major subtypes (myxoid/round-cell, dedifferentiated, and pleomorphic liposarcomas; myxofibrosarcoma, leiomyosarcoma, GIST, and synovial sarcoma) identified frequent mutations in TP53 (which encodes p53), NF1, and PI3K catalytic subunit-α (PIK3CA)74. TP53 mutations were identified in 17% of pleomorphic liposarcomas, consistent with these mutations being frequent in sarcomas with complex karyotypes75, 76. By contrast, in translocation-associated sarcomas secondary genetic alterations, such as TP53 mutations or homozygous deletions of cyclin-dependent kinase inhibitor 2A (CDKN2A), are less common but, when present, are associated with a highly aggressive clinical course77. The discovery of PIK3CA mutations in 18% of myxoid/round-cell liposarcomas (Table 1) raises the possibility that secondary mutations may cooperate with the FUS-CHOP fusion protein in oncogenesis74. PIK3CA mutations clustered in the same two hot spots observed in epithelial tumors: the helical domain (E542K and E545K) and the kinase domain (H1047L and H1047R). Patients with helical domain mutations had a shorter disease-specific survival and increased AKT phosphorylation at both CREB-regulated transcription coactivator 2 (TORC2; also known as CRTC2) and pyruvate dehydrogenase kinase 1 (PDK1) phosphorylation sites than those with wild-type or kinase-domain–mutant tumors74.
Another novel finding is that of NF1 alterations (point mutations or deletions) in 10% of myxofibrosarcomas and 8% of pleomorphic liposarcomas74 (Table 1). NF1 germline and somatic mutations are typically associated with NF1 inactivation in sarcomas in individuals with neurofibromatosis type 1 syndrome, but NF1 mutations had not been previously described in sporadic sarcomas.
DNA copy-number alterations are the third core mechanism of sarcomagenesis. Sarcomas span a wide range of complexity among human malignancies in their copy-number alterations78. They vary from translocation-associated sarcomas with generally few copy-number alterations, either broad or focal, to karyotypically complex subtypes that are heterogeneous, unstable and profoundly altered in genomic copy number. In addition, a recent high-resolution array-based copy-number analysis revealed a category with intermediate complexity mainly characterized by few, yet highly recurrent amplifications, exemplified by dedifferentiated liposarcomas74. These and similar genomic data support an alternative sarcoma classification to the one based on low-resolution karyotypes. These three groups are genomically simple sarcomas, driven by pathognomonic translocations or point mutations; non–translocation-associated sarcomas of intermediate genomic complexity; and highly genomically complex sarcomas, while some subtypes may not fit so neatly in these broad groups, such as PAX7-FOXO1-positive ARMS. Data from another copy-number analysis show that the third category can be subdivided into sarcomas with few chromosome arm or whole chromosome gains or losses and sarcoma genomes with a high level of chromosomal complexity79.
The first group, genomically simple sarcomas, harbor characteristic gene fusions or activating mutations thought to represent early events in their pathogenesis. Yet even these tumors can acquire genomic complexity in advanced stages of disease80, 81.
Intermediate complexity sarcomas are exemplified by well-differentiated and dedifferentiated liposarcomas, which are driven mainly by chromosome 12 alterations, often generating extra-chromosomal episomes, ring chromosomes and larger markers82 (FIG. 2b). These 12q gains have high prevalence (80–90%) and co-amplified oncogenes cyclin-dependent kinase 4 (CDK4) and MDM2 can serve as confirmatory diagnostic markers83 and as potential pharmacological targets74, 84, 85. The structure, stability and reintegration of these amplicons into liposarcoma genomes can alter their affect on oncogenic phenotypes as well86. Another gene affected by 12q amplification is HMGA2, which often loses its 3′ untranslated region (UTR), disrupting microRNA-mediated repression87. This genomic remodeling of chromosome 12 is likely the result of progressive rearrangement and amplification in an evolving amplicon rather than a single catastrophic event such as the recently proposed chromothripsis, seen in a subset of osteosarcomas and chordomas (Table 1)32. Similar 12q amplifications occur at lower frequencies in other mesenchymal tumors such as osteosarcomas88 as well as several epithelial tumor types78.
Other notable, albeit less recurrent amplifications, in intermediate-complexity sarcomas occur on 1p and 6q. These amplifications, which appear to be mutually exclusive, span genes in the p38 and JNK pathways of MAPK signaling including, on 1p, JUN (Table 1) and, on 6q, TAB2 and MAP3K5 (also known as ASK1), a kinase upstream of JUN9, 89–91. An additional target of genomic amplification is telomerase reverse transcriptase (TERT) (on 5p)74. Some targets of genomic amplification appear to be shared among a subset of both intermediate and highly complex sarcomas, including Yes-associated protein 1 (YAP1) and vestigial like 3 (VGLL3) on 11q22 and 3p12, respectively92.
Finally, highly complex sarcomas harbor multiple numerical and structural chromosome aberrations that are reminiscent of the vast majority of epithelial tumors. Molecular classification of these subtypes reflect varying levels of similarity in their genomic aberrations; some subtypes may be considered a single entity93, while others are distinct94. Broad amplifications of several chromosome arms (such as 5p95) often occur in combination with deletions affecting well-established tumor suppressors such as CDKN2A, CDKN2B, PTEN, retinoblastoma 1 (RB1), NF1 and TP53. The affected gene, if not homozygously deleted, often harbors an inactivating mutation in the remaining allele74. In fact, several of these genes have a direct role in maintaining chromosome integrity96, 97 and their loss of function may be an early event leading to genomic instability in highly complex sarcomas. In other subtypes, such as leiomyosarcoma, genomic deletions are more common than amplifications74, 98. Nevertheless, at least a subset of leiomyosarcomas depends on the specific amplification of myocardin (MYOCD), which encodes a smooth muscle-specific transcriptional coactivator of the serum response factor (SRF) (Table 1)99–101. The involvement of MYOCD in smooth muscle differentiation implies it may serve as a lineage-survival oncogene102. Therefore, while systematic catalogues of copy numbers alterations point to pathways potentially activated in specific subtypes, to precisely delineate genes involved in these events that drive sarcomagenesis it will be essential to annotate genomic characterization with high-throughput functional genetics for target discovery.
Systematic surveys of cancer genomes with integrated genomics have proven an effective approach in identifying targetable genetic alterations in specific cancer types. The list of potential targets is expected to grow with the expanded use of second-generation sequencing technologies, which detect not only genome-wide copy-number changes, but rearrangements and mutations (Box 1).
Thus far, large genomic characterization efforts in cancer have mainly focused on epithelial and haematological cancers. Given the need for new therapies for sarcomas, their inclusion in such studies is expected in the near future. For instance, the Cancer Genome Atlas (TCGA) project is initiating a comprehensive genomic analysis of dedifferentiated liposarcoma, leiomyosarcoma and undifferentiated pleomorphic sarcoma, although this effort will have to overcome a perennial challenge in sarcoma genomic research: the scarcity of samples. Nevertheless, given the large number of differentiation lineages among diverse sarcomas, a detailed genetic characterization of these tumors is likely to benefit our wider understanding of cancer in general.
A gene recurrently altered in a sarcoma subtype does not necessarily play a role in cancer initiation or progression. In fact, the identification of recurrent lesions (Box 2) far outstrips our ability to test their importance. To determine the involvement of a gene in sarcoma biology and to credential it as a therapeutic target, systematic biological validation in genetically defined models must follow. Furthermore, even when a causal role for a given genetic alteration is experimentally supported in a particular cancer type, the critical downstream targets may remain elusive, requiring further functional studies. Yet, functional studies in sarcoma are hampered by the dearth of such appropriate models. Only limited numbers of human sarcoma cell lines exist, in part because of the rarity of certain diagnoses and resulting scarcity of samples. Moreover, for each of the subtypes with complex genomes, multiple cell lines are needed to represent the diversity of genetic alterations within that subtype. Several large-scale projects now aim to genetically characterize large numbers of human cancer cell lines and screen these against a range of anti-cancer therapies to correlate drug sensitivity with genetic markers. Among these are the Cancer Cell Line Encyclopedia (J.B. personal communication) and the Sanger Cancer Cell Line Project103, the latter is assembling approximately 800 cell lines, of which only 10 (1.3%) represent complex soft-tissue sarcomas (another 38 represent Ewing sarcoma or primitive neuroectodermal tumor, rhabdomyosarcoma or osteosarcoma).
Over the course of their somatic evolution, cancer genomes can acquire an array of abnormalities. These alterations either confer a clonal growth advantage to the cell (driver) or are acquired stochastically, but are biologically neutral (passenger). The need to distinguish between these two alteration types in increasingly complex genomic data has driven the development of robust and statistically principled computational methodologies. Alteration-type-specific methods have focused in particular on DNA copy-number alterations (CNAs), one of the most common somatic genetic events not only in karyotypically complex sarcomas, but also in the genomes of epithelial cancers. Two such methods, Genomic Identification of Significant Targets in Cancer (GISTIC)206 and RAE207, assign a statistical significance to candidate driver alterations emerging from a background of random, passenger abnormalities using their pattern of recurrence, amplitude, and extent, but also assign to individuals the set of CNAs they have undergone. The outputs of these computational methods can be used in studies of clinical associations, analyses of aberrant pathway activity, integrated with orthogonal data, or used to populate large-scale functional genetic screens (see main text). Alternatively, other methods, such as iCluster208 and Copy Number and Expression in Cancer (CONEXIC)209 identify putative driver alterations by integrating multiple high-throughput data types (such as expression and copy number).
There is, therefore, a pressing need to generate cell lines representative of diverse sarcoma types, mainly for the subtypes with complex karyotypes. The creation of a sarcoma cell line panel with cytogenetic and genomic profiles that mirror the diversity observed in their corresponding tumor types would represent a critical step in dissecting the influence of heterogeneity on variability of response to targeted therapies104. Such a panel could also drive genomics-guided functional genetics, either with arrayed or pooled loss-of-function RNAi screens105, 106, or gain-of-function ‘ORFeome’ approaches107 (FIG. 3).
Along these lines, we recently sought to functionally annotate the dedifferentiated liposarcoma genome by systematically knocking down genes altered by recurrent genomic amplification on 12q and elsewhere. With an arrayed loss-of-function shRNA screen, we determined which amplified genes are actually required for cell proliferation and survival74. We concentrated on dedifferentiated liposarcomas because the marked homogeneity of its genetic alterations compensates for the low number of cell lines available. Profiling of three dedifferentiated liposarcoma cell lines showed that this small panel captured a significant number of the molecular abnormalities observed in primary tumors. Using these validated cell lines, we identified several genes required for cancer cell viability, some of them potentially druggable. For instance, the hits included not only CDK4 at 12q14, confirming its importance in this sarcoma, but also aurora kinase A (AURKA; at 20q13), specific inhibitors of which are currently in clinical trials108.
These studies also provided a setting where we could address an open question in cancer genetics, namely, whether focal genomic amplifications contain a single driver gene or, as recently suggested, multiple independent drivers109. We found evidence that MDM2 and YEATS4, which are frequently co-amplified with each other (and nearly always in the same tumors with CDK4 amplification) are both drivers74. MDM2 is a validated target in this disease, as drugs that inhibit the MDM2-p53 interaction induce apoptosis in dedifferentiated liposarcoma cell lines84, 85. Therefore, these data support the concept of multiple driver genes in a single amplicon and hint at a more complex effect of genomic amplification on cancer phenotypes than previously understood. Furthermore, co-amplified genes may influence phenotypes unrelated to viability, so alternative assays are needed to test their role as oncogenes. Overall, this study design establishes a framework for the systematic genomic and functional genetic characterization of other rare cancers.
In addition to cell lines, several other types of models have been used for sarcoma and are likely similarly adaptable to in vivo functional genetics for target discovery. These include ex vivo cultures of tissue slices that preserve the original tumor microenvironment110 and low-passage short-term cultures111, 112, both of which are tractable surrogates of primary tumors. Nevertheless, to test novel targeted therapies, it is essential to develop in vivo models of sarcomas. Both subcutaneous and orthotopic xenografts (injecting sarcoma cell lines in immunocompromised mice) have been used to model human sarcomas but these model systems also have certain limitations. Some genetic abnormalities present in primary tumors will not be present or retained in the xenografts, and, conversely, serial passaging can introduce additional alterations not reflecting the primary tumors. Indeed, many treatments that initially showed promise in these models have not translated successfully to the clinic. To overcome these limitations, researchers are attempting to create panels of xenografts directly from primary tumor tissues representing several sarcoma subtypes113.
An alternative to xenografts is to genetically engineer animal models that reproduce the characteristics of human tumors, but that presents specific challenges. For mouse models of translocation-associated sarcomas, the challenge is to express the fusion oncogene in the correct lineage and development stage. For mouse models of complex karyotypes sarcomas, the challenge is expressing genuine alterations in an appropriate combination. For example, in leiomyosarcoma, the most prominent genetic alteration is chromosome 10 deletions affecting PTEN, but this may be a secondary alteration. Nonetheless, this was modeled by genetically inactivating Pten in smooth muscle cells of mice, which led to leiomyosarcomagenesis114. Another recent mouse model introduced oncogenic Kras and mutant Trp53 in the muscle of mice; these changes were sufficient to generate high-grade sarcomas with myofibroblastic differentiation115, but KRAS is rarely mutated in human sarcomas. Sarcomas with simple karyotypes that have been successfully modeled are synovial sarcoma62, 63, ARMS116, myxoid liposarcoma117, and GIST118, 119. Nevertheless, in one model of synovial sarcoma62, the SYT-SSX fusion oncogene was targeted to the myogenic lineage, a lineage inconsistent with conventional pathologic data on this sarcoma.
These and other sarcoma models may reveal the specific roles of genes altered in primary tumors and allow identification of secondary genetic or phenotypic events that are required for sarcoma progression and/or metastasis (FIG. 3). Engineered animal models may also be adapted to in vivo RNAi screens, as has been demonstrated in models of hepatocellular carcinoma and lymphoma120–122 (FIG. 3). This approach to exploring gene function would be especially powerful in soft-tissue sarcomas with complex genotypes and numerous chromosome aberrations. Genetically engineered animal models can also be used for diagnostic or prognostic biomarker discovery, drug testing, and drug resistance studies123. Indeed, the combination of sarcoma tumor profiles, sarcoma model systems that faithfully represent the alterations characteristic of their tumor type, and in vitro and in vivo functional genetics is a powerful approach to target discovery that is also being applied to other cancers by the National Cancer Institute’s Cancer Target Discovery and Development Network124.
Despite the many advances in identifying genetic abnormalities in sarcoma and elucidating their function, cytotoxic chemotherapy remains the standard of care for most locally advanced and metastatic sarcomas. Yet, complete surgical resection is the best hope for cure, and few patients with unresectable disease are curable by cytotoxic chemotherapy. Indeed, today few specific genetic lesions in sarcoma are direct targets of therapy, unlike epithelial cancer types harboring mutations that confer sensitivity to targeted inhibitors125.
The exception among sarcomas is GIST, where the KIT kinase inhibitor imatinib achieves a partial response or stable disease in approximately 80% of patients with advanced or metastatic GIST, often within days, with some patients on therapy now for 10 years126. These responses to imatinib depend, however, on the specific site of mutation; tumors with activation loop mutations are generally insensitive. Response to imatinib has also been disappointing in patients with wild-type KIT and PDGFRA genotypes, despite KIT pathway activation. These findings lend support to a genotype-driven paradigm of kinase inhibition. This paradigm may apply across tumors of diverse histologies that share addiction to a particular mutated kinase. For example, the recent success of Raf inhibitors in BRAF-V600E mutant melanoma suggests that responses may be elicited in other tumor types with a dependence on oncogenic Raf127, 128, a possible therapeutic option for the approximately 1% of adult GIST patients with BRAF-V600E mutation66.
GIST notwithstanding, other common sarcomas have shown very little sensitivity to existing tyrosine kinase inhibitors (TKIs) including leiomyosarcoma, high-grade undifferentiated pleomorphic sarcoma (formerly termed malignant fibrous histiocytoma) and well-differentiated/dedifferentiated liposarcoma129, 130. Nevertheless, kinase-directed agents have produced responses in certain translocation-associated sarcomas130, 131 (Table 2). Among these are responses to imatinib in dermatofibrosarcoma protuberans (DFSP) and giant-cell tumors of the tendon sheath with collagen Iα1 (COL1A1)-platelet- derived growth factor-β (PDGFB) and collagen Ivα3 (COL6A3)-colony-stimulating factor 1 (CSF1) fusions, respectively132, 133, MET inhibitor responses in ASPS and clear-cell sarcomas with ASPL-TFE3 and EWS-activating transcription factor 1 (ATF1) fusions, respectively38, 134, 135, ALK inhibitor responses in inflammatory myofibroblastic tumors with ALK fusions136, and IGF1R antibody responses in Ewing sarcoma with EWS-FLI1 or EWS-ERG fusions137–139. Yet in none of these instances have clinical responses proven to be as durable as those observed in patients with GIST treated with imatinib or other TKIs. In addition, patients with Ewing sarcoma have an approximately 10–15% response rate to anti-IGF1R therapy, yet in these tumors and in angiosarcomas, preclinical evidence would have predicted a greater response rate131, 140, 141. The reason for this discrepancy is unknown, though perhaps only a fraction of patients have overtly IGF1R-dependent tumors, as indicated by high serum IGF1 levels, which has been observed in non-small cell lung cancer142.
Among some less common sarcoma subtypes, several targeted agents appear active (Table 2). In addition, the identification of moderately or highly prevalent genetic abnormalities in some subtypes has suggested new possibilities for therapy. VEGFR-directed therapies such as bevacizumab and sorafenib are associated with approximately 15% response rates in primary and radiation-induced angiosarcoma129, 143, perhaps associated most closely with KDR mutation73. ASPS tumors are sensitive to VEGF-directed therapy such as cediranib or sunitinib144, 145. On the basis of reduced expression of tuberin (TSC2), perivascular epithelial cell tumors (PEComas) and related conditions such as lymphangioleiomyomatosis and angiomyolipoma respond to mTOR inhibition146–148. As NF1 inactivation leads to aberrant MAPK and mTOR pathway activity149, 150, the NF1 mutations and genomic deletions recently observed in pleomorphic liposarcomas and myxofibrosarcomas74 may identify a broader range of patients who might respond to either RAF/MEK inhibitors or rapamycin and its analogs (rapalogues). In fact, deploying rapalogues in several complex subtypes could be justified on the basis of highly prevalent PTEN deletions, as in leiomyosarcoma114. However, preliminary results of a phase III study of the mTOR inhibitor ridoforolimus indicated that progression-free survival was extended by only 3.1 weeks after completion of cytotoxic chemotherapy compared with the control arm, suggesting limited utility of mTOR inhibitors in sarcoma patients not first selected on the basis of their genomic abnormalities. The finding of frequent PIK3CA mutations in myxoid/round-cell liposarcoma74 (Table 1) suggests that at least this molecular subset of patients might benefit from PI3K inhibitors; this is currently being tested in clinical trials.
Finally, and while not strictly a targeted chemotherapeutic agent, trabectedin, a DNA minor groove-binding drug now approved in Europe for use in sarcomas, shows a substantial response rate in myxoid/round cell liposarcoma with FUS-CHOP and EWS-CHOP fusions and perhaps in additional translocation-associated sarcomas151, 152. While the precise function of trabectedin in sarcomas is unclear, it appears to involve alterations in transcription downstream of histone and transcription factor binding153. Trabectedin sensitizes cancer cell lines to FAS-mediated cell death154 and sarcomas with intact nucleotide excision repair (NER) appear to be more sensitive to the drug than those with dysfunctional NER155.
Malignancies, both epithelial and mesenchymal in origin, have remarkable similarity in their mechanisms of acquired and adaptive drug resistance. Resistance to TKIs is frequently acquired through reactivation of the oncogenic kinase through second-site mutations, as in KIT-mutant GIST156–158. For patients with metastatic disease, the median time to progression on first-line imatinib therapy is approximately 2 years. Here, the nature of secondary KIT mutations depends on the location of the primary KIT mutation. For instance, GIST harboring the more common and imatinib-sensitive KIT exon 11 mutation tend to become resistant by acquiring a second-site KIT mutation in exon 11 rather than in exon 9. Resistance to broadly based second-line KIT inhibitors (sunitinib), arising on average in ~6 months, can also develop through selection for double KIT-mutant resistant clones159. Another mechanism of resistance may involve alternative oncogenic pathways or rewiring of signaling networks, as experimental evidence suggests is the case for IGF1R inhibitors in rhabdomyosarcomas and Ewing sarcoma cell lines160, 161. This adaptive resistance is consistent with the lack of IGF1R mutations observed in cancer types where these therapies are active.
To circumvent these mechanisms of drug resistance, additional novel agents and strategies will be required. A strategy currently being used for imatinib-resistant chronic myelogenous leukemia is the development of second- or third-generation inhibitors with kinase-binding affinity or binding to sites other than the kinase domain itself162. The newer generation KIT inhibitors in preclinical development bind to the switch pocket domain of the protein, overcoming the resistance mediated by most combinations of KIT mutations observed in clinical samples163. However, the complexity of polyclonal resistance in imatinib-resistant GIST patients suggests that a single next-generation drug is unlikely to inhibit all mutant clones in a given patient, and broader therapeutic strategies need to be considered. Strategies being examined in clinical trials include drug combinations that block specific heterodimerization of receptor tyrosine kinases or multiple levels in a single signaling pathway. One such example is an inhibitor of CDC37, a protein that links KIT to the chaperone heat shock protein 90 (HSP90); this inhibitor may potentiate KIT degradation without the potential toxicity inherent in inhibiting too many HSP90 client proteins164–166. Other approaches warrant study, including polypharmacology, the simultaneous inhibition of multiple targets167.
Considering therapeutic strategies aimed at the aberrant transcriptional proteins driving translocation sarcomas, we note that transcription factors are considered poorly druggable because their protein-protein and protein-DNA interactions have historically been difficult to inhibit with small molecules. This view may be changing168, 169, and in sarcoma, a notable example exists with a small molecule that disrupts a critical interaction of the EWS-FLI1 protein with RNA helicase A in Ewing sarcoma170. Nevertheless, the most promising current approach to discovering therapeutic targets in translocation sarcomas is identifying targets of the chimeric transcription factor and focusing on those that encode known drug targets. The receptor tyrosine kinase MET has emerged as such a target in several sarcomas. MET is a direct transcriptional target of ASPL-TFE3 in ASPS38, and apparently also of PAX3-FOXO1 in ARMS36, 171. In clear-cell sarcoma, EWS-ATF1 transactivates microphthalmia-associated transcription factor (MITF), which in turn directly activates MET transcription134, 172. The fact that these fusion proteins upregulate MET has justified a phase II multi-institutional study of the MET inhibitor ARQ197 in patients with advanced clear cell sarcoma and ASPS. The transcriptional targets of the Ewing sarcoma fusion protein EWS-FLI1 appear to affect multiple pathways including Notch173, Hedgehog-GLI174, 175, Wnt-β-catenin176, transforming growth factor-β (TGFβ)177, 178, and possibly IGF1R179. The IGF1R pathway may be dysregulated by EWS-FLI1 at several levels, including IGF1 upregulation and IGF binding protein 3 (IGFBP3) repression138, 180, 181. The IGF1R pathway is also transcriptionally upregulated by the PAX3-FOXO1 fusion in ARMS34, 35. These findings have in part provided the rationale for trials of IGF1R inhibitors in these sarcomas141, 182, 183. Finally, in myxoid/round cell liposarcoma, the FUS-CHOP fusion oncoprotein forms a complex with NFKBIZ at target promoters thereby upregulating nuclear factor-κB (NF-κB) target genes184. Thus, NF-κB pathway inhibition, which reduces the viability of myxoid liposarcoma cell lines185, may represent a new therapeutic option in this translocation sarcoma.
The insensitivity of many sarcomas to existing systemic therapy is driving the exploration of agents aimed at new types of targets. Among these are HSP90 inhibitors, which have been studied in GISTs (but not in other sarcomas). Other novel targets include BCL2, phosphatases involved in feedback control of oncogenic pathways, and key mediators of epigenetic regulation including histone deacetylases, histone acetyltransferases, and DNA methyltransferases. Epigenetic approaches may lead to re-expression of pro-apoptotic molecules, rendering sarcomas sensitive to other agents, or itself induce apoptosis or senescence, although unknown at present. Similarly, cell cycle regulators including CDK4 and CDK6186 have proven to be attractive but recalcitrant targets, while strategies to target components of the mitotic apparatus such as aurora kinases are actively under development. Targeting the p53-MDM2 pathway with nutlins is promising in tumors with MDM2 amplification84 (predominantly well- and dedifferentiated liposarcomas). With an increasing array of agents to test against this rare group of cancers (FIG. 4), international-scale cooperative studies are paramount, as is ensuring that patients with sarcomas be included, along with more common cancers, in clinical trials of biologically relevant agents.
The diagnosis and treatment of sarcoma patients is entering a period of rapid evolution. The dramatic drop in the cost of personal genome sequencing may alter the clinical and therapeutic course for sarcoma patients, as it is becoming technically possible to guide patient care by analysis of the patient’s cancer and normal genome sequences and this may soon become practically feasible as well187. Over the next few years, the catalog of mutations that drive all but the least common diseases will become known, thanks to large-scale efforts such as TCGA and the International Cancer Genome Consortium, as well as others. To prevent sarcomas from lagging behind epithelial cancers in target discovery, it will be critical that robust models of disease be developed to allow rapid functional annotation of the genetic abnormalities identified from both research and clinical sequencing.
We apologize to the many authors whose relevant work we were unable to cite here owing to space limitations. We thank N. Schultz for providing pathway expertise, C.D.M. Fletcher for critical reading, and M. Meyerson and C. Sander for advice and support. This work was supported in part by The Soft Tissue Sarcoma Program Project (P01 CA047179, S.S., M.L. and C.A.) and the SPORE in Soft Tissue Sarcoma (P50 CA140146-01, S.S., M.L., C.A. and B.S.T.).
BARRY S. TAYLOR
Barry S. Taylor, Ph.D., is the David H. Koch Fellow in cancer genomics at Memorial Sloan-Kettering Cancer Center and a visiting scientist at the Helen Diller Family Comprehensive Cancer Center at the University of California, San Francisco. His research uses statistical, integrative genomic, and functional genetic approaches for cancer genome discovery.
Jordi Barretina, Ph.D., is a research scientist in the Cancer Program at the Broad Institute. He is interested in applying genomic and functional tools to the analysis of cancer genomes with a particular focus on translational research. He currently leads the Cancer Cell Line Encyclopedia (CCLE) project, which is conducting detailed genetic and pharmacologic characterization of cancer cell lines to link therapeutic vulnerabilities to genomic patterns and translate cell line genomics into patient stratification in genotype-driven clinical trials.
ROBERT G. MAKI
Robert G. Maki, M.D., Ph.D, is the leader of the sarcoma program and pediatric hematology-oncology program at Mount Sinai Medical Center, New York. He is interested in and has been involved in numerous translational studies involving a variety of sarcoma subtypes, and is actively seeking both novel agents and new clinical trial designs to accelerate the use of new agents in adult and pediatric sarcoma patients.
CRISTINA R. ANTONESCU
Cristina R. Antonescu, M.D. is an Associate Attending Pathologist in the Department of Pathology at Memorial Sloan-Kettering Cancer Center. She is a sarcoma pathologist and her research focuses on characterizing novel molecular markers for diagnosis and treatment of gastrointestinal stromal tumors and angiosarcoma.
Samuel Singer, M.D., FACS, is an Attending Surgical Oncologist and the Chief of the Gastric and Mixed Tumor Service at Memorial Sloan-Kettering Cancer Center, where he is principal investigator of the Soft Tissue Sarcoma Program Project and the SPORE in soft tissue sarcoma. His research focuses on an integrated molecular, genetic, and biochemical analysis of soft-tissue sarcoma. Dr. Singer also maintains a large clinical practice focused on patients with soft tissue sarcoma.
Marc Ladanyi, M.D. holds the William Ruane Chair in Molecular Oncology at MSKCC where he is Attending Pathologist in the Molecular Diagnostics Service of the Department of Pathology and Member in the Human Oncology and Pathogenesis Program. He is a molecular pathologist whose research laboratory works on the genomics and molecular pathogenesis of sarcomas and thoracic malignancies. He also co-directs the Cancer Genome Atlas (TCGA) group at MSKCC, part of the NCI’s TCGA project network.
Competing interests statement
The authors declare no competing financial interests.
The Cancer Genome Atlas (TCGA), studied cancers: http://cancergenome.nih.gov/wwd/cancers_studied_by_tcga.asp
Online Mendelian Inheritance in Man: http://www.ncbi.nlm.nih.gov/omim
Mitelman Database of Chromosome Aberrations and Gene Fusions in Cancer: http://cgap.nci.nih.gov/Chromosomes/Mitelman
NCI CTD2: http://ocg.cancer.gov/programs/ctdd.asp
The RNAi Consortium: http://www.broadinstitute.org/rnai/trc/
Human ORFeome: http://horfdb.dfci.harvard.edu/
National Cancer Institute Drug Dictionary: http://www.cancer.gov/drugdictionary/