Like other solid tumors, colorectal cancer (CRC) is a genomic disorder in which various types of genomic alterations, such as point mutations, genomic rearrangements, gene fusions, or chromosomal copy number alterations, can contribute to the initiation and progression of the disease. The advent of a new DNA sequencing technology known as next-generation sequencing (NGS) has revolutionized the speed and throughput of cataloguing such cancer-related genomic alterations. Now the challenge is how to exploit this advanced technology to better understand the underlying molecular mechanism of colorectal carcinogenesis and to identify clinically relevant genetic biomarkers for diagnosis and personalized therapeutics. In this review, we will introduce NGS-based cancer genomics studies focusing on those of CRC, including a recent large-scale report from the Cancer Genome Atlas. We will mainly discuss how NGS-based exome-, whole genome- and methylome-sequencing have extended our understanding of colorectal carcinogenesis. We will also introduce the unique genomic features of CRC discovered by NGS technologies, such as the relationship with bacterial pathogens and the massive genomic rearrangements of chromothripsis. Finally, we will discuss the necessary steps prior to development of a clinical application of NGS-related findings for the advanced management of patients with CRC.
Next-generation sequencing; Cancer genomics; Colorectal cancers; Personalized medicine; The cancer genome atlas
The accumulation of somatic mutations in genes and molecular pathways is a major factor in the evolution of oral squamous cell carcinoma (OSCC), which sparks studies to identify somatic mutations with clinical potentials. Recently, massively parallel sequencing technique has started to revolutionize biomedical studies, due to the rapid increase in its throughput and drop in cost. Hence sequencing of whole transcriptome (RNA-Seq) becomes a superior approach in cancer studies, which enables the detection of somatic mutations and accurate measurement of gene expression simultaneously.
We used RNA-Seq data from tumor and matched normal samples to investigate somatic mutation spectrum in OSCC.
By applying a sophisticated bioinformatic pipeline, we interrogated two tumor samples and their matched normal tissues and identified 70,472 tumor somatic mutations in protein-coding regions. We further identified 515 significantly mutated genes (SMGs) and 156 tumor-specific disruptive genes (TDGs), with six genes in both sets, including ANKRA2, GTF2H5, STOML1, NUP37, PPP1R26, and TAF1L. Pathway analysis suggested that SMGs were enriched in cell adhesion pathways, which are frequently indicated in tumor development. We also found that SMGs tend to be differentially expressed between tumors and normal tissues, implying a regulatory role of accumulation of genetic aberrations in these genes.
Our finding of known tumor genes proves of the utility of RNA-Seq in mutation screening, and functional analysis of genes detected here would help understand the molecular mechanism of OSCC.
RNA-Seq; Oral squamous cell carcinoma; Somatic mutations; Significantly mutated genes; Differential expression; Disruptive genes
Meningiomas are the most common primary nervous system tumor. The tumor suppressor NF2 is disrupted in approximately half of meningiomas1 but the complete spectrum of genetic changes remains undefined. We performed whole-genome or whole-exome sequencing on 17 meningiomas and focused sequencing on an additional 48 tumors to identify and validate somatic genetic alterations. Most meningiomas exhibited simple genomes, with fewer mutations, rearrangements, and copy-number alterations than reported in other adult tumors. However, several meningiomas harbored more complex patterns of copy-number changes and rearrangements including one tumor with chromothripsis. We confirmed focal NF2 inactivation in 43% of tumors and found alterations in epigenetic modifiers among an additional 8% of tumors. A subset of meningiomas lacking NF2 alterations harbored recurrent oncogenic mutations in AKT1 (E17K) and SMO (W535L) and exhibited immunohistochemical evidence of activation of their pathways. These mutations were present in therapeutically challenging tumors of the skull base and higher grade. These results begin to define the spectrum of genetic alterations in meningiomas and identify potential therapeutic targets.
Deep sequencing techniques provide a remarkable opportunity for comprehensive understanding of tumorigenesis at the molecular level. As omics studies become popular, integrative approaches need to be developed to move from a simple cataloguing of mutations and changes in gene expression to dissecting the molecular nature of carcinogenesis at the systemic level and understanding the complex networks that lead to cancer development.
Here, we describe a high-throughput, multi-dimensional sequencing study of primary lung adenocarcinoma tumors and adjacent normal tissues of six Korean female never-smoker patients. Our data encompass results from exome-seq, RNA-seq, small RNA-seq, and MeDIP-seq. We identified and validated novel genetic aberrations, including 47 somatic mutations and 19 fusion transcripts. One of the fusions involves the c-RET gene, which was recently reported to form fusion genes that may function as drivers of carcinogenesis in lung cancer patients. We also characterized gene expression profiles, which we integrated with genomic aberrations and gene regulations into functional networks. The most prominent gene network module that emerged indicates that disturbances in G2/M transition and mitotic progression are causally linked to tumorigenesis in these patients. Also, results from the analysis strongly suggest that several novel microRNA-target interactions represent key regulatory elements of the gene network.
Our study not only provides an overview of the alterations occurring in lung adenocarcinoma at multiple levels from genome to transcriptome and epigenome, but also offers a model for integrative genomics analysis and proposes potential target pathways for the control of lung adenocarcinoma.
The research community at large is expending considerable resources to sequence the coding region of the genomes of tumors and other human diseases using targeted exome capture (i.e., “whole exome sequencing”). The primary goal of targeted exome sequencing is to identify nonsynonymous mutations that potentially have functional consequences. Here, we demonstrate that whole-exome sequencing data can also be analyzed for comprehensively monitoring somatic copy number alterations (CNAs) by benchmarking the technique against conventional array CGH. A series of 17 matched tumor and normal tissues from patients with metastatic castrate-resistant prostate cancer was used for this assessment. We show that targeted exome sequencing reliably identifies CNAs that are common in advanced prostate cancer, such as androgen receptor (AR) gain and PTEN loss. Taken together, these data suggest that targeted exome sequencing data can be effectively leveraged for the detection of somatic CNAs in cancer.
Recent advances in the treatment of cancer have focused on targeting genomic aberrations with selective therapeutic agents. In rare tumors, where large-scale clinical trials are daunting, this targeted genomic approach offers a new perspective and hope for improved treatments. Cancers of the ampulla of Vater are rare tumors that comprise only about 0.2% of gastrointestinal cancers. Consequently, they are often treated as either distal common bile duct or pancreatic cancers.
We analyzed DNA from a resected cancer of the ampulla of Vater and whole blood DNA from a 63 year-old man who underwent a pancreaticoduodenectomy by whole genome sequencing, achieving 37× and 40× coverage, respectively. We determined somatic mutations and structural alterations.
We identified relevant aberrations, including deleterious mutations of KRAS and SMAD4 as well as a homozygous focal deletion of the PTEN tumor suppressor gene. These findings suggest that these tumors have a distinct oncogenesis from either common bile duct cancer or pancreatic cancer. Furthermore, this combination of genomic aberrations suggests a therapeutic context for dual mTOR/PI3K inhibition.
Whole genome sequencing can elucidate an oncogenic context and expose potential therapeutic vulnerabilities in rare cancers.
The incidence of melanoma is increasing more than any other cancer, and knowledge of its genetic alterations is limited. To systematically analyze such alterations, we performed whole-exome sequencing of 14 matched normal and metastatic tumor DNAs. Using stringent criteria, we identified 68 genes that appeared to be somatically mutated at elevated frequency, many of which are not known to be genetically altered in tumors. Most importantly, we discovered that TRRAP harbored a recurrent mutation that clustered in one position (p. Ser722Phe) in 6 out of 67 affected individuals (~4%), as well as a previously unidentified gene, GRIN2A, which was mutated in 33% of melanoma samples. The nature, pattern and functional evaluation of the TRRAP recurrent mutation suggest that TRRAP functions as an oncogene. Our study provides, to our knowledge, the most comprehensive map of genetic alterations in melanoma to date and suggests that the glutamate signaling pathway is involved in this disease.
The arrival of both high-throughput and bench-top next-generation sequencing technologies and sequence enrichment methods has revolutionized our approach to dissecting the genetic basis of cancer. These technologies have been almost invariably employed in whole-genome sequencing (WGS) and whole-exome sequencing (WES) studies. Both WGS and WES approaches have been widely applied to interrogate the somatic mutational landscape of sporadic cancers and identify novel germline mutations underlying familial cancer syndromes. The clinical implications of cancer genome sequencing have become increasingly clear, for example in diagnostics. In this editorial, we present these advances in the context of research discovery and discuss both the clinical relevance of cancer genome sequencing and the challenges associated with the adoption of these genomic technologies in a clinical setting.
Next-generation sequencing; Exome; Cancer; Diagnostics; Familial cancer syndrome; Somatic mutation
Gene fusions arising from chromosomal translocations have been implicated in cancer. However, the role of gene fusions in BRCA1-related breast cancers is not well understood. Mutations in BRCA1 are associated with an increased risk for breast cancer (up to 80% lifetime risk) and ovarian cancer (up to 50%). We sought to identify putative gene fusions in the transcriptomes of these cancers using high-throughput RNA sequencing (RNA-Seq).
We used Illumina sequencing technology to sequence the transcriptomes of five BRCA1-mutated breast cancer cell lines, three BRCA1-mutated primary tumors, two secretory breast cancer primary tumors and one non-tumorigenic breast epithelial cell line. Using a bioinformatics approach, our initial attempt at discovering putative gene fusions relied on analyzing single-end reads and identifying reads that aligned across exons of two different genes. Subsequently, latter samples were sequenced with paired-end reads and at longer cycles (producing longer reads). We then refined our approach by identifying misaligned paired reads, which may flank a putative gene fusion junction.
As a proof of concept, we were able to identify two previously characterized gene fusions in our samples using both single-end and paired-end approaches. In addition, we identified three novel in-frame fusions, but none were recurrent. Two of the candidates, WWC1-ADRBK2 in HCC3153 cell line and ADNP-C20orf132 in a primary tumor, were confirmed by Sanger sequencing and RT-PCR. RNA-Seq expression profiling of these two fusions showed a distinct overexpression of the 3' partner genes, suggesting that its expression may be under the control of the 5' partner gene's regulatory elements.
In this study, we used both single-end and paired-end sequencing strategies to discover gene fusions in breast cancer transcriptomes with BRCA1 mutations. We found that the use of paired-end reads is an effective tool for transcriptome profiling of gene fusions. Our findings suggest that while gene fusions are present in some BRCA1-mutated breast cancers, they are infrequent and not recurrent. However, private fusions may still be valuable as potential patient-specific biomarkers for diagnosis and treatment.
Glioblastoma multiforme, the most common type of primary brain tumor in adults, is driven by cells with neural stem (NS) cell characteristics. Using derivation methods developed for NS cells, it is possible to expand tumorigenic stem cells continuously in vitro. Although these glioblastoma-derived neural stem (GNS) cells are highly similar to normal NS cells, they harbor mutations typical of gliomas and initiate authentic tumors following orthotopic xenotransplantation. Here, we analyzed GNS and NS cell transcriptomes to identify gene expression alterations underlying the disease phenotype.
Sensitive measurements of gene expression were obtained by high-throughput sequencing of transcript tags (Tag-seq) on adherent GNS cell lines from three glioblastoma cases and two normal NS cell lines. Validation by quantitative real-time PCR was performed on 82 differentially expressed genes across a panel of 16 GNS and 6 NS cell lines. The molecular basis and prognostic relevance of expression differences were investigated by genetic characterization of GNS cells and comparison with public data for 867 glioma biopsies.
Transcriptome analysis revealed major differences correlated with glioma histological grade, and identified misregulated genes of known significance in glioblastoma as well as novel candidates, including genes associated with other malignancies or glioma-related pathways. This analysis further detected several long non-coding RNAs with expression profiles similar to neighboring genes implicated in cancer. Quantitative PCR validation showed excellent agreement with Tag-seq data (median Pearson r = 0.91) and discerned a gene set robustly distinguishing GNS from NS cells across the 22 lines. These expression alterations include oncogene and tumor suppressor changes not detected by microarray profiling of tumor tissue samples, and facilitated the identification of a GNS expression signature strongly associated with patient survival (P = 1e-6, Cox model).
These results support the utility of GNS cell cultures as a model system for studying the molecular processes driving glioblastoma and the use of NS cells as reference controls. The association between a GNS expression signature and survival is consistent with the hypothesis that a cancer stem cell component drives tumor growth. We anticipate that analysis of normal and malignant stem cells will be an important complement to large-scale profiling of primary tumors.
Genomic aberrations can be used to determine cancer diagnosis and prognosis. Clinically relevant novel aberrations can be discovered using high-throughput assays such as Single Nucleotide Polymorphism (SNP) arrays and next-generation sequencing, which typically provide aggregate signals of many cells at once. However, heterogeneity of tumor subclones dramatically complicates the task of detecting aberrations.
The aggregate signal of a population of subclones can be described as a linear system of equations. We employed a measure of allelic imbalance and total amount of DNA to characterize each locus by the copy number status (gain, loss or neither) of the strongest subclonal component. We designed simulated data to compare our measure to existing approaches and we analyzed SNP-arrays from 30 melanoma samples and transcriptome sequencing (RNA-Seq) from one melanoma sample.
We showed that any system describing aggregate subclonal signals is underdetermined, leading to non-unique solutions for the exact copy number profile of subclones. For this reason, our illustrative measure was more robust than existing Hidden Markov Model (HMM) based tools in inferring the aberration status, as indicated by tests on simulated data. This higher robustness contributed in identifying numerous aberrations in several loci of melanoma samples. We validated the heterogeneity and aberration status within single biopsies by fluorescent in situ hybridization of four affected and transcriptionally up-regulated genes E2F8, ETV4, EZH2 and FAM84B in 11 melanoma cell lines. Heterogeneity was further demonstrated in the analysis of allelic imbalance changes along single exons from melanoma RNA-Seq.
These studies demonstrate how subclonal heterogeneity, prevalent in tumor samples, is reflected in aggregate signals measured by high-throughput techniques. Our proposed approach yields high robustness in detecting copy number alterations using high-throughput technologies and has the potential to identify specific subclonal markers from next-generation sequencing data.
copy number; SNP arrays; Next generation sequencing; melanoma
The joint sequencing of related genomes has become an important means to discover rare variants. Normal-tumor genome pairs are routinely sequenced together to find somatic mutations and their associations with different cancers. Parental and sibling genomes reveal de novo germline mutations and inheritance patterns related to Mendelian diseases.
Acute lymphoblastic leukemia (ALL) is the most common paediatric cancer and the leading cause of cancer-related death among children. With the aim of uncovering the full spectrum of germline and somatic genetic alterations in childhood ALL genomes, we conducted whole-exome re-sequencing on a unique cohort of over 120 exomes of childhood ALL quartets, each comprising a patient's tumor and matched-normal material, and DNA from both parents. We developed a general probabilistic model for such quartet sequencing reads mapped to the reference human genome. The model is used to infer joint genotypes at homologous loci across a normal-tumor genome pair and two parental genomes.
We describe the algorithms and data structures for genotype inference, model parameter training. We implemented the methods in an open-source software package (QUADGT) that uses the standard file formats of the 1000 Genomes Project. Our method's utility is illustrated on quartets from the ALL cohort.
Lung cancer is a leading cause of cancer related morbidity and mortality globally, and carries a dismal prognosis. Improved understanding of the biology of cancer is required to improve patient outcomes. Next-generation sequencing (NGS) is a powerful tool for whole genome characterisation, enabling comprehensive examination of somatic mutations that drive oncogenesis. Most NGS methods are based on polymerase chain reaction (PCR) amplification of platform-specific DNA fragment libraries, which are then sequenced. These techniques are well suited to high-throughput sequencing and are able to detect the full spectrum of genomic changes present in cancer. However, they require considerable investments in time, laboratory infrastructure, computational analysis and bioinformatic support. Next-generation sequencing has been applied to studies of the whole genome, exome, transcriptome and epigenome, and is changing the paradigm of lung cancer research and patient care. The results of this new technology will transform current knowledge of oncogenic pathways and provide molecular targets of use in the diagnosis and treatment of cancer. Somatic mutations in lung cancer have already been identified by NGS, and large scale genomic studies are underway. Personalised treatment strategies will improve care for those likely to benefit from available therapies, while sparing others the expense and morbidity of futile intervention. Organisational, computational and bioinformatic challenges of NGS are driving technological advances as well as raising ethical issues relating to informed consent and data release. Differentiation between driver and passenger mutations requires careful interpretation of sequencing data. Challenges in the interpretation of results arise from the types of specimens used for DNA extraction, sample processing techniques and tumour content. Tumour heterogeneity can reduce power to detect mutations implicated in oncogenesis. Next-generation sequencing will facilitate investigation of the biological and clinical implications of such variation. These techniques can now be applied to single cells and free circulating DNA, and possibly in the future to DNA obtained from body fluids and from subpopulations of tumour. As costs reduce, and speed and processing accuracy increase, NGS technology will become increasingly accessible to researchers and clinicians, with the ultimate goal of improving the care of patients with lung cancer.
High-throughput nucleotide sequencing; DNA sequence analysis; lung neoplasms; non-small cell lung carcinoma; small cell lung carcinoma
Significant tumor regressions have been observed in up to 70% of patients receiving adoptively transferred autologous melanoma-reactive tumor infiltrating lymphocytes (TIL) 1,2, and in pilot trials, 40% of treated patients experienced complete regressions of all measurable lesions for at least five years following treatment 3. To evaluate the potential association between the ability of TIL to mediate durable regressions and their ability to recognize potent antigens that presumably include mutated gene products, a novel screening approach was developed that involved mining whole exome sequence data to identify the mutated proteins that were expressed in patient tumors. Candidate mutated T cell epitopes that were identified using an MHC binding algorithm 4 were then synthesized and evaluated for recognition by TIL. Using this approach, mutated antigens expressed on autologous tumor cells were identified as targets of three TIL that were associated with objective tumor regressions following adoptive transfer. This simplified approach, which avoids the need to generate and laboriously screen cDNA libraries from tumors, may represent a generally applicable method for identifying mutated T cell antigens expressed in melanoma as well as other tumor types.
The diagnosis and treatment of cancers, which rank among the leading causes of mortality in developed nations, presents substantial clinical challenges. The genetic and epigenetic heterogeneity of tumors can lead to differential response to therapy and gross disparities in patient outcomes, even for tumors originating from similar tissues. High-throughput DNA sequencing technologies hold promise to improve the diagnosis and treatment of cancers through efficient and economical profiling of complete tumor genomes, paving the way for approaches to personalized oncology that consider the unique genetic composition of the patient’s tumor. Here we present a novel method to leverage the information provided by cancer genome sequencing to match an individual tumor genome with commercial cell lines, which might be leveraged as clinical surrogates to inform prognosis or therapeutic strategy. We evaluate the method using a published lung cancer genome and genetic profiles of commercial cancer cell lines. The results support the general plausibility of this matching approach, thereby offering a first step in translational bioinformatics approaches to personalized oncology using established cancer cell lines.
Next generation sequencing (NGS) technologies have revolutionized cancer research allowing the comprehensive study of cancer using high throughput deep sequencing methodologies. These methods detect genomic alterations, nucleotide substitutions, insertions, deletions and copy number alterations. SOLiD (Sequencing by Oligonucleotide Ligation and Detection, Life Technologies) is a promising technology generating billions of 50 bp sequencing reads. This robust technique, successfully applied in gene identification, might be helpful in detecting novel genes associated with cancer initiation and progression using formalin fixed paraffin embedded (FFPE) tissue. This study’s aim was to compare the validity of whole exome sequencing of fresh-frozen vs. FFPE tumor tissue by normalization to normal prostatic FFPE tissue, obtained from the same patient. One primary fresh-frozen sample, corresponding FFPE prostate cancer sample and matched adjacent normal prostatic tissue was subjected to exome sequencing. The sequenced reads were mapped and compared. Our study was the first to show comparable exome sequencing results between FFPE and corresponding fresh-frozen cancer tissues using SOLiD sequencing. A prior study has been conducted comparing the validity of sequencing of FFPE vs. fresh frozen samples using other NGS platforms. Our validation further proves that FFPE material is a reliable source of material for whole exome sequencing.
exome sequencing; SOLiD4; prostate cancer; next-generation sequencing
Molecular pathology of thymomas is poorly understood. Genomic aberrations are frequently identified in tumors but no extensive sequencing has been reported in thymomas. Here we present the first comprehensive view of a B3 thymoma at whole genome and transcriptome levels. A 55-year-old Caucasian female underwent complete resection of a stage IVA B3 thymoma. RNA and DNA were extracted from a snap frozen tumor sample with a fraction of cancer cells over 80%. We performed array comparative genomic hybridization using Agilent platform, transcriptome sequencing using HiSeq 2000 (Illumina) and whole genome sequencing using Complete Genomics Inc platform. Whole genome sequencing determined, in tumor and normal, the sequence of both alleles in more than 95% of the reference genome (NCBI Build 37). Copy number (CN) aberrations were comparable with those previously described for B3 thymomas, with CN gain of chromosome 1q, 5, 7 and X and CN loss of 3p, 6, 11q42.2-qter and q13. One translocation t(11;X) was identified by whole genome sequencing and confirmed by PCR and Sanger sequencing. Ten single nucleotide variations (SNVs) and 2 insertion/deletions (INDELs) were identified; these mutations resulted in non-synonymous amino acid changes or affected splicing sites. The lack of common cancer-associated mutations in this patient suggests that thymomas may evolve through mechanisms distinctive from other tumor types, and supports the rationale for additional high-throughput sequencing screens to better understand the somatic genetic architecture of thymoma.
Patients with prostate cancer may present with metastatic or recurrent disease despite initial curative treatment. The propensity of metastatic prostate cancer to spread to the bone has limited repeated sampling of tumor deposits. Hence, considerably less is understood about this lethal metastatic disease, as it is not commonly studied. Here we explored whole-genome sequencing of plasma DNA to scan the tumor genomes of these patients non-invasively.
We wanted to make whole-genome analysis from plasma DNA amenable to clinical routine applications and developed an approach based on a benchtop high-throughput platform, that is, Illuminas MiSeq instrument. We performed whole-genome sequencing from plasma at a shallow sequencing depth to establish a genome-wide copy number profile of the tumor at low costs within 2 days. In parallel, we sequenced a panel of 55 high-interest genes and 38 introns with frequent fusion breakpoints such as the TMPRSS2-ERG fusion with high coverage. After intensive testing of our approach with samples from 25 individuals without cancer we analyzed 13 plasma samples derived from five patients with castration resistant (CRPC) and four patients with castration sensitive prostate cancer (CSPC).
The genome-wide profiling in the plasma of our patients revealed multiple copy number aberrations including those previously reported in prostate tumors, such as losses in 8p and gains in 8q. High-level copy number gains in the AR locus were observed in patients with CRPC but not with CSPC disease. We identified the TMPRSS2-ERG rearrangement associated 3-Mbp deletion on chromosome 21 and found corresponding fusion plasma fragments in these cases. In an index case multiregional sequencing of the primary tumor identified different copy number changes in each sector, suggesting multifocal disease. Our plasma analyses of this index case, performed 13 years after resection of the primary tumor, revealed novel chromosomal rearrangements, which were stable in serial plasma analyses over a 9-month period, which is consistent with the presence of one metastatic clone.
The genomic landscape of prostate cancer can be established by non-invasive means from plasma DNA. Our approach provides specific genomic signatures within 2 days which may therefore serve as 'liquid biopsy'.
Recurrent mutations affecting the histone H3.3 residues Lys27 or indirectly Lys36 are frequent drivers of pediatric high-grade gliomas (over 30 % of HGGs). To identify additional driver mutations in HGGs, we investigated a cohort of 60 pediatric HGGs using whole-exome sequencing (WES) and compared them to 543 exomes from non-cancer control samples. We identified mutations in SETD2, a H3K36 trimethyltransferase, in 15 % of pediatric HGGs, a result that was genome-wide significant (FDR = 0.029). Most SETD2 alterations were truncating mutations. Sequencing the gene in this cohort and another validation cohort (123 gliomas from all ages and grades) showed SETD2 mutations to be specific to high-grade tumors affecting 15 % of pediatric HGGs (11/73) and 8 % of adult HGGs (5/65) while no SETD2 mutations were identified in low-grade diffuse gliomas (0/45). Furthermore, SETD2 mutations were mutually exclusive with H3F3A mutations in HGGs (P = 0.0492) while they partly overlapped with IDH1 mutations (4/14), and SETD2-mutant tumors were found exclusively in the cerebral hemispheres (P = 0.0055). SETD2 is the only H3K36 trimethyltransferase in humans, and SETD2-mutant tumors showed a substantial decrease in H3K36me3 levels (P < 0.001), indicating that the mutations are loss-of-function. These data suggest that loss-of-function SETD2 mutations occur in older children and young adults and are specific to HGG of the cerebral cortex, similar to the H3.3 G34R/V and IDH mutations. Taken together, our results suggest that mutations disrupting the histone code at H3K36, including H3.3 G34R/V, IDH1 and/or SETD2 mutations, are central to the genesis of hemispheric HGGs in older children and young adults.
Electronic supplementary material
The online version of this article (doi:10.1007/s00401-013-1095-8) contains supplementary material, which is available to authorized users.
High-grade glioma; H3K36 methylation; SETD2; Epigenetic; Pediatric; Young adult
Background. Next-generation sequencing of cancers has identified important therapeutic targets and biomarkers. The goal of this pilot study was to compare the genetic changes in a human papillomavirus- (HPV-)positive and an HPV-negative head and neck tumor.
Methods. DNA was extracted from the blood and primary tumor of a patient with an HPV-positive tonsillar cancer and those of a patient with an HPV-negative oral tongue tumor. Exome enrichment was performed using the Agilent SureSelect All Exon Kit, followed by sequencing on the ABI SOLiD platform.
Results. Exome sequencing revealed slightly more mutations in the HPV-negative tumor (73) in contrast to the HPV-positive tumor (58). Multiple mutations were noted in zinc finger genes (ZNF3, 10, 229, 470, 543, 616, 664, 638, 716, and 799) and mucin genes (MUC4, 6, 12, and 16). Mutations were noted in MUC12 in both tumors.
Conclusions. HPV-positive HNSCC is distinct from HPV-negative disease in terms of evidence of viral infection, p16 status, and frequency of mutations. Next-generation sequencing has the potential to identify novel therapeutic targets and biomarkers in HNSCC.
Motivation: With the advent of relatively affordable high-throughput technologies, DNA sequencing of cancers is now common practice in cancer research projects and will be increasingly used in clinical practice to inform diagnosis and treatment. Somatic (cancer-only) single nucleotide variants (SNVs) are the simplest class of mutation, yet their identification in DNA sequencing data is confounded by germline polymorphisms, tumour heterogeneity and sequencing and analysis errors. Four recently published algorithms for the detection of somatic SNV sites in matched cancer–normal sequencing datasets are VarScan, SomaticSniper, JointSNVMix and Strelka. In this analysis, we apply these four SNV calling algorithms to cancer–normal Illumina exome sequencing of a chronic myeloid leukaemia (CML) patient. The candidate SNV sites returned by each algorithm are filtered to remove likely false positives, then characterized and compared to investigate the strengths and weaknesses of each SNV calling algorithm.
Results: Comparing the candidate SNV sets returned by VarScan, SomaticSniper, JointSNVMix2 and Strelka revealed substantial differences with respect to the number and character of sites returned; the somatic probability scores assigned to the same sites; their susceptibility to various sources of noise; and their sensitivities to low-allelic-fraction candidates.
Availability: Data accession number SRA081939, code at http://code.google.com/p/snv-caller-review/
Supplementary data are available at Bioinformatics online.
Fibroblast growth factor receptors (FGFRs) play diverse roles in control of cell proliferation, cell differentiation, angiogenesis, and development. Activating mutations of FGFRs in the germline have long been known to cause a variety of skeletal developmental disorders, but it is only recently that a similar spectrum of somatic FGFR mutations has been associated with human cancers. Many of these somatic mutations are gain-of-function and oncogenic and create dependencies in tumor cell lines harboring such mutations. A combination of knock-down studies and pharmaceutical inhibition in preclinical models has further substantiated genomically-altered FGFR as a therapeutic target in cancer, and the oncology community is responding with clinical trials evaluating multi-kinase inhibitors with anti-FGFR activity and a new generation of specific pan-FGFR inhibitors.
FGFR; tyrosine kinase; somatic mutation; targeted therapy
Recent high throughput genomic sequencing studies of solid tumors, including head and neck squamous cell carcinoma (SCC), ovarian cancer, lung adenocarcinoma, glioblastoma, breast cancer and lung SCC, have highlighted DNA mutation as a mechanism for aberrant Notch signaling. A primary challenge of targeting Notch for treatment of solid malignancies is determining whether Notch signaling is cancer-promoting or tumor-suppressing for a specific cancer. We compiled reported Notch receptor and ligand missense and nonsense mutations in order glean insights into aberrant Notch signaling.
Frequencies of coding mutations differed for the four Notch genes. 4.7% of tumors harbored NOTCH1 missense or nonsense mutations. NOTCH2 and NOTCH3 had similar overall mutation rates of 1.5% and 1.3%, respectively, while NOTCH4 mutations were rarer. Notch ligand genes were rarely mutated.
The combined mutation frequency and position spectra of the four Notch paralogs across the different cancers provide an opportunity to begin to illuminate the different contributions of each Notch paralog to each tumor type and to identify opportunities for therapeutic targeting. Notch signaling pathway activators and inhibitors are currently in early clinical development for treatment of solid malignancies. Defining the status and consequences of altered Notch signaling will be important for selection of appropriate treatment.
A 44-year old woman with recurrent solitary fibrous tumor (SFT)/hemangiopericytoma was enrolled in a clinical sequencing program including whole exome and transcriptome sequencing. A gene fusion of the transcriptional repressor NAB2 with the transcriptional activator STAT6 was detected. Transcriptome sequencing of 27 additional SFTs all revealed the presence of a NAB2-STAT6 gene fusion. Using RT-PCR and sequencing, we detected this fusion in 51 of 51 SFTs, indicating high levels of recurrence. Expression of NAB2-STAT6 fusion proteins was confirmed in SFT, and the predicted fusion products harbor the early growth response (EGR)-binding domain of NAB2 fused to the activation domain of STAT6. Overexpression of the NAB2-STAT6 gene fusion induced proliferation in cultured cells and activated EGR-responsive genes. These studies establish NAB2-STAT6 as the defining driver mutation of SFT and provide an example of how neoplasia can be initiated by converting a transcriptional repressor of mitogenic pathways into a transcriptional activator.
Gene fusions created by somatic genomic rearrangements are known to play an important role in the onset and development of some cancers, such as lymphomas and sarcomas. RNA-Seq (whole transcriptome shotgun sequencing) is proving to be a useful tool for the discovery of novel gene fusions in cancer transcriptomes. However, algorithmic methods for the discovery of gene fusions using RNA-Seq data remain underdeveloped. We have developed deFuse, a novel computational method for fusion discovery in tumor RNA-Seq data. Unlike existing methods that use only unique best-hit alignments and consider only fusion boundaries at the ends of known exons, deFuse considers all alignments and all possible locations for fusion boundaries. As a result, deFuse is able to identify fusion sequences with demonstrably better sensitivity than previous approaches. To increase the specificity of our approach, we curated a list of 60 true positive and 61 true negative fusion sequences (as confirmed by RT-PCR), and have trained an adaboost classifier on 11 novel features of the sequence data. The resulting classifier has an estimated value of 0.91 for the area under the ROC curve. We have used deFuse to discover gene fusions in 40 ovarian tumor samples, one ovarian cancer cell line, and three sarcoma samples. We report herein the first gene fusions discovered in ovarian cancer. We conclude that gene fusions are not infrequent events in ovarian cancer and that these events have the potential to substantially alter the expression patterns of the genes involved; gene fusions should therefore be considered in efforts to comprehensively characterize the mutational profiles of ovarian cancer transcriptomes.
Genome rearrangements and associated gene fusions are known to be important oncogenic events in some cancers. We have developed a novel computational method called deFuse for detecting gene fusions in RNA-Seq data and have applied it to the discovery of novel gene fusions in sarcoma and ovarian tumors. We assessed the accuracy of our method and found that deFuse produces substantially better sensitivity and specificity than two other published methods. We have also developed a set of 60 positive and 61 negative examples that will be useful for accurate identification of gene fusions in future RNA-Seq datasets. We have trained a classifier on 11 novel features of the 121 examples, and show that the classifier is able to accurately identify real gene fusions. The 45 gene fusions reported in this study represent the first ovarian cancer fusions reported, as well as novel sarcoma fusions. By examining the expression patterns of the affected genes, we find that many fusions are predicted to have functional consequences and thus merit experimental followup to determine their clinical relevance.