|Home | About | Journals | Submit | Contact Us | Français|
Follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBCL) are the two most common non-Hodgkin lymphomas (NHLs). To identify genes with mutations in B-cell NHL we sequenced tumour and matched normal DNA from 13 DLBCL cases and one FL case. We analysed RNA-seq data from these and another 113 NHLs to identify genes with candidate mutations, and then re-sequenced tumour and matched normal DNA from these cases to confirm 109 genes with multiple somatic mutations. Genes with roles in histone modification were frequent targets of somatic mutation. For example, 32% of DLBCL and 89% of FL cases had somatic mutations in MLL2, which encodes a histone methyltransferase. 11.4% of DLBCL and 13.4% of FL cases had somatic mutations in MEF2B, a calcium-regulated gene that cooperates with CREBBP and EP300 in acetylating histones. Our analysis thus suggests a previously unappreciated disruption of chromatin biology in lymphomagenesis.
Non-Hodgkin lymphomas (NHLs) are cancers of B, T or natural killer lymphocytes. The two most common types of NHL, follicular lymphoma (FL) and diffuse large B-cell lymphoma (DLBCL), together comprise 60% of new B-cell NHL diagnoses each year in North America1. FL is an indolent and typically incurable disease characterized by clinical and genetic heterogeneity. DLBCL is aggressive and likewise heterogeneous, comprising at least two distinct subtypes that respond differently to standard treatments. Both FL and the germinal centre B-cell (GCB) cell of origin (COO) subtype of DLBCL derive from germinal centre B cells whereas the activated B-cell (ABC) variety, which exhibits a more aggressive clinical course, is thought to originate from B cells that have exited, or are poised to exit, the germinal centre2. Current knowledge of the specific genetic events leading to DLBCL and FL is limited to the presence of a few recurrent genetic abnormalities2. For example, 85-90% of FL and 30-40% of GCB DLBCL cases3,4 harbour t(14;18)(q32;q21), which results in deregulated expression of the BCL2 oncoprotein. Other genetic abnormalities unique to GCB DLBCL include amplification of the c-REL gene and of the miR-17-92 microRNA cluster5. In contrast to GCB cases, 24% of ABC DLBCLs harbour structural alterations or inactivating mutations affecting PRDM1, which is involved in differentiation of GCB cells into antibody-secreting plasma cells6. ABC-specific mutations also affect genes regulating NF-κB signalling7,8,9, with TNFAIP3 (A20) and MYD8810 the most abundantly mutated in 24% and 39% of cases respectively. To enhance our understanding of the genetic architecture of B-cell NHL, we undertook a study to (1) identify somatic mutations and (2) determine the prevalence, expression and focal recurrence of mutations in FL and DLBCL. Using strategies and techniques applied to cancer genome and transcriptome characterization by ourselves and others11,12,13, we sequenced tumour DNA and/or RNA from 117 tumour samples and 10 cell lines (Supplementary Tables S1 and S2) and identified 651 genes (Supplementary Figure S1) with evidence of somatic mutation in B-cell NHL. After validation, we showed that 109 genes were somatically mutated in 2 or more NHL cases. We further characterised the frequency and nature of mutations within MLL2 and MEF2B, which were among the most frequently mutated genes with no previously known role in lymphoma.
We sequenced the genomes or exomes of 14 NHL cases, all with matched constitutional DNA sequenced to comparable depths (Supplementary Tables S1 and S2). After screening for single nucleotide variants followed by subtraction of known polymorphisms and visual inspection of the sequence read alignments, we identified 717 nonsynonymous (coding single nucleotide variants; cSNVs) affecting 651 genes (Supplementary Figure S1; Methods). We identified between 20 and 135 cSNVs in each of these genomes. Only 25 of the 651 genes with cSNVs were represented in the cancer gene census (December, 2010 release)14.
We performed RNA sequencing (RNA-seq) on these 14 NHL cases and an expanded set of 113 samples comprising 83 DLBCL, 12 FL and 8 B-cell NHL cases with other histologies and 10 DLBCL-derived cell lines (Supplementary Table S2). We analysed these data to identify novel fusion transcripts (Supplementary Table S3) and cSNVs (Figure 1). We identified 240 genes with at least one cSNV in a genome/exome or an RNA-seq “mutation hot spot” (below), and with cSNVs in at least three cases in total (Supplementary Table S4). We selected cSNVs from each of these 240 genes for re-sequencing to confirm their somatic status. We did not re-sequence genes with previously documented mutations in lymphoma (e.g. CD79B, BCL2). We confirmed the somatic status of 543 cSNVs in 317 genes, with 109 genes having at least two confirmed somatic mutations (Supplementary Table S4 and S5). Of the successfully re-sequenced cSNVs predicted from the genomes, 171 (94.5%) were confirmed somatic, 7 were false calls and 3 were present in the germ line. These 109 recurrently mutated genes were significantly enriched for genes implicated in lymphocyte activation (P=8.3×10-4; e.g. STAT6, BCL10), lymphocyte differentiation (P=3.5×10-3; e.g. CARD11), and regulation of apoptosis (P=1.9×10-3; e.g. BTG1, BTG2). Also significantly enriched were genes linked to transcriptional regulation (P= 5.4×10-4; e.g. TP53) and genes involved in methylation (P=2.2×10-4) and acetylation (P=1.2×10-2), including histone methyltransferase (HMT) and acetyltransferase (HAT) enzymes known previously to be mutated in lymphoma (e.g. EZH213 and CREBBP15; Methods).
Mutation hot spots can result from mutations at sites under strong selective pressure and we have previously identified such sites using RNA-seq data13. We searched our RNA-seq data for genes with mutation hot spots, and identified 10 genes that were not mutated in the 14 genomes (PIM1, FOXO1, CCND3, TP53, IRF4, BTG2, CD79B, BCL7A, IKZF3 and B2M), of which five (FOXO1, CCND3, BTG2, IKZF3 and B2M) were not previously known targets of point mutation in NHL (Supplementary Table S6; Methods). FOXO1, BCL7A and B2M exhibited hot spots affecting their start codons. The effect of a FOXO1 start codon mutation, which was observed in three cases, was further studied using a cell line in which the initiating ATG was mutated to TTG. Western blots probed with a FOXO1 antibody revealed a band with a reduced molecular weight, indicative of a FOXO1 N-terminal truncation (Supplementary Figure S2) consistent with utilization of the next in-frame ATG for translation initiation. A second hot spot in FOXO1 at T24 was mutated in two cases. T24 is reportedly phosphorylated by AKT subsequent to B-cell receptor (BCR) stimulation16 inducing FOXO1 nuclear export.
We analysed the RNA-seq data to determine whether any of the somatic mutations in the 109 recurrently mutated genes showed evidence for allelic imbalance with expression favouring one allele. Of 380 expressed heterozygous mutant alleles, we observed preferential expression of the mutation for 16.8% (64/380) and preferential expression of the wild-type for 27.8% (106/380; Supplementary Table S7). Seven genes displayed evidence for significant preferential expression of the mutant allele in at least two cases: (BCL2, CARD11, CD79B, EZH2, IRF4, MEF2B and TP53; Methods). In 27 of 43 cases with BCL2 cSNVs, expression favoured the mutant allele, consistent with the previously-described hypothesis that the translocated (and hence, transcriptionally deregulated) allele of BCL2 is targeted by somatic hypermutation17. Examples of mutations at known oncogenic hot spot sites such as F123I in CARD1118 exhibited allelic imbalance favouring the mutant allele in some cases. Similarly, we noted expression favouring two novel hot spot mutations in MEF2B (Y69 and D83) and two sites in EZH2 not previously reported as mutated in lymphoma (A682G and A692V).
We sought to distinguish new cancer-related mutations from passenger mutations using the approach proposed by Greenman et al19. We reasoned that this would reveal genes with strong selection signatures, and mutations in such genes would be good candidate cancer drivers. We identified 26 genes with significant evidence for positive selection (FDR 0.03, Methods), with either selective pressure for acquiring non-synonymous point mutations or truncating/nonsense mutations (Methods; Table 1; Supplementary Table S8). Included were known lymphoma oncogenes (BCL2, CD79B9, CARD1118, MYD8810 and EZH213), all of which exhibited signatures indicative of selection for non-synonymous variants.
We expected tumour suppressor genes to exhibit strong selection for the acquisition of nonsense mutations. In our analysis, the eight most significant genes included seven with strong selective pressure for nonsense mutations, including the known tumour suppressor genes TP53 and TNFRSF1420 (Table 1). CREBBP, recently reported as commonly inactivated in DLBCL15, also showed some evidence for acquisition of nonsense mutations and cSNVs (Supplementary Figure S3; Supplementary Table S9). We also observed enrichment for nonsense mutations in BCL10, a positive regulator of NF-κB, in which oncogenic truncated products have been described in lymphomas21. The remaining strongly significant genes (BTG1, GNA13, SGK1 and MLL2) had no reported role in lymphoma. GNA13 was affected by mutations in 22 cases including multiple nonsense mutations. GNA13 encodes the alpha subunit of a heterotrimeric G-protein coupled receptor responsible for modulating RhoA activity22. Some of the mutated residues negatively impact its function23,24, including a T203A mutation, which also exhibited allelic imbalance favouring the mutant allele (Supplementary Table S7). GNA13 protein was reduced or absent on Western blots in cell lines harbouring either a nonsense mutation, a stop codon deletion, a frame shifting deletion, or changes affecting splice sites (Methods; Supplementary Figure S4).
SGK1 encodes a PI3K-regulated kinase with functions including regulation of FOXO transcription factors25, regulation of NF-κB by phosphorylating IkB kinase26, and negative regulation of NOTCH signalling27. SGK1 also resides within a region of chromosome 6 commonly deleted in DLBCL (Figure 1)5. The mechanism by which SGK1 and GNA13 inactivation may contribute to lymphoma is unclear but the strong degree of apparent selection towards their inactivation and their overall high mutation frequency (each mutated in 18 of 106 DLBCL cases) suggests that their loss contributes to B-cell NHL. Certain genes are known to be mutated more commonly in GCB DLBCLs (e.g. TP5328 and EZH213). Here, both SGK1 and GNA13 mutations were found only in GCB cases (P = 1.93×10-3 and 2.28×10-4, Fisher exact test; n=15 and 18, respectively)(Figure 2). Two additional genes (MEF2B and TNFRSF14) with no previously described role in DLBCL showed a similar restriction to GCB cases (Figure 2).
MLL2 showed the most significant evidence for selection and the largest number of nonsense SNVs. Our RNA-seq analysis indicated that 26.0% (33/127) of cases carried at least one MLL2 cSNV. To address the possibility that variable RNA-seq coverage of MLL2 failed to capture some mutations, we PCR amplified the entire MLL2 locus (~36kb) in 89 cases (35 primary FLs, 17 DLBCL cell lines, and 37 DLBCLs). 58 of these cases were among the RNA-seq cohort. Illumina amplicon resequencing (Methods) revealed 78 mutations, confirming the RNA-seq mutations in the overlapping cases and identifying 33 additional mutations. We confirmed the somatic status of 46 variants using Sanger sequencing (Supplementary Table S10), and showed that 20 of the 33 additional mutations were insertions or deletions (indels). Three SNVs at splice sites were also detected, as were 10 new cSNVs that had not been detected by RNA-seq.
The somatic mutations were distributed across MLL2 (Figure 3A). 37% (n=29/78) of these were nonsense mutations, 46% (n=36/78) were indels that altered the reading frame, 8% (n=6/78) were point mutations at splice sites and 9% (n=7/78) were non-synonymous amino acid substitutions (Table 2). Four of the somatic splice site mutations had effects on MLL2 transcript length and structure. For example, two heterozygous splice site mutations resulted in the use of a novel splice donor site and an intron retention event.
Approximately half of the NHL cases we sequenced had two MLL2 mutations (Supplementary Table S10). We used BAC clone sequencing in eight FL cases to show that in all eight cases the mutations were in trans, affecting both MLL2 alleles. This observation is consistent with the notion that there is a complete, or near-complete, loss of MLL2 in the tumour cells of such patients.
With the exception of two primary FL cases and two DLBCL cell lines (Pfeiffer and SU-DHL-9), the majority of MLL2 mutations appeared to be heterozygous. Analysis of Affymetrix 500k SNP array data from two FL cases with apparent homozygous mutations revealed that both tumours exhibited copy number neutral loss of heterozygosity (LOH) for the region of chromosome 12 containing MLL2 (Methods). Thus, in addition to bi-allelic mutation, LOH is a second, albeit less common mechanism by which MLL2 function is lost.
MLL2 was the most frequently mutated gene in FL, and among the most frequently mutated genes in DLBCL (Figure 2). We confirmed MLL2 mutations in 31 of 35 FL patients (89%), in 12 of 37 DLBCL patients (32%), in 10 of 17 DLBCL cell lines (59%) and in none of the eight normal centroblast samples we sequenced. Our analysis predicted that the majority of the somatic mutations observed in MLL2 were inactivating (91% disrupted the reading frame or were truncating point mutations), suggesting to us that MLL2 is a tumour suppressor of significance in NHL.
Our selective pressure analysis also revealed genes with stronger pressure for acquisition of amino acid substitutions than for nonsense mutations. One such gene was MEF2B, which had not previously been linked to lymphoma. 20 (15.7%) cases had MEF2B cSNVs and 4 (3.1%) cases had MEF2C cSNVs. All cSNVs detected by RNA-seq affected either the MADS box or MEF2 domains. To determine the frequency and scope of MEF2B mutations, we Sanger-sequenced exons 2 and 3 in 261 primary FL samples; 259 DLBCL primary tumours; 17 cell lines; 35 cases of assorted NHL (IBL, composite FL and PBMCL); and eight non-malignant centroblast samples. We also used a capture strategy (Methods) to sequence the entire MEF2B coding region in the 261 FL samples, revealing six additional variants outside exons 2 and 3. We thus identified 69 cases (34 DLBCL; 12.67% and 35 FL; 15.33%) with MEF2B cSNVs or indels, failing to observe novel variants in other NHL and non-malignant samples. 55 (80%) of the variants affected residues within the MADS box and MEF2 domains encoded by exons 2 and 3 (Supplementary Table S11; Figure 3B). Each patient generally had a single MEF2B variant and we observed relatively few (8 total, 10.7%) truncation-inducing SNVs or indels. Non-synonymous SNVs were by far the most common type of change observed, with 59.4% of detected variants affecting K4, Y69, N81 or D83. In 12 cases MEF2B mutations were shown to be somatic, including representative mutations at each of K4, Y69, N81 and D83 (Supplementary Table S12). We did not detect mutations in ABC cases, indicating that somatic mutations in MEF2B play a role unique to the development of GCB DLBCL and FL (Figure 2).
In our study of genome, transcriptome and exome sequences from 127 B-cell NHL cases, we identified 109 genes with clear evidence of somatic mutation in multiple individuals. Significant selection appears to act on at least 26 of these for the acquisition of either nonsense or missense mutations. To the best of our knowledge, the majority of these genes had not previously been associated with any cancer type. We observed an enrichment of somatic mutations affecting genes involved in transcriptional regulation and, more specifically, chromatin modification.
MLL2 emerged from our analysis as a major tumour suppressor locus in NHL. It is one of six human H3K4-specific methyltransferases in the MLL family, all of which share homology with the Drosophila trithorax gene29. Trimethylated H3K4 (H3K4me3) is an epigenetic mark associated with the promoters of actively transcribed genes. By laying down this mark, MLLs are responsible for the transcriptional regulation of developmental genes including the homeobox (Hox) gene family30 which collectively control segment specificity and cell fate in the developing embryo31,32. Each MLL family member is thought to target different subsets of Hox genes33 and in addition, MLL2 is known to regulate the transcription of a diverse set of genes34. Recently, MLL2 mutations were reported in a small-cell lung cancer cell line35 and in renal carcinoma36 but the frequency of nonsense mutations affecting MLL2 in these cancers was not established in these reports. Parsons and colleagues recently reported inactivating mutations in MLL2 or MLL3 in 16% of medulloblastoma patients37 further implicating MLL2 as a cancer gene.
Our data link MLL2 somatic mutations to B-cell NHL. The reported mutations are likely to be inactivating and in eight of the cases with multiple mutations, we confirmed that both alleles were affected, presumably resulting in essentially complete loss of MLL2 function. The high prevalence of MLL2 mutations in FL (89%) equals the frequency of the t(14;18)(q32;q21) translocation, which is considered the most prevalent genetic abnormality in FL3. In DLBCL tumour samples and cell lines, MLL2 mutation frequencies were 32% and 59% respectively, also exceeding the prevalence of the most frequent cytogenetic abnormalities, such as the various translocations involving 3q27, which occur in 25-30% of DLBCLs and are enriched in ABC cases38. Importantly, we found MLL2 mutated in both DLBCL subtypes (Figure 2). Our analyses thus indicate that MLL2 acts as a central tumour suppressor in FL and both DLBCL subtypes.
The MEF2 gene family encodes four related transcription factors that recruit histone-modifying enzymes including histone deacetylases (HDACs) and HATs in a calcium-regulated manner. Although truncating variants were detected in our analysis of MEF2 gene family members, our analysis suggests that, in contrast to MLL2, MEF2 family members tend to selectively acquire non-synonymous amino acid substitutions. In the case of MEF2B, 59.4% of all the cSNVs were found at four sites within the protein (K4, Y69, N81 and D83), and all four of these sites were confirmed to be targets of somatic mutation. 39% of the MEF2B alterations affect D83, resulting in replacement of the charged aspartate with any of alanine, glycine or valine. Although we cannot yet predict the consequences of these substitutions on protein function, it seems likely that their effect would impact the ability of MEF2B to facilitate gene expression and thus play a role in promoting the malignant transformation of germinal centre B cells to lymphoma (Supplementary Discussion).
MEF2B mutations can be linked to CREBBP and EP300 mutations, and to recurrent Y641 mutations in EZH213. One target of CREBBP/EP300 HAT activity is H3K27, which is methylated by EZH2 to repress transcription. There is evidence that the action of EZH2 antagonizes that of CREBBP/EP30039. One function of MEF2 is to recruit either HDACs or CREBBP/EP300 to target genes40, and it has been suggested that HDACs compete with CREBBP/EP300 for the same binding site on MEF241. Under normal Ca2+ levels, MEF2 is bound by type IIa HDACs, which maintain the tails of histone proteins in a deacetylated repressive chromatin state42. Increased cytoplasmic Ca2+ levels induce the nuclear export of HDACs, enabling the recruitment of HATs such as CREBBP/EP300, facilitating transcription at MEF2 target genes. Mutation of CREBBP, EP300 or MEF2B may impact expression of MEF2 target genes owing to reduced acetylation of nucleosomes near these genes (Supplementary Figure S5; Supplementary Discussion). In light of the recent finding that heterozygous EZH2 Y641 mutations enhance overall H3K27 trimethylation activity of PCR243,44, it is possible that mutation of both MLL2 and EZH2 could cooperate in reducing the expression of some of the same target genes. Our data imply that (1) post-transcriptional modification of histones is of key importance in germinal centre B cells and (2) deregulated histone modification due to these mutations likely results in reduced acetylation and enhanced methylation and acts as a core driver event in the development of NHL (Supplementary Figure S5).
All samples analysed contained at least 50% tumour cells. Genomes, exomes and transcriptomes were sequenced using a combination of Illumina GAIIx and HiSeq 2000 instruments to read lengths of between 36 and 100 nucleotides. Exome capture was performed using the Agilent SureSelect Target Enrichment System Protocol (Version 1.0, September 2009). Alignment was accomplished using BWA45 and variants were identified using SNVmix46. Variants were manually reviewed in IGV and were confirmed (where applicable) by PCR followed by either Sanger sequencing or Illumina re-sequencing. Structural rearrangements in genomes and transcriptomes were identified using ABySS47. Gene expression values used for subtype assignment were calculated as RPKM values48 and subtypes were assigned using an adaptation of the method developed for data from Affymetrix expression arrays49 trained with samples previously classified by this standard approach.
This study was funded in part by funding from the National Cancer Institute Office of Cancer Genomics (Contract No. HHSN261200800001E.), the Terry Fox Foundation (grant #019001, Biology of Cancer: Insights from Genomic Analyses of Lymphoid Neoplasms), Genome Canada/Genome BC Grant Competition III (Project Title: High Resolution Analysis of Follicular Lymphoma Genomes) to J.M.C., R.D.G. and M.A.M. We acknowledge support from NIH grants P50CA130805-01 “SPORE in Lymphoma, Tissue Resource Core (PI Fisher)” and 1U01CA114778 “Molecular Signatures to Improve Diagnosis and Outcome in Lymphoma (PI Chan)”. A.J.M. is a Career Development Program Fellow of the Leukemia and Lymphoma Society. N.A.J. was a research fellow of the Terry Fox Foundation (award NCIC 019005) and the Michael Smith Foundation for Health Research (ST-PDF-01793). M.A.M. is a Terry Fox Young Investigator and a Michael Smith Senior Research Scholar. R.D.M is a Vanier Scholar (CIHR) and holds a MSFHR senior graduate studentship. M.M.L acknowledges support from a Postdoctoral Fellowship from the Spanish Ministry of Education, under the “Programa Nacional de Movilidad de Recursos Humanos del Plan Nacional de I-D+i 2008-2011”. D.W.S was supported by the Terry Fox Foundation Strategic Health Research Training Program in Cancer Research at Canadian Institutes of Health Research (Grant No. TGT-53912). JJS receives acknowledges funding from The Canadian Cancer Society and the Canadian Institutes of Health Research. RG is supported by a UBC Four Year Fellowship. IMM acknowledges the Canadian Foundation for Innovation for a Leaders Opportunity Fund. The laboratory work for this study was undertaken at the Genome Sciences Centre, British Columbia Cancer Research Centre and the Centre for Translational and Applied Genomics, a program of the Provincial Health Services Authority Laboratories. The authors would like to thank Dr. Chris Greenman for supplying his software and also gratefully acknowledge Dr. Daniela Gerhard and Dr. Samuel Aparicio for helpful discussions and guidance. Special thanks to Cecelia Suragh, Robyn Roscoe, Armelle Troussard and Adrienne Drobnies for expert project management assistance, and to the Library Construction, Sequencing and Bioinformatics Teams at the Genome Sciences Centre. The content of this publication does not necessarily reflect the views of policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. All raw sequence data can be accessed using the DCC portal (http://cgap.nci.nih.gov/Data_Access), study accession phs000235.v2.p1.
Author Contributions: MAM, RDG, DEH, MH and JMC conceived of the study and led the design of the experiments. RDM performed the analysis of sequence data, identified mutations and, with MML, AJM and MAM, produced Figures and wrote the manuscript. MML, AJM, DLT, SC, SC, DS, HM, JS, MM, TZ, AD, KT, YB, MF, JTW and TMS designed and performed experiments to amplify, discover, and validate mutations. RG, MG and IMM contributed to analyses and reviewed the manuscript. NAJ, MB, BW and BM prepared the samples, performed sample sorting and COO analysis and contributed to the text. ABW and JJS collected and prepared constitutional DNA samples. KLM, RC, SL, MF and SJ generated de novo assemblies and identified mutations. MK, SR, MG, OY and EYZ wrote software and contributed to Figures. RC performed copy number analysis and produced Figure and SBN performed confirmatory FISH experiments. YZ and AT produced the sequencing libraries. IB, RH, SJMJ, RM, JS, MH contributed to the development of experimental and analytical protocols. LR provided materials and reviewed the manuscript.
The authors declare no competing financial interests.
The SRA accession for the submission is SRP001599 which is linked the dbGAP study accession phs000235.v2.p1.