Burkitt lymphoma is characterized by deregulation of the MYC
gene through its translocation to one of the immunoglobulin loci. The role of collaborating genetic mutations that contribute to Burkitt lymphoma remains unknown1,2
. Whereas gene expression profiles of Burkitt lymphoma and the more common DLBCL have shown that these two diseases have vast molecular differences1,3
, the genetic underpinnings of these differences are not known.
We identified a classic case of Burkitt lymphoma4
and performed whole-genome sequencing of tumor and germline DNA from the same affected individual using the Illumina platform. The distribution of somatic mutations observed in the Burkitt lymphoma genome is depicted in a Circos5
diagram (; summarized in Supplementary Table 1
). The vast majority of somatic alterations were in intergenic regions. We observed 6 mutations in potential regulatory regions (loci within 2 kb of a transcriptional start site) and 42 in gene-coding regions. Through the analysis of paired-end reads6
, we also identified the presence of the t(8;14) translocation (Supplementary Fig. 1
) that is a defining feature of Burkitt lymphoma. Thus, in this single genome, nearly all the known hallmarks of Burkitt lymphoma were identified, including the translocation and mutation of the MYC
Figure 1 Results from whole-genome sequencing of a Burkitt lymphoma tumor and germline DNA. The Circos diagram5 summarizes the somatically acquired genetic variants in a Burkitt lymphoma genome. The outermost ring depicts the chromosome ideogram oriented clockwise, (more ...)
We further characterized the diversity of mutations in Burkitt lymphoma by performing exome sequencing on 59 affected individuals, including 51 primary Burkitt lymphoma tumors, 14 with paired normal tissue, and 8 Burkitt lymphoma cell lines, using the Illumina platform and Agilent reagents. We verified adequate sequencing quality and coverage throughout the exome (Supplementary Fig. 2
). We identified genetic variants and further classified these as synonymous, missense, nonsense and small insertions and/or deletions (indels).
We verified the accuracy of genetic variant identification from our deep sequencing data by performing Sanger sequencing on 108 missense and 16 frameshift and/or indel mutations (Supplementary Note
). We found that the two methods agreed for over 80% of the variants assessed (Supplementary Table 2
), confirming that our sequencing and bioinformatics methods generated accurate results.
We designated the 14 Burkitt lymphoma samples with paired germline DNA our discovery set, and the remaining 45 Burkitt lymphoma samples were designated the validation set. We noted that transitions were the predominant form of somatically acquired genetic variation in the discovery set (P
< 1 × 10−6
test; and Supplementary Fig. 3
. We identified 1,241 variants in 1,104 unique genes that were somatically mutated in at least 1 tumor-germline pair in the discovery set. We then identified additional genetic variants for these 1,104 genes in the validation set, which were similarly rare variants that were not present in databases of normal variation, including dbSNP135 (ref. 8
), publicly available data from healthy individuals9–12
) and data from 19 additional exomes that we sequenced from control individuals without lymphoma.
Figure 2 Exome sequencing in Burkitt lymphoma. (a) The ratio of somatically acquired transitions and transversions for samples with paired normal tissue are shown for all 14 discovery set samples. (b) The heatmap indicates the mutation patterns of the 19 most (more ...)
For the 1,104 somatically mutated genes, we identified candidate mutated genes in the validation set of 45 Burkitt lymphomas. We annotated the 2,318 variants that were nonsynonymous and did not occur in normal controls. For a gene to be classified as being mutated in Burkitt lymphoma, it needed to have recurrent variants that were already in the Catalogue of Somatic Mutations in Cancer (COSMIC)13
or recurrent variants in close proximity to each other or affecting the same protein domain (Supplementary Fig. 4
and Supplementary Note
We identified 70 recurrently mutated genes in Burkitt lymphoma (Supplementary Tables 3
), including 16 genes that have been conclusively implicated in cancer13
. The number and types of mutations in genes that were mutated in 10% or more of the Burkitt lymphomas (n
= 19) are shown (). We noted considerable heterogeneity in the number of mutated genes, which ranged up to 16 per lymphoma of these 70 genes (). Gene expression data confirmed that all of these genes were measurably expressed in Burkitt lymphomas, DLBCLs or mature B cells (an example of expression is depicted in Supplementary Fig. 5
The most frequently mutated genes in Burkitt lymphoma were MYC (40%) and ID3 (34%). Other frequently mutated genes included the known suppressor genes ARID1A, SMARCA4 and TP53, as well as the oncogene PIK3R1 and NOTCH1. In the recurrently mutated genes in Burkitt lymphoma, silencing events, such as nonsense and frameshift mutations, constituted a substantial proportion (~30% or more) of the events in ID3, GNA13, ARID1A, CREBBP and CCT6B, suggesting that the genetic alterations may result in loss of function.
We further investigated the genetic differences between Burkitt lymphoma and DLBCL. Through similar analyses, we identified 351 recurrently mutated genes in DLBCL (Supplementary Note
and J.Z. et al.
, unpublished data), a number of which overlapped with those identified in previously published studies of DLBCL14–16
. We identified all genes that were recurrently mutated in either Burkitt lymphoma or DLBCL at a frequency of at least 10% in our study or one of the published studies of DLBCL. We plotted the relative and absolute frequencies of the gene alterations in Burkitt lymphoma and DLBCL (). We found a number of genes, including ID3
, that were predominantly mutated in Burkitt lymphoma (P
< 0.05, Fisher's exact test). In contrast, PIM1
) were predominantly mutated in DLBCL. A number of genes had overlapping patterns of mutation in the two diseases, including MLL3
Figure 3 Patterns of exonic mutations in Burkitt lymphoma compared to DLBCL. (a) The bar graph shows the proportion of Burkitt lymphoma and DLBCL samples containing a mutation in each gene. (b) The bar graph shows the number of cases that contain a mutation in (more ...)
We further examined the association between the occurrence of individual gene alterations in Burkitt lymphoma and DLBCL (). Notably, we found that mutations in the SWI/SNF family members SMARCA4
occurred in a mutually exclusive fashion, suggesting that mutation in one of these genes by itself may be sufficient to deregulate the SWI/SNF chromatin-remodeling complex. The different mutational patterns of Burkitt lymphoma and DLBCL were also related in part to the lineage-derived subsets of DLBCL18
were predominantly mutated in the activated B cell–like (ABC) DLBCLs compared to Burkitt lymphoma. GNA13
showed overlapping mutational patterns in Burkitt lymphomas and DLBCLs derived from germinal center B cells.
mutations affected nearly a third of the Burkitt lymphomas and were not present in any DLBCLs, including those containing MYC
translocations (Supplementary Note
). Nearly all of the alterations in ID3
affected the highly conserved helix-loop-helix (HLH) domain (). Of these alterations, nearly 30% represented nonsense and frameshift mutations, suggesting that the mutations have a silencing effect on the gene.
Figure 4 Recurrent ID3 mutations in Burkitt lymphomas. (a) Deep sequencing reads identify recurrent mutations affecting the HLH domain of ID3 in Burkitt lymphomas. Each colored line represents an individual somatic mutation or a rare genetic variant. The conservation (more ...)
To better understand the biological role of ID3
mutations in Burkitt lymphoma, we began by examining gene expression in 21 Burkitt lymphomas and 87 DLBCLs. We found that Burkitt lymphomas were characterized by twofold higher expression of ID3
compared to DLBCLs (P
= 0.002; Supplementary Fig. 6
). Both alleles seemed to be expressed at similar levels in cases with mutations (Supplementary Fig. 7
). Gene set enrichment analysis19
identified genes associated with the G1 to S-phase transition as being significantly upregulated in lymphomas with ID3
mutations (false discovery rate (FDR) < 0.05; Supplementary Fig. 8
). The expression of cell cycle pathway genes corresponding to the G1 to S-phase transition, including E2F1
, was significantly higher in ID3-
mutant Burkitt lymphoma samples relative to those with wild-type ID3
(). Samples with ID3
mutation also showed higher expression of known MYC target genes (Supplementary Fig. 9
). These findings provided a working hypothesis that ID3
mutations promote the G1 to S-phase transition, which we then tested experimentally.
We designed constructs expressing six different mutant forms of the ID3
gene, encoding the Val67*, Ile69fs, Leu64Phe, Leu54Val, Leu64His and Pro56Ser variants. We expressed these mutant constructs using a lentiviral vector in Jijoye, a Burkitt lymphoma cell line with wild-type ID3
, and confirmed their expression using protein blot analysis and fluorescence microscopy (Supplementary Fig. 10
). Cells expressing each of the six mutant constructs had a greater proportion of cells in S phase and a reduced proportion of cells in G1 phase (), differences that, when averaged together and plotted, were significant compared to control cells encoding wild-type ID3
= 0.03, paired t
test; ). Cell-cycle analysis over 24 h showed higher cell proliferation in all cell lines expressing mutant ID3 (P
= 5.6 × 10−5
, Student's t
test; ). These results suggest that mutations in ID3
result in increased G1 to S-phase cell cycle progression in Burkitt lymphoma.
Conversely, when we expressed wild-type ID3 in the BL41 cell line encoding mutant ID3 (with the p.Val67* alteration), we found that the proportion of cells in S phase was lower in cells expressing wild-type ID3 compared to control cells overexpressing only GFP (). Similarly, we observed significantly lower cell proliferation in cells expressing wild-type ID3 at 24 h in culture (P = 0.02, Student's t test; ).
Thus, ID3 mutants increased cell cycle progression and cellular proliferation in Burkitt lymphoma cells, whereas expression of wild-type ID3 in mutant cells gave the opposite results. These experiments support a role for ID3 as a new tumor suppressor gene in Burkitt lymphoma.
The role of MYC
as a human oncogene was first discovered in Burkitt lymphoma20
, and its importance has since been shown in a number of different malignancies, including carcinomas of the lung21
. Little is known about the role of other genetic alterations that collaborate with MYC
deregulation in Burkitt lymphoma.
Inhibitor of DNA binding (ID) proteins have been shown to be regulators of normal cellular development25
. These proteins lack a DNA-binding domain and inhibit transcription through the formation of nonfunctional heterodimers with other basic helix-loop-helix (bHLH) proteins. Our data implicate ID proteins, for the first time to our knowledge, in Burkitt lymphoma and cancer, with ID3
mutations affecting over a third of Burkitt lymphomas. Predominantly silencing mutations in ID3
were associated with increased cell cycle progression and the expression of proliferation-associated genes. The ability of wild-type ID3 to decrease cell proliferation in Burkitt lymphoma suggests the possibility of using ID3 mimetics as a potential therapeutic approach in Burkitt lymphoma and other bHLH-driven cancers. The role of ID3
also highlights the importance of context in shaping the effect of genetic alterations in cancer. Affecting a single gene, mutations in ID3
seem unlikely to have a clear oncogenic role in most cancers. It is only in the setting of deregulation of MYC
(and perhaps other oncogenic bHLH proteins) that inactivating ID3
mutations might have a role by significantly amplifying the actions of these oncogenes. Similar context-dependent roles may be carried out by a number of other oncogenes and tumor suppressor genes.
Our study newly implicates a number of other genes in Burkitt lymphoma. Mutations in SWI/SNF family members ARID1A
occurred in a mutually exclusive fashion in Burkitt lymphoma, affecting nearly 25% of the tumors. Lineage also seems to have a key role in determining the mutations acquired in Burkitt lymphomas. GNA13
, which encodes a guanine nucleotide–binding G protein, was mutated through predominantly silencing events in nearly 15% of the lymphomas and has been shown to be specifically mutated in germinal center B cell–derived DLBCLs (ref. 26
and J.Z. et al.
, unpublished data). Thus, alterations in GNA13
seem to be a germinal center B cell–specific oncogenic event in lymphomas, similar to those described for EZH2
), which was also mutated in 7% of Burkitt lymphomas. We also observed recurrent mutations in the RET
genes and their associated pathways. These findings suggest new therapeutic possibilities in Burkitt lymphoma that can be tested in clinical trials in conjunction with approaches that assay for these mutations. Our data also implicate a number of genes for the first time in cancer, including CCT6B
. These genes likely have roles in other cancers that remain to be explored.
Exome sequencing has emerged as a powerful approach for the delineation of gene-coding mutations in malignancies. However, this approach does not capture every important aspect of tumor biology. Not every gene will have adequate coverage in every instance. Exome sequencing also does not assay for structural genetic alterations, mutations in regulatory regions and epigenetic alterations that could also make critical contributions to observed tumor phenotypes. Nevertheless, exome sequencing provides a cost-effective means to identify broad patterns of mutation in diseases at a resolution that was unthinkable just a few years earlier.
Our work thus provides an important starting point for understanding the genetic landscape of mutations in Burkitt lymphomas.