|Home | About | Journals | Submit | Contact Us | Français|
Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumor genomes and their comparison to matched normal DNAs. Several new and unexpected oncogenic mechanisms were suggested by the pattern of somatic mutation across the dataset. These include the mutation of genes involved in protein translation (seen in nearly half of the patients), genes involved in histone methylation, and genes involved in blood coagulation. In addition, a broader than anticipated role of NF-κB signaling was suggested by mutations in 11 members of the NF-κB pathway. Of potential immediate clinical relevance, activating mutations of the kinase BRAF were observed in 4% of patients, suggesting the evaluation of BRAF inhibitors in multiple myeloma clinical trials. These results indicate that cancer genome sequencing of large collections of samples will yield new insights into cancer not anticipated by existing knowledge.
Multiple myeloma (MM) is an incurable malignancy of mature B-lymphoid cells, and its pathogenesis is only partially understood. About 40% of cases harbor chromosome translocations resulting in over-expression of genes (including CCND1, CCND3, MAF, MAFB, WHSC1/MMSET and FGFR3) via their juxtaposition to the immunoglobulin heavy chain (IgH) locus1. Other cases exhibit hyperdiploidy. However, these abnormalities are likely insufficient for malignant transformation because they are also observed in the pre-malignant syndrome known as monoclonal gammopathy of uncertain significance (MGUS). Malignant progression events include activation of MYC, FGFR3, KRAS and NRAS and activation of the NF-κB pathway1-3. More recently, loss-of-function mutations in the histone demethylase UTX/KDM6A have also been reported4.
A powerful way to understand the molecular basis of cancer is to sequence either the entire genome or the protein-coding exome, comparing tumor to normal from the same patient in order to identify the acquired somatic mutations. Recent reports have described the sequencing of whole genomes from a single patient5-9. While informative, we hypothesized that a larger number of cases would permit the identification of biologically relevant patterns that would not otherwise be evident.
We studied 38 MM patients (Supplementary Table 1), performing whole-genome sequencing (WGS) for 23 patients and whole-exome sequencing (WES; assessing 164,687 exons) for 16 patients, with one patient analyzed by both approaches (Supplementary Information). WES is a cost-effective strategy to identify protein-coding mutations, but cannot detect non-coding mutations and rearrangements. We identified tumor-specific mutations by comparing each tumor to its corresponding normal, using a series of algorithms designed to detect point mutations, small insertions/deletions (indels) and other rearrangements (Supplementary Fig. 1). Based on WGS, the frequency of tumor-specific point mutations was 2.9 per million bases, corresponding to approximately 7,450 point mutations per sample across the genome, including an average of 35 amino acid-changing point mutations plus 21 chromosomal rearrangements disrupting protein-coding regions (Supplementary Tables 2 and 3). The mutation-calling algorithm was found to be highly accurate, with a true positive rate of 95% for point mutations (Supplementary text, Supplementary Tables 4 and 5, and Supplementary Fig. 2).
The mutation rate across the genome rate varied greatly depending on base composition, with mutations at CpG dinucleotides occurring 4-fold more commonly than mutations at A or T bases (Supplementary Fig. 3a). In addition, even after correction for base composition, the mutation frequency in coding regions was lower than that observed in intronic and intergenic regions (p < 1×10−16; Supplementary Fig. 3b), potentially owing to negative selective pressure against mutations disrupting coding sequences. There is also a lower mutation rate in intronic regions compared to intergenic regions (p < 1×10−16), which may reflect transcription-coupled repair, as previously suggested10, 11. Consistent with this explanation, we observed a lower mutation rate in introns of genes expressed in MM compared to those not expressed (Fig. 1a).
We next focused on the distribution of somatic, non-silent protein-coding mutations. We estimated statistical significance by comparison to the background distribution of mutations (Supplementary Information). 10 genes showed statistically significant rates of protein-altering mutations (‘significantly mutated genes’) at a False Discovery Rate (FDR) of ≤0.10 (Table 1). To investigate their functional importance, we compared their predicted consequence (based on evolutionary conservation and nature of the amino acid change) to the distribution of all coding mutations. This analysis showed a dramatic skewing of functional importance (FI) scores12 for the 10 significantly mutated genes (p = 7.6×10−14; Fig. 1b), supporting their biological relevance. Even after RAS and p53 mutations are excluded from the analysis, the skewing remained significant (p < 0.01).
We also examined the non-synonymous:synonymous (NS:S) mutation rate for the significantly mutated genes. The expected NS:S ratio was 2.82 ± 0.15, whereas the observed ratio was 39:0 for the significant genes (p < 0.0001), further strengthening the case that these genes are likely drivers of the pathogenesis of MM, and are unlikely to simply be passenger mutations.
The significantly mutated genes include three previously reported to have point mutations in MM: KRAS and NRAS (10 and 9 cases, respectively (50%), p < 1×10−11, q < 1×10−6), and TP53 (3 cases (8%), p = 5.1×10−6, q = 0.019). Interestingly, we identified 2 point mutations (5%, p = 0.000027, q = 0.086) in CCND1 (cyclin D1), which has long been recognized as a target of chromosomal translocation in MM, but for which point mutations have not been observed previously in cancer.
The remaining 6 genes have not previously been known to be involved in cancer, and suggest new aspects of the pathogenesis of MM.
A striking finding of this study was the discovery of frequent mutations in genes involved in RNA processing, protein translation and the unfolded protein response. Such mutations were observed in nearly half of the patients.
The DIS3/RRP44 gene harbored mutations in 4/38 patients (11%, p = 2.4x10-6, q = 0.011). DIS3 encodes a highly conserved RNA exonuclease which serves as the catalytic component of the exosome complex involved in regulating the processing and abundance of all RNA species13, 14. The four observed mutations occur at highly conserved regions (Fig. 2a) and cluster within the RNB domain facing the enzyme's catalytic pocket (Fig. 2b). Two lines of evidence suggest that the DIS3 mutations result in loss of function. First, 3 of the 4 tumors with mutations exhibited loss of heterozygosity via deletion of the remaining DIS3 allele. Second, two of the mutations have been functionally characterized in yeast and bacteria, where they result in loss of enzymatic activity leading to the accumulation of their RNA targets15, 16. Given that a key role of the exosome is the regulation of the available pool of mRNAs available for translation17, these results suggest that DIS3 mutations may dysregulate protein translation as an oncogenic mechanism in MM.
Further support for a role of translational control in the pathogenesis of MM comes from the observation of mutations in the FAM46C gene in 5/38 (13%) patients (p < 1.8×10−10, q = 1×10−6). There is no published functional annotation of FAM46C, and its sequence lacks obvious homology to known proteins. To gain insight into its cellular role, we examined its pattern of gene expression across 414 MM samples and compared it to the expression of 395 gene sets curated in the Molecular Signatures Database (MSigDB), using the GSEA algorithm18-20. The expression of FAM46C was highly correlated (q = 0.034 after multiple hypothesis correction; Fig. 2c) to the expression of the set of ribosomal proteins, which are known to be tightly co-regulated21. Strong correlation with eukaryotic initiation and elongation factors involved in protein translation was similarly observed. While the precise function of FAM46C remains unknown, this striking correlation provides strong evidence that FAM46C is functionally related in some way to the regulation of translation.
Notably, while not statistically significant on their own, we found mutations in 5 other genes related to protein translation, stability and the unfolded protein responses (Supplementary Table 6), further supporting a role of translational control in MM. Of particular interest, two patients had mutations in the unfolded protein response gene XBP1. Over-expression of a particular splice form of XBP1 has been shown to cause a MM-like syndrome in mice, although no role of XBP1 in the pathogenesis of human MM has been described22.
Of related interest, mutations of the LRRK2 gene were observed in 3/38 patients (8%; Supplementary Table 6). LRRK2 encodes a serine-threonine kinase that phosphorylates translation initiation factor 4E-binding protein (4EBP). LRRK2 is best known for its role in the predisposition to Parkinson's disease23, 24. Parkinson's disease and other neurodegenerative diseases such as Huntington's disease are characterized in part by aberrant unfolded protein responses25. Protein homeostasis may be particularly important in MM because of the enormous rate of production immunoglobulins by MM cells26-28. The finding is also of clinical significance because of the success of the drug bortezomib (Velcade) that inhibits the proteasome and which shows remarkable activity in MM compared to other tumor types29.
Together, these results indicate that mutations affecting protein translation and homeostasis are extremely common in MM (at least 16/38 patients; 42%), thereby suggesting that additional therapeutic approaches that target these mechanisms may be worth exploring.
Another way to recognize biologically significant mutations is to search for recurrence of identical mutations indicative of gain-of-function alterations in oncogenes. Two patients had an identical mutation (K123R) in the DNA-binding domain of the interferon regulatory factor IRF4. Interestingly, a recent RNA interference screen in MM showed that IRF4 was required for MM survival, consistent with its role as a putative oncogene30. Genotyping for this mutation in 161 additional MM identified two more patients with this mutation. IRF4 is a transcriptional regulator of PRDM1 (BLIMP-1), and two of 38 sequenced patients also exhibited PRDM1 mutations. PRDM1 is a transcription factor involved in plasma cell differentiation, loss-of-function mutations of which occur in diffuse large B-cell lymphoma31-35.
Some mutations deserve attention because of their clinical relevance. One of our 38 patients harboured a BRAF kinase mutation (G469A). While BRAF G469A has not previously been observed in MM, this precise mutation is known to be activating and oncogenic36. We genotyped an additional 161 MM patients for the 12 most common BRAF mutations and found mutations in 7 patients (4%). Three of these were K601N and 4 were V600E (the most common BRAF mutation in melanoma37). Our finding of common BRAF mutations in MM has important clinical implications because such patients may benefit from treatment with BRAF inhibitors, some of which show dramatic clinical activity38. Our results also support the observation that inhibitors acting downstream of BRAF (e.g. MEK) may have activity in MM39.
Another approach to identify biologically relevant mutations in MM is to look not at the frequency of mutation of individual genes, but rather of sets of genes.
We first considered gene sets based on existing insights into the biology of MM. For example, activation of the NF-κB pathway is known in MM, but the basis of such activation is only partially understood 2, 3. We observed 10 point mutations (p=0.016) and 4 structural rearrangements, affecting 11 NF-κB pathway genes (Supplementary Table 7): BTRC, CARD11, CYLD, IKBIP, IKBKB, MAP3K1, MAP3K14, RIPK4, TLR4, TNFRSF1A, and TRAF3. Taken together, our findings greatly expand the mechanisms by which NF-κB may be activated in MM.
We next looked for enrichment in mutations in histone-modifying enzymes. This hypothesis arose because of our observation that the homeotic transcription factor HOXA9 was highly expressed in a subset of MM patients, particularly those lacking known IgH translocations (Supplementary Fig. 4a). HOXA9 expression is regulated primarily by histone methyltransferases (HMT) including members of the MLL family. Sensitive RT-PCR analysis showed that HOXA9 was in fact ubiquitously expressed in MM, with most cases exhibiting biallelic expression consistent with dysregulation via an upstream HMT event (Supplementary Figs. 4b,c). Accordingly, we looked for mutations in genes known to directly regulate HOXA9. We found significant enrichment (p = 0.0024), with mutations in MLL, MLL2, MLL3, UTX, WHSC1, and WHSC1L1.
HOXA9 is normally silenced by histone-3 lysine-27 tri-methylation (H3K27me3) chromatin marks when cells differentiate beyond the hematopoietic stem cell stage40, 41. This repressive mark was weak or absent at the HOXA9 locus in most MM cell lines (Fig. 3a). Moreover, there was inverse correlation between H3K27me3 levels and HOXA9 expression (Fig. 3b), consistent with HMT dysfunction contributing to aberrant HOXA9 expression.
To establish the functional significance of HOXA9 expression in MM cells, we knocked down its expression with 7 shRNAs (Supplementary Fig. 5). In 11/12 MM cell lines, HOXA9-depleted cells exhibited a competitive disadvantage (Fig. 3c and Supplementary Fig. 6).
These experiments suggest that aberrant HOXA9 expression, caused at least in part by HMT-related genomic events, plays a role in MM and may represent a new therapeutic target. Further supporting a role of HOXA9 as a MM oncogene, array-based comparative genomic hybridization identified focal amplifications of the HOXA locus in 5% of patients (Supplementary Fig. 7).
We next asked whether it would be possible to discover pathways enriched for mutations in the absence of prior knowledge. Accordingly, we examined 616 gene sets in the MSigDB Canonical Pathways database. One top-ranking gene set was of particular interest because it did not relate to genes known to be important in MM. This gene set encodes proteins involved in the formation of the fibrin clot in the blood coagulation cascade. There were 6 mutations in 5/38 patients (16%, q = 0.0054), encoding 5 proteins (Supplementary Table 8). RT-PCR analysis confirmed expression of 4 of the 5 coagulation factors in MM cell lines (Supplementary Fig. 8). The coagulation cascade involves a number of extracellular proteases and their substrates and regulators, but their role in MM has not been suspected. However, thrombin and fibrin have been shown to serve as mitogens in other cell types42, and have been implicated in metastasis43. These observations suggest that coagulation factor mutations should be explored more fully in human cancers.
Analyses of non-coding portions of the genome have not previously been reported in cancer. We focused on non-coding regions with highest regulatory potential (RP). We defined 2.4×106 RP regions (Supplementary Fig. 9), averaging 280 base pairs (bp). We then treated these regions as if they were protein-coding genes, subjecting them to the same permutation analysis used for exonic regions.
We identified multiple non-coding regions with high frequencies of mutation which fell into two classes (Table 2 and Supplementary Table 9). The first corresponds to regions of known somatic hypermutation. These have a 1000-fold higher than expected mutation frequency, as expected for post-germinal center B-cells (Supplementary Table 9). These regions comprise immunoglobulin-coding genes and the 5′-UTR of the lymphoid oncogene, BCL6, as reported44. Interestingly, we also found previously unrecognized mutations in the intergenic region flanking BCL6 in 5 patients, indicating that somatic hypermutation likely occurs in regions beyond the 5′ UTR and first intron of BCL6 (Table 2). Whether such non-coding BCL6 mutations contribute to MM pathogenesis remains to be established.
The second class consisted of 18 non-coding regions with mutation frequencies beyond that expected by chance (q < 0.25) (Table 2 and Supplementary Table 10). Four of the 18 regions flanked genes that also harbored coding mutations. Interestingly, we observed 7 mutations in 5 of 23 patients (22%) within non-coding regions of BCL7A, a putative tumor suppressor gene discovered in the B-cell malignancy Burkitt lymphoma45, and which is also deleted or hypermethylated in cutaneous T-cell lymphomas46, 47. The function of BCL7A is unknown, and the effect of its non-coding mutations in MM remains to be established.
Our preliminary analysis of non-coding mutations suggests that non-exonic portions of the genome may represent a previously untapped source of insight into the pathogenesis of cancer.
The analysis of MM genomes reveals that mechanisms previously suspected to play a role in the biology of MM (e.g. NF-κB activation and HMT dysfunction) may in fact play broad roles by virtue of mutations in multiple members of these pathways. In addition, potentially new mechanisms of transformation are suggested, including mutations in the RNA exonuclease DIS3 and other genes involved in protein translation and homeostasis. Whether these mutations are unique to MM or are common to other cancers remains to be determined. Furthermore, frequent mutations in the oncogenic kinase BRAF were observed – a finding that has immediate clinical translational implications.
Importantly, the majority of these discoveries could not have been made by sequencing only a single MM genome – the complex patterns of pathway dysregulation required the analysis of multiple genomes. Whole-exome sequencing revealed the substantial majority of the significantly mutated genes. However, we note that half of total protein-coding mutations occurred via chromosomal aberrations such as translocations, most of which would not have been discovered by sequencing of the exome alone. Similarly, the recurrent point mutations in non-coding regions would have been missed with sequencing directed only at coding exons.
The analysis described here is preliminary. Additional MM genomes will be required to establish the definitive genomic landscape of the disease and determine accurate estimates of mutation frequency in the disease. The sequence data described here will be available from the dbGaP repository (http://www.ncbi.nlm.nih.gov/gap) and we have created a MM Genomics Portal (http://www.broadinstitute.org/mmgp) to support data analysis and visualization.
Informed consent from MM patients was obtained in line with the Declaration of Helsinki. DNA was extracted from bone marrow aspirate (tumor) and blood (normal). WGS libraries (370-410 bp inserts) and WES libraries (200-350 bp inserts) were constructed and sequenced on an Illumina GA-II sequencer using 101 and 76 bp paired-end reads, respectively. Sequencing reads were procesed with the Firehose pipeline, identifying somatic point mutations, indels, and other structural chromosomal rearrangements. Structural rearrangements affecting protein-coding regions were then subjected to manual review to exclude alignment artifacts. True positive mutation rates were estimated by Sequenom mass spectrometry genotyping of randomly selected mutations. HOXA9 shRNAs were introduced into MM cell lines using lentiviral infection using standard methods.
A complete description of the materials and methods are provided in the Supplementary Information.
This project was funded by a grant from the Multiple Myeloma Research Foundation. M.C. was supported by a Clinician Scientist Fellowship from Leukaemia and Lymphoma Research (UK). We are grateful to all members of the Broad Institute's Biological Samples Platform, Genetic Analysis Platform, and Genome Sequencing Platform, without whom this work would not have been possible.
Author Information: The sequence data described here will be available from the dbGaP repository (http://www.ncbi.nlm.nih.gov/gap). We have also created a MM Genomics Portal (http://www.broadinstitute.org/mmgp) to support data analysis and visualization.
The authors have no competing financial interests to disclose.
Author Contributions: All authors contributed to the final manuscript. K.C.A., R.F., C.C.H., S.J., A.J.J., A.K., T.L., S.L., S.V.R., D.S.S., S.T., R.V., and T.Z. collected data and provided patient materials. J.J.K., C.S., G.J.A., K.G.A., D.A., A.B., P.L.B., S.B.G., J.L., T.L., S.M., B.M., L.M.P., R.O., W.W., and J.C. processed and analyzed genetic material, including RNA/DNA extraction, fingerprinting, genotyping, data management, hybridizations, library preparation, and sequencing.M.A.C., J.J.K., A.C.S., C.L.H., M.A., and B.E.B. performed experimental work, including PCR, cloning, ChIP analyses, and RNAi experiments. M.A.C., M.S.L., J.J.K., K.C., J-P. B., Y.D., S.M., T.J.P., A.H.R., A.S., D.V., and G.G. performed data analyses. M.A.C., M.S.L., K.C., E.S.L., G.G., and T.R.G. produced the text and figures, including supplementary information. J.C., J.T., W.C.H., L.A.G., M.M., E.S.L., G.G., and T.R.G. provided leadership for the project.