|Home | About | Journals | Submit | Contact Us | Français|
Exome sequencing of human breast cancers has revealed a substantial number of candidate cancer genes with recurring but infrequent somatic mutations. To determine more accurately their mutation prevalence, we performed a mutation analysis of 36 novel candidate cancer genes in 96 human breast cancers. Somatic mutations with potential impact on protein function were observed in the genes ADAM12, CENTB1, CENTG1, DIP2C, GLI1, GRIN2D, HDLBP, IKBKB, KPNA5, NFKB1, NOTCH1, and OTOF. These findings strengthen the evidence for involvement of the Notch, Hedgehog, NF-KB, and PIK3CA pathways in breast cancer development, and point to novel processes that likely are involved.
It is widely accepted that cancer is caused by constitutional and somatic mutations in genes that control cell growth or genome stability (Vogelstein and Kinzler, 2004). Classical genetic techniques were used to discover frequently mutated cancer genes, such as TP53 and ERBB2, which subsequently guided genetic and functional characterization of the pathways in which they reside. However, the majority of recently discovered candidate cancer genes in adult solid tumors are mutated in <10% of patient tumors. Thus, the hunt for breast cancer genes mutated in a low fraction of patient tumors has necessitated unbiased mutational analyses at the gene family, exome or genome levels. Examples of such studies include (1) re-sequencing of genes encoding kinases, which uncovered a higher ratio of non-synonymous to synonymous mutations than expected by chance indicating accumulation of driver mutations in this gene set (Stephens et al, 2005); (2) exome-wide somatic mutation analyses (Sjöblom et al, 2006; Wood et al, 2007; Leary et al, 2008); (3) rearrangement analyses by paired-end sequencing, which have revealed an average of 90 chromosomal breakpoints per receptor-negative tumor (Stephens et al, 2009); and (4) whole genome sequencing, which revealed 50 somatic point mutations and small indels in coding sequences as well as 28 large deletions, 6 inversions and 7 translocations in the breast cancer metastasis studied (Ding et al, 2010). In breast cancers, exome sequencing revealed 140 candidate breast cancer genes (Sjöblom et al, 2006; Wood et al, 2007). The average receptor-negative breast cancer had point mutations or small insertions or deletions in 101 protein coding genes, 11 focal amplifications, and 7 focal deletions. In breast cancers, as well as in other tumor types, it is currently believed that the majority of somatic mutations are passengers, i.e. mutations which do not directly alter the net rate of cell growth or other phenotypes of essence to the tumor cell. However, the multitude of novel recurring but infrequent gene mutations discovered by exome or genome sequencing poses a challenge in distinguishing driver from passenger genes (Ali and Sjöblom, 2009). In the present study, we investigate 36 previously identified candidate breast cancer genes by mutational analysis of 96 additional tumors. Through bioinformatic analyses to predict the effect of specific mutations on protein function and the analysis of the pathways in which these mutated genes reside, we identify likely driver genes and pathways in breast tumorigenesis.
Ninety-six fresh frozen tumor samples were obtained from the Johns Hopkins Medical Institutions, the Dana-Farber Cancer Institute and the South Carolina Biorepository System and either macrodissected or laser capture microdissected to increase tumor cell fraction (Table S1). The patients have an average age at diagnosis of 54 years (range 30–89). Tumors were categorized into 3 subtypes, namely luminal, HER2+ and basal, according to the expression status of estrogen receptor (ER), progesterone receptor (PR) and HER2 by immunohistochemistry (Brenton et al, 2005). Among the 96 samples investigated in this study, 19 cases do not have sufficient expression information for classification, while the remaining 77 tumors consist of 54 luminal breast cancers (70%), 10 HER2+ breast cancers (13%) and 13 basal breast cancers (17%).
Tumor DNA was extracted from frozen tissue or purified cell lines, and whole genome amplification (REPLI-g WGA, Qiagen) was used to provide sufficient quantity of DNA for mutational analyses.
Protein coding sequences of the 36 selected candidate cancer genes were amplified and sequenced from 96 breast tumor samples using previously described approaches (Sjöblom et al, 2006). PCR primers used to amplify targeted regions are listed in Table S2. DNA sequences were analyzed by Mutation Surveyor (SoftGenetics) followed by visual inspection to identify potential mutations. Sequence variants present in SNP databases (International HapMap Project and the 1000 Genomes Project, release 20100804) were removed. Putative mutations were sequenced de novo in the tumor DNA that had the mutation along with the patient-matched normal DNA. Two prediction tools, Cancer-Specific High-Throughput Annotation of Somatic Mutations (CHASM) (Carter et al, 2009) and MutationTaster (Schwarz et al, 2010), were used to predict functional effects of validated somatic mutations.
In order to calculate the mutation prevalence on a larger panel of samples, we combined the mutational data from Sjöblom et al. (2006) and Wood et al. (2007) with the data presented here. Since the tumor samples used in the validation screen in these studies varied across different genes, only the samples in which the gene was successfully sequenced were included in calculating mutation rate for certain genes (or in which all genes from the pathway were successfully sequenced in the case of pathway mutation prevalence calculation). To determine whether differences in mutation rates exist between different breast cancer subtypes, Fisher’s exact test was applied to the mutational data from samples that had subtype information. False discovery rate was controlled at 0.02 using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995) to minimize false positives caused by multiple comparisons.
We selected 36 candidate breast cancer genes to investigate from Sjöblom et al. (2006) and Wood et al. (2007) fulfilling the criteria (1) having somatic mutations in at least one tumor in a discovery set of 11 breast cancers, and subsequently found mutated in at least one tumor when reassessed in a validation set of 24 breast cancers; (2) not previously demonstrated to be breast cancer genes by functional studies; (3) having a mutation prevalence >3-fold higher than the estimated background somatic mutation prevalence of 1–3 mutations/Mb; (4) having mutations with predicted effects on gene function. In addition, BAP1, IKBKB, NFKB1, NFKBIA, and NFKBIE were included as they had putative loss of function mutations in discovery set tumors and were implicated in known cancer pathways.
We assessed ~130 kb of protein-encoding sequences from the 36 genes in each of 96 breast cancers for a total of 12.5 MB sequence. We observed 28 somatic mutations, comprising 13 non-synonymous, 3 frameshift, 1 truncating, 1 splice site, and 10 synonymous mutations (Table 1). Novel non-synonymous somatic mutations were observed in one-third of the genes, namely ADAM12, CENTB1, CENTG1, DIP2C, GLI1, GRIN2D, HDLBP, IKBKB, KPNA5, NFKB1, NOTCH1, and OTOF (Fig. 1).
Mutations in NF-KB pathway components contribute to the development of hematological malignancies such as multiple myeloma (Annunziata et al, 2007). We have previously demonstrated truncating, frameshift, and splice site mutations, respectively, in NFKBIE, NFKBIA, and NFKB1 along with multiple non-synonymous mutations in IKBKB and KEAP1 in breast tumors (Sjöblom et al, 2006; Wood et al, 2007). We here identify additional non-synonymous mutations in NFKB1 and the kinase domain of IKBKB in breast cancers.
The novel IKBKB E81Q kinase domain mutation is a predicted driver missense mutation (CHASM, FDR = 0.3). At the crossroads of NF-KB and PI3K signaling are members of the Centaurin gene family. Downregulation of the GTPase-activating protein CENTB1 has been shown to enhance NF-KB signaling, which provides a plausible explanation for the early splice site and frameshift mutations observed in breast cancers (Yamamoto-Furusho et al, 2006). CENTG1 is a known proto-oncogene, amplified in ~10% of glioblastomas and an activator of PIK3CA pathway signaling, which should encourage further functional studies of the missense mutations observed in breast cancers (Liu et al, 2007). The non-synonymous mutations observed in CENTB1 and CENTG1 in the current study are predicted to be disease-causing by MutationTaster (Schwarz et al, 2010). The combined mutation prevalence of these NF-KB pathway components mentioned above is 8% of breast tumor cases. Mutations in genes of the NF-KB pathway (NFKB, NFKBIA, NFKBIE, IKBKB, KEAP1, CENTB1 and CENTG1) are non-randomly distributed among breast tumor subtypes (luminal, n=60; HER2+, n=13 and triple-negative, n=19 from this study and from Sjöblom et al. (2006); P =0.003). These genes are mutated in 22% (7 out of 32) HER2+ or triple-negative breast cancers, which is significantly higher than 1.7% (1 out of 60) in the luminal tumors (P=0.002, Fisher’s exact test, FDR=0.02). Mutation frequencies of other genes and pathways, including DIP2C (n=110, P=0.125), GLI1 (n=110, P=0.084), GRIN2D (n=110, P=0.429), HDLBP (n=110, P=0.429), KPNA5 (n=110, P=0.084), OTOF (n=110, P=1.000) and Notch pathway (NOTCH1 and ADAM12, n=87, P=0.853) were not significantly different among breast tumor subtypes.
Notch signaling has previously been implicated in human oncogenesis, and the NOTCH1 gene is a target for insertion and rearrangement by the mouse mammary tumor virus (MMTV) (Yanagawa et al, 2000). We here identify NOTCH1 as a human breast cancer gene, based on the detection of a frameshift mutation in its C-terminal regulatory region. Frameshift mutations near the carboxy-terminus, such as the one newly identified in this study, are known to activate NOTCH1 in T-cell acute lymphoblastic leukemia (Weng et al, 2004). Activating non-synonymous and frameshift mutations in NOTCH1 have been observed in ~10% of non-small cell lung cancers (Westhoff et al, 2009). Similarly, Notch signaling has been found aberrantly activated in human breast cancer cell lines and tissue samples, but not in normal breast tissues. Furthermore, induced Notch signaling can transform normal human breast epithelial cells, resulting in growth beyond confluence, remarked change in cell shape, loss of cell-cell adhesion and resistance to drug-induced apoptosis (Stylianou et al, 2006). The consequences of non-synonymous mutations in the extracellular domain of NOTCH1 have not previously been described and their putative functional roles merit further investigation.
We also identified two novel non-synonymous mutations in the protease gene ADAM12. Recently, two previously identified breast cancer derived mutations in ADAM12 (D301H and G479E) were shown to prevent its insertion in the plasma membrane in a dominant-negative fashion, thereby leading to decreased shedding of the Notch ligand Delta-like I (Dyczynska et al, 2008). The novel T596A substitution is located in a cysteine-rich domain and has characteristics of a likely driver mutation (CHASM, FDR = 0.4). The mutations in NOTCH1 or ADAM12 in 8.4% of breast tumors (9 of 107), along with functional data, collectively point to a role for Notch pathway aberration in the development of breast carcinomas.
The sonic hedgehog effector GLI1 (glioma-associated oncogene homolog 1) is known to undergo amplification in a fraction of patients with malignant glioma (Kinzler et al, 1987). Single somatic mutations of GLI1 have previously been reported in urinary tract tumors and skin cancers (COSMIC, http://www.sanger.ac.uk/genetics/CGP/cosmic/). We have observed three non-synonymous GLI1 mutations in breast cancers, which merit further functional studies as GLI1 is a proto-oncogene expressed in normal mammary epithelial cells as well as in breast cancers (http://www.proteinatlas.org).
Several genes involved in RNA metabolism have recently emerged as putative cancer genes (Sjöblom et al, 2006). The somatic mutation prevalence (5% of cases) along with multiple frameshift mutations links the human homologue of disco-interacting protein, DIP2C (KIAA0934), to the development of breast cancer. Further, the missense mutations observed are all located in regions strongly conserved throughout evolution (data not shown) and predicted to be disease-causing (Schwarz et al, 2010). The DIP class of RNA-binding nuclear genes, which interact with the D. melanogaster gene disco during the establishment of the nervous system, has been implicated in maintenance of cell fate specification (DeSousa et al, 2003). The ubiquitously expressed HDLBP/vigilin gene, which has been connected to mRNA metabolism and estrogen-mediated stabilization of mRNAs, is composed of 15 KH nucleic acid binding domains and is essential to human cells as evidenced by siRNA knockdown (Goolsby and Shapiro, 2003). However, relatively little is known about the function of these genes, and further investigations into their roles in normal and tumor tissues are required.
Previous studies have identified mutations in genes in the nuclear pore complex and nuclear transport processes, such as NUP133, NUP214, and KPNA5, in breast cancers and other malignancies (Sjöblom et al, 2006; Mitelman et al, 2007). We here identify an additional non-synonymous mutation in the second ARM domain of the importin subunit alpha-6, KPNA5, which is thought to be involved as an adaptor in nuclear localization signal (NLS)-dependent protein import into the nucleus (Yang et al, 2010). Intriguingly, the protein products that rely on KPNA5 for nuclear import are still unknown. Further, we have identified 5 non-synonymous mutations in the ligand binding and channel-forming domains, along with one truncating mutation at the end of the channel-forming domain, in GRIN2D. The N-methyl-D-aspartate (NMDA) receptor subunit epsilon 4, GRIN2D, forms a heterotetrameric ligand-gated cation channel together with GRIN1. Interestingly, GRIN2D expression is regulated by estrogen (Ikeda et al, 2010). Functional NMDA receptor complexes containing the GRIN2D gene product have been demonstrated in human breast cancer cells and tissues, and the in vitro and in vivo tumor growth can be inhibited by NMDA receptor antagonists (North et al, 2010). This raises the possibility that the GRIN2D mutations observed here are oncogenic. Mutations in OTOF, a calcium-sensing protein that triggers membrane fusion and exocytosis, may also provide a link between calcium signaling and cancer.
Taken together, we provide data to strengthen the role of mutations in a subset of novel candidate cancer genes in breast tumorigenesis. We have identified additional somatic mutations in genes of the Notch, Hedgehog, NFKB, and PIK3CA pathways as well as in processes not yet strongly linked to human cancer such as RNA processing and calcium signaling. The mutation prevalence of CAN genes in this study differs from previously published work (Sjöblom et al, 2006; Wood et al, 2007). Potential explanations include the inability of mutational screens based on a low number of samples to pinpoint the true mutation prevalence, and the sample cohort compositions in terms of subtypes of breast cancers used in the studies. We also noticed a difference in prediction of mutation significance provided by CHASM and MutationTaster, that among the 72 non-synonymous mutations in Table 1 which have predictions from both methods, only 6 were classified as disease causing mutations by CHASM with the FDR controlled at 0.4 while 46 were suggested causal by MutationTaster, and only 3 mutations were consistently identified as significant mutations by both methods. While computational tools predicting mutation significance can be applied to prioritize targets for subsequent studies, the functional significance of mutations has to be proven through experimental analyses. The observation of multiple mutations in genes outside established cancer pathways may indicate that our understanding of these pathways is incomplete, or that hitherto unknown pathways and phenotypes are involved in tumor formation.
Supported by: Young Investigator Award and project grants from the Swedish Cancer Foundation (T.S.), Virginia and D.K. Ludwig Fund for Cancer Research, and National Institutes of Health grants CA43460, CA573445, CA62924, and CA121113.