Motivation: The sequencing of tumors and their matched normals is frequently used to study the genetic composition of cancer. Despite this fact, there remains a dearth of available software tools designed to compare sequences in pairs of samples and identify sites that are likely to be unique to one sample.
Results: In this article, we describe the mathematical basis of our SomaticSniper software for comparing tumor and normal pairs. We estimate its sensitivity and precision, and present several common sources of error resulting in miscalls.
Availability and implementation: Binaries are freely available for download at http://gmt.genome.wustl.edu/somatic-sniper/current/, implemented in C and supported on Linux and Mac OS X.
Contact: firstname.lastname@example.org; email@example.com
Supplementary information: Supplementary data are available at Bioinformatics online.
Acute promyelocytic leukemia (APL) is characterized by the t(15;17) translocation
that generates the fusion protein promyelocytic leukemia–retinoic acid
receptor α (PML-RARA) in nearly all cases. Multiple prior mouse models of
APL constitutively express PML-RARA from a variety of
non-Pml loci. Typically, all animals develop a myeloproliferative
disease, followed by leukemia in a subset of animals after a long latent period. In
contrast, human APL is not associated with an antecedent stage of myeloproliferation.
To address this discrepancy, we have generated a system whereby
PML-RARA expression is somatically acquired from the mouse
Pml locus in the context of Pml
haploinsufficiency. We found that physiologic PML-RARA expression
was sufficient to direct a hematopoietic progenitor self-renewal program in vitro and
in vivo. However, this expansion was not associated with evidence of
myeloproliferation, more accurately reflecting the clinical presentation of human
APL. Thus, at physiologic doses, PML-RARA primarily acts to increase
hematopoietic progenitor self-renewal, expanding a population of cells that are
susceptible to acquiring secondary mutations that cause progression to leukemia. This
mouse model provides a platform for more accurately dissecting the early events in
Most patients with acute myeloid leukemia (AML) die from progressive disease after relapse, which is associated with clonal evolution at the cytogenetic level1,2. To determine the mutational spectrum associated with relapse, we sequenced the primary tumor and relapse genomes from 8 AML patients, and validated hundreds of somatic mutations using deep sequencing; this allowed us to precisely define clonality and clonal evolution patterns at relapse. Besides discovering novel, recurrently mutated genes (e.g. WAC, SMC3, DIS3, DDX41, and DAXX) in AML, we found two major clonal evolution patterns during AML relapse: 1) the founding clone in the primary tumor gained mutations and evolved into the relapse clone, or 2) a subclone of the founding clone survived initial therapy, gained additional mutations, and expanded at relapse. In all cases, chemotherapy failed to eradicate the founding clone. The comparison of relapse-specific vs. primary tumor mutations in all 8 cases revealed an increase in transversions, probably due to DNA damage caused by cytotoxic chemotherapy. These data demonstrate that AML relapse is associated with the addition of new mutations and clonal evolution, which is shaped in part by the chemotherapy that the patients receive to establish and maintain remissions.
Cytotoxic lymphocytes use the granule exocytosis pathway to kill pathogen-infected cells and tumor cells. Although many genes in this pathway have been extensively characterized (e.g., perforin, granzymes A and B), the role of granzyme C is less clear. We therefore developed a granzyme C-specific mAb and used flow cytometry to examine the expression of granzyme B and C in the lymphocyte compartments of wild-type and mutant GzmB−/− cre mice, which have a small deletion in the granzyme B gene. We detected granzyme B and C expression in CD4+ and CD8+ T cells activated with CD3/CD28 beads or MLRs. Stimulation of NK cells in vitro with IL-15 also induced expression of both granzymes. Granzyme C up-regulation was delayed relative to granzyme B in wild-type lymphocytes, whereas GzmB−/− cre cells expressed granzyme C earlier and more abundantly on a per-cell basis, suggesting that the deleted 350-bp region in the granzyme B gene is important for the regulation of both granzymes B and C. Quantitative RT-PCR revealed that granzyme C protein levels were regulated by mRNA abundance. In vivo, a population of wild-type CD8αα+ intraepithelial lymphocytes constitutively expressed granzyme B and GzmB−/− cre intraepithelial lymphocytes likewise expressed granzyme C. Using a model of a persistent murine CMV infection, we detected delayed expression of granzyme C in NK cells from infected hosts. Taken together, these findings suggest that granzyme C is activated with persistent antigenic stimulation, providing nonredundant backup protection for the host when granzyme B fails.
The genetic alterations responsible for an adverse outcome in most patients with acute myeloid leukemia (AML) are unknown.
Using massively parallel DNA sequencing, we identified a somatic mutation in DNMT3A, encoding a DNA methyltransferase, in the genome of cells from a patient with AML with a normal karyotype. We sequenced the exons of DNMT3A in 280 additional patients with de novo AML to define recurring mutations.
A total of 62 of 281 patients (22.1%) had mutations in DNMT3A that were predicted to affect translation. We identified 18 different missense mutations, the most common of which was predicted to affect amino acid R882 (in 37 patients). We also identified six frameshift, six nonsense, and three splice-site mutations and a 1.5-Mbp deletion encompassing DNMT3A. These mutations were highly enriched in the group of patients with an intermediate-risk cytogenetic profile (56 of 166 patients, or 33.7%) but were absent in all 79 patients with a favorable-risk cytogenetic profile (P<0.001 for both comparisons). The median overall survival among patients with DNMT3A mutations was significantly shorter than that among patients without such mutations (12.3 months vs. 41.1 months, P<0.001). DNMT3A mutations were associated with adverse outcomes among patients with an intermediate-risk cytogenetic profile or FLT3 mutations, regardless of age, and were independently associated with a poor outcome in Cox proportional-hazards analysis.
DNMT3A mutations are highly recurrent in patients with de novo AML with an intermediate-risk cytogenetic profile and are independently associated with a poor outcome. (Funded by the National Institutes of Health and others.)
To define the factors that modulate regulatory T (Treg) cells in the tumor setting, we co-cultured various tumor cells with either purified Treg cells, or with unfractionated splenocytes. We found that Treg expansion occurred only with unfractionated splenocytes, suggesting that accessory cells and/or factors produced by them play an essential role in tumor-induced Treg expansion. We performed gene expression profiling on tumor-associated Treg cells to identify candidate signaling molecules and studied their effects on tumor-induced Treg expansion. We inadvertently discovered that IL-12 treatment blocked Treg expansion in an IL-12 receptor-dependent fashion. Additional studies showed that IL-12 acts by stimulating Interferon-gamma mediated inhibition of Treg cell proliferation, which may partially account for the anti-tumor effects of IL-12. Furthermore, IL-12 treatment was found to decrease IL-2 production, which may lead to interferon-gamma independent inhibition of Treg cells, as IL-2 is required for their survival and expansion. Mechanistic studies revealed that Interferon-gamma signaling directly causes cell cycle arrest in Treg cells. This study demonstrates that an IL-12-Interferon-gamma axis can suppress tumor-induced Treg proliferation. This mechanism may counteract the ability of Treg cells to promote tumor growth in vivo.
Regulatory T cells; Cytokine; Interleukin 12; Interferon-gamma; Tumor clearance
The t(8;21)(q22;q22) translocation, present in ~5% of adult acute myeloid leukemia (AML) cases, produces the AML1/ETO fusion protein. Dysregulation of the POU domain-containing transcription factor POU4F1 is a recurring abnormality in t(8;21) AML. Here, we show that POU4F1 over-expression is highly correlated with, but not caused by AML1/ETO. AML1/ETO markedly increases the self-renewal capacity of myeloid progenitors from murine bone marrow or fetal liver and drives expansion of these cells in liquid culture. POU4F1 is neither necessary nor sufficient for these AML1/ETO-dependent properties, suggesting that it contributes to leukemia through novel mechanisms. To identify targets of POU4F1, we performed gene expression profiling in primary mouse cells with genetically defined levels of POU4F1 and identified 140 differentially expressed genes. This expression signature was significantly enriched in human t(8;21) AML samples and was sufficient to cluster t(8;21) AML samples in an unsupervised hierarchical analysis. Among the most highly differentially expressed genes, half are known AML1/ETO targets, implying that the unique transcriptional signature of t(8;21) AML is, in part, attributable to POU4F1 and not AML1/ETO itself. These genes provide novel candidates for understanding the biology and developing therapeutic approaches for t(8;21) AML.
POU4F1; AML1/ETO; acute myeloid leukemia; gene expression profiling
The systematic karyotyping of bone marrow cells was the first genomic approach used to personalize therapy for patients with leukemia. The paradigm established by cytogenetic studies in leukemia (from gene discovery to therapeutic intervention) now has the potential to be rapidly extended with the use of whole-genome sequencing approaches for cancer, which are now possible. We are now entering a period of exponential growth in cancer gene discovery that will provide many novel therapeutic targets for a large number of cancer types. Establishing the pathogenetic relevance of individual mutations is a major challenge that must be solved. However, after thousands of cancer genomes have been sequenced, the genetic rules of cancer will become known and new approaches for diagnosis, risk stratification and individualized treatment of cancer patients will surely follow.
array CGH; cancer; comparative genomic hybridization; genomics; next-generation sequencing; SNP array
Antiapoptotic BCL2 family members have been implicated in the pathogenesis of acute myelogenous leukemia (AML), but the functional significance and relative importance of individual proteins (e.g., BCL2, BCL-XL, and myeloid cell leukemia 1 [MCL1]) remain poorly understood. Here, we examined the expression of BCL2, BCL-XL, and MCL1 in primary human hematopoietic subsets and leukemic blasts from AML patients and found that MCL1 transcripts were consistently expressed at high levels in all samples tested. Consistent with this, Mcl1 protein was also highly expressed in myeloid leukemic blasts in a mouse Myc-induced model of AML. We used this model to test the hypothesis that Mcl1 facilitates AML development by allowing myeloid progenitor cells to evade Myc-induced cell death. Indeed, activation of Myc for 7 days in vivo substantially increased myeloid lineage cell numbers, whereas hematopoietic stem, progenitor, and B-lineage cells were depleted. Furthermore, Mcl1 haploinsufficiency abrogated AML development. In addition, deletion of a single allele of Mcl1 from fully transformed AML cells substantially prolonged the survival of transplanted mice. Conversely, the rapid lethality of disease was restored by coexpression of Bcl2 and Myc in Mcl1-haploinsufficient cells. Together, these data demonstrate a critical and dose-dependent role for Mcl1 in AML pathogenesis in mice and suggest that MCL1 may be a promising therapeutic target in patients with de novo AML.
Acute promyelocytic leukemia (APL) is characterized by the t(15;17) chromosomal translocation, which results in fusion of the retinoic acid receptor α (RARA) gene to another gene, most commonly promyelocytic leukemia (PML). The resulting fusion protein, PML-RARA, initiates APL, which is a subtype (M3) of acute myeloid leukemia (AML). In this report, we identify a gene expression signature that is specific to M3 samples; it was not found in other AML subtypes and did not simply represent the normal gene expression pattern of primary promyelocytes. To validate this signature for a large number of genes, we tested a recently developed high throughput digital technology (NanoString nCounter). Nearly all of the genes tested demonstrated highly significant concordance with our microarray data (P < 0.05). The validated gene signature reliably identified M3 samples in 2 other AML datasets, and the validated genes were substantially enriched in our mouse model of APL, but not in a cell line that inducibly expressed PML-RARA. These results demonstrate that nCounter is a highly reproducible, customizable system for mRNA quantification using limited amounts of clinical material, which provides a valuable tool for biomarker measurement in low-abundance patient samples.
Acute myeloid leukemia is a highly malignant hematopoietic tumor that affects about 13,000 adults yearly in the United States. The treatment of this disease has changed little in the past two decades, since most of the genetic events that initiate the disease remain undiscovered. Whole genome sequencing is now possible at a reasonable cost and timeframe to utilize this approach for unbiased discovery of tumor-specific somatic mutations that alter the protein-coding genes. Here we show the results obtained by sequencing a typical acute myeloid leukemia genome and its matched normal counterpart, obtained from the patient’s skin. We discovered 10 genes with acquired mutations; two were previously described mutations thought to contribute to tumor progression, and 8 were novel mutations present in virtually all tumor cells at presentation and relapse, whose function is not yet known. Our study establishes whole genome sequencing as an unbiased method for discovering initiating mutations in cancer genomes, and for identifying novel genes that may respond to targeted therapies.
We used massively parallel sequencing technology to sequence the genomic DNA of tumor and normal skin cells obtained from a patient with a typical presentation of FAB M1 Acute Myeloid Leukemia (AML) with normal cytogenetics. 32.7-fold ‘haploid’ coverage (98 billion bases) was obtained for the tumor genome, and 13.9-fold coverage (41.8 billion bases) was obtained for the normal sample. Of 2,647,695 well-supported Single Nucleotide Variants (SNVs) found in the tumor genome, 2,588,486 (97.7%) also were detected in the patient’s skin genome, limiting the number of variants that required further study. For the purposes of this initial study, we restricted our downstream analysis to the coding sequences of annotated genes: we found only eight heterozygous, non-synonymous somatic SNVs in the entire genome. All were novel, including mutations in protocadherin/cadherin family members (CDH24 and PCLKC), G-protein coupled receptors (GPR123 and EBI2), a protein phosphatase (PTPRT), a potential guanine nucleotide exchange factor (KNDC1), a peptide/drug transporter (SLC15A1), and a glutamate receptor gene (GRINL1B). We also detected previously described, recurrent somatic insertions in the FLT3 and NPM1 genes. Based on deep readcount data, we determined that all of these mutations (except FLT3) were present in nearly all tumor cells at presentation, and again at relapse 11 months later, suggesting that the patient had a single dominant clone containing all of the mutations. These results demonstrate the power of whole genome sequencing to discover novel cancer-associated mutations.
Copy number variants (CNVs) are currently defined as genomic sequences that are polymorphic in copy number and range in length from 1000 to several million base pairs. Among current array-based CNV detection platforms, long-oligonucleotide arrays promise the highest resolution. However, the performance of currently available analytical tools suffers when applied to these data because of the lower signal:noise ratio inherent in oligonucleotide-based hybridization assays. We have developed wuHMM, an algorithm for mapping CNVs from array comparative genomic hybridization (aCGH) platforms comprised of 385 000 to more than 3 million probes. wuHMM is unique in that it can utilize sequence divergence information to reduce the false positive rate (FPR). We apply wuHMM to 385K-aCGH, 2.1M-aCGH and 3.1M-aCGH experiments comparing the 129X1/SvJ and C57BL/6J inbred mouse genomes. We assess wuHMM's performance on the 385K platform by comparison to the higher resolution platforms and we independently validate 10 CNVs. The method requires no training data and is robust with respect to changes in algorithm parameters. At a FPR of <10%, the algorithm can detect CNVs with five probes on the 385K platform and three on the 2.1M and 3.1M platforms, resulting in effective resolutions of 24 kb, 2–5 kb and 1 kb, respectively.
A dissection of the genetic networks and circuitries is described for two form of leukaemia. Integrating transcription factor binding and gene expression profiling, networks are revealed that underly this important human disease.
Acute myeloid leukemia (AML) comprises a group of diseases characterized by the abnormal development of malignant myeloid cells. Recent studies have demonstrated an important role for aberrant transcriptional regulation in AML pathophysiology. Although several transcription factors (TFs) involved in myeloid development and leukemia have been studied extensively and independently, how these TFs coordinate with others and how their dysregulation perturbs the genetic circuitry underlying myeloid differentiation is not yet known. We propose an integrated approach for mammalian genetic network construction by combining the analysis of gene expression profiling data and the identification of TF binding sites.
We utilized our approach to construct the genetic circuitries operating in normal myeloid differentiation versus acute promyelocytic leukemia (APL), a subtype of AML. In the normal and disease networks, we found that multiple transcriptional regulatory cascades converge on the TFs Rora and Rxra, respectively. Furthermore, the TFs dysregulated in APL participate in a common regulatory pathway and may perturb the normal network through Fos. Finally, a model of APL pathogenesis is proposed in which the chimeric TF PML-RARα activates the dysregulation in APL through six mediator TFs.
This report demonstrates the utility of our approach to construct mammalian genetic networks, and to obtain new insights regarding regulatory circuitries operating in complex diseases in humans.
Three cold shock domain (CSD) family members (YB-1, MSY2, and MSY4) exist in vertebrate species ranging from frogs to humans. YB-1 is expressed throughout embryogenesis and is ubiquitously expressed in adult animals; it protects cells from senescence during periods of proliferative stress. YB-1-deficient embryos die unexpectedly late in embryogenesis (embryonic day 18.5 [E18.5] to postnatal day 1) with a runting phenotype. We have now determined that MSY4, but not MSY2, is also expressed during embryogenesis; its abundance declines substantially from E9.5 to E17.5 and is undetectable on postnatal day 1(adult mice express MSY4 in testes only). Whole-mount analysis revealed similar patterns of YB-1 and MSY4 RNA expression in E11.5 embryos. To determine whether MSY4 delays the death of YB-1-deficient embryos, we created and analyzed MSY4-deficient mice and then generated YB-1 and MSY4 double-knockout embryos. MSY4 is dispensable for normal development and survival, but the testes of adult mice have excessive spermatocyte apoptosis and seminiferous tubule degeneration. Embryos doubly deficient for YB-1 and MSY4 are severely runted and die much earlier (E8.5 to E11.5) than YB-1-deficient embryos, suggesting that MSY4 indeed shares critical cellular functions with YB-1 in the embryonic tissues where they are coexpressed.
Proteins containing “cold shock” domains belong to the most evolutionarily conserved family of nucleic acid-binding proteins known among bacteria, plants, and animals. One of these proteins, YB-1, is widely expressed throughout development and has been implicated as a cell survival factor that regulates the transcription and/or translation of many cellular growth and death-related genes. For these reasons, YB-1 deficiency has been predicted to be incompatible with cell survival. However, the majority of YB-1−/− embryos develop normally up to embryonic day 13.5 (E13.5). After E13.5, YB-1−/− embryos exhibit severe growth retardation and progressive mortality, revealing a nonredundant role of YB-1 in late embryonic development. Fibroblasts derived from YB-1−/− embryos displayed a normal rate of protein synthesis and minimal alterations in the transcriptome and proteome but demonstrated reduced abilities to respond to oxidative, genotoxic, and oncogene-induced stresses. YB-1−/− cells under oxidative stress expressed high levels of the G1-specific CDK inhibitors p16Ink4a and p21Cip1 and senesced prematurely; this defect was corrected by knocking down CDK inhibitor levels with specific small interfering RNAs. These data suggest that YB-1 normally represses the transcription of CDK inhibitors, making it an important component of the cellular stress response signaling pathway.
Expression of the PML-retinoic acid receptor α (PML-RARα) fusion protein is the initiating genetic event for acute promyelocytic leukemia (APL), but the molecular mechanisms responsible for disease initiation are not yet clear. Several observations have suggested that early myeloid cells are uniquely susceptible to transformation by PML-RARα. Recently, we have shown that the early myeloid-specific protease neutrophil elastase is important for APL development in the mouse. To better understand the role of neutrophil elastase for the pathogenesis of APL, we examined the consequences of PML-RARα expression in early myeloid cells with or without neutrophil elastase. We found that high-level PML-RARα expression was associated with cellular toxicity that was dependent on the expression of neutrophil elastase; a mutant form of PML-RARα that resisted neutrophil elastase cleavage was not toxic. When PML-RARα was expressed at very low levels in the early myeloid cells of mice, it induced myeloid expansion and delayed myeloid maturation; neutrophil elastase was also required for these activities. The activities of PML-RARα in early myeloid cells are therefore strongly influenced by the presence of neutrophil elastase. To assure physiologic relevance, PML-RARα functions should be evaluated in neutrophil elastase-expressing early myeloid cells.
Leukemia results from the expansion of self-renewing hematopoietic cells that are thought to contain mutations that contribute to disease initiation and progression. Studies of the gene expression profiles of human acute myeloid leukemia samples has allowed their classification based on the presence of translocations and French-American-British subtypes, but it is not yet clear whether their molecular signatures reflect the initiating mutations or mutations acquired during progression. To begin to address this question, we examined the expression profiles of normal murine promyelocyte-enriched samples, nontransformed murine promyelocytes expressing human promyelocytic leukemia-retinoic acid receptor alpha (PML-RARα) fusion gene, and primary acute promyelocytic leukemia cells. The expression profile of nontransformed cells expressing PML-RARα was remarkably similar to that of wild-type promyelocytes. In contrast, the expression profiles of fully transformed cells from three acute promyelocytic leukemia model systems were all different, suggesting that the expression signature of acute promyelocytic leukemia cells reflects the genetic changes that contributed to progression. To further evaluate these progression events, we compared two high-penetrance acute promyelocytic leukemia models that both commonly acquire an interstitial deletion of chromosome 2 during progression. The two models exhibited distinct gene expression profiles, suggesting that the dominant molecular signatures of murine acute promyelocytic leukemia can be influenced by several independent progression events.
Gammaherpesviruses can establish lifelong latent infections in lymphoid cells of their hosts despite active antiviral immunity. Identification of the immune mechanisms which regulate gammaherpesvirus latent infection is therefore essential for understanding how gammaherpesviruses persist for the lifetime of their host. Recently, an individual with chronic active Epstein-Barr virus infection was found to have mutations in perforin, and studies using murine gammaherpesvirus 68 (γHV68) as a small-animal model for gammaherpesvirus infection have similarly revealed a critical role for perforin in regulating latent infection. These results suggest involvement of the perforin/granzyme granule exocytosis pathway in immune regulation of gammaherpesvirus latent infection. In this study, we examined γHV68 infection of knockout mice to identify specific molecules within the perforin/granzyme pathway which are essential for regulating gammaherpesvirus latent infection. We show that granzymes A and B and the granzyme B substrate, caspase 3, are important for regulating γHV68 latent infection. Interestingly, we show for the first time that orphan granzymes encoded in the granzyme B gene cluster are also critical for regulating viral infection. The requirement for specific granzymes differs for early versus late forms of latent infection. These data indicate that different granzymes play important and distinct roles in regulating latent gammaherpesvirus infection.
Because PML-RARA-induced acute promyelocytic leukemia (APL) is a morphologically differentiated leukemia, many groups have speculated about whether its leukemic cell of origin is a committed myeloid precursor (e.g. a promyelocyte) versus an hematopoietic stem/progenitor cell (HSPC). We originally targeted PML-RARA expression with CTSG regulatory elements, based on the early observation that this gene was maximally expressed in cells with promyelocyte morphology. Here, we show that both Ctsg, and PML-RARA targeted to the Ctsg locus (in Ctsg-PML-RARA mice), are expressed in the purified KLS cells of these mice (KLS = Kit+Lin−Sca+, which are highly enriched for HSPCs), and this expression results in biological effects in multi-lineage competitive repopulation assays. Further, we demonstrate the transcriptional consequences of PML-RARA expression in Ctsg-PML-RARA mice in early myeloid development in other myeloid progenitor compartments [common myeloid progenitors (CMPs) and granulocyte/monocyte progenitors (GMPs)], which have a distinct gene expression signature compared to wild-type (WT) mice. Although PML-RARA is indeed expressed at high levels in the promyelocytes of Ctsg-PML-RARA mice and alters the transcriptional signature of these cells, it does not induce their self-renewal. In sum, these results demonstrate that in the Ctsg-PML-RARA mouse model of APL, PML-RARA is expressed in and affects the function of multipotent progenitor cells. Finally, since PML/Pml is normally expressed in the HSPCs of both humans and mice, and since some human APL samples contain TCR rearrangements and express T lineage genes, we suggest that the very early hematopoietic expression of PML-RARA in this mouse model may closely mimic the physiologic expression pattern of PML-RARA in human APL patients.
The myelodysplastic syndromes are a group of hematologic disorders that often evolve into secondary acute myeloid leukemia (AML). The genetic changes that underlie progression from the myelodysplastic syndromes to secondary AML are not well understood.
We performed whole-genome sequencing of seven paired samples of skin and bone marrow in seven subjects with secondary AML to identify somatic mutations specific to secondary AML. We then genotyped a bone marrow sample obtained during the antecedent myelodysplastic-syndrome stage from each subject to determine the presence or absence of the specific somatic mutations. We identified recurrent mutations in coding genes and defined the clonal architecture of each pair of samples from the myelodysplastic-syndrome stage and the secondary-AML stage, using the allele burden of hundreds of mutations.
Approximately 85% of bone marrow cells were clonal in the myelodysplastic-syndrome and secondary-AML samples, regardless of the myeloblast count. The secondary-AML samples contained mutations in 11 recurrently mutated genes, including 4 genes that have not been previously implicated in the myelodysplastic syndromes or AML. In every case, progression to acute leukemia was defined by the persistence of an antecedent founding clone containing 182 to 660 somatic mutations and the outgrowth or emergence of at least one subclone, harboring dozens to hundreds of new mutations. All founding clones and subclones contained at least one mutation in a coding gene.
Nearly all the bone marrow cells in patients with myelodysplastic syndromes and secondary AML are clonally derived. Genetic evolution of secondary AML is a dynamic process shaped by multiple cycles of mutation acquisition and clonal selection. Recurrent gene mutations are found in both founding clones and daughter subclones. (Funded by the National Institutes of Health and others.)
Myelodysplastic syndromes (MDS) are hematopoietic stem cell disorders that often progress to chemotherapy-resistant secondary acute myeloid leukemia (sAML). We used whole genome sequencing to perform an unbiased comprehensive screen to discover all the somatic mutations in a sAML sample and genotyped these loci in the matched MDS sample. Here we show that a missense mutation affecting the serine at codon 34 (S34) in U2AF1 was recurrently mutated in 13/150 (8.7%) de novo MDS patients, with suggestive evidence of an associated increased risk of progression to sAML. U2AF1 is a U2 auxiliary factor protein that recognizes the AG splice acceptor dinucleotide at the 3′ end of introns and mutations are located in highly conserved zinc fingers in U2AF11,2. Mutant U2AF1 promotes enhanced splicing and exon skipping in reporter assays in vitro. This novel, recurrent mutation in U2AF1 implicates altered pre-mRNA splicing as a potential mechanism for MDS pathogenesis.
Alterations in DNA methylation have been implicated in the pathogenesis of myelodysplastic syndromes (MDS), although the underlying mechanism remains largely unknown. Methylation of CpG dinucleotides is mediated by DNA methyltransferases, including DNMT1, DNMT3A, and DNMT3B. DNMT3A mutations have recently been reported in patients with de novo acute myeloid leukemia (AML), providing a rationale for examining the status of DNMT3A in MDS samples. Here, we report the frequency of DNMT3A mutations in patients with de novo MDS, and their association with secondary AML. We sequenced all coding exons of DNMT3A using DNA from bone marrow and paired normal cells from 150 patients with MDS and identified 13 heterozygous mutations with predicted translational consequences in 12/150 patients (8.0%). Amino acid R882, located in the methyltransferase domain of DNMT3A, was the most common mutation site, accounting for 4/13 mutations. DNMT3A mutations were expressed in the majority of cells in all tested mutant samples regardless of blast counts, suggesting that DNMT3A mutations occur early in the course of MDS. Patients with DNMT3A mutations had worse overall survival compared to patients without DNMT3A mutations (p=0.005) and more rapid progression to AML (p=0.007), suggesting that DNMT3A mutation status may have prognostic value in de novo MDS.
myelodysplastic syndrome; DNMT3A; mutation
The identification of patients with inherited cancer susceptibility syndromes facilitates early diagnosis, prevention, and treatment. However, in many cases of suspected cancer susceptibility, the family history is unclear and genetic testing of common cancer susceptibility genes is unrevealing.
To apply whole-genome sequencing to a patient with suspected cancer susceptibility (and lacking a clear family history of cancer and no BRCA1 and BRCA2 mutations) to identify rare or novel germline variants in cancer susceptibility genes.
Design, Setting, and Participant
Skin (normal) and bone marrow (leukemia) DNA were obtained from a patient with early-onset breast and ovarian cancer and therapy-related acute myeloid leukemia (t-AML), and analyzed with: 1) whole genome sequencing using paired end reads; 2) SNP genotyping; 3) RNA expression profiling; and 4) spectral karyotyping.
Main Outcome Measures
Structural variants, copy number alterations, single nucleotide variants and small insertions and deletions (indels) were detected and validated using the above platforms.
Whole genome sequencing revealed a novel, heterozygous 3 Kb deletion removing exons 7-9 of TP53 in the patient’s normal skin DNA, which was homozygous in the leukemia DNA as a result of uniparental disomy. In addition, a total of 28 validated somatic single nucleotide variations or indels in coding genes, 8 somatic structural variants, and 12 somatic copy number alterations were detected in the patient’s leukemia genome.
Whole genome sequencing can identify novel, cryptic variants in cancer susceptibility genes in addition to providing unbiased information on the spectrum of mutations in a cancer genome.
Whole genome sequencing (WGS) is becoming increasingly available for research purposes, but it has not yet been routinely used for clinical diagnosis.
To determine whether whole genome sequencing can identify cryptic, actionable mutations in a clinically relevant time frame.
Design, Setting, and Patient
We were referred a difficult diagnostic case of acute promyelocytic leukemia with no pathogenic X-RARA fusion identified by routine metaphase cytogenetics or interphase FISH. The patient was enrolled in an IRB approved protocol, with consent specifically tailored to the implications of whole genome sequencing. The protocol employs a ‘movable firewall,’ which maintains patient anonymity within the entire research team, but allows the research team to communicate medically relevant information to the treating physician.
Main Outcome Measure
Clinical relevance of whole genome sequencing and time to communicate validated results to the treating physician.
Massively parallel paired-end sequencing allowed us to identify a cytogenetically cryptic event: 77 kilobases from chromosome 15 was inserted en bloc into the second intron of the RARA gene on chromosome 17, resulting in a classic bcr3 PML-RARA fusion gene. RT-PCR subsequently validated the expression of the fusion transcript. Novel FISH probes identified two additional cases of t(15;17)-negative acute promyelocytic leukemia that had cytogenetically invisible insertions. Whole genome sequencing and validation were completed in seven weeks, and changed the treatment plan for the patient.
Whole genome sequencing can identify cytogenetically invisible oncogenes in a clinically relevant timeframe.
Acute promyelocytic leukemia (APL) is a subtype of acute myeloid leukemia (AML). It is characterized by the t(15;17)(q22;q11.2) chromosomal translocation that creates the promyelocytic leukemia–retinoic acid receptor α (PML-RARA) fusion oncogene. Although this fusion oncogene is known to initiate APL in mice, other cooperating mutations, as yet ill defined, are important for disease pathogenesis. To identify these, we used a mouse model of APL, whereby PML-RARA expressed in myeloid cells leads to a myeloproliferative disease that ultimately evolves into APL. Sequencing of a mouse APL genome revealed 3 somatic, nonsynonymous mutations relevant to APL pathogenesis, of which 1 (Jak1 V657F) was found to be recurrent in other affected mice. This mutation was identical to the JAK1 V658F mutation previously found in human APL and acute lymphoblastic leukemia samples. Further analysis showed that JAK1 V658F cooperated in vivo with PML-RARA, causing a rapidly fatal leukemia in mice. We also discovered a somatic 150-kb deletion involving the lysine (K)-specific demethylase 6A (Kdm6a, also known as Utx) gene, in the mouse APL genome. Similar deletions were observed in 3 out of 14 additional mouse APL samples and 1 out of 150 human AML samples. In conclusion, whole genome sequencing of mouse cancer genomes can provide an unbiased and comprehensive approach for discovering functionally relevant mutations that are also present in human leukemias.