Identification of new mutations
The genetic events involved in leukemogenesis have been deciphered by using two approaches. First, genomic alterations have been identified by using karyotype analysis and DNA hybridization onto oligonucleotide arrays (SNP-arrays, array-CGH); several types of genomic profiles have been found: lack of detectable changes, uniparental disomies (UPD), losses of chromosomes or large chromosomal regions, trisomies, losses or gains of small regions or genes. Second, small gene mutations have been detected by classical Sanger sequencing [
17-
22] or, more recently, by the use of new technologies such as next generation sequencing (NGS) [
23-
31].
These studies, together with previous ones that had identified
JAK2, NPM1, MPL, RAS and
RUNX1 mutations, among others, led to the discovery of several major players in leukemogenesis:
ASXL1[
21],
BCORL1[
25],
CBL[
19],
DNMT3A[
24,
32],
EZH2[
20,
22],
IDH1/IDH2[
26],
TET2[
18] and
UTX[
33]. The mutational frequencies of these genes range from a few percent to more than 50%, or even virtually 100%, depending on the gene, the disease and the series studied. Thus, almost all cases of PV have a mutation of
JAK2[
34,
35]. Not counting the latter, mutations in
ASXL1 and
TET2 are frequently observed throughout the whole myeloid spectrum (Figure ), reaching 40-50% in CMML [
33,
36]. Mutations in
DNMT3A and
IDH1/2 are rare in the chronic stages but reach 15-20% in AML and exhibit a strong association with monocytic features [
30]. Genes encoding components of the splicing machinery that is involved in the splicing of introns during pre-mRNA maturation (mainly
SF3B1, SRSF2, U2AF35/U2AF1, and
ZRSR2) have been found frequently mutated in MDSs and CMML, and more rarely in MPNs and AML (Figure ) [
31,
37-
42]. Mutations in splicing factors are found in more than 60% of MDS with ring sideroblasts and in more than 50% of CMML [
31].
Mutations in leukemogenic genes have been described in detail in recent reviews [
7,
9,
10,
12-
16,
43]; and will not be reviewed here. We will rather delve on the questions aroused by these recent data.
Have we already identified the entire repertoire of mutated genes?
We may have identified (most of) the major culprits [
14]. First, there are hundreds of background mutations (i.e. that do not provide selective advantage) but only a limited number of driver mutations (i.e. that cause the disease) in each malignant disease. Second, many of the newly discovered mutated genes may affect the same pathways or networks as the major mutated genes. For example, deletions and mutations of
NF1, which have been recently identified [
17,
44,
45], or
PTPN11[
46] are thought to have the same effect as a
RAS mutation; a mutation of the
SHKBP1 gene [
47] or a duplication of the
SH3KBP1 gene [
48], which both encode cytoplasmic regulators of the CBL pathway, may have the same effect as a
CBL mutation [
49,
50]. Because EZH2, EED and SUZ12 proteins all belong to the same polycomb complex 2 (PRC2) the rare deletions or mutations of the
EED[
23,
51] and
SUZ12 genes [
17,
51] could have the same effect as
EZH2 mutations. Third, several genes (e.g.
ETV6[
52] or
RUNX1) can be structurally altered by mechanisms other than mutation, such as deletions and breakages. Fourth, some important regulatory genes could be affected not by structural alteration but through other mechanisms such as abnormal DNA methylation (e.g.
CDKN2A/B[
53],
TRIM33[
54],
CTNNA1[
55],
SOCS1[
56,
57]), histone modifications, mRNA splicing, microRNA or long non-coding RNA (lncRNA) modulation, or product degradation. Fifth, when all known mutated genes are analyzed in a series of cases, the percentage of samples with at least one mutated candidate driver gene varies from 50% [
58] to over 90% (in CMML; [
33]; Gelsi-Boyer et al., submitted). Moreover, most samples studied by NGS were shown to harbor gene mutations [
23,
26]. Thus, we are soon approaching the days where all cases can be defined by combination of several alterations. The practical definition of leukemogenesis will then be based on a specific and limited repertoire of alterations, including translocations, mutations and copy number changes, affecting a defined set of driver genes.
However, some issues still need be addressed. First, many genes may be mutated or deleted with a very low frequency (i.e. under 1%); their involvement and recurrence may be hard to demonstrate. Second, because NGS studies of several malignancies have shown that hundreds of genes can be mutated in a single tumor, background mutations should be discarded and driver genes validated. Third, we still miss information in some diseases such as essential thrombocythemia (ET), in which
JAK2 mutations are found in only half the cases, and
TET2 mutations in less than 10%. We also lack knowledge about the targeted genes of some frequent genomic alterations such as the 20q11-q13 deletion (
ASXL1 and
DNMT3B, more centromeric, are not involved). Fortunately, this lack of information is bound to disappear. The example of refractory anemia with ring sideroblasts (RARS) is instructive; in three-quarters of RARS, mutations have been recently found in
SF3B1, a gene encoding a subunit of a splicing factor (U2 snRNP) and histone acetyltransferase (STAGA) complexes [
27,
29,
31].
What are the functions of the mutated proteins ?
Leukemogenic alterations mainly affect five classes of proteins (Figure ): signaling pathway components, such as ABL, CBL, CBLB, FGFR1, FLT3, JAK2, KIT, LNK, MPL, PDGFRs, PTPN11, PTPRT [
23,
59] and RAS, transcription factors (TFs) such as CEBPA, ETV6 [
58], GATA2 [
30], IKZF1 [
60], RARA and RUNX1, epigenetic regulators (ERs), such as ASXL1, BCORL1 [
25], DAXX [
23], DNMT3A, EZH2 [
20,
22], MLL, MYST3, NSD1 [
30], PHF6 [
61], SUZ12 [
17,
51], TET2 and UTX [
28], tumor suppressors (TSG), such as CDKN2A, TP53, and WT1 and components of the spliceosome [
27,
29,
31,
38,
39,
41,
42]. However, additional alterations occur in genes encoding proteins that it is too early to classified into these defined categories, such as DIS3, DDX41 [
23], mitochondrial NAPDH dehydrogenase ND4 [
62], or cohesin complex proteins [
23,
63].
In chronic stages, alterations in signaling molecules can be grouped in two major categories, a first one that is found in MPNs and affects oncogenic tyrosine kinases (ABL1, JAK2, FGFR1, PDGFRs) and the downstream JAK-STAT and/or PI3-kinase pathways, and a second one that is mutated in CMML and affect the RAS-MAP kinase pathway (RAS, PTPN11, NF1). CBL alterations occur in a wide variety of myeloid diseases [
50].
TFs and ERs constitute the largest classes, which involve several categories of proteins (Figure ); because there are many ways to affect gene expression it is probable that not all of these categories are known yet. The existence of epigenetic alterations in myeloid malignancies has been known for long time [
64,
65]. For example, alterations of MLL, a histone methyltransferase (HMT), and MYST3, a histone acetyltransferase (HAT), have shown the importance of epigenetic deregulation in AMLs with translocation [
45,
64,
65]. However, in chronic diseases and in AMLs with normal karyotype, the extent, causes, identities, exact roles and consequences of epigenetic alterations have long remained elusive. Molecular studies have recently shown that both DNA methylation and histone regulation are affected, and that epigenetic alterations may be due to genetic alterations, (i.e. mutations in genes encoding epigenetic regulators). The latter phenomenon has been observed in genome-wide analyses of many neoplasias [
28,
66,
67]. However, not all epigenetic alterations may be due to an abnormal genetic background [
1,
53,
54].
The recent reports of the interrelated functions of IDH1/2 and TET2 in DNA methylation represent a major breakthrough in our understanding of leukemogenesis [
68]. It was initially hard to associate mutations of IDH1 and IDH2, two metabolic enzymes, with mutations in TET2, an unknown gene product, and as hard to suspect their role on DNA methylation. A very rapid series of elegant studies have shown i) that
IDH1/2 and
TET2 mutations are mutually exclusive in myeloid malignancies [
68], ii) that mutated IDH1 and IDH2 produce 2-hydroxyglutarate instead of alpha-ketoglutarate (αKG) [
69,
70], iii) that
TET2 encodes an αKG–dependent methyl cytosine dioxygenase whose mutation alters the conversion of 5-methylcytosine (5-mC) to 5-hydroxymethylcytosine (5-hmC) [
68,
71] and iv) that both
IDH1/2 and
TET2 mutations impact on DNA methylation and are involved in the same biochemical pathway [
72]. In addition, TET proteins can generate from 5-hmC 5-formylcytosine and 5-carboxylcytosine, but their roles are currently unknown [
73]. The recent studies on TET proteins suggest a role in removing aberrant DNA methylation to ensure DNA methylation fidelity [
74]. This has opened a new area of research since first, other factors involved in DNA demethylation may exist and second, several αKG–dependent enzymes, such as jumonji histone demethylases [
75] are epigenetic regulators; therefore, some of these proteins could also be involved in malignancies. However,
IDH1/2 and
TET2 mutations, while mutually exclusive, are not equivalent because
IDH1/2 mutations are more frequent in acute than in chronic myeloid diseases, whereas it is not the case for
TET2 alterations, which are more evenly distributed between chronic and acute stages. Inactivation of TET2 increases self-renewal in hematopoietic stem cells and induces a disease resembling CMML in mouse models [
76,
77]. Mutated IDH1/2 enzymes may impact on self-renewal but with a different strength. The likely explanation is that IDH1/2 and TET2 have other, non-overlapping functions on the regulation of DNA methylation and histone marks. Also, an IDH-mutated product may depend on another, rate-limiting factor to exert a leukemogenic effect. DNMT3A is a
de novo DNA methyltransferase involved in the formation of 5-mC and has complex interactions with polycomb and HMT proteins [
78]. How DNMT3A mutations affect DNA methylation remains to be defined [
24,
30,
79]; they probably do so in a different way from
TET2 or
IDH1/2 mutations since they may co-occur with either of them. A recent study showed that DNMT3A loss leads to upregulation of hematopoietic stem cell genes and downregulation of differentiation genes but is alone insufficient to induce a malignant disease in a mouse model [
79].
Mutations in regulators of histone marks have become a major subject of research and the relationships between them are quickly unveiled. Central regulators of myelopoiesis and key players in leukemogenesis seem to be the polycomb regulatory complexes, especially PRC2, which, in addition to direct defects of its components (EED, EZH2, SUZ12), could be affected in its concerted action with several ERs, such as ASXL1, cohesins, DNMT3A, IDH1/2, MLL, TET2 and UTX. TET proteins could regulate pluripotency and self-renewal through interaction with PRC2 [
74,
80,
81]. The cohesin complex is encoded by four genes (
SMC1, SMC3, RAD21 and
STAG2), which have been found mutated [
23] and deleted [
63]. A major interactor of cohesin complex is CTCF. PRC2 is recruited to specific loci through interaction of SUZ12 with CTCF [
82]. Another main leukemogenic interactor of PRC2 components is ASXL1. A recent study showed that ASXL1 loss affects PRC2 complexes and H3K27me3 histone marks, and induces a strong hematopoietic phenotype consistent with an MDS in a conditional knock-out mouse model [
83]. ASXL1 would direct PRC2 to leukemogenic loci such as
HOXA genes. Thus, through direct alterations of its components or of proteins or lncRNAs [
84] that recruit the complex, PRC2 has emerged as a key node in a network regulating hematopoietic stem cell self-renewal and proliferation and as a major factor in myeloid leukemogenesis. This is also true for T-cell leukemogenesis [
85]. Correct functioning of polycomb repressive complex 1 (PRC1) seems also to be important for myeloid cells since the loss of BMI1 (a component of PRC1) in the mouse leads to a disease similar to PMF [
86]. Structural alterations of the
BMI1 gene occur but are rare in human myeloid diseases [
87].
Whether other chromatin-associated complexes play a role in leukemogenesis should soon be revealed. ASXL1 could play a role in a cross-talk between major chromatin silencing systems, PRC1/PRC2, HP1α/CBX5 heterochromatin repressive complex and polycomb repressive deubiquitinase (PR-DUB) complex. Mutations in
BCOR and
BCORL1 suggest that the RAF/BCOR complex [
84,
88] might be involved in AML. The recent identification of a mutation in the
DAXX gene in an AML case [
23] further supports a wide participation of chromatin-regulatory complexes in leukemogenesis and cancer in general. DAXX and ATRX (which is mutated in X-linked α-thalassemia) are subunits of a chromatin remodeling complex and are both mutated in solid tumors [
89,
90].
The importance of the fifth class of mutated genes was more unexpected. Mutations in components of the spliceosome, which are mutually exclusive, lead to splicing defects including exon skipping, intron retention and use of incorrect splice site [
31]. A recent study showed that a consequence of splicing gene mutations is accumulation of unspliced transcripts affecting a specific subset of mRNAs [
41].
What are the effects of the gene mutations?
The dominant-positive effects of oncogenes such as
BCR-ABL1, mutated
FLT3, JAK2 or
RAS, have been easy to apprehend.
CBL and
LNK mutations inactivate brakes on signaling pathways and may have a dominant-negative effect.
TET2 is inactivated in the manner of a tumor suppressor.
EZH2 is frequently associated with UPD and acts as a TSG. A frequent form of defect seems to be haplo-insufficiency [
91], which could be associated with the (generally) heterozygous loss or mutation of
ASXL1, NF1, NPM1, TP53, RUNX1 or
TET2. Neo-functionalization results from
IDH1/2 mutations, which are always mono-allelic. For genes altered through different mechanisms (mutations, deletions or translocations) such as
RUNX1 or with different types of mutations (hotspot or dispersed) such as
DNMT3A, the function might be variably affected and some mutants may have a dominant-negative effect. Mutations in spliceosome genes are mostly missense and could result in proteins with a modified but not inactivated function.
Mutations in signaling pathways, transcription networks and splicing machinery have many downstream consequences. Modifications in epigenetic regulation of DNA and histones may have a strong amplifying effect since they impact on the transcription of thousands of genes. This in turn impacts on the properties of hematopoietic stem cells, favoring self-renewal and proliferation over differentiation, thus promoting leukemogenesis [
92]. However, chimeric proteins involving TFs and ERs (e.g. MLL, MYST3, NSD1 …) may induce a stronger effect than mutations in other TFs and ERs (such as ASXL1, EZH2 or TET2), which may need to co-occur with several other alterations to trigger AML, often after a chronic phase. Perhaps like the difference between a water jet and a sprinkling rain, this difference may have to do with the specific functions of TFs and ERs [
64]. TF and ER fusion proteins assemble in complexes that are directly recruited to their target genes where they modify the local histone marks, drastically altering transcription. In contrast, mutated ERs may moderately perturb the epigenetic network, resulting in global gene deregulation.
Mutations in spliceosome components may lead to several types of deregulation, including alterations of the epigenetic control of differentiation and self-renewal; they may thus result in the same defects as TF and ER mutations. This may derive from splicing aberrations of leukemogenic genes (e.g.
RUNX1) [
41] or from other specific but indirect defects. SF3B1 for example interacts with components of the polycomb repressor complex 1 (PRC1) and
SF3B1 mutations may compromise PRC1 regulation of leukemogenic loci [
93]. Reciprocally, the function of the pre-mRNA splicing machinery involves the reading of histone marks, and defective chromatin regulators may affect splicing [
94]. Directly or indirectly,
SF3B1 mutations, which are associated with the presence of ring sideroblasts, are likely to affect genes involved in red cell biology and mitochondria function. Because mutations in splicing genes, in TFs and in ERs are not mutually exclusive it is probable that the three types of alterations have additive rather than interchangeable effects.