|Home | About | Journals | Submit | Contact Us | Français|
One of the most fundamental questions in the control of gene expression in mammals is how epigenetic methylation patterns of DNA and histones are established, erased, and recognized. This central process in controlling metazoan gene expression includes coordinated covalent modifications of DNA and its associated histones. This review focuses on recent developments in characterizing the functional links between the methylation status of the DNA and of two particularly important histone marks. Mammalian DNA methylation is intricately connected to the presence of unmodified lysine 4 and methylated lysine 9 residues in histone H3. An interconnected network of methyltransferases, demethylases, and accessory proteins is responsible for changing or maintaining the modification status of specific regions of chromatin. The structural and functional interactions among members of this network are critical to processes that include imprinting and differentiation, dysregulation of which is associated with disorders ranging from inflammation to cancer.
Nucleosomes, the fundamental building blocks of eukaryotic chromatin, consist of ~146 bp of DNA wrapped ~1.8 times around a histone octamer that is extremely well-conserved evolutionarily (1). Chromatin, rather than being a passive platform for storing genetic information, can regulate transcriptional processes through postsynthetic modifications of both of its components: DNA and histones. Combinations of modifications regulate chromatin structure, thereby determining its different functional states and playing a central role in differentiation (2, 3). Serious human diseases can result from defects in DNA methylation, ranging from nine known imprinting-associated disorders (4) through cancer (5-9) and obesity (10, 11) to immune responses (12-15) and neurological disorders (16, 17). Despite its fundamental importance, much remains to be learned about how specific segments of chromatin are targeted for modifications (or demodifications) that boost or silence its transcription. One broad theme that has become increasingly clear is that a web of interactions tightly coordinates the modifications of a segment of DNA and its associated histones.
This work focuses on recent developments in characterizing the functional links between histone and DNA modifications in mammalian cells, and themechanisms underlying these links. An overview of the links on which we will focus, represented by the dashed lines, is provided in Figure 1. As indicated in that figure, we will focus in particular on the links between modification of DNA and histone H3, which appear at this point to be the most significant. The two sites of H3 modifications that appear most closely associated with DNA methylation are Lys4 and Lys9 methylation (H3K4me and H3K9me, respectively). These two sites of histone lysine modifications appear to play major roles in development and differentiation (18-23). In extremely recent examples of key roles played by H3K4me and H3K9me, first, H3K41 methylation status was found to be highly predictive of expression levels (from promoters having a relatively low CpG content) in human T cells (24). A second example is that H3K9 methylation status (even in a triple Dnmt1/3a/3b−/− background) was strongly associated with proviral silencing in mouse embryonic stem cells (25). The enzymes responsible for these methylations have largely been identified, though their relative roles are not entirely clear (26). The enzymatic mechanisms of histone methylation have been defined (27). Some specific inhibitors are available (28-31). High-throughput methods for characterizing histone modification states are becoming available (32, 33), so our understanding ofH3methylation is expected to develop rapidly. Our purpose here, however, is to focus on the linkages between H3 methylation and the associated DNA.
DNA methylation and histone modifications are intimately connected with one another (34-36). In fact, genome-scale DNA methylation profiles suggest that DNA methylation is better correlated to histone methylation patterns than to the underlying genome sequence context (37). Specifically, DNA methylation is associated with the absence of H3K4 methylation and the presence of H3K9 methylation. In a case of the exception proving the rule, the process of seeking regions with methylation of both K4 and K9 has been used to identify candidate loci subject to imprinting, where one of the two parental alleles is active and the other silenced (38).
Methylation of H3K4 has been suggested (19) to protect promoters from de novo DNA methylation in somatic cells (39, 40). There is considerable evidence of an inverse relationship between H3K4 methylation and allele-specific DNA methylation in differentially methylated regions (37, 41-44). A very recent genome-scale analysis confirmed the strong anticorrelation between DNA methylation and H3K4 methylation (while finding no correlation with methylation of H3K27) (45).
In contrast to that at H3K4, methylation at H3K9 is positively correlated with DNA methyation. In fact, inhibiting DNA methylation with the methyltransferase inhibitor 5azaC leads to a loss of H3K9 methylation (46). There is evidence that the H3K9-CpG linked methylations represent an evolutionarily conserved silencing pathway. In the filamentous fungus Neurospora, the H3K9 methyltransferase DIM-5 is required for DNA methylation (47, 48), while in the plant Arabidopsis, the H3K9 methyltransferase KRYPTONITE is required (49). With regard to mammals, mouse ES cells that lack the heterochromatin-associated H3K9 methyltransferases Suv39h1 and Suv39h2 exhibit some demethylation of satellite DNA (50). G9a and GLP (G9a-like protein), two related euchromatin-associated H3K9 methyltransferases (51), have also been implicated in DNA methylation at various loci, including imprinting centers (52, 53), retrotransposons and satellite repeats (54), a G9a/GLP target promoter (55), and a set of embryonic genes (56). In addition, as described below, G9a interacts directly with DNA methyltransferases Dnmt1 and Dnmt3a.
The correlation of DNA methylation with unmethylated H3K4 and methylated H3K9 requires a mechanism to ensure that H3K4 and H3K9 are not simultaneously methylated or demethylated. However, structural and biochemical data available to date have indicated that this does not seem to be due to a simple mechanism whereby (for example) methylated H3K4 directly inhibits binding of a H3K9 methyltransferase.
The H3K9 methyl “writers” (methyltransferases G9a and GLP) form heterodimers through their catalytic domains (51) and preferentially methylate lysines in an Arg-Lys (R8-K9) consensus sequence (57). They apparently do not require H3K4 for binding, as their ankyrin repeats bind H3 at K9me1 or K9me2 (58).
Conversely, H3K4me0 “readers” do not appear to probe H3K9 methylation. ADD domains of Dnmt3a and Dnmt3L (for ATRX-Dnmt3-Dnmt3L) interact with the first six or seven residues of H3 (59, 60), while the PHD domain of BHC80 binds the first eight residues of the H3 tail containing H3K4me0 (61). Thus, in these cases, the status of K9 modification is not important for binding.
Similarly, the H3K4 methyl “eraser” LSD1 (62) does not appear to probe the methylation status of H3K9 when the enzyme acts on methylated H3K4. This conclusion is based on studies with a 21-residue H3 amino-terminal synthetic peptide containing methionine in place of methylated K4 (K4M), which yielded a 30-fold increase in binding affinity for LSD1, making the variant peptide a strong inhibitor and an ideal candidate for structural work. In fact, Forneris et al. (63) structurally resolved the first 16 residues of the H3K4M peptide, in agreement with their previous biochemical data which showed that LSD1 is active on peptide substrates longer than 16 residues (64). This study was the first in which a long, structured histone tail was visualized in histone-modifying enzymes and protein domains that recognize (decode) methyllysine. However, the key feature of the structure, for the purposes of this review, is the fact that the side chain of K9 did not interact with LSD1 (when its active site engaged methylated H3K4), was partially disordered, and pointed toward the solvent (63).
A combination of structural, enzymological, and protein interaction studies, in organisms ranging from fission yeast to mammals, implicate the Jumonji demethylases as playing a critical role in the H3K4/K9 reciprocal methylation. In Schizosaccharomyces pombe, Jumonji protein Lid2 appears to have alternative effects depending on which other proteins are locally present (65). In heterochromatin, Lid2 is in a six-protein complex that leaves Lid2 an active K4me3 demethylase while including K9 methylase Clr4. In euchromatin, Lid2 forms a complex with Set1 and Lsd1 in a form that blocks Lid2 activity; Set1 methylates K4, and Lsd1 demethylates K9. This leaves open the question of what determines the local concentrations of these alternative binding partners to Lid2. We note that mammalian LSD1 is capable of demethylating H3K9 in an androgen receptor-dependent manner (66, 67).
Enzymatic and structural studies of two related human Jumonji demethylases also provided key insights into H3K4 and H3K9 methylation reciprocity (68). PHF8 (plant homeodomain finger protein 8) and KIAA1718 (also known as JHDM1D) belong to a small family of Jumonji proteins that has three members in mice and humans (PHF2, PHF8, and KIAA1718) (69). These proteins harbor two domains in the N-terminal half (Figure 2A): a PHD domain that binds H3K4me3 and a Jumonji domain that demethylates H3K9me2 and H3K27me2 (70). However, PHF8 is substantially more active on a peptide that contains both H3K4me3 and H3K9me2 (68). In contrast, H3K4me3 significantly reduces the H3K9me2 demethylase activity of KIAA1718 (while having no effect on its H3K27me2 activity). This difference in substrate specificity can be explained by the bent conformation of PHF8, which allows each of its domains to engage their respective targets, and by the extended conformation of KIAA1718, which prevents its access to H3K9me2 by its Jumonji domain when its PHD domain engages H3K4me3 (Figure 2A). The structural linkage between PHD binding to H3K4me3 and the placement of the catalytic Jumonji domains relative to this activating epigenetic landmark determines which repressive marks (H3K9me2 or H3K27me2) are removed by the demethylases (68).
The use of multiple binding domains in concert, to enhance an enzyme’s activity and its substrate specificity, may be a general mechanism for Jumonji demethylases. For example, JHDM2A-mediated histone H3K9me1/2 demethylation requires a zinc finger N-terminal to the Jumonji domain for its enzymatic activity (71). JARID Jumonji family proteins (including Lid2 in S. pombe) contain a Jumonji domain that demethylates H3K4me3 surrounded by several PHD domains, and at least one of them binds H3K9me3 (65, 72) (Figure 2B). Mutation or deletion of this PHD domain impairs the demethylase activity on H3K4me3 (65, 72). It was thought that, in a repressing environment with H3K9me3 bound by PHD, the ideal substrate for the JARID family is H3 trimethylated at both K4 and K9, allowing the enzyme to remove any local activating methyl groups of H3K4me3 by the Jumonji (68). JMJD2A contains an N-terminal Jumonji domain and C-terminal PHD and Tudor domains (Figure 2C). The JMJD2A Jumonji domain alone is capable of demethylating tri- and dimethylated H3K9 (H3K9me3/2) and H3K36 (H3K36me3/2) (73-75). On the other hand, the JMJD2A Tudor domain binds two different histone sequences (H3K4me3 and H4K20me3) via radically different approaches (76, 77). The functional connection between the methyl mark reader and eraser in JMJD2A is not clear. We speculate that each of the two demethylase activities for JMJD2A (H3K9me3/2 and H3K36me3/2) correlates with one of the methyl marks (H3K4me3 and H4K20me3) recognized by the Tudor domain.
LSD1 and 2 are two related lysine-specific demethylases whose substrates include mono- and dimethylated H3K4 (H3K4me1/2) (78, 79). Given the known association between H3K4me0 and DNA methylation, it is not surprising that disrupting the genes for mammalian LSD1 and LSD2 revealed an essential role in maintaining global DNA methylation (80) and establishing maternal DNA genomic imprints (81), respectively. The simplest explanation for LSD2-promoted DNA methylation is that demethylating H3K4 makes imprinted loci more accessible to the Dnmt3a-Dnmt3L de novo DNA methylation machinery (81).
While LSD1-promoted global DNA methylation may be explained by generation of H3K4me0, and its binding by UHRF1 or Dnmt3a (see below), an alternative mechanism is also possible. This alternative involves modulation of the stability of the maintenance DNA methyltransferase Dnmt1, via methylation of that protein (80). Dnmt1 can be methylated at Lys142 by Set7/9 (a protein lysine methyltransferase), and this results in decreased stability (82). In the absence of LSD1, Dnmt1 stability is reduced in vivo, and this may explain the progressive loss of DNA methylation (80). There is no direct evidence yet that LSD1 demethylates Lys142 of Dnmt1, but this is an intriguing possibility as the mammalian LSD1 is capable of demethylating H3K9 in an androgen receptor-dependent manner (66, 67).
A full discussion of the control of histone methylation (even just of H3) is beyond the scope of this review. However, the relationship between H3K4 and H3K9 methylation may be influenced by features we will only briefly mention here, such as the number of methyl groups attached to each Lys, the presence of acetylation, and which H3 variant is involved. Different H3K4 methyltransferase complexes have different relative propensities for generating di- versus trimethylation (83), and changes in the relative amounts and distribution of the various H3K4 methyltransferases could have significant effects on chromatin activity. Furthermore, there is an association between H3K4 methylation and acetylation elsewhere on H3 (84, 85). The H3 variants (H3.1, H3.2, and H3.3) differ at just five positions (86, 87). In particular, the first 31 residues are identical, so there is no difference in the immediate contexts of K4 and K9; however, other residues in the core histones affect H3 methylation, at least that of K4 (88). Even before incorporation into nucleosomes, some methylation at H3K9 has been reported (89), and this methylation is substantially more abundant on H3.1 than on H3.3 [which may play a role only in gametogenesis (90)].
Dnmt3L is a noncatalytic paralog of Dnmt3a and -3b that is expressed primarily in gametogenesis (91-94) but may also be involved in subtelomeric methylation (95). Dnmt3L was found to associate in vivo not only with Dnmt3b and Dnmt3a2 [a shorter isoform of Dnmt3a predominantly expressed in embryonic stem cells (96)] but also with the four core histones (59). Peptide interaction assays showed that Dnmt3L specifically interacts with the amino terminus of histone H3, only when H3K4 is not modified (H3K4me0) (59). Cocrystallization of Dnmt3L with the amino tail of H3 showed this tail bound to the N-terminal ADD domain of Dnmt3L (59). These data suggest that Dnmt3L acts as a sensor for H3K4 methylation: when methylation is absent, Dnmt3L induces de novo DNA methylation by docking Dnmt3a to the nucleosome (Figure 3).
The phenotype of Dnmt3L knockout mice is indistinguishable from that of Dnmt3a germ cell-specific conditional knockout mice, as both have lost parent-of-origin de novo DNA methylation (imprinting) in maternal germ cells, and methylation of dispersed retrotransposons in paternal germ cells (92, 97-100). While Dnmt3a and Dnmt3L are essential for methylation of imprinted genes and enhance de novo methylation of repetitive elements in growing oocytes, Dnmt3b is dispensable for mouse gametogenesis and imprinting. Dnmt3L colocalizes and co-immunoprecipitates with both Dnmt3a and Dnmt3b (101) and enhances de novo methylation by both of these methyltransferases (102-106). The interaction occurs through the C-terminal domains of both proteins (103-107) (Figure 3), as illustrated by the structure of the complex between C-terminal domains of Dnmt3a and Dnmt3L (108).
Histone-Dnmt3L-Dnmt3a-DNA interactions have recently been studied in the budding yeast Saccharomyces cerevisiae, which has no detectable DNA methylation (109) and has no orthologs of Dnmts. Introducing the murine maintenance methyltransferase Dnmt1, or Dnmt3a alone, leads to detectable but extremely low levels of DNA methylation in yeast (110). Through joint expression of murine Dnmt3a and Dnmt3L, Hu et al. (111) achieved substantially higher levels of de novo methylation. They found that the N-terminus of H3, including K4, is required for this DNA methylation, while neither the central part of the H3 tail (including K9 and K27) nor the H4 tail is necessary. The yeast DNA methylation was found preferentially in heterochromatin regions in which H3K4 methylation is rare. When genes for components of the H3K4-methylating COMPASS–Set1 complex were disrupted, in yeast cells producing the murine Dnmt3a and Dnmt3L, the overall level of DNA methylation was up to 5-fold higher. Hu et al. next used this system to explore the interaction between Dnmt3L and H3K4 described above (59). In the yeast system, deletions or targeted mutations in the ADD portion of Dnmt3L greatly reduced both the level of global DNA methylation and the level of Dnmt3L pulldown of an H3K4me0 peptide. Finally, when these Dnmt3L mutants were introduced into mouse ES cells from which native Dnmt3L had been deleted, the level of DNA methylation (at the tested promoter) was indistinguishable from that seen in Dnmt3L−/− cells (111). Thus, the interaction of Dnmt3L with unmethylated H3K4 is a central link between histone and DNA methylation.
As noted above, Dnmt3L binds to H3K4me0, and this would recruit Dnmt3a to H3K4-hypomethylated regions of chromatin. However, in somatic cells, Dnmt3a is expressed but Dnmt3L is expressed poorly if at all. This raises the question of whether Dnmt3a alone is capable of discriminating H3K4 methylation status and (if so) the structural basis for that discrimination. To determine the intranuclear distribution of Dnmt3a and Dnmt3b, Jeong et al. used sucrose density gradients of chromatin that had been fragmented by partial or complete micrococcal nuclease digestion, followed by Western blot analysis. The results revealed little free Dnmt3a or -3b in the nuclei of HCT116 human colon cancer cells (which do not express Dnmt3L) (112). Almost all of the cellular Dnmt3a and -3b (but not Dnmt1) was associated with a subset of nucleosomes containing methylated short (SINE) and long (LINE) interspersed repetitive nuclear elements and CpG islands (112). Chromatin binding of Dnmt3a and -3b required intact nucleosomal structure, though no other chromatin factors, suggesting that Dnmt3a and -3b alone are capable of direct interaction with chromatin components in addition to DNA.
Indeed, a recent structure of the Dnmt3a ADD domain in complex with an amino-terminal tail peptide from histone H3 indicates that Dnmt3a independently recognizes H3K4me0 (60). Interestingly, this Dnmt3a ADD domain was reported to bind symmetrically dimethylated Arg3 in histone H4 (H4R3me2s), in addition to H3K4me0, as shown by peptide pulldown assays (see Figure 4d of ref 113). However, this Dnmt3a–H4R3me2s interaction was not seen by others using isothermal titration calorimetry or nuclear magnetic resonance titration (60). A similar discrepancy has also been seen in the case of WDR5, a WD40 protein reported to bind methylated H3K4 in pulldown assays (114), while later structural and biophysical work revealed that it is a peptidyl arginine recognition factor (115-117), high-lighting some of the difficulties in these studies.
The question of whether Dnmt3a alone or Dnmt3a and -3b together can form linear tetramers, similar to Dnmt3L-3a-3a-3L (see below), in somatic cells where Dnmt3L is not expressed remains unclear. Dnmt3a and Dnmt3b also exhibit nonoverlapping functions in development, with Dnmt3b specifically required for methylation of centromeric minor satellite repeats (118). Dnmt3a is fairly ubiquitously expressed, while Dnmt3b is expressed at very low levels in most tissues except testis, thyroid, and bone marrow (119). The residues forming the Dnmt3L–Dnmt3a interface are highly conserved in all three polypeptides (120), so it seems very likely that Dnmt3a alone or Dnmt3a and Dnmt3b could use the same interface to oligomerize (121) or form filament-like structures (122).
One unexpected feature of the Dnmt3a-3L complex structure is that the Dnmt3a-3L heterodimer further dimerizes though Dnmt3a to form a tetramer (Dnmt3L-3a-3a-3L) (108) (Figure 3). Dimerization via the 3a–3a interface brings two active sites together. Superimposing theDnmt3a-3L tetramer structure onto a nucleosome structure (Figure 3) yields a model in which the two active sites are both located, approximately one helical turn apart, in the DNA major groove. This would imply that the central Dnmt3a dimer in the tetramer would preferentially methylate two CpGs separated by 8–10 bp, as demonstrated by in vitro methylation assays (108). Recent bioinformatic analysis revealed an 8 bp periodicity in the distribution of CpGs in humanDNA, resulting mostly from Alu SINES and imprinted regions (123).
In fact, an ~10 bpDNAmethylation periodicity has since been reported in 12 knownmaternally imprinted regions (108), human chromosome 21 (from leukocytes of healthy donors) (124), Arabidosis thaliana (which produces a protein, DRM2, related to mammalian Dnmt3a) (125), and more recently non-CpG methylation in human embryonic stem cells (126).
Hemimethylation, where the CpG on only one DNA strand contains 5mC, is produced by replication. Maintenance methylation conserves the methylation pattern by modifying the daughter strand CpG. One of the critical questions in the DNA methylation field is the basis for the intrinsic preference ofDnmt1 for hemimethylated CpGsites (127). Dnmt1 alone is necessary but insufficient for proper maintenance methylation, since its preference for hemi- over unmethylated CpG sites is only moderate (128).
The solution to this apparent paradox is provided by an accessory protein called UHRF1 (ubiquitin-like, containing PHD and RING finger domains 1), also called Np95 (nuclear protein of 95 kDa) in mice and ICBP90 (inverted CCAAT binding protein of 90 kDa) in humans. UHRF1 harbors five recognizable functional domains (Figure 4): a ubiquitin-like domain (UBL) at the N-terminus, followed by a tandem Tudor domain that binds H3K9me3 (129, 130), a plant homeodomain (PHD) that binds the histone H3 tail (131, 132), a SET- and RING-associated (SRA) domain that binds hemimethylated CpG-containing DNA (133-137), and a really interesting new gene (RING) domain at the C-terminus that may provide UHRF1 with E3 ubiquitin ligase activity for histones (131). However, it is not yet clear how these domains are structurally arranged or functionally coordinated.
UHRF1 binds both Dnmt1 and hemimethylated DNA (133, 134, 138), explaining this accessory protein’s ability to target Dnmt1 to newly replicated DNA. In fact, the maintenance of DNA methylation is compromised in cells deficient for UHRF1 (133, 134). The fact that UHRF1 also binds methylated H3K9 (129-132) indicates that UHRF1 is a key component in coupling maintenance methylation of DNA by Dnmt1 and histone modifications during DNA replication (Figure 4).
Finally, UHRF1 appears to interact with Dnmt3a and Dnmt3b (139), two de novo DNA methyltransferases. However, thismight be less surprising in light of evidence that Dnmt3a and -3b might also contribute to the maintenance of DNA methylation (140). It is important to understand in the coming years how these multiple binding events are coordinated and whether they are cooperative. Given its interaction with such a wide variety of epigenetic regulators, including a histone acetyltransferase (141), and the H3K9 methyltransferase G9a (142), it makes sense that UHRF1 is a target of an apoptotic pathway (143) and is a target of growing interest for drug development (144).
MLL is a family of H3K4 methyltransferase genes (the name comes from myeloid/lymphoid or mixed lineage leukemia, as MLL translocations cause that disease). These enzymes have the opposite regulatory effect compared to that of LSD1 (by methylating rather than demethylating H3K4) and G9a (by methylating K4 rather than K9 ofH3). Members ofMLL directly or indirectly preventDNA methylation or stabilize unmethylatedDNAin that state. If this is due to direct interactions, then MLLs would be expected to interact with unmethylated CpGs. In fact, MLL proteins contain CXXC domains that selectively bind unmethylated CpGs (145-147). This interaction has now been confirmed by a solution structure of an MLL1-CXXC domain in a complex with unmethylated DNA, and the structure was tested by demonstrating the predicted effects of specific mutations (148).
In addition, anotherH3K4 methyltransferase, Set1, appears to interact with the DNA via an accessory protein, as was the case for Dnmt3a/Dnmt3L or Dnmt1/UHRF1. On its own, Set1 lacks the CXXC domain, but it might interact directly with an accessory protein that contains the same domain, CXXC finger protein 1 (Cfp1, formerly CGBP1). Murine embryonic stem cells deficient for Cfp1 exhibit a decreased level of global DNA methylation, along with elevated global levels of histone H3K4me3 (149). This suggests that Cfp1 restricts the distribution of Set1. Further, pointmutations that specifically eliminate either DNA binding or association with the Set1 also severely compromise the ability of the alleles to complement the phenotype of the Cfp1 deletion. Consistent with these observations, Cfp1 is associated with Set1 distribution that is limited to euchromatin, and this effect is also compromised by point mutations to Cfp1 affecting either DNA binding or Set1 association (150).
As illustrated in Figure 1, G9a/GLP interacts with at least Dnmt3a, UHRF1, Dnmt1, and H3, making it a central player in epigenetic regulation. While the implications are less clear, G9a also binds (151) the testis-specific Zn finger protein ZNF200 (152). G9a and GLP repress transcription by mono- and dimethylating histone H3 lysine 9 (H3K9me1 and -me2), and deletion of genes for either G9a or GLP results in embryonic lethality (51, 153). Loss-of-function deletion, nonsense, and frameshiftmutations of GLP (also called EHMT1 for euchromatin histone methyltransferase 1) are causative factors for the 9q34 subtelomeric deletion syndrome, with severe mental retardation being the main symptom (154, 155).
How G9a contributes to DNA methylation is not clear, though G9a appears to interact with Dnmt1 during replication (156). In addition, the G9a ankyrin repeat domain has been suggested to interactwithDnmt3a (56), a possibleway forG9a to induce de novo DNA methylation (54). In addition, G9a binds UHRF1 (142), while UHRF1 binds methylated H3K9, the methylation product of G9a. This suggests that G9a and the resulting H3K9 methylation might also help to target UHRF1 and Dnmt1, the pair of proteins primarily responsible for DNA maintenance methylation, to newly replicated DNA.
Combinatorial readout of multiple covalent chromatin modifications (including DNA methylation) is an explicit prediction of the “histone code hypothesis” (157-159). Several histonemethylating enzymes contain components (domains) for both synthesizing and binding to a specific histone mark, such as mammalian G9a/GLP (for H3K9me1 and -2) (58) and S. pombe Clr4 (for H3K9me3) (160). These proteins contain modules for making (SET domain) and recognizing (ankyrin repeats or chromodomain) a given methyl mark. This interdomain cross talk provides a possible mechanism for propagating a methyl mark. In analogy, PHF8 andKIAA1718 containmodules, within the same polypeptide, for both recognizing (PHD) and removing (Jumonji domain) two opposing methyl marks. This cross talk also provides a possible mechanism for removing an “OFF” (or repressive) methyl mark based on an existing “ON” (or active) methyl mark. Furthermore, the Dnmt3a-Dnmt3L complex contains reader domains for H3K4me0 and DNA methyltransferase activity; the Dnmt1-UHRF1 complex contains reader domains for H3K9me3 and DNA methyltransfease activity, andMLL (or Set1-Cfp1 complex) contains reader domains for DNA CpG and a SET domain for making methylated H3K4. This cross talk further provides a mechanism for linking DNA and histone methylation, probably on the same nucleosome. In addition, we suggest that modification(s) of epigenetic modifiers themselves [Dnmt1 by Set7/9 and potentially LSD1, and the dynamic lysine methylation of many non-histone proteins (161)] is another component of epigenetic regulation and may serve as a checkpoint for correct assembly of the machinery required to accurately modify chromatin. Understanding the function and cross talk of individual states (one methyl mark, two methyl marks, etc.) should allow scientists eventually to uncover the complex language of the histone code (162, 163).
Several important questions remain unanswered as of this writing, in addition to simply confirming the accuracy and breadth of applicability of the interactions illustrated in Figure 1.
Is there a protein that specifically recognizes unmethylated H3K9, such as (for example) a DNA demethylase?
LSD1 demethylation of H3K4 is well-documented, while the evidence for its demethylation of H3K9 is not yet as strong; if LSD1 does demethylate both lysines, how is its activity controlled so that the pattern of reciprocal methylation is maintained?
Like the histone H3K4 methyltransferases of the MLL/Set1 family, the histoneH3K36me3 Jumonji demethylase JHDM1 has a CXXC domain (164); it has not yet been shown to associate with DNA. If such an association is found, is there a correlation between DNA methylation and H3K36 methylation? Interestingly, similar CXXC domains have also been found in Dnmt1 (generating 5mC) (165), methyl-CpG-binding protein MBD1 (binding 5mC) (166), and Tet1 [a Jumonji-like 2-oxoglutarate- and Fe(II)-dependent enzyme that catalyzes conversion of 5mC to 5-hydroxymethylcytosine (167)].
Is a role played, in the linkage between histone modification and DNAmethylation, by methyl-CpG-binding proteins such as MBDs? It is noteworthy that MBD4 phosphorylation enhances DNA demethylation (168). In addition, SETDB1 and -2, two related H3K9 methyltransferases, contain a putative MBD domain.
Does the ratio of MLLs/Set1 to GLP/G9a (or SETDB1 and -2) vary, and if so, how does this affect overall DNA methylation?
While the field still faces a number of critical questions, it is clear that structural analyses will continue to play a central and synergistic role along with the biochemical and genetic studies in addressing them.
†The work in the authors’ laboratories is currently supported by the U.S. National Institutes of Health (GM068680-05, DK-082678-02, and GM049245-16 to X.C.) and the National Science Foundation (MCB-0964728, to R.M.B.).
1Abbreviations: H3K4 and H3K9, histone H3 lysine 4 and lysine 9, respectively; GLP, G9a-like protein; ADD, ATRX-Dnmt3-Dnmt3L; PHF8, plant homeodomain finger protein 8; SINE and LINE, short and long interspersed repetitive nuclear elements, respectively; UHRF1, ubiquitin-like, containing PHD and RING finger domains 1; Np95, nuclear protein of 95 kDa; ICBP90, inverted CCAAT binding protein of 90 kDa; UBL, ubiquitin-like; PHD, plant homeodomain; SRA, SET- and RING-associated; RING, really interesting new gene; MLL, myeloid/lymphoid or mixed lineage leukemia; CFP1, CXXC finger protein 1; EHMT1, euchromatin histone methyltransferase 1.