|Home | About | Journals | Submit | Contact Us | Français|
Posttranslational modifications of histones are coupled in the regulation of the cellular processes involving chromatin such as transcription, replication, repair, and genome stability. Recent biochemical and genetic studies have clearly demonstrated that many aspects of chromatin, and not just posttranslational modifications of histones, provide surfaces that can interact with effectors and the modifying machineries in a context-dependent manner, all as a part of the “chromatin signaling pathway”. Here, we have reviewed recent findings on the molecular basis for the recruitment of the chromatin-modifying machineries and their diverse and varied biological outcomes.
Although every cell within our body bears the same genetic information and the same set of genes, only a small subset of genes is transcribed in a given cell at a given time. The molecular mechanism underlying this cell/stage-specific transcriptional control has been the subject of intense study for many years. The genetic information encoded in our DNA is packaged within nucleosomal arrays forming what is referred to as chromatin. A nucleosome contains ~146 bp of DNA wrapped twice around an octamer composed of two copies of histones H3 and H4, and two copies of histones H2A and H2B. Initial structural studies using electron microscopy demonstrated that nucleosomes are found in arrays forming a series of beads on a string, with the beads being the individual nucleosomes, and the string the linker DNA (Kornberg, 1974; Kornberg and Lorch, 1999). Histone proteins contain a flexible amino-terminal tail and a characteristic histone fold, a globular domain that mediates substantial interactions between histones to form the nucleosome. High resolution X-ray crystallography demonstrated that histone N-terminal tails protrude outward from the nucleosome (Luger et al., 1997), and biochemical studies have confirmed that such histone tails are extensively, posttranslationally modified (Bhaumik et al., 2007).
Reported posttranslational modifications of histones so far include acetylation, phosphorylation, methylation, monoubiquitination, sumoylation and ADP ribosylation (Berger, 2007; Bernstein et al., 2007; Campos and Reinberg, 2009; Kouzarides, 2007; Shilatifard, 2006; Weake and Workman, 2008; Workman and Kingston, 1998). For almost every modification identified, there are also machineries involved in their removal (Bhaumik et al., 2007). Histone modifications can have a variety of functions; they can change the charge of a residue to disrupt protein-DNA, protein-protein, and nucleosome-nucleosome interactions; and they can form binding surfaces for a variety of proteins (Campos and Reinberg, 2009; Kouzarides, 2007). Histone modifications that form binding surfaces for other proteins have been implicated in several epigenetic processes. For example, HP1 can bind to the histone H3 tail methylated on lysine 9 as part of the process of stable maintenance of heterochromatic silencing. Polycomb (Pc) and Eed proteins can bind to histone H3K27-methylated tails at different stages of the cell cycle to help maintain the silencing of developmental genes during differentiation (Hansen et al., 2008; Kouzarides, 2007; Margueron et al., 2009). However, histone modifications are not always epigenetic. The addition and removal of histone acetylation and phosphorylation can be transient events that are associated with the initiation and repression of genes. For example, histone H2B monoubiquitination is added, then quickly removed during the process of gene activation (Henry et al., 2003). Histone H3K4 methylation is associated with counteracting Pc silencing of developmental genes when implemented by Trithorax, but is also found on many housekeeping genes where its function is unknown. Thus, the same modification can have both epigenetic and non-epigenetic functions.
Recent studies have demonstrated that the posttranslational modifications of histones do not represent a code and are no different than the posttranslational modifications associated with any other proteins in the cell (Lee et al., 2010; Schreiber and Bernstein, 2002; Sims and Reinberg, 2008). The posttranslational modifications of histones are part of signaling pathways, and their readout is context-dependent with the biological outcomes dictated by many variables. In this review, we analyze recent reports on the different molecular mechanisms of recruitment and biological outcomes for histone-modifying machineries. These studies demonstrate that a combination of many factors, including DNA elements, protein-protein interactions, stage and origin of cells, and posttranslational modifications on transcription factors and histones, regulate the diverse biological outcome associated within the “chromatin signaling pathway.”
Although it was known for decades that histone acetylation was associated with actively transcribed genes, it was not known if histone acetylation was a consequence of transcription or was involved in the activation process. Purification of a histone acetyltransferase from the transcriptionally active macronucleus of the ciliated protozoan Tetrahymena thermophila identified a protein homologous to yeast Gcn5, previously described as a coactivator of transcription (Berger et al., 1992; Brownell et al., 1996; Marcus et al., 1994). Coactivators were postulated to be factors required for bridging sequence-specific binding proteins and the basal transcriptional machinery (Berger et al., 1990; Pugh and Tjian, 1990). Subsequently, other previously known coactivators were shown to be histone acetyltransferases or to participate in complexes with histone acetyltransferases, including CBP/p300, SRC-1 and ACTR (Bannister and Kouzarides, 1996; Chen et al., 1997; Ogryzko et al., 1996; Spencer et al., 1997). Both Gcn5, as part of the SAGA histone H3 acetyltransferase complex, and Esa1, as part of the NuA4 histone H4 acetyltransferase complex, were shown to be recruited by certain acidic activators (Utley et al., 1998). Importantly, another Gcn5-containing complex, the ADA complex, did not interact with these activators, demonstrating specificity in the recruitment of different histone-modifying activities by distinct activators.
Concurrent with the discovery of a histone acetyltransferase as a transcriptional activator was the discovery of a known transcriptional repressor, Rpd3, as a histone deacetylase (Taunton et al., 1996). Rpd3 and related enzymes are found to act as corepressors with sequence-specific transcriptional binding factors such as nuclear hormone receptors that can recruit SMRT/NCOR histone deacetylase complexes (Alland et al., 1997; Hassig et al., 1997; Heinzel et al., 1997; Kadosh and Struhl, 1997; Nagy et al., 1997). The same repressor can recruit different deacetylase complexes. For example, Ikaros is a critical regulator of hematopoiesis that can interact with both the SIN3A and Mi-2-NuRD deacetylase complexes to repress lineage-specific genes (Kim et al., 1999; Koipally et al., 1999). Together, these studies not only demonstrated a functional role for histone-modifying activities as key regulators of transcription, but also provided the first paradigm of how histone-modifying activities could be recruited to chromatin (Figure 1).
While histone acetylation and other histone modifications can function at the promoter during transcriptional activation, histone modifications can also have roles in the body of genes. The trimethylation of histone H3 on lysine 36 is mediated by the Set2 enzyme in yeast and animals, and this modification peaks in the middle and the 3 end of genes associated with the elongating RNA polymerase II (Krogan et al., 2003b; Shilatifard, 2006). Purification of the Set2 enzyme from yeast identified the large subunit of Pol II, Rpb1, as an interactor of Set2 (Krogan et al., 2003b; Xiao et al., 2003). This interaction depends on the phosphorylation of Rpb1’s C-terminal repeat domain on Serine 2, a marker of elongating Pol II. Thus, Set2 is recruited to gene bodies through an interaction with elongating Pol II (Figure 2).
In contrast to the direct physical interaction between Set2 and RNA Pol II, other histone-modifying activities can be recruited subsequently to transcription initiation through the interaction with the Polymerase-associated factor (Paf1) complex (Krogan et al., 2003a; Ng et al., 2003a; Wood et al., 2003). The Paf1 complex is associated with elongating Pol II. It was found in a global proteomic analysis in S. cerevisiae (GPS) to be required for proper H3K4 methylation (Krogan et al., 2003a; Wood et al., 2003). Paf1 directly interacts with COMPASS (Complex of Proteins Associated with Set1), the sole H3K4 methyltransferase in yeast, and Paf1 is required for the recruitment of COMPASS to chromatin (Wood et al., 2003). These early studies in yeast set the paradigm that the Paf1 complex plays a role as a “platform” for the requirement of histone-modifying machinery to the elongating Pol II (Gerber and Shilatifard, 2003) (Figure 2). We now know that the Drosophila and mammalian Paf1 complexes also function as a platform for the recruitment of Set1/TRX/MLL-containing complexes in Drosophila and human cells (Tenney et al., 2006; Wang et al., 2008).
In addition to the recruitment of histone-modifying activities to chromatin, complexes can be recruited to chromatin, but remain in a relatively inactive state. The Paf1 complex appears to regulate several activities in this manner. Histone H2B can be monoubiquitinated by the Rad6/Bre1 E2/E3 ubiquitin ligase complex. This monoubiquitination is needed for the higher methylation states of H3K4 and H3K79, and Paf1 is required for H2B monoubiquitination (Krogan et al., 2003a; Wood et al., 2003). However, both Rad6/Bre1 and Dot1 appear to be recruited independently of Paf1 (Wood et al., 2003). Paf1 stimulates the enzymatic activity of Rad6/Bre1, perhaps post-recruitment of these complexes to chromatin. The subsequent monoubiquitination of H2B stimulates the activity of Set1/COMPASS and Dot1 to generate the trimethylation of H3K4 and H3K79 (Dover et al., 2002; Krogan et al., 2003a; Wood et al., 2003). Several recent mammalian studies have also confirmed the generality of these early observations made in yeast regarding the role of the Paf1 complex in the regulation of histone H2B monoubiquitination by Rad6/Bre1 and H3K4 methylation by Set1/COMPASS and H3K79 methylation byDot1 (Kim et al., 2009; McGinty et al., 2008; Pavri et al., 2006; Zhu et al., 2005).
Many of the posttranslational modifications of histones can enhance chromatin binding by other proteins through a variety of protein domains: bromodomains, found in several transcriptional activators can preferentially bind peptides with acetylated lysines; 14-3-3 and forkhead-domains can bind phosphorylated serines and threonines; chromodomains, MBT repeats, and PHD fingers can discriminate among lysines that are mono-, di- or trimethylated; and tudor domains can recognize methylated arginine or lysine residues (Maurer-Stroh et al., 2003; Taverna et al., 2007; Yaffe and Elia, 2001). The importance of the interaction between histone modifications and proteins containing these modules is exemplified by the occurrence of mutations within these domains in human disease (Matthews et al., 2007; Pena et al., 2006). Modifications can help form a landing platform for proteins and their complexes to aid in recruitment to chromatin through recognition of the modified residue. One example of this is the recruitment of deacetylase complexes to the body of transcriptionally active genes that require the chromodomain-containing protein Eaf3. Eaf3 preferentially binds to H3K36 di- and trimethylated states. While Eaf3 is a component of both histone acetyltransferase and deacetylase complexes, loss of Eaf3 leads to an increase in acetylation in the body of genes, which is proposed to have the effect of opening up the chromatin structure to allow “cryptic” transcription (Carrozza et al., 2005; Keogh et al., 2005) (Figure 3).
An important issue is the relative contribution of histone modifications in recruitment of other factors to chromatin. Since Eaf3 is also a component of the NuA4 histone acetyltransferase complex, one might expect both of the Eaf3 complexes to compete for binding to H3K36-methylated nucleosomes. However, Rco1, a component of Rpd3S, but not of NuA4, interacts with nucleosomes (with or without histone modifications) and is required for the interaction of Rpd3S with nucleosomes methylated at H3K36; indicating that both histone modification-dependent and independent mechanisms are important for targeting Rpd3S to the coding regions of transcribed genes (Li et al., 2007). Recently, Rpd3S has been demonstrated to be recruited to genes through interactions between the Rco1 subunit and the serine 5 phosphorylated form of the CTD of RNA Pol II. This finding suggests that the interaction with H3K36 methylated nucleosomes by Eaf3/Rco1 is subsequent to the initial recruitment of the complex by the CTD of RNA Pol II (Govind et al., 2010). Supporting these observations, genome wide profiling of Rpd3S in a Set2 deletion by Robert and colleagues demonstrated that H3K36 methylation had a modest effect on the recruitment of Rpd3S to transcribed regions (F. Robert, unpublished data).
Animals have at least two homologs of Eaf3, its ortholog Mrg15, and a paralog MSL3. Mrg15 participates in the NuA4-like Tip60 complex as well as the histone deacetylase complexes (Kusch et al., 2004; Lee et al., 2009; Moshkin et al., 2009; Spain et al., 2010). MSL3 is part of an H4K16-specific histone acetyltransferase complex. MSL3 can interact with H3K36-methylated nucleosomes, and mediates acetylation in the ORFs of transcribed genes (Sural et al., 2008). Although a balance of histone acetylation/deacetylation in ORFs is probably important for such processes as enhancing transcription elongation and maintaining transcription fidelity, how the MRG15 and MSL3 complexes are precisely targeted to the promoters or gene bodies likely involves factors other than just H3K36me3 binding. Rpd3S has not been extensively characterized in metazoans, although complexes with MRG15 and histone deacetylases have been reported (Lee et al., 2009; Tominaga et al., 2003; Yochum and Ayer, 2002). Importantly, the metazoan homologs of Rco1, PHF12 in mammals and CG3815 in Drosophila, are found in Rpd3S-like complexes, although the relative contribution of this subunit to the recruitment of Rpd3S by direct interaction with RNA Pol II, or indirect interaction with the H3K36 methylation deposited by Set2, remains to be determined.
In the coactivator model, the DNA bound transcription factors recruit histone-modifying complexes. However, the most direct way to recruit a histone-modifying activity to particular genes is through recognition of specific DNA sequences by the histone-modifying complexes. This process is perhaps at the heart of, and the basis for, all recruitment to chromatin and can be considered the main step in the epigenetic regulation as well (Berger et al., 2009). Here, we review the role of DNA elements that form target sites for histone-modifying complexes that extend sequence-dependent recruitment mechanisms beyond the simple coactivator/corepressor model.
The identification of the Polycomb Group of genes (PcG) (Ringrose and Paro, 2004) was made by the discovery of mutations in several genes in Drosophila that led to the ectopic expression of the Hox genes and a transformation of segmental identity. The PcG genes are required for the repression of Hox genes in regions where these genes have not been activated. The repression can be stably maintained for several cell generations after the transcription factors that initially determined cell identity are no longer present, constituting a form of epigenetic memory (Berger et al., 2009). Cloned regions near the Hox genes were shown to confer proper repression of reporter genes in a PcG-dependent manner, thus being known as Polycomb responsive elements (PRE). PREs are like enhancers in that they can function from large distances from genes and are sometimes found in introns of the genes they regulate. However, unlike canonical enhancer-binding proteins, PcG proteins are expressed ubiquitously and are downstream of the decision of whether a target gene is to be activated or repressed (Ringrose and Paro, 2004).
PcG proteins are found in at least three distinct complexes (Schuettengruber et al., 2007; Schwartz and Pirrotta, 2007; Simon and Kingston, 2009; Wang et al., 2004). Polycomb repressive complex 2 (PRC2) contains the Suppressor of zeste 12 (Su(z)12), extra sex combs (ESC) as well as the histone methyltransferase Enhancer of zeste (E(z)), or EZH1 and EZH2 in mammals) that implements H3K27 methylation. H3K27 methylation has been shown to form a binding site for the Polycomb protein (Pc), a component of the Polycomb repressive complex 1 (PRC1). PRC1 also includes the E3 ubiquitin ligase RING, Posterior Sex Combs (PSC, or Bmi-1 in mammals) and Polyhomeotic (PH). A third complex, the PHO-repressive complex, consists of Pleiohomeotic (PHO, similar to YY1 in mammals) and Sfmbt (MBTD1 in mammals), a protein that can recognize H4K20 mono- and dimethylated states. Of the known PcG proteins, PHO, which bears several zinc fingers, is clearly a sequence-specific DNA-binding protein and is required for recruitment of PRC2 to chromatin in Drosophila. However PHO binding is not sufficient for recruitment, and other factors have been implicated in PRC2 localization in Drosophila. While statistically significant, enrichment of sequences such as the PHO/YY1 consensus can be found in several PREs, a sequence-based definition of PREs has remained elusive (Ringrose et al. 2003).
One outstanding question is whether PREs exist in mammals. Although PRC2 is conserved from flies to mammals, YY1 appears to be the only mammalian homolog of the Drosophila transcription factors implicated in sequence-specific binding to elements commonly found within Drosophila PREs. However, Jarid2, a protein with very weak DNA-binding activity was recently identified as a component of PRC2 that is required for recruitment to chromatin. The mechanism by which Jarid2 recruits PRC2 is not known (Landeira et al., 2010; Li et al., 2010; Pasini et al., 2010; Peng et al., 2009) [for review please see (Herz and Shilatifard, 2010)]. Unlike other PRC2 components, Jarid2 shows obvious tissue-specific differences in expression, being particularly highly expressed in embryonic stem cells. Additionally, Jarid2 seems to have different effects on different genes, perhaps reflecting the diversity in the PRC2 complexes (Herz and Shilatifard, 2010). A full understanding of the role of Jarid2 in the regulation of PRC2 activity and gene expression will require further detailed and comprehensive biochemical and in vivo studies using organisms amenable to genetic manipulations.
Recently, Kingston and colleagues identified a 1.8 kb region that conferred PcG responsiveness at the HoxD locus on a reporter gene (Woo et al., 2010). This region contained YY1 binding sites (Wang et al., 2004), as well as an evolutionarily conserved sequence within the HoxD locus (Beckers and Duboule, 1998). In addition to these two features, this putative mammalian PRE contains a CpG island (Illingworth and Bird, 2009). In mammals, PRC2-binding is highly correlated with the presence of CpG islands, and CpG islands are largely predictive of the presence of PRC2, suggesting that CpG islands behave as PREs in mammalian cells (Ku et al., 2008). CpG islands are regions enriched for CpG dinucleotides that are associated with promoters of ~ 60–70 % of genes (Illingworth and Bird, 2009). Most CpG dinucleotides in the mammalian genome are cytosine-methylated, but when they are sufficiently clustered together in islands, they are typically unmethylated, except during transcriptional silencing associated with imprinting, X inactivation, or silencing of tumor suppressor genes during oncogenesis. Genes that are silenced during development are not associated with the CpG methylation of their promoters, even if associated with CpG islands (Baylin and Bestor, 2002; Illingworth and Bird, 2009). Therefore, it is likely that PcG function, without DNA methylation, is the major repressor of many developmentally regulated genes, perhaps through recognition of CpG islands; however PcG proteins can also synergize with DNA methylation machineries during silencing associated with imprinting and X inactivation (Illingworth and Bird, 2009). How is PRC2 targeted to CpG islands? CpG-rich sequences can be recognized by proteins bearing a CXXC domain, a type of zinc finger that can recognize CpG sequences. However, CXXC domains are not found within PRC2 components, suggesting a novel domain recognizes these sequences, or that other motifs found within these islands help recruit PRC2.
While Hox genes can be silenced with PcG complexes, they can be activated with the help of the Trithorax group of proteins (TrxG) (Eissenberg and Shilatifard, 2009). Trithorax, like its mammalian counterpart MLL, is an H3K4 methyltransferase. H3K4 and H3K27-methylated nucleosomes are usually found to occur in a mutually exclusive pattern throughout the genome. This can perhaps be explained by the competition for the same recruitment sites by the enzyme complexes that implement these modifications. Experimental evidence indicates that PREs and Trithorax recruitment elements (TREs) substantially overlap (Ringrose and Paro, 2007). DNA-binding factors that are important for recognizing TREs include the GAGA factor (also known as Trl), which recognizes GA-rich sequences (Farkas et al., 1994), and Pho and Zeste, although other combinations of factors are likely to be important for recruitment (Schwartz et al., 2006). In mammals, the TRX homologs MLL1 and MLL2 have CXXC domains within their polypeptide, suggesting that the CpG character within a PRE may constitute a distinct sequence that helps recruit the competing TrxG activity to the PREs/TREs. Interestingly, recent work from several laboratories has shown that the CXXC domains are involved in targeting distinct types of chromatin-modifying activities to chromatin, confirming the importance of the CpG islands in recruiting histone-modifying activities to chromatin while also raising the question of which other features determine the specificity demonstrated for these diverse activities?
The CXXC domain binding of CpG was originally discovered in MBD1, a protein isolated for its homology to the methylated CpG DNA-binding protein, MeCP2, but also having three zinc fingers of the CXXC type (Cross et al., 1997). Subsequently, it was determined that while two of the CXXC domains within MBD1 bind methylated DNA, many CXXC domains are specific for unmethylated CpG sequences (Birke et al., 2002; Lee et al., 2001), including a third CXXC domain within MBD1 (Jorgensen et al., 2004) and the DNA methyltransferase DNMT1 (Pradhan et al., 2008). Thus, multiple chromatin modifiers could potentially be recruited by CpG-rich sequences. Two recent studies have tested this idea and found a critical role for CXXC domains in recruiting histone-modifying activities to chromatin (Blackledge et al., 2010; Thomson et al., 2010).
In one test of the role of the CXXC domains in targeting CpG islands, Bird and colleagues studied the recruitment of CXXC1 (Cfp1), a component of mammalian Set1/COMPASS, the major H3K4me3 in mammals (Lee and Skalnik, 2005; Miller et al., 2001; Thomson et al., 2010; Wu et al., 2008). Bird and colleagues performed genome-wide profiling of CXXC1 in mouse brain. They found CXXC1 to be localized at 80% of the CpG islands, 90% of which were enriched for H3K4me3. Half of the CXXC1-negative CpG islands were previously reported to be sites of Polycomb and H3K27me3 occupancy. Comparing CXXC1 occupancy at Xist, a gene transcribed on only one of two X chromosomes in females, CXXC1 was found to associate exclusively with the transcribed, unmethylated, CpG copy. Together, these findings suggest that CpG islands could help recruit mammalian Set1/COMPASS through CXXC1’s affinity for unmethylated CpG sequences.
To experimentally test the role of CpG islands in recruiting CXXC1, Bird and colleagues created two ES cell lines carrying a promoterless construct with eGFP and puromycin sequences that contain CpG dinucleotide densities similar to those found in CpG islands. In one cell line, the artificial CpG island was targeted to the 3 end of Nanog, while in the other, the construct was targeted to the 3 end of Mecp2, an X-linked gene. When located adjacent to Nanog, the cassette remained free of CpG methylation, despite lacking detectable Pol II. H3K4me3 and CXXC1 occupancy tracked CpG density within the cassette and around the insertion site. Interestingly, the cassette inserted next to the Mecp2 gene showed two-thirds CpG methylation, but still was found to be bound by CXXC1. Bisulfite sequencing of the CXXC1 and H3K4me3 ChIP DNA demonstrated that only a third of the immunoprecipitated cassette was CpG-methylated, strongly suggesting that unmethylated CpG sequences are recruitment sites for mammalian Set1/COMPASS. However, point mutations within the CXXC domain of CXXC1 would be required to demonstrate the direct role of this protein in binding CpG and the ensuing regulation of H3K4 methylation by mammalian Set1/COMPASS.
While Set1/COMPASS is the major H3K4 trimethylase in mammalian cells (Wu et al., 2008), in other studies it has been demonstrated that the loss of CXXC1 leads to an increase in global H3K4 trimethylation levels in ES cells (Tate et al., 2009). Based on this observation, Skalnik and colleagues have suggested that CXXC1 functions by restricting Set1/COMPASS methyltransferase activity. In contrast, Bird and colleagues find that H3K4me3 levels are reduced at the CpG islands in the absence of CXXC1. It will be interesting to learn where in the genome H3K4me3 is increasing upon loss of CXXC1.
An important finding by Bird and colleagues is that H3K4 methylation implemented by mammalian Set1/COMPASS is independent of Pol II and transcription, while findings in yeast have shown that transcription is required for proper H3K4 trimethylation (Krogan et al., 2003a; Ng et al., 2003b; Shilatifard, 2006). One possible explanation for this apparent difference is that the interaction of Pol II with CpG islands might be transient and not as easily detectable as the product of the process, histone H3K4 trimethylation. Therefore, sensitive RNA-seq methods, such as global run-on sequencing GRO-seq (Core et al., 2008), could reveal transcription within these CpG islands, thus explaining the association of CXXC1 and mammalian COMPASS at these CpG islands.
Interestingly, the yeast homolog of CXXC1, Cps40, has a PHD finger in common, but lacks a CXXC domain. However, the Drosophila homolog of CXXC1, CG17446, contains both the PHD and CXXC domains. CpG islands have not been studied in Drosophila, although their existence has been predicted (Takai and Jones, 2002). It is notable that Drosophila Trithorax, unlike its mammalian counterpart MLL, lacks a CXXC domain, further demonstrating that although these H3K4 methyltransferase complexes are largely conserved in composition and function, some aspects of their recruitment can differ, perhaps reflecting differences in the size or complexity of the genome in which they are found.
Another class of histone-modifying enzymes bearing a CXXC domain is KDM2A/B. KDM2A/B is a histone demethylase that preferentially uses H3K36me2 as a substrate (Tsukada et al., 2006). H3K36me2 is a modification that has previously been linked to gene silencing (Bender et al., 2006), and can recruit histone deacetylases (Li et al., 2009; Youdell et al., 2008); suggesting that removal of H3K36me2 could facilitate the formation of open chromatin. Klose and colleagues tested the role of the CXXC domain in targeting KDM2A. First, they tested the DNA-binding specificity of KDM2A’s CXXC domain and found that it preferentially binds to unmethylated CpG sequences (Blackledge et al., 2010). Genome-wide profiling demonstrated that KDM2A was highly enriched at annotated CpG islands. Major peaks of KDM2A binding not corresponding to CpG islands were shown to have a high CpG content with little DNA methylation as assessed by bisulfite sequencing. Since most CpGs outside of the CpG islands are unmethylated, Klose and colleagues likely found novel CpG islands that were previously unnoticed due to the statistical criteria used in the annotation process. Sites of strong KDM2A binding were also found to be depleted for H3K36me2, suggesting that KDM2A localization to CpG islands results in active demethylation of H3K36me2. Indeed, knockdown of KDM2A results in increased H3K36me2 at some CpG islands, however, very little alteration in transcription was reported when KMD2A levels were reduced by RNAi, suggesting that H3K36 dimethylation at CpG islands at the promoters does not have a major transcriptional regulatory role. KDM2A and the highly related KDM2B have been shown to be highly concentrated in nucleoli of cells where they can repress ribosomal RNA transcription (Frescas et al., 2007; Tanaka et al., 2010). Interestingly, rDNA accounts for 20% of the unmethylated CpG sequences in the mouse genome (Bird et al., 1985), suggesting that KDM2A and KDM2B are targeted in part through CpG recognition for rDNA transcription by RNA Pol I.
The studies by the Bird and Klose groups were both in agreement that recruitment to unmethylated CpG islands via the CXXC domain was uncorrelated with transcriptional activity, suggesting that unmethylated CpG content is sufficient for recruitment. Since MBD1 and the DNA methyltransferase DNMT1 also have CXXC fingers that recognize unmethylated CpG, it will be important to ask how various CXXC-associated activities compete or coexist with each other on the same site and how this is regulated during development. It would not be surprising if future studies in this regard will find that the interactions of the CXXC finger-containing proteins with their target site are context-dependent and require other cellular signals for proper function. The studies by the Bird and Klose groups should stimulate investigations into the role of CpG islands in transcription and the function of histone-modifiying activities in this process.
One of the best studied histone modifications is the methylation of H3 on lysine 4 (H3K4). The first methyltransferase complex for H3K4 to be identified was COMPASS (Complex of Proteins Associated with Set1) in yeast (Krogan et al., 2002; Miller et al., 2001; Roguev et al., 2001; Shilatifard, 2006). The original interest in purifying COMPASS was based on the similarity between yeast Set1 and the human MLL protein involved in leukemia, and indeed, human MLL forms a COMPASS-like complex with methyltransferase activity for H3K4 (Eissenberg and Shilatifard, 2009; Hughes et al., 2004; Miller et al., 2001; Shilatifard, 2006). While yeast has just one methyltransferase for H3K4, humans have at least six methyltransferases that can methylate H3K4, including MLL1-4 and Set1A/B which are found in COMPASS-like complexes (Cho et al., 2007b; Eissenberg and Shilatifard, 2010; Hughes et al., 2004; Lee and Skalnik, 2008; Lee et al., 2007a; Shilatifard, 2008; Wu et al., 2008). COMPASS-like complexes are defined as having a Set1 or MLL-related protein, and the core subunits Cps60/Ash2, Cps30/Wdr5, and Cps50/Rbbp5.
In addition to the common subunits, the Set1 and MLL complexes each have unique subunits. MLL1 and MLL2, which are related to Drosophila Trithorax (TRX), interact with the tumor suppressor Menin, which helps target MLL1/2 to Hox and other targets for transcription activation (Hughes et al., 2004; Wang et al., 2009). MLL3 and MLL4, which show high similarity to Drosophila trithorax-related (TRR), contain NCOA6, PTIP and PA-1 which may help target MLL3/4 to hormone-responsive genes, together with UTX, an H3K27 histone demethylase (Cho et al., 2007b; Issaeva et al., 2007; Patel et al., 2007). Two subunits found in the SETD1A and SETD1B complexes, but not in the MLL1-4 complexes, are CXXC1 and WDR82 (Lee and Skalnik, 2008; Lee et al., 2007a; Wu et al., 2008). These subunits are homologous to the COMPASS subunits Cps40 and Cps35 (Miller et al., 2001). Cps35/WDR82 has a unique role in mediating the crosstalk between histone H2B ubiquitination and H3K4 methylation (Lee et al., 2007b; Wu et al., 2008; Zheng et al., 2010).
The COMPASS and MLL1-4 complexes can be recruited by different mechanisms to chromatin, with varying functional consequences associated with H3K4 methylation. COMPASS can be recruited to the actively transcribed genes through its interaction with the Paf1 complex and RNA Pol II, which is sufficient for H3K4 monomethylation (Gerber and Shilatifard, 2003; Krogan et al., 2003a; Wood et al., 2003). The Paf1 complex may also be involved in recruiting Rad6/Bre1 through a direct Paf1/Bre1 interaction leading to ubiquitinated H2B within this region (Wood et al., 2003). Cps35/WDR82 interacts with COMPASS in a histone H2B monoubiquitination-dependent manner, and this interaction on chromatin converts COMPASS to a trimethylation competent complex (Lee et al., 2007b; Zheng et al., 2010). Deletion of COMPASS subunits or mutation of H3K4 to alanine has little effect on transcription levels in yeast (Miller et al., 2001), consistent with the recruitment and stimulation of COMPASS activity following gene activation (Krogan et al., 2003a; Ng et al., 2003b).
In contrast to COMPASS, the MLL complexes can function as transcriptional activators. Loss of MLL leads to loss of H3K4 methylation and activation of transcription at Hox and other loci (Hughes et al., 2004; Wang et al., 2009). MLL1-4 complexes appear to be co-activators of nuclear receptors (Cho et al., 2007a; Dreijerink et al., 2009; Eissenberg and Shilatifard, 2010; Goo et al., 2003). H3K4 methylation by the MLL complexes could activate genes through recruitment of H3K4me3-binding proteins. For example, the basal transcription factor can be directly recruited by H3K4me3 via the Taf3 subunit’s PHD finger (Vermeulen et al., 2007). Alternatively, H3K4me3 at the promoter could help recruit histone acetyltransferase complexes or nucleosome remodeling complexes, which harbor subunits with PHD fingers capable of recognizing the H3K4me3 state (Kouzarides, 2007; Ruthenburg et al., 2007). Thus, the timing and mechanism of recruitment to a gene can influence the biological readout of H3K4 methylation (Wang et al., 2009).
In addition to protein and DNA-based recruitment of histone modifiers, noncoding RNAs have recently been shown to be major regulators of chromatin and transcription. While RNAs are well-known integral components of ribosomes and spliceosomes for the translation and splicing of mRNAs, the role of RNA molecules in transcriptional regulatory complexes has only recently gained widespread notice. One of the first noncoding RNAs associated with transcriptional regulation was the Xist RNA, whose transcription is required for somatic silencing of an X chromosome in mammalian females, part of a process of dosage compensation to equalize expression from X-linked genes between XY males and XX females (Brockdorff et al., 1992; Brown et al., 1991). Inactivation of the X chromosome is associated with accumulation of repressive histone marks and the compaction of the X chromosome into the Barr body (Chow and Heard, 2009; Senner and Brockdorff, 2009). Silencing initiates and spreads in cis from the site of Xist transcription to encompass most of the X chromosome. The ability of certain forms of Xist RNA to interact with PRC2 complexes provides a molecular explanation for the requirement of Xist for H3K27 methylation and gene silencing (Zhao et al., 2008).
Equalization of transcription between males and females of Drosophila also involves noncoding RNAs. roX1 and roX2 are transcribed from the X chromosome in males and are required for the twofold increased expression of the male X chromosome to equalize transcription with the two X chromosomes from females. Unlike Xist, roX RNAs can act in trans (Kelley et al., 1999). They form a ribonucleoprotein complex with the MSL proteins that includes the histone acetyltransferase, MOF. MOF mediates H4K16 acetylation in the transcribed region of genes, which is proposed to facilitate the decompaction of the X chromosome to facilitate higher rates of transcription elongation by Pol II (Smith et al., 2001). The targeting of the roX-MSL complex to its X-linked targets is comprised of two steps; 1) recognition of high affinity sites for the MSL1-MSL2 subunits, and 2) spreading to nearby actively transcribed genes (Alekseyenko et al., 2008). The nature of the high affinity sites is unknown, perhaps consisting of low affinity sites within a specific context (Alekseyenko et al., 2008; Fauth et al., 2010; Straub et al., 2008). Spreading from high affinity sites to nearby genes could be facilitated by the MLE subunit, an RNA helicase whose closest mammalian homolog has been shown to interact with RNA Pol II (Nakajima et al., 1997). MLE’s presence in the roX-MSL complex is dependent on the integrity of the RNA component (Ilik and Akhtar, 2009; Smith et al., 2000). Thus, roX RNAs could function in MSL spreading by interacting with the RNA helicase MLE, which in turn associates with RNA Pol II at active genes, allowing the roX-MSL complex to spread from one active gene to another. The enrichment of the MSL complexes in the middle and the 3 end of the transcribed genes could be facilitated by interactions between the chromodomain of the MSL3 subunit and H3K36me3 implemented by Pol II-associated Set2 (Sural et al., 2008). Placing high affinity binding sites for MSL1-MSL2 on autosomes allows a similar spreading to nearby autosomal genes, which becomes even more pronounced if particular spliced forms of roX RNAs are ectopically expressed (Park et al., 2005).
While the roX RNAs are associated with the upregulation of transcription, it is more common for noncoding RNAs to recruit repressive histone-modifying activities similarly to Xist. For example, Air and Kcnq1ot1 are noncoding RNAs expressed from the paternal chromosome that are required for silencing of neighboring genes in cis (Mancini-Dinardo et al., 2006; Sleutels et al., 2002). Both have been shown to associate with the H3K9 methyltransferase, G9a, while Kcnq1ot1 is also associated with PRC2 (Nagano et al., 2008; Pandey et al., 2008). Although Xist, Air, and Kcnq1ot1 all work to silence genes in cis, it is becoming apparent that noncoding RNAs can silence genes on other chromosomes. HOTAIR, a noncoding RNA transcribed from the HOXC cluster, recruits PRC2 to silence genes in the HOXD cluster (Rinn et al., 2007). Separate regions of HOTAIR interact with the PRC2 and LSD1/CoREST complexes (Tsai et al., 2010). LSD1 is an H3K4 demethylase that can demethylate H3K4me2, but not H3K4me3, and associates with the CoREST histone deacetylase complex (Lee et al., 2005; Shi et al., 2004). In addition to a role in developmental regulation, HOTAIR is also implicated in promoting metastasis. HOTAIR is frequently upregulated in metastatic tumors and its high expression is associated with a poor prognosis (Gupta et al., 2010). Importantly, PRC2 components are required for the matrix invasiveness of cells ectopically expressing HOTAIR, suggesting that HOTAIR is mistargeting PRC2 components when over-expressed (Gupta et al., 2010).
The potential regulatory scope of noncoding RNAs is just being realized. Genome-wide analyses have identified over 1000 “large intergenic noncoding RNAs” (lincRNAs) based on signatures of H3K4me3 at transcription start sites and H3K36me3 in transcribed regions (Guttman et al., 2009). RNA immunoprecipitation with PRC2 antibodies identified 24% of the expressed noncoding RNAs as interactors (Khalil et al., 2009). RNAi knockdown of six of these lincRNAs showed significant overlap between genes upregulated by the knockdown of PRC2 components, but no significant overlap among the six noncoding RNAs (Khalil et al., 2009). One of these, TUG1, is induced upon DNA damage in a P53-dependent manner and is required for the repression of a set of genes involved in cell cycle regulation (Guttman et al., 2009; Khalil et al., 2009). Subsequent studies aimed at finding P53-induced noncoding RNAs identified lincRNA-p21 as a gene required for repression of genes in the P53 pathway (Huarte et al., 2010). While lincRNA-p21 is located near p21, which is also a repressor in the P53 pathway, their gene targets do not significantly overlap. LincRNA-p21 interacts with PRC2 similarly to many other lincRNAs, but it also interacts with hnRNP-K (Huarte et al., 2010). hnRNP-K, when interacting with P53, can activate genes. When associating with lincRNA-p21, hnRNP-K mediates repression of genes; and hnRNP-K localization to these targets requires intact lincRNA-p21 (Huarte et al., 2010). The corepression of genes by hnRNP-K and lincRNA-p21 provides a hint as to why so many different noncoding RNAs associate with PRC2 for transcriptional repression. Distinct noncoding RNAs could form unique scaffolds for histone-modifying complexes to associate with gene-specific targeting factors. P53 alone is involved in the upregulation of approximately 30 noncoding RNAs under a variety of conditions and cell types, indicating that the gene regulatory potential for these large noncoding RNAs is enormous (Huarte et al., 2010). Recent work suggests that short RNA transcripts from the PRC2 target genes can form short hairpin structures that directly recruit PRC2 (Kanhere et al., 2010), and small RNAs transcribed from LINE elements on the mammalian X chromosome could play a key role in X inactivation (Chow et al., 2010), indicating that both large and small RNAs contribute to these silencing events.
For many of the complexes we ve discussed here, several mechanisms exist for recruiting the same complex. PRC2, for example, can be recruited by its previously deposited H3K27me3 mark for epigenetic memory, it can be recruited through DNA sequences such as PREs, and it can be recruited through interacting with noncoding RNAs. The opposing activity of PRC2 is mediated in part by MLL, which also displays multiple modes of targeting to chromatin. For example, MLL complexes can be recruited through MLL’s CXXC domain to CpG islands (Ayton et al., 2004; Cierpicki et al., 2010). Translocation of the MLL gene with a variety of other genes results in chimeric proteins comprised of an N-terminal portion of MLL that includes the CXXC domain. Mutations in MLL’s CXXC that prevent DNA-binding, cause an increase in DNA methylation at the Hoxa9 locus, reduced Hoxa9 expression and reduced transformation of bone marrow cells by the MLL-AF9 chimera (Ayton et al., 2004; Bach et al., 2009; Cierpicki et al., 2010). It has also recently been proposed that MLL can be recruited to genes through a direct interaction between its CXXC domain and the PAF complex. Remarkably, it was found that only MLL1, and not other Set1-related proteins, could interact with the PAF complex (Milne et al., 2010; Muntean et al., 2010). At this time, it is unclear how these data can be reconciled with demonstrated functional differences for the Set1 and MLL complexes in mammals (Wang et al., 2009; Wu et al., 2008).
Another domain that may be important for targeting MLL to chromatin is its third PHD finger (PHD3), which specifically recognizes H3K4me3 (Chang et al., 2010). MLL also contains AT hooks, which are found in proteins that bind AT-rich sequences (Reeves and Nissen, 1990), although their requirement for recruitment of MLL to Hox or other loci is n’t known. Aside from domains that bind DNA sequences or histone modifications, MLL could also be recruited to genes through the coactivator model, due to its interaction with Menin-LEDGF and their recruitment by nuclear hormone receptors (Dreijerink et al., 2009; Eissenberg and Shilatifard, 2010).
Histone modifications play an indisputably important role in transcription and other DNA-templated processes. However, the identity of the enzyme and the mechanism of recruitment can influence the effect of the modification. This is exemplified by the diverse functions of the highly related Set1/COMPASS and MLL COMPASS-like complexes, whose biological roles are reflected in part by their mode of recruitment. Another example is the differing role of H3S10P at the FOSL1 gene, which when mediated early during activation by MSK1 at the promoter, or later at the enhancer by Pim1, leads to distinct downstream events (Zippo et al., 2009). Furthermore, histone-modifying enzymes can themselves be multi-faceted transcription factors, with only one aspect being the ability to modify histones. For example, distinct phenotypes are found for loss of MLL and loss of just the histone-modifying Set domain within MLL, the latter of which yields viable offspring (Terranova et al., 2006). Thus, rather than simply “writing or erasing” a code that is waiting to be “read”, histone-modifying activities are integral components of gene regulatory networks in a larger “chromatin signaling pathway”. A better understanding of the multiple modes of recruitment of these histone-modifying activities is therefore essential for understanding the gene regulatory processes in which they are engaged. The growing number of noncoding RNAs involved in this process provides a challenging, yet promising, avenue for future research. Identification of the factors that bind to these RNAs could help us better understand how histone-modifying enzymes are targeted for gene activation or repression.
We thank Laura Shilatifard for Editorial assistance. The studies in the Shilatifard laboratory are supported in part by grants from the National Institute of Health: R01GM069905, R01CA150265, and R01CA89455.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.