|Home | About | Journals | Submit | Contact Us | Français|
Chromosomal surfaces are ornamented with a variety of posttranslational modifications of histones, which are required for the regulation of many of the DNA-templated processes. Such histone modifications include acetylation, sumoylation, phosphorylation, ubiquitination and methylation. Histone modifications can either function by disrupting chromosomal contacts or by regulating non-histone protein interactions with chromatin. In this review, recent findings will be discussed regarding the regulation of the implementation and physiological significance for one such histone modification, histone H3 Lysine 4 (H3K4) methylation by the yeast COMPASS and mammalian COMPASS-like complexes.
All DNA-templated processes are regulated by chromatin and its posttranslational modifications. The nucleosome, which is the fundamental unit of chromatin, is composed of 146 base pairs of DNA wrapped twice around an octamer of the four core histones (H3, H4, H2A, and H2B)[1–3]. Structural studies demonstrated that the unstructured histone N-terminal tails protrude outward from the nucleosomes and are available for interactions with other neighboring histones or non-histone proteins. Many residues within the histone N-terminal tails and a few within the histone core can be altered by posttranslational modifications. To date, there are at least five types of posttranslational modifications found on histones, these include: acetylation, phosphorylation, ubiquitination, sumoylation, and methylation[4,5]. Almost all of these modifications have been shown to be reversible, and their implementation and removal are fundamental to the regulation of a diverse set of biological processes such as replication, repair, recombination, transcription and RNA processing [4,5]. This review will focus mainly on the most recent studies of one type of histone modification, histone H3 lysine 4 (H3K4) methylation.
Histones are methylated either on their arginine and/or lysine residues[6,7]. The lysine residues are methylated on the ε-nitrogen by either the SET domain or the non-SET domain-containing lysine methyltransferases (KMTs). A large number of enzymes have been characterized which are capable of methylating specific lysine residues on histones (Table 1). The ε-nitrogen can also be modified by lysine acetyl transferases (KATs) and methylation and acetylation of lysine are mutually exclusive. Indeed, many sites of histone methylation are also sites of acetylation. While acetylation is typically associated with active transcription or open chromatin structure, the role of histone lysine methylation appears to be more diverse.
Histone lysine methylation can occur in mono-, di-, or trimethylated forms[8–10]. The same enzyme can either implement each state in a processive manner, or in some cases, different enzymes can be required for different methylation states on the same residue[8,10] (Figure 1). The biological ramifications of histone methylations in transcriptional regulation are quite diverse. Histone methylation could either be associated with actively transcribing RNA polymerase II (Pol II) or could be involved in setting the stage for transcriptional repression within heterochromatin, and sometimes could be both[6,9,11–13]. For example, histone H3 lysine 9 methylation (H3K9), which is implemented by the SUV39 family of enzymes, is mostly associated with the silent regions within both euchromatin and heterochromatin[14–18]. The SUV39 family of HMTs contains a cysteine rich pre-SET domain, which is required for specificity towards H3K9 . The human Suv39 family is comprised of Suv39H1, Suv39H2, G9a, EuHMTase, SetDB1 and CLL8 (KMT1A-F, respectively) (Table 1). The H3K9 methylation mark plays an important role as a binding site for the chromo-containing protein, HP1[20,21]. Interactions of HP1 with H3K9-methylated chromatin can, therefore, result in chromatin compaction and heterochromatin generation. HP1 is found in α, β, and γ forms. Both HP1α and β interact with methylated H3K9 within the heterochromatin and the silent regions of euchromatin. However, the HP1γ interactions with methylated H3K9 are found on the chromatin of actively transcribed genes[12,22]. It is not clear at this time how HP1γ distinguishes between H3K9 associated within actively transcribed genes over H3K9 within the repressed genes. It is possible the presence of either bivalent marks on the chromatin of actively transcribed genes, or a specific role for the hinge domain of each subtype of HP1, or both could describe such binding specificity. Incidentally, a feature of the hinge domain of HP1 is multiple sites of phosphorylation, and such posttranslational modifications of the hinge domain could be instrumental in various binding activities.
Unlike H3K9 methylation, histone H3K4 methylation is a posttranslational modification that is exclusively associated with actively transcribed genes[5,6,23]. The first H3K4 methylase complex, COMPASS, was identified in the yeast S. cerevisiae and consists of Set1/KMT2 and seven other polypeptides (named Cps60-Cps15) . Set1/KMT2 alone is not enzymatically active, but functions within COMPASS and is capable of mono-, di-, and trimethylating H3K4 [6,8,10,24–27]. Following the identification of Set1/COMPASS as an H3K4 methylase, it was demonstrated that its mammalian homologues, the MLL proteins, MLL1-4 and hSet1A and B, are found in COMPASS-like complexes capable of methylating the fourth lysine of histone H3 [6,28,29] (Figure 2). While there is only one Set1 in yeast, there are over six Set1 related proteins in mammals, all capable of catalyzing the methylation of H3K4 (Figure 2). It is clearly evident that H3K4 methylase activities in mammals are not redundant, as the deletion of either the MLL1 and/or MLL2 gene are embryonic lethal [30,31]. Why do mammals need so many different H3K4 methylases? It is perhaps because mammals need to control the opposing, silencing effects of H3K27 methylation mark at different genomic loci in different cellular contexts, while single-celled eukaryotes do not. It is possible then that mammals have a built-in intricate network of H3K4 methylases that perform the ancient functions, as well as, oppose, and perhaps, reverse H3K27. However, what we have learned so far from yeast to human indicates that the compositional and functional conservation between MLL/hSet1A and B complexes and COMPASS establishes the existence of a highly conserved, ancient molecular machinery for the modification of histone H3K4 by methylation.
It was initially demonstrated in yeast that COMPASS and H3K4 methylation are localized to punctuate sites near the transcription start sites for actively transcribed genes [32,33]. This H3K4 localization pattern also holds true in higher eukaryotic organisms including mammals . However, since MLL is required for the proper transcription of Hox gene clusters, H3K4 localization studies in mammals also focused on these regions. From these studies, a unique pattern of H3K4 methylation is observed in Hox gene clusters. These studies demonstrated the presence of a large region of continuous H3K4 methylation across multiple genes, which was also present in the intergenic regions within the Hox cluster [34–37]. Given what we know about the correlation of the H3K4 methylation pattern with transcriptionally active Pol II, the extended H3K4 methylation domains on the Hox gene clusters could simply reflect transcriptional activity throughout the intergenic regions. In other words, the broad methylation pattern on the Hox cluster may indicate that non-coding RNAs are transcribed from these intergenic regions by Pol II, and that such RNAs could be essential regulators of development and differentiation. It is not clear at this time which MLLs and/or hSet1s are involved for H3K4 methylation within these regions.
Recent global histone modification studies in higher eukaryotic organisms have confirmed earlier findings that in S. cerevisiae the H3K4 methylation mark is mostly associated within the early transcribed regions of active genes and that the H3K36 methylation mark, implemented by Set2/KMT3, is associated with the body of transcribing genes [5,6]. Indeed, all transcriptionally active genes contain nucleosomes with H3K4 trimethyl and H3K9/14 acetylation modifications in conjunction with the H3K36 trimethyl mark. This pattern of modifications is now well accepted as a fingerprint of an actively transcribed gene. However, only a subset of genes from Drosophila to human carry the H3K4 trimethylation mark without bearing an H3K36 trimethlyation mark[38,39]. These genes encode for most developmentally regulated proteins containing preinitiated Pol II at their promoter. It has been postulated that the expression of these genes is somehow regulated by Pol II general elongation factors.
Studies in S. cerevisiae initially demonstrated that histone H2B monoubiquitination is required for H3K4 methylation by COMPASS. We now know that monoubiquitination of H2B is required for H3K4 di- and trimethylation in yeast and that this crosstalk pathway is highly conserved from yeast to human [6,40,41]. Although the past several years brought about a watershed of information regarding the factors/proteins required for proper H2B monoubiquitination [6,40–48], the molecular mechanism for this histone crosstalk was poorly understood. Recent studies in yeast S. cerevisiae demonstrated that monoubiquitination of histone H2B regulates COMPASS’ compositional assembly, and therefore, proper H3K4 methylation. Initial biochemical studies indicated that COMPASS purified from a monoubiquitination-deficient background lacks one of its subunits, a WD-repeat containing protein Cps35, and its enzymatic activity to di- and trimethylate H3K4. Cps35 can interact with chromatin in a monoubiquitination-dependent manner, but Set1 independent manner. It is also noted that Cps35’s addition to COMPASS purified from a monoubiquitination deficient background can complement trimethylase activity. Based on this information, it has been proposed that the monoubiquitination-dependent recruitment of Cps35 to chromatin can activate COMPASS’ trimethylase activity on chromatin (Figure 3). It is still to be determined whether Cps35 interacts with monoubiquitinated chromatin directly or via interactions with other factors. Also, the identification of the mammalian homologue of Cps35 and the analysis of its role in the regulation of H3K4 trimethylation will be of great interest.
Many of the identified histone methylations on chromatin serve as a “mark” that can be identified by specific proteins, resulting in their recruitment, and therefore, implementation of the enzymatic and physiological activities at the site of recruitment. Given the evolutionary conserved pattern of the distribution of H3K4 trimethylation at the start site at the 5′ end of active genes [34,50], many laboratories have been working feverishly to define the physiological significance of this mark. To date, several factors, which have been shown to directly associate with di- and trimethylated H3K4 have been reported [51–55]. These include the CHD1 protein and proteins containing a plant homeodomain (PHD) finger domain, such as ING2, ING4, ING5, RAG2, and BPTF[51–55]. Recent studies also demonstrated that the basal transcription complex TFIID directly binds to the trimethylated H3K4me3 via the PHD finger of TAF3 . The physiological significance for some of the recently identified interactions will be briefly described.
Earlier studies implicated a role for CHD1 in posttranscriptional initiation during transcriptional elongation and termination in a variety of model systems including the yeast and mammalian systems. Transcriptional elongation and termination, although not as well understood as transcriptional initiation, is a highly regulated processes. The biogenesis of messenger RNA postinitiation of transcription incorporates multiple events concurrently. These events include transcript elongation, mRNA capping, splicing, polyadenylation, and termination followed by mRNA surveillance and export. Although not well understood, chromatin and its modification could play an essential role in many of these processes. Recent reports demonstrated that human CHD1’s recognition of trimethylated H3K4 functions by the recruitment of factors implicated in transcriptional elongation and pre-mRNA processing.
Affinity purification using trimethylated H3K4 as bait has resulted in the identification of CHD1 and its interacting proteins, including the components of the spliceosome as an H3K4 trimethyl specific binding complex. In support of a role for human CHD1 in splicing, the depletion of CHD1 from extracts or its RNAi knockdown dramatically reduced splicing efficiency in vitro, and also in pre-mRNA splicing on active genes in vivo. Furthermore, reduction in the level of H3K4 trimethylation using Ash2 RNAi can phenocopy pre-mRNA splicing defects, and results in a reduced association of the U2 snRNP components with chromatin on active genes . Together, this new study suggests that H3K4 trimethlyation could facilitate pre-mRNA maturation via bridging of the spliceosomal components and CHD1 to actively transcribed genes.
Since H3K4 trimethylation is a mark found at the start site of the 5′ end of active genes, it is not clear whether CHD1 binding to these sites only serves for splicing events at the 5′ end of the active genes. Alternatively, once CHD1 interacts with the 5′ end of active genes via the H3K4 trimethylation mark, the splicing machinery can be loaded on at the outset of transcription and follow through with the elongating Pol II. Furthermore, given the existence of a highly conserved machinery for the implementation of H3K4 trimethylation from yeast to human, it would be of great interest to assess the pre-mRNA splicing competency and association of the U2 snRNP components with the chromatin of actively transcribed genes in yeast cells lacking either COMPASS components or H3K4. Although, CHD1 also exists in yeast, its molecular interaction with trimethylated H3K4 has been questioned . Therefore, it would be of great interest to determine whether other factors could link spliceosomal components to actively transcribed genes in yeast through trimethylated H3K4. Given the fact that there are fewer genes requiring splicing events in yeast cells than mammalian cells, it is plausible that higher eukaryotes have evolved to use CHD1 in a different manner than yeast.
The Recombination Activating Gene (RAG) 2 is involved in antigen-receptor gene assembly in lymphocytes. The RAG2 protein contains a PHD finger towards its C terminus domain. Since PHD domains have been shown to be capable of interacting with trimethylated H3K4, the association of the RAG2 PHD domain with H3K4 was recently tested. The crystal structures of RAG2-PHD alone and complexed with methylated H3K4 have been determined. These studies have demonstrated that in the absence of the modified peptide, a peptide N-terminal to RAG2-PHD occupies the substrate-binding site. This observation reflects the existence of an autoregulatory mechanism for RAG2. Furthermore, unlike other PHD domains interacting with trimethylated H3K4, the RAG2-PHD domain substitutes a carboxylate that interacts with the arginine 2 residue on the H3 peptide. This interaction results in the enhancement of the binding of RAG2 to trimethylated H3K4 when H3R2 is dimethylated. It was recently reported that H3R2 dimethylation can negatively regulate H3K4 trimethylation, and therefore, H3R2 methylation is mutually exclusive with the trimethyl form of H3K4[61,62]. It is still not clear whether RAG2 can see the histone H3 tail with both dimethylated H3R2 and trimethylated H3K4.
Since RAG2 is an essential component of the RAG1/2 V(D)J recombinase, the functional significance of its trimethyl H3K4 interaction domain and V(D)J recombination was tested. Mutations that perturb RAG2's interaction with trimethylated H3K4 result in a severe impairment of V(D)J recombination. Similarly, a reduction in the levels of trimethyl H3K4 by the overexpression of an H3K4 tridemethylase also results in a decrease in V(D)J recombination. Since the tryptophan 453 residue (W453) of RAG2 constitutes a key structural component of its trimethyl H3K4 binding surface, and the fact that this residue is mutated in patients with immunodeficiency syndromes, the physiological significance for such interactions is of great interest. Furthermore, it would be highly valuable to define which of the MLLs are required for H3K4 trimethylation in lymphocytes to regulate RAG2’s role in recombination.
The past five years have brought about a wealth of information regarding the enzymatic machinery and molecular factors required for H3K4 methylation in yeast. We have also learned how yeast cells use different factors within and outside of COMPASS to regulate the mono-, di- and trimehylation of H3K4. The information provided by the yeast model system has proven to be a great template guiding the studies regarding the mechanism of H3K4 methylation in metazoans and mammals. We have also gained a plethora of information about a diverse number of factors that can interact with differentially methylated H3K4. Given all of the above information, there is yet a great deal to learn about the physiological significance of H3K4 methylation and why the perturbation of one of the enzymes, the MLL protein, which is involved in its implementation, results in the pathogenesis of hematological malignancies. Furthermore, it is of question why there are several H3K4 methylases in mammals with nonredundant activities. The continued comprehensive and thorough application of biochemistry, genetics, genomics and proteomics in model systems ranging from yeast to mammals should provide clues to these remaining questions within the next five years.
I am grateful to Dr. Edwin Smith for critical reading of this review and Laura Shilatifard for editorial assistance. The research in Shilatifard’s laboratory is supported by grants from the National Institutes of Health (2R01CA089455 and 1R01GM069905).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.