|Home | About | Journals | Submit | Contact Us | Français|
The C-terminal domain (CTD) of the RNA polymerase II subunit Rpb1 undergoes dynamic phosphorylation, with different phosphorylation sites predominating at different stages of transcription. Our lab studies how various mRNA processing and chromatin-modifying enzymes interact with the phosphorylated CTD to efficiently produce mRNAs. The H3K36 methyltransferase Set2 interacts with CTD carrying phosphorylations characteristic of downstream elongation complexes, and the resulting co-transcriptional H3K36 methylation targets the Rpd3S histone deacetylase to downstream transcribed regions. Although positively correlated with gene activity, this pathway actually inhibits transcription elongation as well as initiation from cryptic promoters within genes. During early elongation, CTD serine 5 phosphorylation helps recruit the H3K4 methyltransferase complex containing Set1. Within 5' transcribed regions, co-transcriptional H3K4 dimethylation (H3K4me2) by Set1 recruits the deacetylase complex Set3C. Finally, H3K4 trimethylation at the most promoter-proximal nucleosomes is thought to stimulate transcription by promoting histone acetylation by complexes containing the ING/Yng PHD finger proteins. Surprisingly, the Rpd3L histone deacetylase complex, normally a transcription repressor, may also recognize H3K4me3. Together, the cotranscriptional histone methylations appear to primarily function to distinguish active promoter regions, which are marked by high levels of acetylation and nucleosome turnover, from the deacetylated, downstream transcribed regions of genes.
The evolution of histones was a major evolutionary milestone, as a larger amount of DNA could be compacted into each cell. An additional benefit was the ability of chromatin to silence a large percentage of genes on the chromosomes, allowing a single genome to encode transcription programs for a multitude of diverse cell types. However, the repressive properties of chromatin also created the problem of how to allow the enzymes necessary for transcription to access the wrapped DNA. A significant proportion of eukaryotic gene regulation appears to involve control of access to specific DNA sequences. This regulation is carried out by a large number of transcription factors, chromatin modifiers, and chromatin remodelers that can activate transcription by removing histones or repress transcription by stabilizing repressive chromatin.
Early studies noted a general correlation of certain histone modifications with transcription state. Histone acetylation is highest in euchromatic, active regions of the genome. In contrast, histone methylation appears to correlate with transcriptionally silent, heterochromatic regions. While some of these modifications might directly affect chromatin compaction, it appears that they are predominantly used, individually or in combinations, for binding chromatin-related proteins, including many of the modifiers and remodelers themselves. This "histone code" model (Strahl and Allis 2000) suggests that information about transcriptional status can be encoded in the pattern of histone modifications, and that this information could potentially be self-reinforcing or even heritable through DNA replication.
An example of such a positive feedback loop is seen with several histone acetyltransferase (HAT) complexes that have subunits with one or more bromodomains, a domain that often preferentially binds acetylated lysines (Lall 2007). If activation of a gene is first established by recruitment of the HAT to a specific promoter, the subsequent histone acetylation can then provide a second mechanism for maintaining the HAT at the promoter for continued acetylation and activation. Other protein complexes that contain bromodomains include the ATP-dependent chromatin remodelers of the Swi/Snf and ISW families, as well as the basal transcription factor TFIID. In aggregate, high levels of histone acetylation at the promoter appear to be essential for the more rapid nucleosome turnover that allows the RNA polymerase II transcription machinery to access the promoter DNA.
Positive feedback loops also appear to be in operation for histone methylation at heterochromatin. Unlike acetylation, where there appears to be some redundancy between different acetylation sites, specific methylated histone residues have distinct binding partners. Methylation of histone H3 at lysines 9 and 27 promote heterochromatin formation via binding to the repressive HP1 and Polycomb complexes, respectively (Grewal and Jia 2007; Simon and Kingston 2009). The HP1 protein binds methylated H3K9, but also interacts with the H3K9 methyltransferase to reinforce this mark. Similarly, the H3K27 methyltransferase complex PRC2 can bind to this mark to create a positive feedback loop. The exact mechanisms by which K9 and K27 methylation lead to transcription repression are still not entirely clear, but at least in part involve recruitment of HDACs and/or prevention of HAT recruitment.
Given the connection of H3K9me and K27me to transcription repression, it was a surprise to find that methylation of two other residues, H3K4 and H3K36, are strongly correlated with transcribing genes (Hampsey and Reinberg 2003). Indeed, these modifications have been used as diagnostic markers for RNA polymerase II transcription, helping to identify new genes and transcripts that do not encode proteins. It was originally assumed that these marks promote transcription, with many publications referring to K4 and K36 methylations as "activating" marks. Our lab and others have been addressing how these marks get targeted to active genes and exploring how they actually affect gene expression. It is clear that thinking of these transcription-associated methylations as activating transcription is an oversimplification.
Single gene and genome-wide mapping of H3K4 and K36 methylation showed that these marks have different distributions over active genes (Liu et al. 2005; Pokholok et al. 2005). H3K36me2 and me3 are underrepresented at promoters, but found at high levels in transcribed regions. H3K4 methylation changes throughout the gene, with trimethylation highest at the very 5' end of the transcribed region and dimethylation peaking slightly further downstream in the middle of genes. The clear correlation of methylation pattern with transcription direction suggests direct communication between transcription complexes and histone methyltransferase (HMTs) and/or demethylases.
One mechanism for this coupling is direct interaction of HMTs with the elongating RNA polymerase II (RNApII). The C-terminal domain (CTD) of the RNApII largest subunit consists of a repeating seven amino acid sequence (YSPTSPS). A series of targeted kinases and phosphatases generates a stereotypical pattern of CTD phosphorylation changes, with phosphorylation of serines 5 and 7 highest at 5' ends of genes and serine 2 peaking further downstream (Buratowski 2003). This CTD cycle is used to recruit the mRNA processing and termination enzymes at the appropriate time when they are needed (Buratowski 2009).
Remarkably, this same CTD cycle also helps generate the H3K4/K36 methylation pattern. When the H3K36 methyltransferase Set2 was purified from yeast cells, it was found associated with phosphorylated RNA polymerase II (Li et al. 2002; Krogan et al. 2003; Schaft et al. 2003; Xiao et al. 2003; Li et al. 2009). Structural studies suggest that Set2 preferentially binds to CTD phosphorylated at both serine 2 and serine 5 (Li et al. 2005). Set2 crosslinked throughout transcribed regions, and both this crosslinking and H3K36 methylation were lost when the major CTD serine 2 kinase Ctk1 was deleted. Similarly, the CTD serine 5 kinase Kin28 has been found to be important for H3K4 methylation at the 5' end of genes (Ng et al. 2003). In higher eukaryotes, which have multiple H3K4 methyltransferases, transcription activators may also recruit some H3K4 methyltransferases to specific promoters. While these mechanisms target H3K4 to 5' regions of genes, it remains unclear what generates the offset between H3K4me3 and me2 peaks. It may involve regulation of Set1 activity by other associated proteins. The yeast Set1 complex has multiple subunits, as do homologous higher eukaryotic complexes, and some subunits appear to be necessary specifically for H3K4 trimethylation.
The association of Set2 with transcribing RNA polymerase II led to suggestions of a positive role in elongation. However, deletion of SET2 actually produced phenotypes more consistent with inhibition of transcription. First, while most elongation mutants are sensitive to the drug 6-azauracil (6AU), set2Δ cells are actually more resistant (Li et al. 2003; Keogh et al. 2005). Second, phenotypes caused by mutations in several positive elongation factors can be suppressed by set2Δ. Suppressed mutations include loss of the Bur1 kinase (Bur1 is the yeast P-TEFb) (Keogh et al. 2005), defective Spt16 (Biswas et al. 2006), and defective Spt5 (Quan and Hartzog 2010). Suppression is expected if the combined deletion of both a positive and negative elongation factor restore a balance to allow normal elongation. The inhibitory effects of Set2 are directly due to H3K36 methylation, since bur1Δ lethality is also suppressed by overexpression of H3K36 demethylases or mutation of K36 (Keogh et al. 2005; Kim and Buratowski 2007).
Clustering of genetic interaction patterns showed that set2Δ strains behaved very similarly to deletion strains for two other proteins: the PHD finger protein Rco1 and chromodomain protein Eaf3(Keogh et al. 2005). Biochemical purifications showed that these two proteins were subunits of a complex that also contained the known histone deacetylase Rpd3 and its accessory subunits Sin3 and Ume1 (Carrozza et al. 2005b; Keogh et al. 2005). This complex was designated the Rpd3 Small (Rpd3S) Complex, and is distinct from the Rpd3 Large (Rpd3L) complex that carries out targeted, promoter-specific transcription repression (see below). Deletion of genes encoding Rpd3S-specific proteins, but not Rpd3L-specific proteins, led to higher levels of acetylated histones in downstream regions of genes, as well as improved growth of bur1Δ lethality and other elongation mutant strains (Keogh et al. 2005; Quan and Hartzog 2010).
A molecular model proposed to explain the connection between Set2 and Rpd3S invokes binding of the Eaf3 chromodomain to the H3K36 methylation deposited by Set2. Point mutation of H3K36 or the Eaf3 chromodomain cause the same increase in acetylation as deletion of Rpd3S. However, Eaf3 alone cannot be the sole determinant of Rpd3S targeting, since it is also a subunit of the NuA4 histone acetyltransferase complex, which is found predominantly at promoters. The PHD domain of Rco1 also appears to be critical for proper Rpd3S function in downstream, transcribed regions (Li et al. 2007). The most parsimonious model proposes that the combination of H3K36 methylation and whatever chromatin feature is recognized by Rco1 would be sufficient for targeting of Rpd3S. However, two recent papers propose that Rpd3S instead is recruited first to genes by interactions with the phosphorylated CTD of elongating RNApII, but then only deacetylates where H3K36 methylation is found (Drouin et al. 2010; Govind et al. 2010). This model would entail simultaneous binding of both the H3K36me "writer" (Set2) and "reader" (Eaf3/Rpd3S) to the CTD. Supporting this model, both Set2 and Rpd3S preferentially binds doubly phosphorylated CTD (Ser2P/Ser5P) in vitro. This coupling may allow for the most efficient targeting of both activities during chromatin reassembly in the wake of the elongation complex.
Histone methylation and deacetylation would generally be predicted to be repressive for transcription. Indeed, the phenotypic suppression of positive elongation factor mutants when the Set2/Rpd3S pathway is disrupted suggests that the Set2/Rpd3S pathway inhibits elongation. However, another striking effect seen in the absence of Set2/Rpd3S is the appearance of new transcripts initiating within many transcribed genes (Carrozza et al. 2005b). These internal cryptic promoters are normally repressed by H3K36 methylation-targeted deacetylation. In the absence of these modifications, the disruption of chromatin by transcribing RNApII exposes sequences that can act as promoters. Other factors that produce similar cryptic promoters include Spt6 and Spt16 (Cheung et al. 2008), factors thought to be involved in replacing chromatin behind transcription elongation complexes. Clearly, these mechanisms exist for replacing nucleosomes in the wake of RNApII and keeping them in a repressive configuration.
The peak of H3K4 dimethylation is found from 5' to middle regions of genes, downstream of the promoter-proximal trimethylation peak (Liu et al. 2005; Pokholok et al. 2005). Given the positive correlation between H3K4 methylation and transcription, it is widely assumed that histone modifications at this residue promote transcription. However, deletion of SET1 has only a few obvious transcription effects that have been clearly documented. Chromatin immunoprecipitation in a set1Δ strain reveals an unexpected increase in acetylation of histones H3 and H4 in transcribed regions, predominantly in the first 500–1000 base pairs (Kim and Buratowski 2009) (Figure 1). Experiments using mutations in the Set1 complex or Rad6-Bre1 H2B ubiquitation pathway that differentially affect methylation states show that H3K4me2, but not me3, is necessary for maintaining the lower level of acetylation.
A similar increase in acetylation is seen when H3K4 is mutated (Kim and Buratowski 2009), suggesting there could be either a histone acetyltransferase that is inhibited by K4me2 or a deacetylase recruited to the H3K4 dimethylated-nucleosomes. Using a candidate approach, we screened deletions of genes for several PHD proteins that bind methylated K4 in vitro (Shi et al. 2007) and are also components of known deacetylase complexes (Kim and Buratowski 2009) (Figure 2). The Rpd3 Large complex contains two such proteins: Pho23 and Rxt1 (also known as Cti6). Neither deletion produced the same acetylation increase seen in set1Δ cells, although the pho23Δ strain did have a reproducible increase in acetylation near promoters (discussed below). In contrast, deletion of the gene for the PHD protein Set3 produced acetylation changes in transcribed regions essentially identical to those seen with set1Δ (Kim and Buratowski 2009).
The Set3 protein is quite interesting because it apparently has a methyl-lysine "reader" module (a PHD finger) and a lysine methyltransferase "writer" (a SET domain). Set3 is most closely related to the MLL5 protein of higher eukaryotes, which has a similar domain structure. A target for the Set3 methyltransferase has not yet been identified. MLL5 has been suggested to be a H3K4 methyltransferase (Fujiki et al. 2009), but this is based largely on sequence similarity to other MLL proteins and some in vitro experiments, but has not been definitively demonstrated in vivo.
Yeast Set3 is one subunit of a complex that contains two histone deacetylase subunits, the Rpd3-like protein Hos2 and the sirtuin Hst1 (Pijnappel et al. 2001). Additional subunits include the WD40 protein Sif2 and the SANT domain protein Snt1. Deletion of any of these subunits also lead to increased histone acetylation in 5' transcribed regions (Kim and Buratowski 2009). The subunit composition of the Set3 Complex (Set3C) is remarkably similar to that of the higher eukaryotic NCoR-SMRT complexes (Pijnappel et al. 2001). MLL5 has not been reported to physically associate with NCoR-SMRT, but an RNAi screen found that MLL5 and NCoR2 have very similar phenotypic profiles (Kittler et al. 2007). NCoR-SMRT has been identified as a corepressor for hormone receptors and other sequence-specific transcription regulators that directly recruit the HDAC to specific promoters for repression.
As with Rpd3S, the simplest model proposes that recognition of the H3K4me2 mark by Set3C is sufficient for recruiting the complex to the appropriate location. Indeed, we found that point mutations in the Set3 PHD finger that inactivate histone binding cause the same increase in acetylation seen with set3Δ (Kim and Buratowski 2009). These mutations also cause loss of Set3C crosslinking in a ChIP assay. An alternative mechanism has been proposed in which Set3C first interacts with the phosphorylated CTD of RNApII for recruitment to active genes, but then only deacetylates nucleosomes that have the appropriate H3K4 methylation (Govind et al. 2010).
The function of Set3C remains unclear. Loss of Set3C results in premature expression of several genes involved in sporulation and meiosis, suggesting a repressive role in transcription (Pijnappel et al. 2001). However, set3Δ cells also have delayed induction kinetics of galactose-responsive genes, indicating a positive role (Wang et al. 2002). Interestingly, many of the genes affected by Set3 have an overlapping, antisense transcript (data not shown). Therefore, Set3C may affect the balance between antisense transcript pairs, either by affecting transcription elongation or initiation from the downstream promoter. Both positive and negative effects could result from perturbing this balance.
We are currently testing three non-mutually exclusive models for how Set3C may function in transcription. The first is that deacetylation by Set3C helps suppress cryptic internal initiation, much like the Set2/Rpd3S system. Set3C would carry out this function in more promoter-proximal regions while Rpd3S would function at the more downstream transcribed regions. Our preliminary experiments indicate that set3Δ does not activate the same cryptic start sites seen in set2Δ. However, some examples of Set3-suppressed cryptic initiation sites have been found (TaeSoo Kim et al., manuscript in preparation).
A second possible Set3C function is to restrict the spread of chromatin acetylation and remodeling to promoters only. Recruitment of HATs to upstream regulatory sites may lead to acetylation beyond the promoter-associated nucleosomes. Even if the original acetylation is promoter-restricted, the association of HAT and remodeler bromodomains with acetylated histone tails could lead to spreading of disrupted chromatin to adjacent nucleosomes unless kept in check by HDACs like Set3C and Rpd3S. Reduced nucleosome occupancy in transcribed regions may adversely affect transcription elongation kinetics or other aspects of gene expression. The H3K4me3 to me2 transition established during elongation may therefore mark a boundary to limit the region of highly accessible chromatin to promoter DNA.
There is a third potential function for Set3C, which may also help explain the function of NCoR-SMRT as a transcription repressor. H3K4me3 appears to be the one methylation most likely to have a true positive effect on transcription (see below). For many years it was believed that histone methylations were largely irreversible. However, this persistence could be problematic when it becomes important for cells to rapidly repress a gene's expression. The discovery of certain JmjC proteins that demethylate H3K4me3 to me2 (Lall 2007) suggest a model for a rapid transition between active and repressed states. When the gene is induced, H3K4me3 helps recruit factors positive for transcription (see below). However, recruitment of a H3K4me3 demethylase would not only lead to loss of these positive factors, but would also lead to rapid recruitment of Set3C or NCoR-SMRT to promoter nucleosomes that now carry H3K4me2. The subsequent deacetylation would further contribute to shutting down the gene. The effects of such a mechanism might manifest primarily in the kinetics rather than the extent of repression.
H3K4me3 correlates strongly with active transcription and peaks at the most 5' nucleosome. In higher eukaryotes, several important transcription factors have domains that recognize this modification. The Taf3 subunit of the basal factor TFIID contains a PHD finger that is important for association of TFIID with chromatin(Vermeulen et al. 2007). Also, the Chd1 protein, a presumed chromatin remodeler thought to function in early elongation recognizes H3K4me3 via its chromodomains (Flanagan et al. 2005). Interestingly, yeast Taf3 lacks the PHD finger, and structural predictions suggest the yeast Chd1 protein is probably not able to bind H3K4me3 (Flanagan et al. 2005; Okuda et al. 2007). Therefore, these interactions that presumably promote transcription are not conserved in unicellular eukaryotes.
One family of conserved proteins that recognize H3K4me3 are the INGs (Inhibitors of Growth). Initially discovered as tumor suppressors, these PHD finger proteins are components of histone acetyltransferases and deacetylases (Doyon et al. 2006). There are three ING homologues in S. cerevisiae: Yng1, Yng2, and Pho23. Yng1 is a component of the NuA3 histone H3 HAT complex and Yng2 is a subunit of the NuA4 histone H4 HAT. Both of these complexes promote transcription by acetylating promoter-proximal nucleosomes, which is in turn is thought to promote binding of chromatin remodelers and accessibility of the underlying DNA. NuA4 can be targeted to promoters by transcription activators (Ikeda et al. 1999; Reeves and Hahn 2005). The PHD finger of Yng2 is not absolutely required for NuA4 recruitment to chromatin, but may provide a second, activator-independent mechanism for enhancing association with promoter-proximal nucleosomes.
Recruitment of NuA3 appears to involve direct binding to histones. NuA3 binding to nucleosomes is dependent upon both Set1 and Set2 methyltransferases (Martin et al. 2006b). The interaction is apparently due to the combined actions of the PHD domain and a second histone-binding domain in Yng1 (Martin et al. 2006a; Taverna et al. 2006; Chruscicki et al. 2010) and possibly the binding of the PHD finger in Nto1 binding to H3K36me (Shi et al. 2007). While the Yng1 PHD finger can be deleted without obvious phenotypes, the ChIP signal for NuA3 is biased towards 5' ends and this enrichment may be due interaction with H3K4me3 (Taverna et al., 2006). Clearly, recruitment of NuA3 and NuA4 is not by a single mechanism and probably involves multiple interactions that contribute differently at individual genes and at different points in gene expression.
The Rpd3L complex, which contains the third yeast ING protein known as Pho23, is also targeted by multiple mechanisms. The Rpd3L complex was first identified genetically as a gene-specific repressor, and at these genes the complex is strongly recruited by sequence-specific binding proteins such as Ume6 and Ash1 (Carrozza et al. 2005a). The homologous mammalian complexes (containing Sin3 and one of several Rpd3-like HDACs) also interact with factors that bind specific sequences. Localized deacetylation of promoter nucleosomes is thought to inhibit the transcription machinery from accessing the DNA and represents one of the most common mechanisms for gene repression.
Surprisingly, there is also evidence suggesting that Rpd3L functions at active promoters. The first low-resolution, genome-wide ChIP analysis of Rpd3 reported a positive correlation between Rpd3 crosslinking and transcription level (Kurdistani et al. 2002). This study predated the discovery of the two Rpd3 complexes, but a recent report of high-resolution ChIP-seq results found that two Rpd3L-specific subunits (Rxt2 and Sds3) showed a peak over active promoters (Drouin et al. 2010). In contrast, inactive promoters did not show such a peak unless they were direct targets of Rpd3L (i.e. they used a direct targeting mechanism as described in the previous paragraph). Similarly, experiments in mammalian cells also found multiple HDACs at active promoters (Wang et al. 2009). Deletion of the gene for Pho23 led to a measurable increase in acetylation at promoter-proximal chromatin of several active yeast genes (Kim and Buratowski 2009) (Figure 2). This effect was not seen upon deletion of the gene for Rxt1/Cti6, the second PHD protein in Rpd3L. Therefore, Rpd3L may be recruited to active or recently active promoters via recognition of H3K4me3 by Pho23.
It may seem paradoxical that an ostensibly repressive complex should be found at active promoters. In this regard, it is worth noting that promoter binding of the well-characterized transcription inhibitors Mot1 and NC2 also correlates very strongly with gene expression (Sikorski and Buratowski 2009) for review. One can imagine several possible scenarios consistent with these results. The first is that the association of Rpd3L with promoters actually occurs as they are being shut down. In yeast, most "active" genes are actually transcribed very infrequently (Holstege et al. 1998) and so Rpd3L may be associating only during their inactive periods. A second possibility is that active promoters have an optimal level of acetylation that is acquired by competition between ongoing acetylation and deacetylation. Surprisingly, deletion of Rpd3L subunits actually causes sensitivity to 6-azauracil (Keogh et al, 2005) and several studies suggest this HDAC has a positive role in transcription at some genes (Sharma et al, 2007; Alejandro-Osorio et al, 2009). The balance between HATs and HDACs might be coupled to cycles of initiation (i.e one round of acetylation and deacetylation per transcription activation event). However, even without such linkage, an active competition between HATs and HDACs allows for faster gene expression changes in response to changing cellular conditions.
The overlapping effects of the cotranscriptional histone modification pathways described here generates at least three different zones along active genes (Figure 3; note that these zones undoubtedly overlap, but are drawn as discrete regions for clarity). At promoters, a highly acetylated state is established by recruitment of HATs and remodelers by transcription activators, resulting in overall lower nucleosome density. This state is maintained by the association of several bromodomain-containing HATs and remodelers associating with acetylated histones. In addition, the act of transcription leads to H3K4 trimethylation near the promoter, which also serves to localize several PHD-containing HATs. In higher eukaryotes, additional PHD and chromodomain proteins such as Taf3 and Chd1 may also recognize H3K4me3 as a signal to promote transcription. At the next few nucleosomes in active genes, transcription-dependent H3K4me2 targets the activity of the Set3C HDAC, resulting in lower acetylation and more stable nucleosome association. Further downstream, the association of Set2 with elongating RNApII methylates H3K36, which in turn targets the activity of the Rpd3S HDAC.
The functions of these cotranscriptional modification pathways remain to be completely understood. Minimally, they appear to help reinforce transcription from "real" promoters while suppressing initiation from transcribed sequences that serve as cryptic promoters when exposed by the transit of RNApII. However, there are an increasing number of cases where such cryptic non-coding transcription (although usually not the transcript itself) actually plays an active role in proper gene regulation (Martens et al. 2004; Hongay et al. 2006; Berretta et al. 2008; Houseley et al. 2008; Camblong et al. 2009). Overlapping transcription can result in overlaps between the transcription-deposited modifications, raising the possibility of complex networks where regulation of an internal transcript positively or negatively affects expression of its host or neighboring genes.
Another likely function of the cotranscriptional histone modifications is to affect transcription elongation rates. It is becoming more appreciated that elongation is a critical point of gene regulation. Not only do many genes appear to be limited by early elongation events that pause or terminate RNApII before a full transcript is made, but elongation kinetics also appear to strongly affect cotranscriptional mRNA splicing events. All of these elongation effects are clearly influenced by cotranscriptional histone modifications (Buratowski 2009; Schwartz and Ast 2010). Strikingly, many of the proteins involved in cotranscriptional histone methylation and acetylation have been linked to particular cancers and other diseases, making it that much more important to understand their molecular functions.
We are extremely grateful to our lab members and collaborators that carried out the work described here. T. Kim is a Special Fellow of the Leukemia and Lymphoma Society. This work was supported by NIH award GM46498 to S.B.