|Home | About | Journals | Submit | Contact Us | Français|
The C-terminal domain of the RNA polymerase II largest subunit undergoes dynamic phosphorylation during transcription, and the different phosphorylation patterns that predominate at each stage of transcription recruit the appropriate set of mRNA processing and histone modifying factors. Recent papers help explain how the changes in CTD phosphorylation pattern are linked to the progression from initiation through elongation to termination.
Although all three eukaryotic RNA polymerases are very similar in structure and subunit configuration, RNA polymerase II (RNApII) uniquely possesses an extra C-terminal domain (CTD) on its largest subunit, Rpb1. Many functions have been proposed for this domain, including interactions with nucleic acids and displacement of nucleosomes. However, the preponderance of evidence indicates that the CTD primarily functions as a binding platform for other proteins involved in transcription, mRNA processing, and histone modifications. The CTD consists of multiple repeats of the heptamer sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7, and as expected from this sequence, this region is highly phosphorylated in transcribing RNApII. Serines 2 and 5 were identified as major phosphorylation sites, and multiple functions for these modifications have been elucidated. More recently, phosphorylation of serine 7 and other covalent modifications have been described. Different phosphorylation states predominate at each stage of transcription, and each preferentially binds a distinct set of factors. These dynamic interactions provide a means for coupling and coordinating specific stages of transcription with other events necessary for proper gene expression.
During preinitiation complex (PIC) assembly, the Mediator co-activator complex bridges upstream activators and RNApII. Mediator binds unphosphorylated polymerase, but when incorporated into the PIC it strongly stimulates the CTD kinase of basal transcription factor TFIIH. This phosphorylation disrupts Mediator binding (Max et al., 2007). Therefore, once Mediator has completed its mission of helping deliver polymerase to the promoter, it triggers its own release from the CTD (FIG 1A). While polymerase proceeds on to elongation, Mediator may remain associated with the promoter as part of the Scaffold complex to facilitate subsequent rounds of polymerase recruitment and reinitiation (Yudkovsky et al., 2000).
TFIIH phosphorylates the CTD on Ser5. Chromatin immunoprecipitation (ChIP) experiments indicate that Ser5P levels remain high as RNApII transcribes the first few hundred nucleotides of genes, but decline further downstream (Komarnitsky et al., 2000). Several events are physically or functionally linked to this modification. The mRNA capping enzyme binds directly to CTD Ser5P residues (Fabrega et al., 2003). Because the CTD lies close to the mRNA exit channel, this physical tethering positions the capping machinery for rapid modification of the mRNA 5′ end as it emerges from the polymerase. Binding of the CTD can also allosterically affect guanylyltransferase activity to further enhance coupling. ChIP experiments suggest that the capping enzyme triphosphatase-guanylyltransferase complex does not remain associated with the elongation complex. In contrast, the cap methyltransferase crosslinks to transcribed regions and may contribute to the processivity of the elongation complex (Komarnitsky et al., 2000; Schroeder et al., 2000; Schroeder et al., 2004). Why the guanylylation and methylation components of the capping machinery respond differently to the changing CTD pattern remains unclear.
Like mRNA processing enzymes, several histone-modifying enzymes make use of the changing CTD pattern to distinguish promoter-proximal and -distal regions (reviewed in Hampsey and Reinberg, 2003). Active genes are marked by a stereotypical set of histone methylations: H3K4 is trimethylated near promoters while H3K36 methylation marks downstream transcribed regions. Targeting of the yeast H3K4 methyltransferase (Set1) complex to 5′ transcribed regions requires CTD Ser5P (Ng et al., 2003). This is also presumed to be true for at least some of the homologous mammalian MLL1 and Set1 complexes, although other factors such as transcription activators may also contribute to their localized recruitment or regulation of methylation activity.
Surprisingly, recent reports show that TFIIH can also phosphorylate CTD Ser7 in vitro (Akhtar et al., 2009; Glover-Cutter et al., 2009; Kim et al., 2009). The ChIP signals for Ser5P and Ser7P are biased towards 5′ ends and in vivo inhibition of the yeast TFIIH kinase results in the loss of both marks. Ser7 is the most degenerate position in the CTD heptamer, appearing in only half of the human repeats and even fewer (7/44) in Drosophila. It has been suggested that divergence at position 7 provides a means for repeats to acquire specialized functions. However, a CTD with only consensus repeats functions normally in yeast (West and Corden, 1995) or mammalian cells (Chapman et al., 2005). Conversely, a heterologous CTD from red algae, which has no Ser7 or Thr4, can support viability in yeast (Stiller and Cook, 2004). So far, the one function assigned to Ser7P is in recruiting the Integrator complex for mammalian snRNA 3′ processing (Egloff et al., 2007). Transcripts in this class are short and not polyadenylated (see below for discussion of termination and 3′ end processing). If there are other functions for Ser7P, they may involve events that occur early during transcription.
As polymerase elongates further downstream, levels of Ser5P drop and Ser2P levels increase. It is important to note that lower levels of Ser5P persist throughout elongation, so at least some CTD repeats in transcribing RNApII are likely to be phosphorylated at both Ser2 and Ser5 (and possibly also Ser7). The drop in Ser5P is not dependent upon Ser2 phosphorylation, since Ser5P does not increase in cells in which Ser2 phosphorylation is blocked (Cho et al., 2001; Liu et al., 2009; Qiu et al., 2009; Zhou et al., 2009). Several recent discoveries help clarify the events mediating the CTD transition as RNApII moves into later elongation.
Mosley et al. (2009) recently identified the long-postulated Ser5P phosphatase, an evolutionarily conserved RNApII-binding protein called Rtr1. Deletion of Rtr1 leads to persistently high Ser5P levels during elongation. Interestingly, Rtr1 is not essential for viability in yeast, although this may be due to the presence of a related and genetically redundant protein (Rtr2). Ssu72 is an essential transcription termination factor that also has Ser5P phosphatase activity (Krishnamurthy et al., 2004), but Ssu72 mutants have not been found to affect the promoter-proximal Ser5P drop. One possibility is that Rtr1 reduces Ser5P in early elongation phase, while Ssu72 removes the remaining Ser5P further downstream during termination. It is also worth noting that another Ser5P phosphatase called Scp1 has been identified in metazoans as a repressor of gene expression in neurons (Zhang et al., 2006). Perhaps this enzyme dephosphorylates the CTD at a very early step before initiation or capping can occur.
Although the correlation between Ser2P and elongation was discovered a number of years ago (Komarnitsky et al., 2000), recent discoveries have uncovered interesting complexities in the responsible kinases. In mammalian cells, the Cdk9 kinase subunit of the positive elongation factor P-TEFb phosphorylates CTD Ser2 as well as the elongation factor Spt4/Spt5 (also known as DSIF) (Peterlin and Price, 2006). Reminiscent of the RNApII CTD, Spt5 contains multiple repeats of a short sequence containing the phosphorylation site. However, the Spt5 repeat differs from that of the RNApII CTD, so it will be interesting to see how the same kinase active site recognizes both substrates.
Yeast have two kinases that resemble mammalian Cdk9. It has been proposed that Cdk9 function is split in S. cerevisiae, with Ctk1 responsible for CTD Ser2 phosphorylation (Cho et al., 2001) and Bur1 phosphorylating Spt4/5 (Zhou et al., 2009). However, recent papers indicate the story is not so simple (Liu et al., 2009; Qiu et al., 2009). These new studies show that, although Ctk1 provides the bulk of Ser2 phosphorylation, Bur1 also contributes to Ser2P just downstream of the promoter and this early phosphorylation helps promote more extensive phosphorylation by the second kinase. Therefore, much like mammalian Cdk9, Bur1 phosphorylates both the CTD and Spt5 to contribute to effective transcription. Similarly, the S. pombe Bur1 homolog (SpCdk9) can phosphorylate both CTD and Spt5, while the Ctk1 homolog (Lsk1) contributes the majority of Ser2 phosphorylation (Viladevall et al., 2009). Given that Bur1/spCdk9 resembles metazoan Cdk9 in targeting both Spt5 and CTD Ser2 during early elongation, it's interesting to consider whether higher eukaryotes may have a second kinase that functions like Ctk1/Lsk1 to more extensively phosphorylate Ser2 further downstream.
What triggers the progression of kinases during the transition from initiation to elongation? Several plausible mechanisms have been described by which Ser5P would help recruit the Ser2 kinases. In S. cerevisiae, Bur1 directly binds to the Ser5P CTD, providing a very simple mechanism for having Ser2P come after Ser5P (Qiu et al., 2009). In S. pombe, Cdk9 is complexed with the mRNA cap methyltransferase. Since capping enzymes bind the Ser5P CTD, and methylation is the final step of capping, this coupling makes Ser2 phosphorylation dependent upon earlier events (Viladevall et al., 2009). In turn, Ctk1 helps release RNApII from the basal initiation factors, including TFIIH (Ahn et al., 2009). Given that CTD Ser2 and Spt5 phosphorylation are critical for later events (see below), these interactions may constitute a “5′ checkpoint” that makes the activity of the Ser2/Spt5 kinases contingent upon Ser5P and mRNA capping (Fig 1B).
Levels of CTD Ser2P increase gradually as RNApII moves away from the promoter. Several mechanisms appear to contribute to this gradient. First, there appears to be ongoing dephosphorylation of Ser2P during elongation by the Fcp1 phosphatase, particularly in 5′ regions where the basal factor TFIIF may stimulate Fcp1 activity (Cho et al., 2001). Second, the kinetics of CTD phosphorylation are not uniform and may be slow compared to transcript elongation. In vitro experiments suggest that early phosphorylation events can “prime” the CTD for more efficient subsequent modification. Both SpCdk9 (Viladevall et al., 2009) and S. cerevisiae Ctk1 (Jones et al., 2004) are much more active on CTD substrates that already have some phosphorylation. Several possible mechanisms for stimulation can be envisioned. There could be a non-catalytic phosphoserine binding site on the kinase that tethers the enzyme to the substrate for efficient CTD phosphorylation (i.e. processivity). An even simpler model is that the early phosphates cause the CTD to adopt an extended conformation that makes it a more accessible substrate. In either case, Ser2P phosphorylation may accelerate while the polymerase is elongating, causing Ser2P to peak further downstream.
Phosphorylation at CTD Ser2 appears to mark RNApII molecules that are competent for the long-range elongation necessary to produce most mRNAs. Indeed, at many non-expressed genes, RNApII is seen only at the promoter and carries Ser5P but not Ser2P (reviewed in Margaritis and Holstege, 2008). How does the CTD transition affect elongation? Phosphorylation of the CTD does not directly affect elongation rate, but instead mediates interactions between the polymerase and other factors. Therefore, the Ser5P to Ser2P transition could either promote the association and activity of positive elongation factors, or inhibit pathways that cause RNApII to pause or terminate early in elongation. In fact, both types of mechanisms appear to operate.
It is increasingly clear that, as in prokaryotes, initiation of eukaryotic transcription does not necessarily result in production of a full-length transcript. In yeast, there is an early transcription termination pathway that functions at snoRNAs and cryptic unstable transcripts (CUTs) (Steinmetz et al., 2006). For both of these classes of transcripts, termination occurs within the first few hundred nucleotides of elongation. Furthermore, snoRNA terminator sequences do not function efficiently when placed further downstream (Gudipati et al., 2008). The Sen1/Nrd1/Nab3 termination complex is targeted to 5′ ends through a combination of sequence-specific RNA binding and the association of Nrd1 with CTD Ser5P (Vasiljeva et al., 2008). The physical association of the Sen1 complex with the Exosome complex links transcription termination to a 3′ exonuclease activity that can trim snoRNA ends or completely degrade cryptic transcripts. Phosphorylation of Ser2P suppresses use of the Sen1/Nrd1/Nab3 termination pathway downstream (Gudipati et al., 2008), providing one mechanism by which Ser2 phosphorylation could enhance downstream elongation in yeast.
It remains to be proven whether higher eukaryotes have a similar early termination pathway, but available data suggest the existence of one or more such mechanisms. Higher eukaryotes do have a Sen1 homolog, although it has not yet been implicated in termination. Like yeast snoRNAs, mammalian snRNAs are short and non-polyadenylated and their terminator sequences only function efficiently within the first few hundred nucleotides of the initiation site (Ramamurthy et al., 1996). ChIP studies in mammalian and Drosophila cells show that a significant percentage of mRNA genes have a large peak of RNApII at the promoter, but much lower levels further downstream (reviewed in Margaritis and Holstege, 2008). Although these peaks are generally interpreted as “paused” or “poised” polymerases, the same pattern would be expected if a significant proportion of RNApII terminated early in elongation. Recent transcript sequencing analyses in mammalian, Drosophila, and yeast cells have uncovered the widespread existence of short, promoter-associated transcripts that are reminiscent of CUTs (see Carninci, 2009 for review). The fact that at least some of these short transcripts are rapidly degraded by the Exosome (Preker et al., 2008) indicates they are released from the transcription elongation complex, which is to say they are termination products rather than precursors to full-length transcripts.
In vitro experiments led to the discovery of the negative elongation factor NELF, which appears to inhibit the release of RNApII into downstream elongation and is therefore often described as a 5′ pausing factor (Peterlin and Price, 2006). Remarkably, NELF has also been shown to be required for proper termination of both snRNAs (Egloff et al., 2009) and cell-cycle regulated histone mRNAs (Narita et al., 2007; Wagner et al., 2007), two classes of transcripts that are relatively short and not polyadenylated in metazoans. Therefore, NELF may also function in one or more early termination pathways in higher eukaryotes. Importantly, the inhibitory effect of NELF is alleviated when P-TEFb phosphorylates Spt5 and CTD Ser2P, underscoring the critical nature of the kinase progression in gene expression.
Given the existence of an early termination pathway, one obvious question is why RNApII doesn't always terminate within the first several hundred nucleotides of transcription. Instead of assuming each gene exclusively uses either an early or late termination pathway, a more useful and plausible model is that each gene has a certain probability of using either pathway. The ratio may be determined by a kinetic competition between the early termination pathway and the phosphorylation events that trigger more efficient elongation and downstream termination. Indeed, the yeast Nrd1/Nab3/Sen1 complex crosslinks to 5′ ends of mRNA as well as snoRNA genes, presumably via the Nrd1-Ser5P interaction (Kim et al., 2006; Vasiljeva et al., 2008). Yeast snoRNAs are strongly biased to the early Sen1 pathway because they contain multiple binding sites for Nrd1 and Nab3, both of which are sequence-specific RNA binding proteins. Particular sequences, RNA structures, or chromatin configurations might also slow RNApII during early elongation, increasing the amount of time available for early termination factors to “catch” RNApII.
In this kinetic competition model for termination, early removal of negative elongation factors or recruitment of positive elongation factors will increase the probability of moving quickly past the high Ser5P region, biasing the gene towards elongation and the downstream termination pathway (see below). Phosphorylation of the Spt5/DSIF elongation factor by Cdk9/Bur1 is likely to be one such event. Interestingly, Spt5 phosphorylation is important for recruitment of the PAF complex, which in turn promotes both co-transcriptional histone methylation and mRNA 3′ end processing (Liu et al., 2009, and references therein). Phosphorylation of CTD Ser2 also helps recruit Spt6, a factor that helps polymerase elongate through nucleosomes (Yoh et al., 2007). Gene-specific mechanisms to recruit Cdk9/P-TEFb to 5′ ends of genes, such as the Tat/TAR system of the human immunodeficiency virus or recruitment by transcription activators, would also be expected to increase the percentage of polymerases that pass through to full elongation (discussed in Peterlin and Price, 2006).
Once RNApII is past the early decision point described above, the increasing concentration of CTD Ser2P mediates several new interactions (Fig 1C). The H3K36 methyltransferase Set2 binds to CTD doubly phosphorylated at both Ser5 and Ser2, and so the H3K36 methylation pattern strongly resembles that of CTD Ser2P (see Hampsey and Reinberg, 2003). Once the H3K4/H3K36 methylation pattern is established on an active gene, these modifications then function to recruit other complexes that affect histone acetylation and chromatin remodeling. In this manner, positional information first encoded in the CTD phosphorylation pattern is propagated and at least temporarily recorded in the overlying chromatin. These modifications are established by transcription, but may then influence subsequent rounds of transcription.
Polyadenylation and transcription termination of mRNA encoding genes are generally tightly linked. Two proteins that preferentially bind CTD Ser2P are involved in this coupling. Pcf11 is an essential polyadenylation factor that preferentially binds the Ser2P CTD, where it may help tether the polyA machinery to elongating RNApII so it can scan for emergence of the appropriate RNA sequences (Licatalosi et al., 2002; Meinhart and Cramer, 2004). Upon cleavage of the nascent mRNA at the polyA site, the downstream RNA is rapidly degraded by the Rat1/Xrn2 5′ to 3′ exonuclease, leading to transcription termination by the “torpedo” model (Kim et al., 2004; West et al., 2004). The termination complex contains a CTD Ser2P binding protein called Rtt103 that, while not absolutely required for Rat1 recruitment, may help target the Rat1 complex to RNApII that has higher levels of Ser2P (Kim et al., 2004). Some factors (including Pcf11) seem to be involved in both the early and late termination pathways, so the choice is not simply a matter of which CTD phosphorylations or RNA sequences are present. A better understanding of the 3′ processing/termination mechanisms will be needed before the decision-making process is completely clear.
There is still much left to discover about how the CTD works to couple transcription with other processes. There may be additional factors that specifically recognize other phosphorylated residues (Ser7, Thr4, and Tyr1), and of course all the hydroxyl groups could be substrates for other covalent modifications such as glycosylation. There is strong circumstantial evidence that isomerization of the CTD prolines is also relevant, since the prolyl isomerase Ess1/Pin1 recognizes phosphorylated Ser-Pro pairs and has been implicated in 3′ end formation (Singh et al., 2009). Although this review has concentrated on how modifications of the CTD are used to recruit various RNA processing and chromatin modifiers, it appears that the communication can be multidirectional. Recent studies (reviewed in Zhong et al., 2009) find that certain proteins binding to the nascent RNA transcript can also affect the elongation rate of RNApII. Chromatin modifications and mRNA processing events may also be linked, as illustrated by the recent observation that regions of chromatin encoding exons often have higher levels of H3K36 methylation than the adjacent introns (Kolasinska-Zwierz et al., 2009). One can imagine direct interactions between histones and mRNA processing factors, or more likely chromatin and RNA processing can affect the rate of transcription elongation and thereby affect the kinetics of the other linked processes (see Sims and Reinberg, 2009 for review). Finally, the events that occur after release of polymerase from the template remain murky, but it is likely that interactions of factors with the CTD will be involved in recycling of RNApII (perhaps through the postulated looping of 5′ and 3′ ends) as well as transport to other locations within the nucleus (such as the various RNApII-containing speckles and granules that have been described). Although the CTD appears to be a simple seven amino acid repeat, it's clear that its various modification patterns and interactions lead to multiple complex and interesting biological functions.
I deeply appreciate helpful discussions on this topic with many colleagues in the field, and I apologize for not being able to cite all relevant papers due to space restrictions.