In this report we analyzed genome-wide localization of RNA pol II CTD phosphorylation and CTD-associated termination factors, Nrd1 and Pcf11. Our results provide strong confirmation that a CTD phosphorylation "code" which "specifies the position of pol II in the transcription cycle" 2
applies generally by showing on a genome-wide scale that polymerases early in the cycle have a distinct pattern of CTD phosphorylation from those that are late in the cycle (). We found that CTD S5 and S2 phosphorylation at 5’ ends is relatively “hard-wired” to produce a delay of about 450 bases between the onset of these two modifications at most mRNA coding genes (). Regardless of gene length, the S2:S5 phosphorylation ratio increases within the first 500 bases of the transcription unit, then remains fairly constant within the body of long genes, and peaks in the 3’ flanking region prior to termination (). As a result, the dual gradient model of increasing S2-PO4 and decreasing S5-PO4 across the entire gene applies well on short genes, but not on long ones. Because S2- and S5-PO4 dynamics are not scaled to compensate for differences in gene length, the ratio of S2-PO4: S5-PO4 is not an unambiguous measure of pol II position within a gene 5
S5-PO4 and S7-PO4 are regulated quite differently although both are appended by the same kinase, Kin28. Whereas S5-PO4 is remarkably uniform in its profile across genes, high at 5’ ends and low at 3’ ends, S7-PO4 profiles vary in a gene-specific way with discrete peaks at either the 5’ end, the 3’ end or both ends. Intriguingly, S7-PO4 is specifically enriched over introns () suggesting a potential role in co-transcriptional assembly or disassembly of splicing complexes.
The CTD kinases Kin28 and Ctk1 have quite profound effects on pol II elongation that went unnoticed previously because they are not manifested on all genes. Inhibition of Kin28 impedes pol II progress through long genes resulting in accumulation of pol II at 5’ ends and depletion further downstream (
Supp. Fig. 2B–E
). This defect could result from diminished recruitment of the cap methyltransferase Abd1, which can enhance elongation 36
, or from premature termination of uncapped transcripts 16
. It is also possible that an abnormally high ratio of S2-PO4 to S5/7-PO4 after inactivating Kin28 could impair elongation, particularly on long genes. Consistent with this idea, high S2-PO4: S5-PO4 is characteristic of terminating pol II (). Short genes might be resistant to the effect of Kin28 inhibition because on average they have lower relative levels of S2-PO4 (). In summary, these results reveal a new function for Kin28 in enhancing polymerase progress through long genes, in addition to its previously established role in co-transcriptional capping.
In contrast, Ctk1 depletion caused pol II to pileup at the 3’ ends of over 500 genes with strong consensus poly (A) sites (, Supp. Fig. 2I
). At this subset of genes, low S2-PO4 may impair disassembly of 3’ end processing complexes on strong poly (A) sites thereby inducing transcriptional pausing. Previously the Glc7 phosphatase 37
, the export factor Mex67 38
and the Tho/Sub2 complex 39
have been implicated in disassembly of 3’ end processing complexes. We speculate that the function of this pause could be to compensate for slower RNP maturation when S2 CTD phosphorylation is low.
Nrd1 is not confined to the 5’ ends of protein coding genes by a strict association with CTD S5-PO4, as previously thought 9
. Instead we found a strong correlation between Nrd1 and CTD S7-PO4 profiles on ncRNA and protein-coding genes. We propose that in addition to binding S5-PO4 heptads9
and specific sequences in the nascent RNA 40
, S7-PO4 plays an important role in directing recruitment of Nrd1. It is also possible that conversely, Nrd1 binding enhances S7 phosphorylation.
Nrd1 recruitment is a characteristic feature of non-coding CUTs and SUTs including those that overlap protein-coding genes (
Supp. Fig. 3B–J
). Such overlapping transcription might serve to localize Nrd1 on these genes for the purpose of repressing their expression, possibly through interactions with the nuclear exosome and chromatin silencing factors 41,42
. Consistent with this idea, Nrd1 was found at PHO84, GAL1-10, NDT80
where antisense transcription is implicated in establishing repressed chromatin states ( and Supp. Fig. 3I, J
Nrd1 also localized to intronic snoRNAs where we also observed Rat1 recruitment and evidence of premature transcription termination (, Supp. Figs. 4C,D, 5A–C
). Co-transcriptional cleavage by Rnt1 43
has been implicated in processing of intronic snoRNAs and might promote premature termination by providing an entry point for the Rat1 exonuclease 15,32
. Whether Nrd1 contributes to processing of intronic snoRNAs and/or premature termination remains to be investigated. Together these results suggest that under some conditions, perhaps when splicing is slow, early termination, possibly involving Nrd1 and Rat1, may produce a transcript that serves as a snoRNA precursor, but not as a mRNA precursor.
The cleavage/polyadenylation factor Pcf11 binds S2-PO4 CTD and is enriched at 3’ ends, but its distribution does not precisely mimic that of S2-PO4 (, Supp. Fig. 6E
). Pcf11 is present throughout the length of most genes and often shows a discrete peak at the poly (A) site unlike S2-PO4. In addition to the S2-PO4 mark on the CTD, poly (A) consensus elements in the nascent transcript are therefore likely to enhance Pcf11 binding to the TEC. Frequently, we observed coincident peaks of Pcf11 and Nrd1 at poly (A) sites (
Supp. Fig. 6A–D
) and ncRNAs (, Supp. Fig. 6F–H
). In addition, Pcf11 inactivation appeared to impair termination downstream of regulatory ncRNAs including those that overlap the 5’ ends of SER3
(). The widespread overlap of Nrd1 and Pcf11 suggests that these two factors cooperate or back one another up to achieve pol II termination at many yeast genes under normal conditions.
Pcf11 accumulated quite unexpectedly at pol I (Supp. Fig. 1B
) and pol III transcribed genes as well as centromeres (, Supp. Fig. 8
), presumably by pol II-independent recruitment mechanisms. It is not known whether Pcf11 at these locations is part of the CF1A cleavage/polyadenylation complex or in another form. The function of Pcf11 at centromeres is a mystery. Unlike centromeres, high levels of Rat1 co-localized with Pcf11 at tRNA genes (). Pol II and associated factors have previously been detected at tRNA genes in yeast and mammalian cells 44,45
. There are several possible functions for Pcf11 and Rat1 at tRNAs including termination of convergently transcribed pol II genes (, Supp. Fig. 8F–H
), degradation of improperly modified tRNAs 46
and establishment of chromatin boundaries. How Pcf11 functions at telomeres is unclear, but together with Rat1 (), it might influence telomere length by regulating termination and 3' end processing of TERRA transcripts 35
. In summary, Pcf11 was previously known only as a cleavage/polyadenylation factor, but our genome-wide localization of this factor at centromeres, and pol I and pol III transcribed genes, shows that this view must be revised.
The most salient result to emerge from genome-wide analysis of CTD-PO4 is that the CTD "code" is inscribed differently on different genes. Distinct patterns of phosphorylation distinguish pol II on the different classes of pol II transcription unit for mRNAs, snoRNAs, CUTs and SUTs (). Furthermore there are significant variations in the CTD "code" among protein-coding genes. Highly expressed protein coding genes and genes with promoters occupied by nucleosomes (OPN) have higher S5-PO4, and lower S7-PO4, relative to total pol II, than poorly expressed genes and those with nucleosome depleted promoters (DPN) (). Although much remains to be learned about exactly what phosphorylation symbols in the CTD “code” mean, the results reported here establish that CTD phosphorylation is differentially regulated between genes. Furthermore they suggest the intriguing hypothesis that promoters can dictate how the CTD "code" will be written.