|Home | About | Journals | Submit | Contact Us | Français|
The carboxyl-terminal domain (CTD) of the largest subunit of RNA polymerase II (pol II) comprises multiple tandem repeats of the heptapeptide Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7. This unusual structure serves as a platform for the binding of factors required for expression of pol II-transcribed genes, including the small nuclear RNA (snRNA) gene-specific Integrator complex. The pol II CTD specifically mediates recruitment of Integrator to the promoter of snRNA genes to activate transcription and direct 3′ end processing of the transcripts. Phosphorylation of the CTD and a serine in position 7 are necessary for Integrator recruitment. Here, we have further investigated the requirement of the serines in the CTD heptapeptide and their phosphorylation for Integrator binding. We show that both Ser2 and Ser7 of the CTD are required and that phosphorylation of these residues is necessary and sufficient for efficient binding. Using synthetic phosphopeptides, we have determined the pattern of the minimal Ser2/Ser7 double phosphorylation mark required for Integrator to interact with the CTD. This novel double phosphorylation mark is a new addition to the functional repertoire of the CTD code and may be a specific signal for snRNA gene expression.
The CTD5 of the largest subunit of pol II is an evolutionary conserved structure, specific to eukaryotic organisms, which plays a fundamental role in transcription and RNA processing (1). It comprises multiple tandemly repeated heptapeptides with the consensus sequence Tyr1-Ser2-Pro3-Thr4-Ser5-Pro6-Ser7, with 52 repeats in the mammalian protein (2). The CTD serves as a scaffold for the interaction of a wide range of nuclear factors involved in various processes linked to transcription. The ability of the CTD to interact specifically with appropriate proteins during the transcription cycle in vivo is determined largely by the phosphorylation status of the three serine residues within the consensus heptapeptide (Ser2/Ser5/Ser7) (3). Different patterns of CTD serine phosphorylation correlate with the position of pol II along transcribed protein-coding genes and allow the recruitment of the appropriate factors at different stages of the transcription cycle. Differential modification thus enables a wide range of signaling combinations to be read as a code (3,–5). Phosphorylation of Ser2 and phosphorylation of Ser5 are the best studied CTD modifications. Ser5 phosphorylation is directed by the CDK7 kinase as part of the TFIIH complex. Ser5 phosphorylation occurs early in the transcription cycle, is highest near the promoter, and activates capping of mRNAs. Ser2 phosphorylation, directed by the CDK9 kinase subunit of the positive transcription elongation factor b (P-TEFb) complex, occurs later in the transcription cycle and is generally highest toward the 3′ end of the genes. This modification plays roles in splicing and 3′ end processing of transcripts. There is some evidence that CTD heptapeptides can bear both Ser2 and Ser5 marks as factors implicated in transcription such as Set2 preferentially recognize a combination of both in vitro (3, 6). Some Ser2 phosphorylation must therefore occur before the removal of the phosphate on Ser5. Recently, it has been shown that Ser7 of the heptapeptide repeat is also phosphorylated during transcription (7,–10), further expanding the number of possible phosphorylation combinations on the CTD. The CDK7 component of the general transcription factor TFIIH has been implicated in Ser7 phosphorylation both in vivo and in vitro (7, 10, 11). The CDK9 component of P-TEFb has also been shown to have Ser7 kinase activity (10). Ser7 phosphorylation has been found on both protein-coding genes and the pol II-transcribed snRNA genes (7,–11). Human snRNA genes transcribed by pol II are structurally different from protein-coding genes (12, 13). snRNAs are neither spliced nor polyadenylated, and instead of a polyadenylation signal, the genes contain a conserved 3′ box RNA-processing element downstream of the snRNA-encoding region (14). However, these two different gene types share a requirement for the pol II CTD for efficient transcription and RNA processing (15,–19). The demonstration that the snRNA gene-specific Integrator complex, which is required for 3′ box recognition, binds to the pol II CTD (20) provided a molecular link between transcription and 3′ processing of snRNA gene transcripts. More recently, we have shown that the serine in position 7 of the CTD heptapeptide plays a pivotal role in expression of snRNA genes (9). Mutation of Ser7 to alanine abolishes the association of the snRNA gene-specific Integrator complex to snRNA promoters and drastically affects transcription of snRNA genes and 3′ end processing of transcripts. In addition, we have shown that CTD phosphorylation is critical for efficient binding of the Integrator complex to the CTD (9). However, the precise mark on the CTD required for binding of the Integrator complex was not fully defined. Here, we precisely identify the CTD mark required for Integrator binding. We show that phosphorylation of both Ser2 and Ser7 is necessary for efficient binding of Integrator to the CTD. Interestingly, efficient binding requires two heptapeptide repeats and occurs when Ser7 in the first repeat and Ser2 in the second repeat are phosphorylated, providing a clear demonstration that recognition by some CTD-binding proteins spans more than a single heptapeptide repeat. This doubly phosphorylated mark may represent a novel gene type-specific CTD mark as Integrator complex recruitment is restricted to snRNA genes (20). In addition, we demonstrate that Ser7 kinase activity in HeLa cell nuclear extracts is largely attributable to DNA-PK and that recombinant DNA-PK specifically phosphorylates Ser7 of GST-CTD fusion proteins with 25 heptapeptide repeats in vitro.
200 μl of HeLa nuclear extract in 20 mm Hepes, pH 7.9, 100 mm KCl, 20% glycerol, 0.5 mm phenylmethylsulfonyl fluoride, 0.5 mm dithiothreitol, 0.2 mm EDTA (Buffer D) (21) was loaded onto a Superose 6 column, and 0.5-ml fractions were collected. Molecular weight protein standards were run through the same column under the same conditions.
0.1 μg of GST-CTD (25 repeats) was incubated with 20 μl of fraction or 0.2–1 μl of HeLa nuclear extract in a 40-μl final volume of 20 mm Hepes-NaOH, pH 7.9, 8 mm MgCl2, 20 mm creatine phosphate, 5 mm β-glycerol phosphate, 50 mm KCl, 10% glycerol, 1 mm phenylmethylsulfonyl fluoride, 1 mm dithiothreitol, 1 mm ATP, 0.4 μl of Sigma phosphatase inhibitor mixture 1 (kinase buffer) for 15 h at 30 °C. For recombinant DNA-PK (Promega) studies, 50 units were incubated with 0.1 mg of GST-CTD (25 repeats) according to the manufacturer's instructions. For inhibitor studies, wortmannin (Invitrogen) was preincubated with the reaction mix for 10 min at 30 °C prior to the addition of ATP and GST-CTD substrate. Activated calf thymus DNA (Sigma) was added to the reaction mix as indicated. Reactions were stopped by boiling in SDS buffer, and phosphorylation was analyzed by Western blot with phospho-CTD-specific antibodies.
GST-CTD pulldowns were performed as described previously (9).
CTD peptides were synthesized (Cambridge Peptides) with one, two, or four repeats of YSPTSPS or YAPTSPA and amino-terminal biotinylation. Ser2/Ser7 phosphorylated peptides were also made. For the Integrator binding assay, 20 μl of streptavidin-coated Sepharose beads was resuspended in 0.1 m KCl HEGN, mixed with 10 μg of biotinylated peptides, and incubated at 4 °C for 1 h. After washing four times with 0.1 m KCl HEGN, 1 ml of the P11 0.5 m KCl fraction (precleared with streptavidin beads for 1 h) was added, and the mixture was incubated at 4 °C for 3 h. After washing the beads five times with 1 ml of 0.3 m KCl HEGN, protein-bound beads were boiled and loaded on SDS-PAGE gels directly.
Western blotting was carried out as described (22) using antibodies against Int11 (23), CTD-Ser(P)2, Ser(P)5, and Ser(P)7 (8), CDK9 (Santa Cruz Biotechnology, SC484), cyclin T1 (Santa Cruz Biotechnology, SC8128), Int7 (Bethyl Laboratories, DKFZP434B168), CDK7 (Santa Cruz Biotechnology, SC529), and DNA-PKcs (Santa Cruz Biotechnology, SC1552).
The Integrator complex comprises 12 subunits totaling ~2 MDa and has been shown to interact with the CTD of pol II in vitro (9, 20). It is associated with U1 and U2 snRNA genes, as shown by chromatin immunoprecipitation, and is required for recognition of the snRNA gene-specific 3′ box (9, 20), which directs 3′ end formation of pre-snRNA (13, 14). The Integrator complex may also play a role in activating transcription of snRNA genes, at a step downstream of pol II recruitment (9). The role of most of the 12 Integrator subunits is unclear, but two subunits, Int9 and Int11, display sequence homology to the CPSF-100 and CPSF-73 subunits of the cleavage and polyadenylation specificity factor, respectively (20, 23). Int11 is therefore thought to carry the endonuclease activity responsible for the first cleavage during the formation of the 3′ end of pre-snRNAs (20). We have previously shown that in vitro binding of Int11 to the CTD requires phosphorylation of the CTD (9). Mutation of Ser7 in the heptapeptide consensus sequence affects the interaction between the CTD and Integrator in vitro and prevents its recruitment on snRNA genes in vivo, suggesting that Ser7 phosphorylation participates in Integrator recruitment to snRNA genes (9). However, the CTD is the site of multiple phosphorylations, with all three serines phosphorylated in vitro (9). To investigate the role of Ser2, Ser5, and Ser7 phosphorylation, we performed glutathione S-transferase (GST) pulldown analysis with GST-CTDs carrying 25 consensus (wild type), S2A, S5A, or S7A heptapeptide repeats (Fig. 1). For detection of the catalytic subunit, Int11 was used as a diagnostic for Integrator binding to the CTD. As expected, Integrator interacts strongly with the consensus repeats after phosphorylation of the CTD. Mutation of Ser2 and Ser7 to alanine has a drastic effect on Integrator binding, demonstrating that these residues are critical for this interaction. In contrast, mutation of Ser5 does not reduce the binding of Integrator but rather enhances it, excluding any positive role of Ser5 and Ser5 phosphorylation in this process. Together, these results demonstrate that Integrator binding to the CTD requires a serine in positions 2 and 7 of the CTD repeat and that phosphorylation of Ser2 and/or Ser7, but not Ser5, participates in this interaction.
The requirement for CTD phosphorylation and serines at positions 2 and 7 can be explained by three distinct scenarios. Binding of the Integrator complex to the CTD requires: 1) a phospho-serine in position 2 and a serine in position 7; 2) a serine in position 2 and a phospho-serine in position 7; or 3) phospho-serines in positions 2 and 7. To discriminate between these possibilities, we separated Ser2 and Ser7 kinase activities in vitro. HeLa nuclear extract was fractionated on a Superose 6 size exclusion column, and the ability of each fraction to phosphorylate GST-CTD containing 25 consensus repeats in vitro was tested by Western blot with antibodies specific to phospho-Ser2, phospho-Ser5, and phospho-Ser7 (8) (Fig. 2A). The fractions were also tested for the presence of the CTD kinases CDK7 and CDK9, the cyclin T1 partner of CDK9, and the Int7 subunit of the Integrator complex. Interestingly, Ser7 kinase activity, which is present in fractions 16–36 and peaks in fractions 16 and 26–28, can be separated from Ser2 and Ser5 kinase activities, which are present in fractions 28–36, with only a small amount of Ser2 kinase activity in fraction 16. Thus, a high level of Ser7 kinase activity fractionates together with Cdk9, cyclin T1, and the large Integrator complex with an apparent molecular mass of >2 MDa (20). A second peak of Ser7 kinase activity fractionates with an apparent molecular mass of ~500–700 kDa, whereas Ser2 and Ser5 kinases fractionate between 50 and 500 kDa. As expected, the pattern of Ser5 phosphorylation closely follows the two isoforms of Cdk7. In contrast, the two major isoforms of the Ser2 kinase Cdk9 do not closely correlate with the Ser2 phosphorylation pattern, although the antibody detects some bands between the two isoforms in the fractions active for Ser2 phosphorylation. Similarly, Ser7 phosphorylation does not correlate closely with the presence of either CDK7 or CDK9, as might be expected (7, 10, 11), although some Ser7 kinase activity is present in the fractions where CDK7 and CDK9 are detected. However, Ser7 kinase activity in fractions 16–30 correlates closely with the presence of the 460-kDa catalytic subunit of the DNA-activated kinase DNA-PK, which has CTD kinase activity (24) and has been shown to phosphorylate CTD peptides on Ser7 (25). In addition, the DNA-PK inhibitor, wortmannin (26), effectively and selectively inhibits Ser7 phosphorylation by nuclear extracts (Fig. 2B) and fractions 16, 26, 28, and 30 (data not shown). Furthermore, Ser7 phosphorylation by nuclear extracts (but not Ser2 or Ser5 phosphorylation) is enhanced by ~4-fold by adding double-stranded DNA, as expected for DNA-PK (27) (Fig. 2B). In addition, recombinant DNA-PK phosphorylates the GST-CTD fusion protein on Ser7, and this activity is inhibited by wortmannin and activated by double-stranded DNA (Fig. 2C). DNA-PK was also previously found to phosphorylate CTD peptides on Ser2 (25). Recombinant DNA-PK also has some Ser2 kinase activity toward the GST-CTD substrate. However, this is neither inhibited by wortmannin nor activated by double-stranded DNA and is significantly lower than Ser7 kinase activity (Fig. 2C). Taken together, these results indicate that under the conditions used, DNA-PK is the major Ser7 kinase in HeLa nuclear extract.
Separation of Ser2 and Ser7 kinase activities by Superose 6 fractionation facilitated the study of Integrator binding to the CTD using fractions containing the Ser7 kinase but not the Ser2 kinase (fraction 26), the Ser2 kinase but not the Ser7 kinase (fraction 36), or both kinase activities (fraction 30). Fractions 26, 30, and 36 were independently used to phosphorylate a GST-CTD containing 48 consensus repeats. As expected, the GST-CTD is differentially modified depending on the fraction used for in vitro phosphorylation (Fig. 3). CTDs were then tested for their ability to interact with Integrator using the binding assay described in Fig. 1. Interestingly, Int11 does not interact with the CTD phosphorylated with fractions 26 and 36, showing that Ser2 and Ser7 phosphorylation cannot independently support efficient Integrator binding (Fig. 3). In contrast, simultaneous phosphorylation of Ser2 and Ser7 results in a strong interaction with Integrator. Taken together, these results demonstrate that Integrator binding to the CTD requires phosphorylation of both Ser2 and Ser7, raising the possibility that together, these modifications create a new CTD mark specifically recognized by the Integrator complex.
As a final step, we designed a set of N-terminally biotinylated peptides to further characterize the Integrator binding motif on the CTD (Fig. 4). First, we determined the minimal number of CTD repeats required for efficient Int11 binding. CTD peptides consisting of 2–4 heptapeptide repeats were tested for their ability to interact with Integrator before or after in vitro phosphorylation (Fig. 4A). As expected, non-phosphorylated peptides were unable to bind Integrator. In contrast, Int11 interacts with in vitro phosphorylated CTD peptides comprising two and four repeats. This result demonstrates that two repeats are sufficient for Integrator binding and is in accordance with the previous demonstration in yeast that the basic unit of the CTD comprises two repeats (28). Importantly, an in vitro phosphorylated CTD peptide containing only one repeat failed to interact with Int11, as did an in vitro phosphorylated peptide carrying two repeats with alanine instead of serine at positions 2 and 7 (Fig. 4B). These controls demonstrate the specificity of the binding assay and confirm the requirement for Ser2 and Ser7 for binding. To more precisely characterize the motif required for Integrator binding, five CTD peptides consisting of two repeats with different combinations of phospho-Ser2/phospho-Ser7 were tested for their ability to bind Integrator (Fig. 4C). The peptide phosphorylated on both Ser2 and Ser7 residues (phosphates in positions 2, 7, 9, and 14) binds Integrator almost as well as the in vitro phosphorylated CTD peptide, suggesting that this peptide contains all the modifications required for Integrator binding. Although the peptides with phosphates in positions 2/7, 9/14, or 2/14 do not appreciably interact with Integrator, the peptide with phosphates in positions 7/9 interacts with Integrator, although less well than the double repeat with both Ser2 and Ser7 residues phosphorylated. These results demonstrate that the minimal mark required for Integrator binding is a phospho-Ser7 on the first CTD repeat followed by a phospho-Ser2 on the second CTD repeat. None of the other combinations of two phosphates are sufficient for Integrator binding. In addition, Integrator binding to peptides phosphorylated by incubation with nuclear extract is lost when either Ser7 in the first repeat or Ser2 in the second repeat is replaced by alanine, S7A and S9A, respectively (Fig. 4D), demonstrating the specificity of this new CTD motif. However, phosphorylation of an additional Ser2, Ser7, or both enhances binding.
The RNA pol II CTD plays the role of scaffold for the recruitment of numerous factors involved in regulation of transcription and co-transcriptional RNA processing (3). The recent discovery of a new phosphorylation mark on serine 7 of the YSPTSPS CTD heptapeptide (3, 8) extends the repertoire of CTD modifications and raises the possibility of the existence of new regulatory phosphorylation combinations. The requirement for a double mark has only previously been demonstrated for factors that require phosphorylation of both Ser2 and Ser5, for example, Set2 (3, 6). The double mark comprising Ser7 phosphorylation on the first repeat of a CTD heptapeptide pair, followed by Ser2 phosphorylation on the second, is therefore the first example involving Ser7 phosphorylation and the first where different marks on two repeats are required for protein binding. Thus, this discovery emphasizes the increasing complexity of CTD modifications during RNA pol II transcription. Phosphorylation of both Ser2 and Ser7 occurs on snRNA genes (9, 22, 29). Although the currently available antibodies specific for phosphorylation on each of the three serines are very useful in studying transcription, they are unable to give precise information about the juxtaposition of modifications on individual heptads (or pairs). It would therefore be of great interest to raise an antibody against the Ser7/Ser2-specific mark and test its presence on snRNA genes by chromatin immunoprecipitation. Interestingly, phosphorylation of Ser2 and Ser7 is also detected at the 3′ end of human protein-coding genes (8, 29), but the Integrator complex is not. It is therefore possible that within the CTD, the location of phosphorylation of Ser2 and Ser7 relative to each other differs from snRNA to protein-coding genes or that this mark works in synergy with promoter-specific factors/elements to specifically recruit Integrator to snRNA genes but not protein-coding genes. In addition, Ser5 and its phosphorylation appear to have a negative impact on the recognition of the double mark by Integrator, indicating that some phosphorylation marks interfere with CTD-protein interaction, adding a further layer of complexity to the CTD code.
Surprisingly, our study also indicates that the majority of the in vitro Ser7 kinase activity in HeLa nuclear extract is attributable to the large kinase, DNA-PK. So far, no role for DNA-PK as a CTD Ser7 kinase in vivo has been demonstrated, and further investigation of this possibility is warranted. Nevertheless, the Ser7-specific activity obtained by fractionation of nuclear extract and recombinant DNA-PK will provide useful tools to selectively phosphorylate CTD heptapeptides in GST fusion proteins on Ser7, which would, for example, facilitate further studies of the role of this modification in recruitment of proteins to the CTD.
We thank Zbig Dominski for the Int11 antibody and Dirk Eick for the phospho-Ser2, phospho-Ser5, and phospho-Ser7 antibodies.
*This work was supported by a grant from the Medical Research Council (UK) and Wellcome Trust grants (to S. M.).
5The abbreviations used are: