|Home | About | Journals | Submit | Contact Us | Français|
Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
Symplekin (Pta1 in yeast) is a scaffold in the large protein complex that is required for 3′-end cleavage and polyadenylation of eukaryotic messenger RNA precursors (pre-mRNAs) 1–4, and also participates in transcription initiation and termination by RNA polymerase II (Pol II) 5,6. Symplekin mediates interactions among many different proteins in this machinery 1,2,7–9, although the molecular basis for its function is not known. Here we report the crystal structure at 2.4 Å resolution of the N-terminal domain (residues 30–340) of human symplekin (Symp-N) in a ternary complex with the Pol II C-terminal domain (CTD) Ser5 phosphatase Ssu72 7,10–17 and a CTD Ser5 phosphopeptide. The N-terminal domain of symplekin has the ARM or HEAT fold, with seven pairs of anti-parallel α-helices arranged in the shape of an arc. The structure of Ssu72 has some similarity to that of low-molecular-weight phosphotyrosine protein phosphatase 18,19, although Ssu72 has a unique active site landscape as well as extra structural features at the C-terminus that is important for interaction with symplekin. Ssu72 is bound to the concave face of symplekin, and engineered mutations in this interface can abolish interactions between the two proteins. The CTD peptide is bound in the active site of Ssu72, unexpectedly with the pSer5-Pro6 peptide bond in the cis configuration, which contrasts with all other known CTD peptide conformations 20,21. While the active site of Ssu72 is about 25 Å away from the interface with symplekin, we found that the symplekin N-terminal domain stimulates Ssu72 CTD phosphatase activity in vitro. Furthermore, the N-terminal domain of symplekin inhibits polyadenylation in vitro, but importantly only when coupled to transcription. As catalytically active Ssu72 overcomes this inhibition, our results demonstrate a role for mammalian Ssu72 in transcription-coupled pre-mRNA 3′-end processing.
Human symplekin contains 1274 amino acid residues (Fig. 1a) and its sequence is well conserved among higher eukaryotes (Supplementary Fig. 1). In comparison, symplekin shares only weak sequence similarity with yeast Pta1 1 (Supplementary Fig. 2), and Pta1 lacks the C-terminal 500 residues of symplekin (Fig. 1a). Symplekin and Pta1 do not have any recognizable homology with other proteins. Secondary-structure predictions suggest the presence of an all-helical segment in the N-terminal region of symplekin and Pta1 (Fig. 1a and Supplementary Figs. 1–2). Recent studies in yeast suggested that the N-terminal segment of Pta1 is important for interaction with Ssu72 9. Ssu72 is required for pre-mRNA 3′-end cleavage in yeast 7, although its phosphatase activity is not necessary for this function 13. The catalytic activity of Ssu72 may instead be important for Pol II transcription and termination, and gene looping 17. Ssu72 is highly conserved among the eukaryotes (Supplementary Fig. 3), but to date no evidence exists implicating mammalian Ssu72 in 3′-end processing.
To determine the structure of a symplekin-Ssu72-CTD phosphopeptide ternary complex, residues 30–360 of human symplekin and full-length human Ssu72 were over-expressed and purified separately. The two proteins were mixed, with Ssu72 in slight molar excess, and the symplekin-Ssu72 complex was purified by gel-filtration chromatography. This procedure also demonstrated strong interactions between the two human proteins, consistent with observations on their yeast counterparts 9. The 10-mer CTD phosphopeptide used in this study, SY1S2P3T4(pS5)P6S7YS, where Ser5 is phosphorylated, contained an entire CTD heptad repeat as well as a Ser from the previous repeat and Tyr-Ser from the following repeat. To prevent hydrolysis, the active site nucleophile Cys12 of Ssu72 was mutated to Ser in the ternary complex. We have also determined the crystal structure of the symplekin-Ssu72(wild-type) binary complex and the structures of the symplekin N-terminal domain alone (for residues 30–395, 30–360, or 1–395). All the structures have excellent agreement with the crystallographic data and the expected geometric parameters (Supplementary Table 1).
The structures show that residues 30–340 of symplekin (Symp-N) form seven pairs of anti-parallel α-helices, while residues 1–29 and 341–395 are disordered (Supplementary Fig. 1). The pairs of helices are arranged in the shape of an arc, with the first helix in each pair, the αA helix, being located on the convex face of the arc, and the αB helix on the concave face (Fig. 1b). Most of the loops connecting the helices are short, except for that linking helices α4B and α5A, with 31 residues (Fig. 1b). The overall fold of Symp-N is found in many other proteins, including those with the ARM or HEAT repeats. These structures are often involved in protein-protein interactions, consistent with the proposed scaffold function of symplekin.
The structure of the N-terminal domain of Drosophila symplekin (residues 22–270) was reported recently 22. Its overall conformation is similar to that of human Symp-N, with rms distance of 1.0 Å among their equivalent Cα atoms, although the Drosophila symplekin structure is missing the two pairs of helices (α6 and α7) at the C-terminus (Fig. 1c). Noting the good sequence conservation of residues in this region (Supplementary Fig. 1), it is likely that these helices are also present in the N-terminal domain of Drosophila symplekin. Notably, our studies show that helix α6B is important for interactions with Ssu72 (Fig. 1b, see below).
The structure of Ssu72 contains a central 5-stranded β-sheet (β1 through β5) that is surrounded by helices on both sides (Fig. 2a, Supplementary Fig. 4). The closest structural homolog of Ssu72 is the low-molecular-weight phosphotyrosine protein phosphatase (Fig. 2b) 18,19, as suggested earlier 11,12,14, even though the two proteins share only 16% sequence identity. Our studies show however that Ssu72 possesses three unique structural features as compared to this other phosphatase (Fig. 2b), which are formed by highly conserved residues (Supplementary Fig. 3) and have important functions. A small, two-stranded anti-parallel β-sheet (β2A and β2B) is located near the active site (Fig. 2a). The αD helix is in a different conformation in Ssu72 and also contributes to phosphopeptide binding. Finally, Ssu72 contains an extra helix (αG) andβ-strand (β5) at the C-terminus, which are essential for interactions with symplekin (Fig. 1b).
Our structure of the ternary complex showed that the CTD phosphopeptide, with good electron density for residues Thr4 to Ser7 (Fig. 2c), is bound with the peptide bond between pSer5 and Pro6 in the cis configuration (Fig. 2d). This is in sharp contrast to the conformations of the CTD phosphopeptides observed in other structures, which all have the Pro residue(s) in the trans configuration (Supplementary Fig. 5) 20,21. With the cis configuration, the backbone of the phosphopeptide makes a 180° turn at the pSer-Pro residues, while the peptide in the trans configuration (Supplementary Fig. 5) would clash with Ssu72. Therefore, Ssu72 can only bind and dephosphorylate CTD substrates with the pSer-Pro peptide bond in the cis configuration, in contrast to all other known CTD phosphatases (Supplementary text).
Our observation of a cis configuration for the CTD also provides a different interpretation for the role of the peptidyl-prolyl isomerase Pin1 (Ess1 in yeast) in regulating Pol II transcription 23–25. It has been proposed that Pin1/Ess1 promotes the trans configuration of the CTD for dephosphorylation by Ssu72 24,25, while our structure indicates that the opposite must be true. Our in vitro phosphatase assays demonstrate that Pin1 strongly stimulates the phosphatase activity of Ssu72 (Supplementary text and Supplementary Fig. 6), consistent with its specificity for the cis configuration.
The active site of Ssu72 is located at the bottom of a narrow groove (Fig. 2e), one wall of which is formed by the small β-sheet (β2A and β2B) and the loop linking the two strands (Fig. 2d). This severely limits the possible conformation of the CTD, ensuring that only the cis configuration of the pSer-Pro peptide bond can be accommodated in the active site. In fact, the Thr4-pSer5 peptide bond is π-stacked with the Pro6-Ser7 peptide bond (Fig. 2d), suggesting a highly restrained conformation for the CTD phosphopeptide in this region. Residues Thr4, pSer5 and Pro6 of the same repeat as well as Tyr1 of the following repeat have interactions with the enzyme (Supplementary text, Fig. 2d), explaining the preference for pSer5 by Ssu72 and consistent with results from earlier biochemical studies on yeast Ssu72 14.
The phosphate group of the peptide is bound deepest in the structure, having extensive ion-pair and hydrogen-bonding interactions with the enzyme (Fig. 2d). In addition, the main-chain amide group of pSer5 is hydrogen-bonded to the main-chain carbonyl of Lys43 (in β2A). The catalytic nucleophile of Ssu72, Cys12, is located directly below the phosphate group and can be in the correct position for the inline nucleophilic attack on the phosphorus atom to initiate the reaction (Supplementary Fig. 7). The side chain of Asp143 is located 3.5 Å from the Oγ atom of Ser5, consistent with its role as the general acid to protonate the leaving group. There are some conformational changes in the active site region of Ssu72, especially for the β2A-β2B loop, upon binding of the CTD phosphopeptide (Supplementary Fig. 7), although this loop appears to be flexible and can assume different conformations in the various structures.
In the structures of the binary and ternary complexes, Ssu72 is bound to the concave face of the Symp-N (Fig. 1b). Approximately 950 Å2 of the surface area of each protein is buried in the interface of this complex, which involves helices α3B through α6B of Symp-N (Fig. 3a, Supplementary Fig. 1), and helix αE, the followingαE-β4 loop, helix αG and strand β5 of Ssu72 (Supplementary Fig. 3). In addition, residue Arg206, at the tip of the long loop connecting helices α4B and α5A of Symp-N, is also located in the interface (Fig. 3a). Ion-pair, hydrogen-bonding as well as hydrophobic interactions make contributions to the formation of this complex (Supplementary text). Especially, the side chains of Val191 and Phe193 of Ssu72 (in strand β5) establish hydrophobic interactions with those of Lys185 (α4B) and Ile251 (α5B) of symplekin in the center of this interface (Fig. 3a). In addition, the side chain hydroxyl group of Thr190 (β5) of Ssu72 is hydrogen-bonded to the side chain of Asn300 (α6B) of symplekin. The relative positions of Symp-N and Ssu72 appear to be somewhat variable among the binary and ternary complexes (Supplementary text, Supplementary Fig. 8).
The symplekin-Ssu72 interface is located about 25 Å from the active site of Ssu72 (Fig. 1b). Unexpectedly however, phosphatase assays measuring hydrolysis of a p-nitrophenylphosphate (pNPP) model substrate 11,12 showed that Symp-N stimulated Ssu72 activity (Fig. 3b), and maximal activation was achieved when the two proteins were at 1:1 molar ratio. To assess whether this stimulation also occurs with a natural substrate, we first used the 10-mer CTD phosphopeptide in the assay, monitoring the release of inorganic phosphate, and observed a similar stimulation (Supplementary Fig. 6). We next prepared a GST-CTD fusion protein that had been phosphorylated on Ser2 and Ser5 with HeLa nuclear extract 26. As demonstrated by Western blotting with a pSer5 specific antibody, Ssu72 dephosphorylated this protein on Ser5, importantly in a manner that was also stimulated by Symp-N (Fig. 3c). Ssu72 was specific for dephosphorylating pSer5, as Ser2 phosphorylation, as monitored by a pSer2 specific antibody, was not affected (data not shown).
Our data indicate that the symplekin-Ssu72 interaction activated Ssu72 phosphatase activity, likely through stabilization of the Ssu72 structure and/or an allosteric mechanism. This is consistent with previous studies on the R129A mutant (ssu72-2) of yeast Ssu72, equivalent to Arg126 in human Ssu72 (Supplementary Fig. 3). This mutant displays a two-fold reduction in catalytic activity compared to wild-type Ssu72 and produces a severe growth defect at the non-permissive temperature 16. Arg126 is far from the active site and is in fact near the interface with symplekin (Fig. 3a). However, it does not directly contribute to interactions with symplekin, and the R126A mutation did not disrupt interaction with Symp-N (data not shown).
To assess the importance of individual residues for the stability of the symplekin-Ssu72 complex, we introduced mutations in the interface and characterized their effects on the complex using gel-filtration chromatography and phosphatase assays. The presence of wild-type Ssu72 gave rise to a clear shift of the peak for Symp-N from a gel-filtration column (Fig. 3d), corresponding to the formation of the symplekin-Ssu72 complex. Ssu72 was present in two-fold molar excess in this experiment, and only half of this protein was incorporated into the complex (Fig. 3d), demonstrating a 1:1 stoichiometry for the complex. Mutation of a symplekin residue in the interface, K185A (Fig. 3a), essentially abolished the interaction with wild-type Ssu72 (Fig. 3e), and mutation of three Ssu72 residues in the interface, T190A/V191A/F193A, abolished the interaction with wild-type symplekin. The chromatographic behavior of the mutants alone was similar to that of the wild-type protein (Fig. 3e), suggesting that the mutations did not disrupt the structure of the proteins. This was also confirmed by the crystal structure of the K185A mutant (data not shown). Consistent with the gel-filtration data, the symplekin K185A mutant failed to stimulate Ssu72 phosphatase activity, and the T190A/V191A/F193A mutant of Ssu72 could not be stimulated by wild-type Symp-N (Fig. 3b).
We next wished to assess the functional importance of the symplekin-Ssu72 interaction with respect to 3′-end formation. Given the roles of their yeast counterparts in both transcription and polyadenylation, we used a transcription-coupled 3′-end processing assay 27. HeLa nuclear extract was pre-incubated with increasing concentrations of Symp-N, which led to a pronounced inhibition of polyadenylation (Fig. 4a), similar to an effect observed earlier with the yeast Pta1 N-terminal domain in a transcription-independent assay 9. Transcription, as measured by accumulation of unprocessed pre-mRNA, was not affected (Fig. 4b). Unexpectedly, RNase protection assays showed that 3′-end cleavage was also not affected (Supplementary Fig. 9), indicating that Symp-N affects only the polyadenylation step of 3′-end formation. Inclusion of purified Ssu72 during the pre-incubation with Symp-N blocked the inhibition, while Ssu72 alone had no effect (Fig. 4c). Importantly, the K185A mutation in Symp-N abolished this inhibitory effect, while the T190A/V191A/F193A mutant of Ssu72 failed to overcome the inhibition by wild-type Symp-N (Fig. 4c). These results provide strong evidence that the inhibitory effect of Symp-N reflects its interaction with Ssu72, and thus implicates Ssu72 in mammalian 3′-end processing. Somewhat unexpectedly, based on studies in yeast 7,9, the catalytically inactive C12S mutant of Ssu72 failed to overcome this inhibition (Fig. 4c), and Symp-N had no detectable effect on transcription-independent polyadenylation (Fig. 4d and data not shown). Together, these results indicate that Ssu72 phosphatase activity is required for polyadenylation of pre-mRNAs, but only when processing is coupled to transcription.
Our finding that a CTD phosphopeptide is bound to Ssu72 with the pSer-Pro peptide bond in the cis configuration indicates the existence of a novel CTD conformation. While Ssu72 has been well studied in yeast and has functions in transcription and 3′-end processing, essentially nothing was known about its mammalian counterpart. In fact, while the yeast enzyme is a stable component of the polyadenylation machinery and required for processing, mammalian Ssu72 has not been found associated with polyadenylation factors, and was not detected in a recent proteomic analysis of the assembled polyadenylation complex 28. Consistent with this, our results provide evidence that in mammals Ssu72 is only necessary for polyadenylation when processing is coupled to transcription. A parsimonious model is that symplekin recruits Ssu72 to the transcription complex and activates its phosphatase activity, which promotes polyadenylation. Conceivably this occurs by facilitating recruitment of poly(A) polymerase, known for many years to be only weakly associated with other 3′-end processing factors 28,29, to the complex. Given that the CTD is necessary for efficient 3′-end formation in mammalian cells 26,30, and that CTD pSer5 is the only known target of Ssu72, CTD pSer5 dephosphorylation may well be important in facilitating polyadenylation during transcription.
The N-terminal domain of human symplekin and full-length human Ssu72 were over-expressed separately in E. coli and purified. The symplekin-Ssu72 complex was purified by gel filtration of a mixture of the two proteins. Crystals were obtained by the sitting-drop vapor diffusion method, and the structures were determined by the seleno-methionyl single-wavelength anomalous diffraction method and the molecular replacement method.
Transcription-coupled polyadenylation was carried out using a DNA construct containing GAL4 binding sites upstream of the adenovirus E4 core promoter and SV40 late poly(A) site downstream. Recombinant symplekin and Ssu72 proteins were preincubated with HeLa nuclear extract before transcription was started by adding the DNA template and purified GAL4-VP16. RNA products were purified, separated into nonpolyadenylated and polyadenylated fractions and analyzed on 5% denaturing gel. Radioactivity was detected using a Phosphorimager. Assays were repeated multiple times with consistent results.
Full Methods and any associated references are available in the online version of the paper at www.nature.com/nature.
We thank Farhad Forouhar and Jayaraman Seetharaman for help with data collection, Thierry Auperin for initial experiments with symplekin, Randy Abramowitz and John Schwanof for setting up the X4C beamline and Stuart Myers for setting up the X29A beamline at the NSLS, Martin Heidemann and Dirk Eick for providing the 3E10 antibody, Nishta Rao and Charlotte Logan for HeLa nuclear extract and help with in vitro assays, and John Decatur for characterizing the phosphopeptide by NMR. This research is supported in part by grants from the NIH to LT (GM077175) and JLM (GM028983).
Author Contributions. K.X., S.X., T.K. and M.M.B. carried out protein expression, purification and crystallization experiments. K.X., S.X. and L.T. carried out crystallographic data collection, structure determination and refinement. T.N. and K.X. carried out polyadenylation experiments. K.X. carried out Ssu72 phosphatase assays. All authors commented on the manuscript. L.T. and J.L.M. designed the experiments, analyzed the data and wrote the paper.
Author Information. The atomic coordinates have been deposited at the Protein Data Bank (accession codes 3O2Q, 3O2S, 3O2T). Reprints and permissions information is available at www.nature.com/reprints. The authors declare no competing financial interests.