|Home | About | Journals | Submit | Contact Us | Français|
Evidence that pre-mRNA processing events are temporally and, in some cases, mechanistically coupled to transcription has led to the proposal that RNA polymerase II (Pol II) recruits pre-mRNA splicing factors to active genes. Here we address two key questions raised by this proposal: (i) whether the U1 snRNP, which binds to the 5′ splice site of each intron, is recruited cotranscriptionally in vivo and, (ii) if so, where along the length of active genes the U1 snRNP is concentrated. Using chromatin immunoprecipitation (ChIP) in yeast, we show that elevated levels of the U1 snRNP were specifically detected in gene regions containing introns and downstream of introns but not along the length of intronless genes. In contrast to capping enzymes, which bind directly to Pol II, the U1 snRNP was poorly detected in promoter regions, except in genes harboring promoter-proximal introns. Detection of the U1 snRNP was dependent on RNA synthesis and was abolished by intron removal. Microarray analysis revealed that intron-containing genes were preferentially selected by ChIP with the U1 snRNP. Thus, U1 snRNP accumulation at genes correlated with the presence and position of introns, indicating that introns are necessary for cotranscriptional U1 snRNP recruitment and/or retention.
Pre-mRNA splicing is a two-step transesterification reaction carried out by the spliceosome, a large and dynamic multicomponent RNA-protein complex (52). The first steps in the assembly of the spliceosome on pre-mRNA involve the recognition of the 5′ and 3′ ends of each intron (5′ and 3′ splice sites) by small nuclear ribonucleoprotein particles (snRNPs) and non-snRNP splicing factors. Regulation of this process determines splice site usage in alternative pre-mRNA splicing (50). A report that 40 to 60% of human genes are alternatively spliced to produce multiple gene products (26) underscores the importance of understanding splice site recognition and subsequent spliceosome assembly. Although much progress has been made in recent years toward understanding the biochemical activities of many splicing regulators, it has been difficult to establish systems for examining the roles of such regulators on endogenous pre-mRNAs in vivo and the mechanisms by which they are recruited.
An important clue to understanding how splicing factors might initially assemble on pre-mRNA is provided by observations that splicing begins and is sometimes completed cotranscriptionally (for a review, see reference 39). For a number of genes, intron removal has been detected in nascent RNAs still tethered to the DNA axis by RNA polymerase II (Pol II) (3, 5, 42, 53, 54, 56). Evidence that transcription rates and promoter identity influence alternative splice site selection is consistent with a cotranscriptional splicing mechanism in humans (9, 21, 45) and yeast (K. J. Howe, C. M. Kane, and M. Ares, unpublished data). The findings that the C-terminal domain (CTD) of RNA Pol II is required for efficient capping, splicing, and polyadenylation of pre-mRNA (33) and specifically stimulates splicing in humans (14) have led to the proposal that Pol II itself recruits splicing factors to nascent RNA (4, 15, 31). Thus, splicing factors may resemble capping enzymes, which bind directly to Pol II via the CTD (7, 32) and do not appear to require RNA recognition for initial targeting to Pol II transcripts.
However, splicing need not always occur cotranscriptionally. A significant fraction of introns are excised after transcription termination (3, 54, 56, 57). Observations of recursive splicing, in which pre-mRNAs are spliced and then respliced, also indicate that not all splicing events are coupled directly to transcription (17, 29). Although cotranscriptional splicing in yeast is suggested by the kinetics of mRNA appearance (13), it has not been directly observed, and a report of recursive splicing has been used to argue against cotranscriptional splicing in yeast (29). Moreover, it is well known that purified pre-mRNAs synthesized by viral RNA polymerases can be spliced in vitro (25). Unlike the capping enzymes, many splicing regulators bind to sequence-specific elements in the pre-mRNA (50), suggesting that direct pre-mRNA binding may be sufficient for splicing in vivo. Thus, major questions in the field remain: to what extent are pre-mRNA splicing factors recruited cotranscriptionally and what are the requirements for pre-mRNA splicing factor recruitment in vivo?
Here we address these questions with respect to the U1 snRNP, the activity of which is required for pre-mRNA splicing in all species, from yeast to humans. The U1 snRNA base pairs with the 5′ splice site, thereby determining 5′ splice site usage, and the U1 snRNP is a component of the earliest biochemically defined splicing complexes (6, 36, 47-49, 58). Recently, it has been shown that the U1 snRNP-specific protein U1C also contacts the 5′ splice site in yeast (12). The U1 snRNP is not present in the active spliceosome, in which the U6 snRNA base pairs with the 5′ splice site to accomplish the catalytic steps (52). Although antigens shared among spliceosomal snRNPs have been detected by immunocytochemistry at active intron-containing transcription units in metazoans (22, 37), cotranscriptional U1 snRNP recruitment has never been specifically demonstrated or examined in detail.
Because nascent RNP complexes are likely to lie adjacent to the DNA axis (55), it seemed possible that splicing factors bound to nascent RNA could be detectable by chromatin immunoprecipitation (ChIP). Studies of chromatin and transcriptional regulation have been advanced by the ChIP technique, which has been used to localize specific factors and chromatin modifications to particular genomic regions (41). Because pre-mRNA splicing patterns are well understood in the yeast Saccharomyces cerevisiae and introns and exons annotated with respect to the complete genome sequence (30, 51), we chose to examine the association of the U1 snRNP with transcription units by ChIP in yeast cells. Here we show that the U1 snRNP accumulates on transcriptionally active intron-containing genes, a finding consistent with a cotranscriptional splicing mechanism in yeast cells. In contrast to capping enzymes, detectable at all promoters, the highest levels of the U1 snRNP coincided with regions of intron synthesis.
Yeast strains (Table (Table1)1) were grown in YP medium plus 2% glucose (YPD), YPD medium plus G418 (200 mg/ml), synthetic complete medium plus 2% glucose without histidine or uracil, or YP medium plus 2% galactose as necessary. Strains with protein A-tagged proteins Nam8p and Prp42p (BSY593 and BSY646) have been previously described (16, 43) and were a generous gift of Bertrand Seraphin. A PCR-based strategy (23) to epitope tag endogenous genes was used to generate strain YKK19, which has a hemagglutinin (HA) tag on the C terminus of Prp42p and a Myc-tag on the C terminus of Rpo21p, from strain LG1 and strains YKK20 and YIB37K, which both have an HA-tag on the C terminus of Prp42p, from DBY120 and YIB37, respectively. The hemagglutinin (HA)-tagged Prp42 was derived from pYM1 containing three copies of the HA epitope and KanMx6 selection marker. The Myc-tagged RPO21 was derived from pYM5 containing three copies of the Myc epitope and His selection marker. Tags were verified by Western blotting. Strain YGL130w contains a tandem affinity purification (TAP)-tagged Ceg1p (44) and was obtained from Cellzome. GAL1 gene expression was induced by growth in 2% galactose for 5 h. In the temperature shift assay, cells were grown at 24°C until an optical density at 600 nm of ~0.600 was achieved; half of the the cells then continued growth at 24°C, while the remaining half were shifted to 37°C for 45 min.
The ChIP was performed as described previously (18). Rabbit immunoglobulin G (IgG)-agarose beads (Sigma) were used with protein A-tagged proteins and TAP-tagged Ceg1p; anti-HA monoclonal antibody (MAb) 12CA5 (Boehringer Mannheim), followed by the addition of gamma-bind G Sepharose beads (Amersham) with HA-tagged proteins; anti-Myc MAb 9E10 (Santa Cruz) with Myc-tagged proteins, and MAb 8WG16 (Babco) against Pol II itself. In ChIPs of strains tagged with protein A tags or TAP tags, Cl-4b beads (Sigma) were used as negative control, while rabbit IgG-agarose beads (Sigma) were used to specifically retrieve these tags.
DNA was analyzed by PCR with multiplex primer sets (sequences available upon request) along the genes of interest. Cycling was for 3 min at 94°C, followed by 24 cycles with 1 min at 94°C, 1 min at 56°C, 2 min at 72°C, and then finally 7 min at 72°C. Three concentrations of each template were used over a 30-fold range, and PCR products were distinguished on high-resolution 2.3% MetaPhor agarose gels (BioWhittaker) and stained with Gelstar. Lanes were chosen for quantitation and figures, based on the intensity of the signal lying in the linear range. Negative control lanes represent PCRs in which the amount of template matched the amount of experimental template used. When results were quantified (ImageQuant software; Molecular Dynamics), the gel background was subtracted, and signals were normalized for the intensity of bands generated from the input material to adjust for differences in PCR efficiency. Signals in intron and exon 2 regions are expressed relative to promoter levels. Note that Prp42p ChIP signals at the promoter are very close to background; therefore, background signal from the control ChIPs was not subtracted.
Cy3- and Cy5-labeled probes were prepared by linker-mediated PCR of the HA-Prp42p and Myc-Pol II ChIP templates (see above) and sonicated genomic DNA fragments present in the starting extract according to the published protocol (44). Two-color competitive hybridization experiments with S. cerevisiae cDNA microarrays were performed at the Fred Hutchison Cancer Research Center (Seattle, Wash.). Microarray construction, target labeling, and hybridization protocols were adapted from those described previously (10). Yeast microarrays were constructed employing a set of 6229 open reading frame (ORF)-specific PCR primer pairs (Research Genetics, Huntsville, Ala.), which were used to amplify ≤1-kb 3′-end portions of each ORF. Individual PCR products were verified as unique via gel electrophoresis and purified by using ArrayIt 96-well PCR purification kits (TeleChem International, Sunnyvale, Calif.). Purified PCR products were mechanically “spotted” in 3× SSC (450 mM sodium cloride and 45 mM sodium citrate; pH 7.0) onto polylysine-coated microscope slides by using an OmniGrid high-precision robotic gridder (GeneMachines, San Carlo, Calif.). Probes were cohybridized to microarrays for 16 h at 63°C and sequentially washed at room temperature in 1× SSC-0.03% sodium dodecyl sulfate for 2 min, 1× SSC for 2 min, 0.2× SSC with agitation for 20 min, and 0.05× SSC with agitation for 10 min. Arrays were immediately centrifuged until dry and scanned by using a GenePix 4000 scanner (Axon Instruments, Union City, Calif.). Image analysis was performed by using GenePix Pro 3.0. In each independent experiment, data points were eliminated if there were defects over particular spots or if fluorescence intensities in either channel were unreliably low relative to the local background. To identify ORFs enriched by HA-Prp42p and Myc-Pol II ChIP procedures, the ratios of Cy3 and Cy5 fluorescence intensities were expressed as a ChIP score of log10(median intensity fluorochrome 1/median intensity fluorochrome 2)/√. Frequency histograms of the ChIP scores indicated a normal distribution of values for genomic-genomic hybridizations and a major peak with an outlying second peak for HA-Prp42-genomic and Myc-Pol II-genomic data sets. The means and standard deviations (SD) of all ChIP scores in each experiment were determined, and ORFs with ChIP scores that were >2 SD away from the mean were chosen for further analysis. Outlying ORFs selected by these criteria from five HA-Prp42p ChIP experiments (three with Cy5-Prp42 probes and two with Cy3-Prp42 probes) and three Myc-Pol II ChIP experiments (two with Cy3-Pol II probes and one with Cy5-Pol II probes) were pooled and analyzed for the reproducibility of their identification and the presence of introns within the genomic segment. Note that, because the probes were double stranded, ORFs may be hit on either DNA strand. Therefore, in scoring for introns, we considered any ORF to be intron containing if the ORF itself has an intron or if the ORF is within 500 nucleotides (nt) of an intron-containing gene on either strand. Because of the density of ORFs within the yeast genome, this occurred fairly frequently (see Results).
To determine whether the U1 snRNP accumulates cotranscriptionally and is detectable by ChIP, we constructed strain YKK19 (Table (Table1),1), harboring tagged copies of endogenous Pol II (Myc-Rpo21p) and Prp42p (HA-Prp42p), a U1 snRNP-specific protein (34, 43). Figure Figure1A1A shows schematically the proposed binding of the tagged U1 snRNP to nascent RNA, which is in turn tethered to the DNA axis by Pol II. Tagging of both essential proteins had no effect on the growth rate of the strain (data not shown), indicating that the normal functions of the Prp42p and Pol II were not disrupted. Two endogenous, intron-containing genes, ASC1 and DBP2 (Fig. 1B and C), were initially selected for analysis by ChIP because they are highly transcribed and have relatively large first exons (>500 bp) (19, 51). The DNA shearing procedure abolished PCR detection of ≥1-kb stretches along the genes of interest while preserving detection of ≤400-bp stretches (data not shown). Therefore, if the U1 snRNP associates with these genes, the system should resolve signals before and after intron synthesis.
Three gene regions from both DBP2 and ASC1 were well represented in the ChIPs of Pol II, whether the anti-Myc tag antibody (Fig. (Fig.1B,1B, lanes 10 to 12; Fig. Fig.1C,1C, lanes 4 and 9) or an antibody against Pol II itself (Fig. (Fig.1B,1B, lanes 7 to 9; Fig. Fig.1C,1C, lanes 5 and 10) was used. In agreement with previous results with tagged versions of various subunits of Pol II (24, 46), the distribution of Pol II along both genes was found to be fairly uniform, usually decreasing slightly downstream from the promoter. PCR products specifically resulted from ChIP of Pol II, since the same products were detected only very weakly in the nonimmune control ChIP (Fig. (Fig.1B,1B, lanes 2 and 3; Fig. Fig.1C,1C, lanes 2 and 7), and Pol II was not detected at either the transcriptionally repressed GAL1 gene or the untranscribed telomere VI R (see Fig. Fig.3B3B and and4).4). In contrast, HA-Prp42p ChIP templates yielded only very low levels of PCR product corresponding to either of the promoter regions, whereas intron and exon 2 regions of both genes were well detected (Fig. (Fig.1B,1B, lanes 4 to 6; Fig. Fig.1C,1C, lanes 3 and 8). Quantitation of PCR products within the linear range (see Materials and Methods) revealed that Prp42p ChIPs contained 6.4 ± 1.1 (mean ± SEM, n = 4 independent experiments)-fold-higher levels of DBP2 intron DNA than promoter-proximal DNA compared to Pol II ChIP templates prepared in parallel. Similarly, the DBP2 second exon was detected at 6.3 ± 2.4 (n = 4)-fold-higher levels than the promoter region. We conclude that Prp42p is concentrated in downstream regions of both DBP2 and ASC1 genes both during and after intron synthesis.
To confirm that Prp42p detection reflects the cotranscriptional accumulation of the U1 snRNP, we performed ChIP experiments with strains harboring other tagged versions of U1 snRNP-specific proteins. First, we examined protein A-tagged Prp42p and Nam8p, a second U1-specific protein (16). These strains were used previously to show that both PA-Prp42p and PA-Nam8p associate with the U1 snRNA by immunoprecipitation (16) and to study the interaction of PA-Nam8p with pre-mRNA in commitment complex formation (43). Figure Figure1D1D shows that ChIP with rabbit IgG-coated beads preferentially selects the ASC1 intron region relative to the promoter in both strains. This experiment also indicates that the HA tag introduced into YKK19 does not produce different results from other tagged versions. Second, HA-tagged Prp40p, another U1 snRNP component which has been shown to bind hyperphosphorylated Pol II CTD by Far Western analysis (38), was detected on DBP2 and ASC1 in an identical pattern to Prp42p (data not shown).
The present observation of U1 snRNP accumulation in downstream regions of ASC1 and DBP2 contrasts strongly with previous studies of the capping enzymes Ceg1p, Cet1p, and Abd1p, which have been detected at promoter regions of transcriptionally active genes (24, 46). Note that, unlike Abd1p, which remains associated with downstream regions, Ceg1p and Cet1p are preferentially concentrated in promoter regions (24, 46). To facilitate a direct comparison between U1 snRNP and capping enzyme dynamics on ASC1 and DBP2, ChIP was performed by using a strain containing TAP-tagged Ceg1p, the mRNA guanylyltransferase (Fig. (Fig.2).2). As expected, TAP-Ceg1p was highly concentrated at the promoter regions of the intronless gene PDR5 assayed in a previous study (24). Similarly, TAP-Ceg1p was concentrated on both ASC1 and DBP2 promoter regions, verifying the differential distribution of capping and splicing factors on two intron-containing genes.
To address the possibility that U1 snRNP may accumulate in downstream regions of all genes, we assayed U1 snRNP and Pol II distributions along three intronless genes. ChIP with anti-Myc-Pol II or anti-Pol II showed robust signals for promoter and downstream regions in RPS3, ADH1, and PDR5 genes (Fig. (Fig.3A,3A, lanes 4, 5, 9, 10, 14, and 15). In contrast, HA-Prp42p was detected at very low levels relative to the control at every position along RPS3 and PDR5, while somewhat higher levels of HA-Prp-42p were observed on ADH1 (Fig. (Fig.3A3A lanes 3, 8, and 13). Relative to levels obtained with Pol II ChIP templates, signals from HA-Prp42p ChIP either decreased or remained the same along the length of each gene. Therefore, U1 snRNP accumulation on DBP2 or ASC1 downstream regions cannot be attributed to generic changes in affinity for elongating Pol II, nonspecific binding to nascent RNA, or recruitment by the 5′ cap.
Because a low level of U1 snRNP association was detected within intronless genes relative to the nonimmune controls (Fig. (Fig.3A,3A, compare lanes 2 and 3, 7 and 8, and 12 and 13), we sought to determine whether gene induction is sufficient for U1 snRNP accumulation. GAL1 gene transcription was induced and changes in Prp42p association with the GAL1 promoter were determined. When YKK19 was grown in glucose, Pol II was detected at the ADH1 promoter but not on GAL1 or a transcriptionally inactive telomeric region on chromosome VI-R (Fig. (Fig.3B,3B, lane 4). Low levels of HA-Prp42p were detected at ADH1 only (Fig. (Fig.3B,3B, lane 3), as expected (see Fig. Fig.3A).3A). After 5 h of growth in galactose, Pol II was present on both ADH1 and GAL1, but Prp42p was not detected on the GAL1 promoter (Fig. (Fig.3B,3B, lanes 7 and 8) or in the downstream region (data not shown) relative to the nonimmune control. This suggests that U1 snRNP detection on chromatin is not due to transcriptional activity per se.
If U1 snRNP accumulation reflects specific binding to cognate sites in nascent RNA as proposed (Fig. (Fig.1A),1A), then it is expected to depend on RNA synthesis. To test this prediction, we introduced an HA tag into the endogenous copy of Prp42p in strain DBY120, harboring the temperature-sensitive rpb1-1 allele of the Pol II large subunit (35, 40). A previous study showed that in this strain the Pol II holoenzyme dissociates from previously active transcription units when the cells are shifted to the nonpermissive temperature (46). As expected, Pol II was detected at the ADH1 promoter and along DBP2 in DBY120 cells grown at 24°C but not at GAL1 or the telomere VI R (Fig. (Fig.4).4). After 45 min of growth at 37°C, Pol II was not detectable above background levels at any gene region tested (Fig. (Fig.4,4, lanes 7 and 15). Similarly, the U1 snRNP was detectable on the intron and exon 2 regions of DBP2 and ADH1 promoter in DBY120 cells grown at 24°C but was undetectable after the shift to 37°C (Fig. (Fig.4,4, lanes 4, 8, 12, and 16). A similar loss of Pol II and U1 snRNP detection at 37°C was observed for ASC1 (data not shown). These data indicate that active transcription is required for U1 snRNP accumulation and that the U1 snRNP is not associated with chromatin independent of transcriptional activity.
To test whether association of the U1 snRNP with downstream regions depends on the intron, we assayed U1 snRNP levels along the DBP2 gene in a strain lacking the DBP2 intron. In this strain, the endogenous DBP2 ORF was replaced by the DBP2 cDNA; transcription levels of this intronless gene were previously found to be twofold higher than wild-type levels (2). We had previously detected elevated U1 snRNP levels on both the intron and second exon of wild-type DBP2 (see Fig. Fig.1).1). Figure Figure55 shows that removal of the intron abolishes accumulation of the U1 snRNP on downstream DNA corresponding to the second exon. Normalizing to Pol II ChIP signals, the ratio of Prp42p downstream to Prp42p at the promoter was only 0.68 (versus 6.3-fold in wild-type). Thus, despite Pol II-driven expression, elevated U1 snRNP levels were not observed at the DBP2 allele lacking its intron.
Higher levels of U1 snRNP were detected on the DBP2 gene when the allele contained an intron (Fig. (Fig.11 and and5).5). Therefore, we postulated that intron-containing genes, constituting only ~5% of yeast ORFs, might accumulate more U1 snRNP than intronless genes. To test this directly, we performed genome localization analysis (20, 44). Cy3- and Cy5-labeled probes were generated from sheared YKK19 genomic DNA and HA-Prp42p and Myc-Pol II ChIP templates by linker-mediated PCR and hybridized with microarrays representing 6,229 yeast ORFs. The ratios of Cy3 and Cy5 median fluorescence intensities were used to analyze the data (see Materials and Methods). Microarrays hybridized with Cy3-genomic DNA versus Cy5-genomic DNA revealed a normal distribution of the data, as expected (Fig. (Fig.6,6, top panel). Interestingly, the data obtained from microarrays probed with HA-Prp42p ChIP templates versus genomic DNA yielded a major peak and a minor second peak of several hundred ORFs in each experiment, reflecting relatively higher scores for the labeled HA-Prp42p ChIP template probe (Fig. (Fig.6,6, center panel). Similarly, a second outlying peak was obtained from microarrays probed with Myc-Pol II ChIP templates and genomic DNA (Fig. (Fig.6,6, bottom panel). The striking difference between the shapes of the curves obtained with both ChIP templates compared to the genomic-genomic distribution indicates that specific sets of ORFs were selected by HA-Prp42p and Myc-Pol II ChIPs.
It was likely that the second outlying peaks obtained with HA-Prp42p and Myc-Pol II ChIP templates represented the population of ORFs associated with relatively high concentrations of Prp42p and Pol II. The mean values and the SD were determined for each experiment, and datum points >2 SD away from the mean were selected for further analysis. Results from five experiments comparing HA-Prp42p-genomic DNA and three experiments comparing Myc-Pol II-genomic DNA showed a high degree of reproducibility in the ORFs identified, by using the “2 SD criterion.” A list of HA-Prp42 ORF hits is provided in Table A1. Of the 388 ORFs hit in the HA-Prp42p arrays, 77 ORFs were hit every time (100%) and 161 ORFs were hit more than once (≥40%, Table Table2).2). Of 373 ORFs hit in the Myc-Pol II arrays, 234 occurred more than once (≥67%, Table Table2).2). A comparison of the Myc-Pol II hits with transcriptional frequency data (19) revealed that the outlying peak indeed contained highly transcribed ORFs (data not shown).
To determine whether the outlying peak observed with the HA-Prp42p ChIP templates was enriched in intron-containing ORFs, all ORFs were evaluated according to the yeast introndatabase(http://www.cse.ucsc.edu/research/compbio/yeast_introns.html). Because genome localization analysis has the potential to detect ORFs on either the Watson or the Crick strand and because the resolution of the ChIP assay is ~400 bp, each ORF hit was also examined with respect to its position within the genome to determine whether introns occurred <500 nt away from the ORF, on either the same or opposite strand of DNA. If an ORF hit contained an intron or was found to be proximal to an intron-containing gene by the above criteria, the hit was scored as intron containing. Table Table22 shows the results obtained for all of the ORFs hit more than once in the HA-Prp42p and Myc-Pol II microarrays. For the ORFs hit in every HA-Prp42p experiment, 92% were intron containing compared to 51% for Myc-Pol II. In contrast, only 4.2% of the ORFs distributed outside 2 SD for the genomic-genomic distribution were intron containing, reflecting the fact that only 5% of the yeast ORFs contain introns. Thus, we conclude that the HA-Prp42p ChIP template is highly enriched for intron-containing ORFs with respect to the genome overall.
The yeast genome is predicted to contain 239 to 255 spliceosomal intron-containing ORFs (http://www.cse.ucsc.edu/research/compbio/yeast_introns.htmlandhttp://www-db.embl-heidelberg.de/jss/servlet/de.embl.bk.wwwTools.GroupLeftEMBL/ExternalInfo/seraphin/yidb.html).Al-though we detected 118 intron-containing genes by genome localization of Prp42p, we did not detect all of them. Because U1 snRNP accumulation is transcription dependent (Fig. (Fig.4),4), some intron-containing genes may not be expressed at levels high enough to be detected in the outlying peak. However, because the array was produced by using oligonucleotide pairs to amplify ≤1 kb of the 3′ end of each ORF (see Materials and Methods), we also considered the possibility that genes containing relatively long second exons may exhibit diminished U1 snRNP accumulation in the 3′ regions represented on the arrays, either because the U1 snRNP has already left the nascent mRNP due to spliceosome assembly or because the tags become inaccessible to antibodies in downstream gene regions. To address this concern, we examined U1 snRNP accumulation on upstream and downstream regions of two such genes, ECM33 and SAC6. Both of these genes contain introns very close to their promoters (Fig. (Fig.7),7), and indeed significant U1 snRNP accumulation was detected in promoter-proximal regions by ChIP (Fig. (Fig.7,7, lanes 3 and 7). Interestingly, HA-Prp42p detection was reduced by ~70% in downstream regions of both genes. Neither ORF was well represented in the HA-Prp42p microarray results (Table (Table3),3), suggesting that other ORFs with relatively long second exons might not have been detected by the microarray analysis performed here.
The earliest events in the life of an RNA are difficult to study in vivo. Nascent RNA represents only a tiny fraction of any given RNA, and it is difficult to establish an order of events from the detection of rare RNA species within the greater pool. A recent view of RNA Pol II transcription units holds that, in addition to being the site of RNA synthesis, they are also RNA processing units at which 5′-end capping, pre-mRNA splicing, and polyadenylation occur while the pre-mRNA is still being synthesized by Pol II (39). However, it has only been possible to demonstrate cotranscriptional RNA processing for a limited number of genes, species, and biochemical events. Here we report the use of ChIP for the detection of U1 snRNP levels at active yeast transcription units. The data provide evidence that (i) the U1 snRNP is recruited cotranscriptionally; (ii) U1 snRNP accumulation along the length of intron-containing transcription units is dynamic, with high concentrations detectable in regions of intron synthesis; and (iii) the U1 snRNP is preferentially concentrated on intron-containing genes with respect to the largely intronless yeast genome.
In the present study, yeast strains harboring specific epitope tags on three endogenous U1 snRNP components—Prp42p, Nam8p, and Prp40p—were used to monitor the sites of accumulation of the U1 snRNP within transcription units. Using ChIP followed by PCR for the detection of specific gene regions, we found that all three U1 snRNP proteins are highly enriched in the downstream regions of two intron-containing genes: ASC1 and DBP2. The resolution of the assay is sufficient to detect differences in U1 snRNP accumulation at the promoters versus intron-containing regions of ASC1 and DBP2, and indeed ~6.5-fold increases in the levels of Prp42p were detected downstream of the 5′ splice site. The distribution of the U1 snRNP in gene regions distal to the promoter contrasts dramatically with that of the capping enzyme Ceg1p, which was detected preferentially at both promoters. Because Prp42p detection on downstream regions of ASC1 and DBP2 in a conditionally mutant Pol II (rpb1-1) strain was abolished upon shift to the nonpermissive temperature, we conclude that U1 snRNP accumulation is dependent on active transcription. Moreover, neither Prp42p nor Pol II was detected on transcriptionally inactive chromatin, such as the telomere VI R or the GAL1 gene under conditions of glucose repression.
Two hypotheses explaining the observed increase in U1 snRNP in downstream regions of ASC1 and DBP2 were tested. First, events occurring at all active Pol II transcription units were considered. These include changes in the Pol II itself as it progresses 5′ to 3′ (e.g., changes in the CTD phosphorylation state, association of elongation factors, and/or association of a network of other factors, such as polyadenylation factors ), the affinity of the U1 snRNP for the 5′ cap (8, 27), or possible nonspecific association with nascent RNA which becomes progressively longer toward the 3′ end of any gene. These possibilities were addressed by examining the highly transcribed intronless genes RPS3, ADH1, and PDR5. As expected, Pol II was detected at high levels relative to the background along the lengths of all three genes. Prp42p was detected at low levels above background, with no detectable changes in levels along the lengths of any of the three genes, even though PDR5 (4,539 nt) is 1,896 nt longer than DBP2 and 3,308 nt longer than ASC1 (see Fig. Fig.3).3). Therefore, we tested the second possibility: that U1 snRNP accumulation was specified by the presence of the intron. Indeed, removal of the DBP2 intron abolished accumulation of the U1 snRNP on the downstream region (see Fig. Fig.5).5). These data indicate that the enhanced detection of the U1 snRNP at downstream regions of ASC1 and DBP2 reflects the presence of an intron in the gene rather than events common to all Pol II transcription units.
As a definitive test of the proposal that the U1 snRNP accumulates cotranscriptionally on intron-containing genes, genome localization studies were carried out. In this approach, fluorescently labeled probes are synthesized from the ChIP templates and hybridized with microarrays containing 6,229 confirmed and predicted ORFs in the yeast genome. Genome localization analysis has been previously used to identify gene targets of a variety of transcription factors (20, 44). Several hundred ORFs were enriched in the HA-Prp42p ChIP template compared to sheared genomic DNA, producing an outlying second peak of the ratio of median intensities (see Fig. Fig.6).6). Up to 92% of the ORFs reproducibly identified in five microarray experiments were intron containing (see Table Table22 and Results). As a positive control for the assay, microarrays were also hybridized with probes synthesized from the Myc-Pol II ChIP templates; several hundred ORFs were identified in the outlying population, and up to 51% of these were intron containing. Although only ~5% of ORFs in the yeast genome contain introns, these genes tend to be among the most highly transcribed (1). Thus, the prevalence of intron-containing genes detected in the Pol II genome localization analysis was expected. The fact that a specific set of transcriptionally active genes is reproducibly identified by HA-Prp42p genome localization analysis confirms that U1 snRNP accumulation is cotranscriptional.
Close examination of the ORFs identified by Prp42p genome localization analysis raises several noteworthy points. First, because cotranscriptional U1 snRNP accumulation depends on transcription of the gene in the first place, the use of genome localization analysis to identify RNA processing targets is likely to be biased by transcriptional activity of a given gene relative to the rest of the transcriptome. This prediction is borne out by the present data, in which the ORFs identified by HA-Prp42p ChIP were found to be highly transcribed (see Tables Tables33 and A1). Transcriptionally inactive intron-containing genes, such as those induced in meiosis, were not detected. It is likely that other intron-containing genes may not be expressed highly enough to be represented in the outlying second peak.
Second, a number of intronless ORFs were consistently identified by HA-Prp42p as well as Myc-Pol II (see Tables Tables2,2, ,3,3, and A1). This suggests that either U1 snRNP specifically accumulates at some intronless genes or that the U1 snRNP nonspecifically accumulates at highly transcribed genes. Interestingly, ADH1 was among the ORFs identified by both HA-Prp42p and Myc-Pol II ChIP templates in the genome localization analysis. We had observed in previous experiments that Prp42p was reproducibly and evenly detected along the length of ADH1 relative to the nonimmune control ChIP (see Fig. Fig.3).3). ADH1 does not contain any consensus or nonconsensus sequence that has been shown to support U1 snRNA base pairing in the context of splicing (51). Thus, we cannot exclude the possibility that the U1 snRNP has a role in ADH1 gene expression through an unknown mechanism.
Finally, the absence of some highly transcribed intron-containing ORFs in the set of ORFs identified by HA-Prp42p genome localization led us to examine U1 snRNP distribution in ORFs containing a very short first exon followed by a very long second exon. We found that two such genes, ECM33 and SAC6, exhibit a high level of Prp42p at their 5′ ends and a reduced level at their 3′ ends (see Fig. Fig.7).7). Possible explanations include either the loss of the U1 snRNP due to spliceosome assembly or the inaccessibility of the tags further downstream in the ORF. Thus, in spite of U1 snRNP accumulation at their 5′ ends, these ORFs were not reproducibly identified in the genome localization analysis, probably because the array used here was biased toward the 3′ end of each ORF. We conclude that genome localization analysis for RNA processing factors is feasible and informative, but it is more complex than paradigms based on DNA binding, as in the case of transcription factors. Genes not identified as targets of RNA processing factors should not be eliminated as candidates until they are analyzed in more detail.
Taken together, the results presented here support the conclusion that the U1 snRNP is recruited cotranscriptionally to intron-containing genes in yeast. If pre-mRNA splicing catalysis occurs cotranscriptionally in yeast, as it does in other species, then additional splicing factors, such as those recognizing the 3′ splice site and components of the spliceosome, may also accumulate cotranscriptionally. Recent models have emphasized the importance of coupling between transcription and pre-mRNA splicing, focusing on the CTD of Pol II as a platform for direct molecular interactions (4, 15, 31). However, the CTD is not required for efficient splicing in yeast (11, 28). Indeed, our data indicate that RNA Pol II is not sufficient for U1 snRNP accumulation; Pol II is abundantly detectable along intron-containing, as well as intronless genes, and yet U1 snRNP levels do not correlate with the distribution of Pol II. Instead, elevated U1 snRNP levels within transcription units coincide with sites of intron RNA synthesis, suggesting a dominant role for RNA splicing signals in U1 snRNP accumulation. The observation that uniform and usually reduced levels of the U1 snRNP were detected on intronless genes suggests that the U1 snRNP may have low affinity for some additional element(s) present at transcription units, which may contribute to either initial recruitment and/or retention of the U1 snRNP at intron-containing genes. One candidate for such an interaction may indeed be the CTD of Pol II, since it has been shown that Prp40p, another component of the U1 snRNP, binds the phosphorylated CTD in vitro (38). It is interesting that Prp40p was not detected at promoters but mirrored Prp42p in its distribution (data not shown), suggesting that Prp40p may bind the CTD in vivo only after U1 snRNP recruitment to intron-containing gene regions. Thus, it is clear that the mechanism of U1 snRNP recruitment is fundamentally different from the more straightforward case of the capping enzymes, which bind to the hyperphosphorylated CTD to ensure capping of every Pol II transcript. The demonstration that intronless genes accumulate relatively less U1 snRNP overall than intron-containing genes in the transcriptome suggests a more complex and dynamic mechanism for recruitment which may support greater plasticity in splicing, particularly in higher organisms where alternative splicing is a common mode of gene regulation.
We thank L. Goetsch, B. Seraphin, and R. Iggo for strains; E. Young and W. Zacharaie for the tagging plasmids; and M. Ares for communicating unpublished results. We are grateful to the Gottschling lab for sharing their ChIP protocol, J. Delrow and J. Howard for help with the microarray analysis, and A. Hopper, D. Stanek, F. Stewart, J. Valcarcel, and A. Weiner for discussions and comments on the manuscript.
K.M.K. is the recipient of a predoctoral fellowship from Boehringer Ingelheim Fonds. Supported by the Max Planck Gesellschaft and a Research Project Grant (RPG-00-110-01-MGO) from the American Cancer Society.
The ORFs selected more than once by HA-Prp42p genome localization analysis were determined and are presented in Table TableA1A1.