|Home | About | Journals | Submit | Contact Us | Français|
In order to study the intragenic profiles of active transcription, we determined the relative levels of active RNA polymerase II present at the 3′- and 5′-ends of 261 yeast genes by run-on. The results obtained indicate that the 3′/5′ run-on ratio varies among the genes studied by over 12 log2 units. This ratio seems to be an intrinsic characteristic of each transcriptional unit and does not significantly correlate with gene length, G + C content or level of expression. The correlation between the 3′/5′ RNA polymerase II ratios measured by run-on and those obtained by chromatin immunoprecipitation is poor, although the genes encoding ribosomal proteins present exceptionally low ratios in both cases. We detected a subset of elongation-related factors that are important for maintaining the wild-type profiles of active transcription, including DSIF, Mediator, factors related to the methylation of histone H3-lysine 4, the Bur CDK and the RNA polymerase II subunit Rpb9. We conducted a more detailed investigation of the alterations caused by rpb9Δ to find that Rpb9 contributes to the intragenic profiles of active transcription by influencing the probability of arrest of RNA polymerase II.
In the last decade, the importance of transcription elongation regulation has been brought into focus. Many factors have been associated with this key step of gene expression, and it has been proved that several biological processes are connected to this transcription phase, including response to stress, development and viral infections (1,2).
Chromatin immunoprecipitation (ChIP) (3,4) using antibodies against different phosphorylated forms of RNA polymerase II (RNA pol II) (5) enables the measurement of elongation rates and processivity (6). Besides, the combination of RNA pol II ChIP with DNA arrays and massive sequencing has provided pictures of the distribution of RNA pol II in several genomes (7). Studying transcription elongation in vivo has also involved the use of other techniques, including the depletion of the intracellular pools of ribonucleotide triphosphates by drugs like 6-azauracile (8) and mycophenolic acid (9), or the comparison of reporter genes of different lengths (10).
One of the drawbacks of the ChIP of RNA pol II is its lack of specificity against the active, elongation-competent form of the polymerase. In vitro studies have shown that RNA pol II often becomes arrested during elongation in the chromatin context (11), while molecular modeling has suggested that backtracking during elongation is indeed a frequent phenomenon in vivo (12).
The run-on technique has proved highly appropriate to deal with these issues. It enables the measurement of the density of actively transcribing RNA polymerases by labelling nascent mRNA in the presence of high salt and sarkosyl, which inhibits a new round of transcription initiation without affecting the elongation reaction (13). Global transcription analyses have been carried out by combining run-on with either DNA arrays hybridization (14) or massive sequencing (15). Using this genomic run-on (GRO) approach, we have recently shown that some functional gene categories are controlled at the elongation step by modulating the fraction of RNA polymerases that become inactive during transcription (16).
In the present work, we have used the run-on technique and a new type of custom-developed DNA arrays to quantitatively analyse the intragenic distribution of active RNA pol II. By probing run-on preparations with the DNA sequences of the two ends of a broad set of genes, we found that the 3′/5′ ratio of actively transcribing polymerases is gene-specific. Among the tested genes, those encoding structural ribosomal proteins (RP) showed the lowest 3′/5′ run-on ratios. We also measured these ratios under several conditions and in mutant backgrounds, and we detected a clear influence of some elements of the transcriptional machinery on the intragenic distribution of active RNA pol II.
All the strains, except W303 (MAT a ade2-1 can1-100 ura3-1 leu2-3,112 his3-11,15 trp1-1) (17) and its derivatives SET1-myc and set1AA-myc (18), were purchased from EUROSCARF and were isogenic to BY4741 (MAT a; his3Δ1; leu2Δ0; met15Δ0; ura3Δ0) with the corresponding gene substituted by a KANMX4 cassette.
Rich (YPD) and minimal (SC) media were used as previously described (19). Osmotic shock was produced by adding NaCl at a final concentration of 0.5 M.
DNA macroarrays were produced in the Sección de Chips de DNA-S.C.S.I.E of the University of Valencia following the procedure described in (20). The double strand DNA 300-bp-length probes obtained by PCR were printed onto positively charged nylon membranes using a BioGridTM robot (BioRobotics). Later, 20–30 ng of DNA were spotted on each position.
Some of the probes were amplified using genomic DNA as a template, while some others were amplified using plasmids containing the individual ORFs (20). The primers used (Supplementary Table SI) are available upon request.
A Hybond N+ membrane (Amersham) was placed into a PR600 Slot Blot device (Hoefer Scientific Instruments). Each well of the device was loaded with 500 ng of dsDNA probes diluted in 0.5 ml 0.4 M NaOH 200 mM EDTA, previously boiled to denaturalize the DNA. The solution was forced through the membrane by applying a vacuum pump to the slot blot device. The membrane was then washed by rinsing it in 0.6 M NaCl 60 mM Na citrate (SSC 2×). Probes were obtained by PCR using the primers listed in Supplementary Table SII.
Run-on assays were performed as described in (21) with minor modifications. Shortly, 50 ml of yeast culture were collected at OD600 0.5 by centrifugation at 4°C. Cells were washed in 5 ml cold TMN solution (10 mM Tris–HCl pH 7.4, 5 mM MgCl2, 10 mM NaCl), centrifuged again and resuspended in 950 µl cold H2O. Next, 50 µl of 10% sarkosyl were added at a final concentration of 0.5%. Cells were incubated in this solution for 20 min at 4°C for permeabilization purposes. Afterward, cells were centrifuged and the supernatant was completely removed. The transcription reaction was performed in 150 µl of transcription buffer (50 mM Tris–HCl pH 7.9, 100 mM KCl, 5 mM MgCl2, 1 mM MnCl2, 2 mM dithiothreitol, ATP, GTP and CTP 0.5 mM each and 100 µCi alpha-33P] UTP (3000 Ci/mmol)). The mix was incubated for 3 min at 30°C. The reaction was stopped by adding 1 ml cold TMN solution. RNA was immediately extracted following the acid-phenol protocol (22).
ChIP experiments were performed as previously described (23) with minor modifications. Shortly, 50 ml of yeast culture were collected at OD600 0.5. Crosslinking was performed by adding formaldehyde to 1% to the culture and by incubating at room temperature for 15 min. Then 2.5 ml 2.5 M glycine were added and the culture was incubated for 5 min. Cells were then harvested and washed four times with 25 ml Tris–saline buffer (150 mM NaCl 20 mM Tris–HCl pH 7.5) at 4°C. Cell breakage was performed in 300 µl of lysis buffer (see the above reference) with glass beads, and cell extracts were sonicated in a Bioruptor sonicator (Diagenode) for 30 min in 30 s on/30 s off cycles (chromatin is sheared into an average size of 300 bp). Immunoprecipitation was performed with magnetic beads coated with pan anti-IgG antibodies (Dynal), which were incubated with the 8WG16 monoclonal antibody beforehand.
Finally, 25 µl real-time PCR were performed to quantify immunoprecipitation using a dilution of 1:1500 for the input samples and another of 1:10 for the immunoprecipitated samples. Immunoprecipitation was defined as the ratio of each specific probe product in relation to that of a non transcribed region (chromosome V, coordinates 9716 to 9863). The primers used are listed in Supplementary Table SIII.
For ChIP on chip purposes, the immunoprecipitated DNA was amplified and labelled following the procedure described in (24).
First, 50–20 ng of yeast genomic DNA was diluted in 35 µl H2O and then boiled. Afterward, 5 µl of the 10× random hexanucleotides mix (ROCHE), 5 µl of a solution containing dATP, dGTP, dTTP, 0.5 mM each, 20 µCi [α-32P]dCTP (3000 Ci/mmol) and 2 U of Klenow polymerase were added to a final volume of 50 µl. The reaction mix was incubated for 1 h at 37°C. Non incorporated nucleotides were eliminated by purifying the sample through a Sephadex G-50 column.
Radio-labelled RNA from the run-on was fragmented and denatured prior to hybridization by adding NaOH to the sample at a final concentration of 50 mM and by incubating for 5 min on ice. Fragmentation was stopped by adding HCl to a concentration of 50 mM. Genomic DNA was denatured before hybridization by boiling it for 5 min. Hybridization with labelled RNA (run-on) or DNA (ChIP or genomic DNA) samples was performed as described in (16). Membranes were exposed in Fuji BAS screens for 5–7 days and were developed with a FUJIX FLA3000 device. Signals were quantified using the Array Vision software, version 8.0 (Imaging Research Inc.).
Only those spots with a signal 1.3 times over the background were considered. After this filtering and background subtraction, the average signal of the two replicas of each probe was only obtained if both values were over the threshold value. For RNA hybridizations, this value was corrected by the number of uracils present in the coding strand of the probe. Then the 3′/5′ ratio was obtained. Similarly, genomic DNA signals were corrected for the numbers of cytidines present in the probes. The run-on ratio was divided by the genomic DNA ratio to correct probe to probe differences in DNA amount and hybridization efficiency. Finally, the log2 of this ratio was obtained. At least three replicas of each experiment were performed.
The comparisons made of the results obtained for each strain or condition were analysed with a Student’s two-tailed paired t-test (25).
In order to determine the intragenic distribution of transcriptionally active RNA pol II in a large number of genes, we designed a DNA macroarray containing probes of about 300 bp corresponding to the 5′ and 3′ ends of 377 S. cerevisiae ORFs (Supplementary Table SI). We also designed a second array containing similar probes for a subset of 76 highly expressed genes (26) (Supplementary Table SI). We used these membranes for the hybridization done with the labelled RNA via the run-on technique so that the signal obtained in each probe was proportional to the density of the active RNA pol II present in this particular piece of the genome. Then we divided the signal obtained in the 3′ probe by the signal obtained in the 5′ probe of each transcriptional unit. This ratio was used as a parameter to reflect the intragenic distribution of the transcriptionally active, i.e. transcriptionally competent, RNA pol II in the genes represented in the membrane (see ‘Materials and Methods’ section for the normalization and quality control procedures). Log2 scale is used; therefore, positive values represent a higher density of active RNA polymerase present at the 3′-end than at the 5′-end, while negative values mean that more active RNA polymerases are present at the 5′-end than at the 3′-end.
First, we tested the intragenic distribution of transcriptionally active RNA pol II in a BY4741 wild-type strain. The results of 261 transcriptional units were obtained after the quality controls and normalizations (see ‘Materials and Methods’ section). We did not obtain consistent signals (repetitively over the background threshold) for at least one of the two probes of the other 116 genes represented in the array. The average results are depicted in Figure 1A as a dot plot in a log2 scale with the standard deviations shown as error bars. The 3′/5′ ratio range covered more than 12 log2 units (from –5.49 in RPL25 to 6.66 in YRF1-2; Supplementary Table SIV) and adapted to a normal distribution (Figure 1A, Supplementary Figure S2A and B). Conventional run-on assays of five highly expressed genes confirmed that the intragenic distribution of active RNA pol II was not uniform (Supplementary Figure S2C). Additional evidence indicating that the dispersion of the 3′/5′ results did not happen by chance was obtained by randomly pairing the 3′ probes signal with the 5′ probes signal of a particular experiment. The results of five random distributions were much more dispersed than the results of the real experiment (Supplementary Figure S3). Therefore, in spite of the background noise that the use of single probes might have introduced in these results, we conclude that the distribution of active transcription is gene-specific.
In order to understand the cause of these differences in the 3′/5′ run-on ratios, we studied whether there was a correlation between the 3′/5′ ratios and several gene properties. We found no correlation with either the G + C content (Supplementary Figure S4A) or the distance to the nearest telomere (Supplementary Figure S4B), although the subtelomeric YRF1 genes exhibited the highest 3′/5′ ratios (Supplementary Table SIV). Since the 5′ probes were situated inside the ORFs, we looked for a possible correlation between the 3′/5′ run-on ratios and the length of the 5′ UTRs. By using the data of the transcription start sites published by Zhang and Dietrich (27), we found no correlation between these two parameters (Supplementary Figure S4C). The data also dismissed a significant influence of overlapping cryptic unstable transcripts (CUTs) or stable unannotated transcripts (SUTs) (28) on the 3′/5′ run-on ratios (Supplementary Figure S4D and E). We observed a slightly positive correlation of the 3′/5′ run-on ratios with the length of the transcription unit (Supplementary Figure S4F) and a weak negative correlation with the expression level according to Holstege et al. (26) (Supplementary Figure S4G). We also found that the average 3′/5′ ratios of intron-containing genes were slightly lower than those of intron-less genes (Supplementary Figure S4H).
These results suggest that short, highly expressed, intron-containing genes should have the lowest average 3′/5′ run-on ratios. The functional group of genes that best fitted these three features is that of the RP genes. The positive correlation between gene length and the 3′/5′ run-on ratio, and the negative correlation between this and the expression level, diminished when the RP genes were excluded (Supplementary Figure S4F and G), while the average 3′/5′ run-on ratio of the intron-containing genes, once non-RP genes were excluded, was even more significantly different to intron-less genes (Supplementary Figure S4H). Accordingly, we found an overrepresentation of the RP genes in the minimal values of the 3′/5 ratios distribution (Figure 1A; Supplementary Figure S2A), where the average 3′/5′ ratio (−1.351 log2 units) was significantly lower than that of those genes without a functional link to the ribosomes (0.375 log2 units; P = 4.34 × 10–6; Figure 1A). The low 3′/5′ run-on ratio is a very specific feature of those genes encoding structural RPs since other ribosome-related genes belonging to the RiBi regulon (29) displayed a scattered distribution of 3′/5′ ratios, while their average ratio (0.74 log2 units) did not show a statistically significant difference to the non ribosome-related genes (P = 3.04 × 10–1; Figures 1A and S2A). Gene ontology analysis of the data, using the Fatigo application from Babelomics (30), revealed that RP-related categories were the only ones exhibiting a significantly biased distribution of active transcription. ‘Structural constituent of ribosomes’ was the item showing the highest adjusted P-value (2.68 × 10–6).
Run-on assays measure the density of the actively elongating RNA polymerases present in a given sequence, but it does not detect enzymes that are not competent for elongation, because either they have not yet initiated, have already terminated, or have been arrested and backtracked during elongation (13). In order to test whether the different run-on profiles of the genes related to the different proportions of active RNA pol II (active versus total), we performed ChIP on chip experiments using the 8WG16 antibody which recognizes the carboxy-terminal domain of the biggest RNA pol II subunit (5), and the same 5–3′ macroarrays we used in the run-on experiments; a protocol that we called RPCC (RNA Pol ChIP on Chip) in a previous paper (16). The 3′/5′ ChIP ratios extended between –3.35 (RPL24B) and 2.99 (GAL7), a considerably shorter range than the one we found for the run-on ratios (Figure 1B). Of the analysed genes, 80% showed ChIP ratios close to 1 (between 0.5 and 2; Supplementary Figure S2B), whereas only 53% exhibited run-on ratios within this range (Supplementary Figure S2A). The biological meaning of this difference is uncertain and it might be related to the shorter dynamic range of RPCC vs GRO (16).
RP genes were also overrepresented in the lowest part of the 3′/5′ ChIP ratios distribution and the average ChIP ratio for RP genes (−0.91 log2 units) even differed more significantly from the non ribosomal genes (0.28 log2 units; P = 2.04 × 10–11) than the average run-on ratio (Figure 1B). However, the overall correlation between run-on and ChIP ratios was poor (Figure 1C) and exhibited a Pearson’s coefficient of only 0.375. These results suggest that the 3′/5′ run-on ratio is not just a direct consequence of the distribution of total RNA Pol II within the genes, but reflects a combination of this parameter with others that contribute to set the proportion of active enzyme along the gene. We also measured the distribution of RNA pol II by ChIP and quantitative PCR in the individual genes tested before. The difference between the run-on and ChIP profiles was clear in most cases (Supplementary Figure 2C). We conclude that RNA pol II has the tendency to become transcriptionally inactive by becoming arrested during elongation in certain gene contexts.
In spite of the low overall correlation between the ChIP and run-on ratios, some functional categories, like RP and RiBi, presented more significant correlations (Figure 1D). This suggests that the distribution of active RNA pol II within a particular gene could be affected by its regulatory mechanisms. To test this hypothesis, we investigated whether the run-on ratios of a subset of 76 highly expressed genes globally respond to physiological changes. First, we compared the distribution of the RNA pol II in those cells grown in a rich medium (YPD) and in a minimal medium (SC) for two wild-type strains: BY4741 and W303 (Supplementary Figure S5A and B). In all the comparisons made in this work, we did not find a uniform homogenous behaviour of all the genes represented in the array as there was always a fraction of genes that increased the ratios while another fraction decreased them. Therefore, the same absolute value of average increment can reflect a modest, but very general effect on the gene population, or a stronger effect in a subpopulation of genes. To display this information, we presented the data using box-and-whisker diagrams in which the minimum, the maximum, the average, the median and the 25th and 75th percentiles are all represented. To test whether the two data distributions significantly differed, we performed Student’s two-tailed t-tests for paired samples. In BY4741, there was an apparently slight displacement of active polymerases toward the 3′ end of the transcription units in the minimal medium (Figure 2). The average log2 (3′/5′ ratio) of the minimal medium was −0.29 and −0.42 for the rich medium (Table 1). A small shift toward the 5′ end was also observed in W303 (with averages of 0.09 and −0.01 for the rich and the minimal mediums, respectively). However, the level of confidence that we set for this work (P <0.01) indicates that these changes are not statistically significant (Figure 2, Table 1).
We then investigated the effect of osmotic stress on the intragenic distribution of active RNA pol II. We performed run-on assays on a BY4741 strain with or without adding NaCl at a final concentration of 0.5 M and after a 20-min incubation. No general change was observed in the 3′/5′ ratios (Supplementary Figure S5C). The difference between the averages of both conditions was only 0.12 log2 units (−0.29 and −0.41 for the non-treated and the NaCl-treated cultures, respectively), with no statistical significance (Figure 2, Table 1). We conclude that the physiological stimuli tested do not have a general effect on the intragenic distribution of active RNA pol II.
We then compared the 3′/5′ run-on ratios of the two genetic backgrounds studied. The general distribution in both strains was conserved. Those genes with a high 3′/5′ run-on ratio in the BY4741 background also showed a high ratio in the W303 background and the transcription units with a low 3′/5′ run-on ratio in BY4741 also presented a low one in W303 (Supplementary Figure S5D). The distribution of the cells grown in the YPD medium showed a high Pearson’s correlation coefficient (r = 0.806). However, the average distribution of active polymerases shifted to the 3′-end in W303 in relation to BY4741. The average 3′/5′ ratios were −0.33 log2 units for BY4741 and 0.15 for W303, with a very significant statistical difference (P = 2.39 × 10–6; Figure 2, Table 1). Very similar results were obtained in the minimal medium (Figure 2, Supplementary Figure S5E). In this case, the correlation between the two series of data was even higher (r = 0.902) and the difference between the averages of both strains was 0.28 logarithmic units, while the P-value in the Student’s t-test was 5.25 × 10–5 (Table 1). We conclude that the genetic background has a general influence on the intragenic distribution of active RNA pol II.
In order to test whether a single-gene mutation can alter the intragenic distribution of active pol II, we performed a series of run-on assays with an spt4Δ mutant, which lacks one of the two subunits of the well-known transcription elongation factor DSIF (31–35). Figure 3A shows a typical run-on hybridization experiment to compare a wild-type BY4741 with an isogenic spt4Δ mutant. A visual inspection reveals that the 3′/5′ ratio in several genes is lower in the mutant than in the wild type. The quantification of the ratio in both strains confirmed this observation (Figure 3B). The statistical analysis of the data showed the two distributions to be different (P = 8.38 × 10–4) with a log2 average difference of −0.31 units (Table 1).
Given this result, we wondered whether the difference observed between BY4741 and W303 might be due to a single mutation. One of the genetic specificities of the W303 background is the SSD1 gene, which is involved in several aspects of transcription and mRNA biogenesis (36–41).The BY4741 strain bears the dominant SSD1-v allele, whereas the W303 background possesses the ssd1-d allele that encodes a truncated form of the Ssd1 protein (42). We analysed the 3′/5′ run-on ratios in an ssd1Δ strain (BY4741 background). The results obtained showed a clear shift of the run-on signal toward the 3′-end in ssd1Δ (a difference of 0.39 log2 units and a P-value of 6.20 × 10–3) (Figure 3C and D). A comparison between the ssd1Δ mutant in the BY4741 background and the wild-type W303 strain revealed no significant difference, with average ratios of 0.09 and 0.15 log2 units, respectively, and a P-value of 6.87 × 10–1 (Supplementary Figure S6A). We conclude that the ssd1 mutation in W303 is the most likely reason for its different RNA pol II intragenic distribution, as compared with BY4741.
These results might suggest that any alteration of the RNA pol II machinery has a significant unspecific impact on the intragenic profiles of active transcription. Alternatively, Spt4 and Ssd1 might belong to a more select group of elongation factors that are involved in shaping these profiles. In order to test these hypotheses, we decided to carry out a more systematic genetic analysis of the intragenic distribution of active RNA pol II. We analysed the 3′/5′ run-on ratios in a collection of mutants defective for the elements involved in RNA pol II transcription elongation. First, we studied mutants lacking one of the two non essential subunits of the RNA pol II, namely Rpb4 and Rpb9 (43,44). On the one hand, the deletion of RPB4 resulted in a slight shift of the active RNA pol II distribution toward the 5′-end of −0.27 log2 units (Figures 3D and Supplementary S6B); however, this shift was not statistically significant (P = 1.53 × 10–2; Table 1). On the other hand, the deletion of RPB9 led to a very marked change in the distribution of active RNA pol II in the opposite direction (Figures 3D and Supplementary S6C). The average 3′/5′ ratio in the rpb9Δ mutant was 0.76 log2 units, while it was − 0.27 in the wild type, yielding an extremely significant P-value of 1.68 × 10–12 (Table 1).
The next mutant we studied was bur2Δ. It lacks a component of the BUR cyclin-kinase complex, which phosphorylates not only the carboxy-terminal domain of the DSIF component Spt5 (45,46), but possibly the CTD of Rpb1 (47–49). The deletion of BUR2 led to a general shift of active RNA pol II toward the 3′-end of the transcription units (Figure 3D and Supplementary S6D), presenting a statistically significant average difference of 0.33 log2 units (P = 3.00 × 10–3; Table 1).
The lack of TFIIS/Dst1, a cleavage factor which rescues RNA pol II from arrested situations (50–52), led to an accumulation of the run-on signal toward the 5′-end of the transcribed regions (Figures 3D and Supplementary S6E). This alteration is statistically significant according to the Student’s t-test (P = 8.61 × 10–3; Table 1) despite being slight (an average log2 difference of −0.21 units).
We have recently published the findings that those mutations lacking subunits of the Mediator (53) and the CCR4-Not complex (54) produce transcription elongation defects both in vitro and in vivo (55). The med2Δ mutant, lacking a subunit of the Mediator tail domain, produced a clearly significant (P = 3.17 × 10–4) displacement of the run-on signal toward the 5′-end with an average difference of −0.47 log2 units (Figures 3D and Supplementary S6F, Table 1). With regard to the CCR4–Not complex, we studied ccr4Δ and not5Δ. Although the not5Δ mutant presented no significant differences with the wild-type strain (Figures 3D and S6G, Table 1), the ccr4Δ mutant was clearly affected (Figures 3D and Supplementary S6H) and showed a general accumulation of the run-on signal at the 3′-ends of the transcription units (increment of 0.65 log2 units, P = 3.70×10–4; Table 1).
Next, we decided to analyse some of the mutations affecting the chromatin-related factors involved in the covalent modifications of histones. We chose spt7Δ and spt20Δ which lack subunits of SAGA (56,57); cdc73Δ and rtf1Δ which lack subunits of the PAF complex (58,59); set1Δ which lacks the catalytic subunit of the COMPASS H3-K4 methylase complex (60,61); and bre1Δ which lacks the E3 ubiquitin ligase involved in H2B ubiquitination (62). We found no significant difference in the two SAGA mutants when compared with the wild type (Figures 3D and Supplementary S6I-J, Table 1). The same happened with cdc73Δ, rtf1Δ and set1Δ (Figures 3D and Supplementary S6K-M, Table 1). In contrast, bre1Δ presented a significant decrease in the average log2 (3′/5′ ratio) when compared with the wild type (−0.55 versus −0.23 with a P-value of 7.93 × 10–3) (Figures 3D and Supplementary S6N and Table 1).
H2B ubiquitination and H3-K4 methylation are closely connected as the former is a covalent modification required for H3-K4 di- and tri-methylation (18,63). Since set1Δ abolishes any kind of H3-K4 methylation and bre1Δ only eliminates H3-K4 di- and tri-methylation, we tested set1AA, a point mutation in an RRM domain of Set1 that almost abolishes H3-K4 tri-methylation, but accumulates significant levels of dimethylation at the 5′-end of the transcribed region (64). The analysis of the set1AA mutant shows a clear shift of the run-on signal toward the 5′-end (increment of −0.40 log2 units, P = 1.15 × 10–4; Figures 3D and Supplementary S6O, Table 1), which suggests that the imbalance between tri- and mono-/di-methylation in the transcribed regions brings about an aberrant intragenic distribution of active RNA pol II.
Finally, we investigated the consequences of two mutations related to translation, fun12Δ and fes1Δ, which somehow cause transcription elongation-related phenotypes (55). FUN12 encodes a translation initiation factor (65). Its deletion brought about an important change in the 3′/5′ run-on, and an average 3′/5′ratio of −0.07 log2 units was noted as opposed to the −0.40 log2 units of the wild type (P = 5.02 × 10–3; Figures 3D and Supplementary S6P, Table 1). The last factor studied, Fes1, is the nucleotide exchange factor of the Ssa1 and Ssb1 chaperones (66,67), both of which are associated with elongating ribosomes. The effect of this mutation on the active RNA pol II distribution had the opposite effect (Figures 3D and Supplementary S6Q, Table 1) since it produced a lower average 3′/5′ ratio than the wild type (−0.84 versus −0.47 log2 units, P = 3.17 × 10–4). These results reinforce the connection of these two translation-related proteins to transcription elongation.
The genetic analysis carried out with the array of 76 highly expressed genes indicates that the intragenic profiles of active transcription do not respond to all the alterations of the transcriptional machinery, but to a subset of elongation-related factors. In order to do an in-depth investigation into rpb9Δ, which presented the most extreme alteration in the 3′/5′ run-on ratio, we repeated the analysis using the extended array that we had previously used for characterizing the 3′/5′ run-on ratios in the wild type. The results confirmed that the shift in the run-on ratios produced by rpb9Δ was highly significant (0.66 log2 units in average, P = 2.54 × 10–31) (Figure 4A). The effect is general (r = 0.888), including both the RP (0.80 log2 units in average) and the RiBi genes (0.73 log2 units; Figure 4A). We also analysed some individual genes in detail. The results confirmed a stronger decrease of active RNA pol II density at the 5′-end of HXK2 and HXT1 in rpb9Δ in relation to the wild type than at the 3′ end (Figure 4B). In contrast, we detected neither comparable changes in the distribution of total RNA pol II by ChIP along these two genes (Figure 4B) nor general changes in the 3′/5′ ratios of the total RNA pol II for the genes present in the array. As shown in Figure 4C, the variation in the ChIP ratios produced by rpb9Δ was in the opposite direction to the change in run-on ratios and much less significant than this. The RP genes were the exception, since their average 3′/5′ ratio slightly increased (0.26 log2 units), unlike the slight decrease noted (−0.14 log2 units) of non ribosome-related genes (P = 1.34 × 10–4; Table 1). This coordinated shift of run-on and ChIP ratios for the RP genes originated a higher correlation between run-on and ChIP ratios for this regulon, whereas the correlation for the rest of the genes did not increase (non ribosome-related), or even decreased (RiBi) (compare Figures 1C and 4D). We conclude that rpb9Δ globally alters the intragenic distribution of active RNA pol II by changing the proportion of enzymes that become transiently arrested at the 5′-end of the transcribed region.
The use of the run-on technique in combination with DNA arrays has allowed us to study the distribution of transcriptionally competent polymerases in a broad set of yeast genes. The results obtained reveal that the intragenic distribution of active transcription varies vastly from gene to gene. CUTs or SUTs overlapping the probes do not explain this variation, since the number of genes affected in the array was very low (32 out of 377). Even after excluding 40 additional genes located in the vicinity of CUTs and SUTs (<200 bp away from any of the probes present in the array) the run-on ratios were clearly gene-specific (Supplementary Figure S4E). The effect of the tested mutants on the run-on ratios was also unaffected by the exclusion of CUTs and SUTs overlapping genes (Supplementary Figure S6R). The intragenic distribution of active transcription does not correlate extensively with any of the other parameters that we analysed (length, G + C content, expression level, presence of introns). It seems to be an intrinsic feature of each transcription unit as it is roughly conserved in all the mutants and under all study conditions (Supplementary Figure S7A).
When we compare the distributions of RNA pol II determined by run-on and ChIP, we observe a much more uniform profile by ChIP, with the majority of genes showing 3′/5′ ChIP ratios close to 1. The detailed analysis of individual genes shows an almost constant distribution of total RNA pol II along their bodies, whereas their run-on profiles for each gene are markedly different and show considerable fluctuations along the transcription units. The more homogenous patterns produced by ChIP may not be just a consequence of the lower resolution of this technique since the RP genes (generally short and, therefore, more sensitive to cross-contamination between the 3′ and 5′ ChIP signals) presented the most biased ChIP ratios. We rather think that the simplest explanation for this phenomenon is the existence of arrested, transcriptionally inactive polymerases along the transcribed regions, which do not produce a run-on signal despite being detected by ChIP.
Although pausing and arrest seem to be common phenomena in each transcription cycle in vivo (68), we show that the tendency of RNA pol II to become inactively arrested is variable among genes and likely influenced by the genomic environment. It is possible that nucleosome positioning, a genomic feature that is considerably gene-specific (69), may increase the probability of RNA pol II becoming arrested. This hypothesis is in good agreement with the conclusions drawn from in vitro experiments that present a high tendency of RNA pol II to backtrack in front of a nucleosome (11).
By considering the whole gene as the unit of analysis, we have recently published that a significant proportion of the total RNA pol II that is engaged in transcription under standard growth conditions is not detectable in a GRO assay (16). We showed that this accumulation of inactive RNA pol II is especially relevant in the RP genes (16), which, according to the present work, exhibit the most 5′-biased distribution of total RNA pol II. It is appealing in this respect that the transcribed regions of the RP genes show some specific chromatin features, including a significantly lower 5′ nucleosome spacing than the rest of the genome (70). Transcription through these highly packaged genes would involve a higher probability of pausing and arrest and a subsequent 5′ bias in the distribution of total RNA pol II. However, Weiner et al. (70) have provided evidence of transcriptional activity modulating chromatin packaging, and not vice versa. Accordingly, we failed to detect a significant quantitative correlation between the 3′/5′ run-on ratios and the corresponding ratios of sensitivity to micrococcal nuclease when we used the data kindly provided by O. Rando (results not shown). Therefore, if nucleosome positioning plays a role in restricting RNA pol II activity, additional chromatin elements should be involved. The genetic analysis carried out in the second part of this work sheds some light in this respect (see subsequently).
Gene specificity is not contradictory to coordinated regulation. The RP genes exhibit an exceptional positive correlation between ChIP and the run-on ratios in both the wild type and rpb9Δ. This fact suggests that, despite their differences, the intragenic distributions of total and active RNA pol II can be somehow interdependent and that there should be a mechanism connecting them. This mechanism might be characteristic of each regulon. In this sense, and unlike the RP regulon, the RiBi one shows a higher ChIP-run-on correlation in the wild type than in rpb9Δ, which indicates that Rpb9 plays a more prominent role in linking the intragenic distributions of total and active RNA pol II in the RiBi genes. We have previously shown that the proportion of active versus total RNA pol II during elongation is differentially controlled in response to the carbon source in three regulons (RP, RiBi and mitochondrial), which represent >10% of yeast genes (16). Accordingly, the gene-specific profiles of active and total RNA pol II might be the consequence of the local effects caused by the regulatory mechanisms controlling transcription at the elongation level.
We have hereby shown that some single mutations can have general effects on the intragenic profiles of active transcription, whereas other mutations, including those affecting important general transcription factors like SAGA or the Paf complex, do not significantly affect the 3′/5′ run-on ratios. All the tested mutations were viable and, therefore, their average effects on the 3′/5′ run-on ratios were modest (1 log2 units at the most; Table 1). These moderate alterations of the active RNA pol II profiles are fully compatible with normal cell growth under laboratory conditions. W303, for instance, which has been considered the reference strain in many gene-expression studies, displays significantly higher 3′/5′ run-on ratios than BY4741. We have shown that this difference is likely due to the presence in W303 of a recessive allele (ssd1-d) that encodes a truncated form of Ssd1 lacking an RNAse II-like domain (42). Ssd1 participates in mRNA processing (39,40), binds RNA (38) and physically interacts with the hyperphosphorylated form of the RNA pol II CTD (71). Our results indicate an important role of Ssd1 during transcription elongation and help explain why some gene expression-related phenotypes are specific of the W303 background (37,41).
The results of the genetic analysis that we have carried out confirm the involvement of some protein factors in transcription elongation (Ccr4, Fun12 and Fes1), whose molecular connection with the transcriptional machinery is uncertain (55). In other cases, previous knowledge on the factors function allows approaching the molecular basis of this phenomenon. This is the case of bre1Δ and set1AA, producing a shift in the distribution of active pol II toward the 5′-end of the genes. The deletion of BRE1 abolishes H2B ubiquitynation and, subsequently, H3-K4 di- and trimethylation (18)(63), whereas set1AA causes a defect in H3-K4 trimethylation without eliminating dimethylation (64). In contrast, the deletion of SET1, which abolishes all forms of H3-K4 methylation, does not produce any significant shift in the intragenic localization of active RNA pol II. We propose that the balance between the different forms of H3-K4 methylation (mono-, di- and tri-methylation) regulates RNA pol II activity during transcription elongation either directly or by affecting additional chromatin modifications (72).
The other mutants that produce general shifts in the 3′/5′ run-on ratios encode general transcription factors that interact directly with RNA pol II. Of these, the deletion of the polymerase subunit Rpb9 is particularly striking. The comparison of the run-on results with those produced by ChIP indicates the presence of transcriptionally incompetent polymerases in the 5′ half of the genes studied, suggesting a higher tendency of Rpb9-defective polymerases to become arrested in that particular region. Such arrested polymerases would need TFIIS to resume transcription, which is consistent with the synthetic lethality exhibited by rpb9Δ and dst1Δ (73). However, dst1Δ does not produce a 3′-biased profile of active transcription, like rpb9Δ, but a weaker bias toward the 5′-end. This result indicates that wild-type RNA pol II does not tend to become preferentially inactive at the 5′-end, as it does in the absence of Rpb9, but inactivates more modestly at the 3′-end, where it may be efficiently compensated by the action of Dst1.
The deletion of Bur2 cyclin-kinase also displaces the distribution of active RNA pol II toward the 3′-end. The main function of Bur2 is to phosphorylate the Spt5 subunit of yDSIF (45,46). This phosphorylation is essential for overcoming the early elongation pause. So it is conceivable that a lack of phosphorylation in Spt5 produces a higher frequency of RNA pol II arrest at the 5′-end, hence lowering the run-on signal at this region. This is in agreement with the results obtained by Chu et al. (74) in PMA1, where bur2Δ has been seen to bring about a decrease in histone H4-K36 trimethylation, a chromatin mark of transcription elongation, at the 5′ of this gene without affecting the total amount of RNA pol II present in this region. If this hypothesis is true, and given that spt4Δ produces a bias of active transcription in the opposite orientation, the two subunits of yDSIF would play distinct complementary roles during elongation. Although we favour this interpretation, we cannot rule out that the deletion of Bur2 produces a perturbation in the intragenic distribution of active RNA pol II as it affects the phosphorylation status of the Rpb1 CTD (49).
The last intriguing finding of our genetic analysis of active RNA polymerase distribution is the marked 5′-bias produced by med2Δ. The Med2 protein belongs to the Mediator complex, which was originally described for its role in the regulation of the pre-initiation-complex assembly. Following a completely different experimental approach, we have already linked the Mediator complex with the elongation phase (55). Human Mediator has also been connected to post-initiation transcription since it was found to be involved in the DSIF-dependent activation of transcription (75). This fact enables a mechanism by which Mediator might indirectly modulate the intragenic distribution of active RNA pol II.
As part of the general tendency of each gene to maintain a characteristic 3′/5′ run-on ratio in all backgrounds and under all conditions, there is a certain degree of overlapping among the genes when comparing all the average ratios obtained for the 18 mutants tested (Supplementary Figure S7A). As expected, the clustering analysis of this data recapitulates the functional relationship of some mutants (set1AA and set1Δ) or their average effect on the run-on ratios (rpb9Δ, ccr4Δ and ssd1Δ) (Supplementary Figure S9C). With some minor exceptions, all the genes present in the array were influenced in their 3′/5′ run-on ratios by the mutants tested, exhibiting a variation of at least 1 log2 units in one of the mutant strains (Supplementary Figure S7B). Some genes tended to shift their run-on ratio toward the 3′ end in the mutants analysed, whereas others did so toward the 5′-end (Supplementary Figure S7B). When we plot the average mutant-wild-type increment in the 3′/5′ run-on ratio for each gene versus its 3′/5′ run-on ratio in the wild type, we see that most genes exhibiting significantly biased distributions of active RNA pol II in the wild type tend to equilibrate them in the mutant backgrounds (Figure 5). We can conclude that the default situation is the unbiased distribution of active transcription and that transcription elongation factors are collectively involved in shaping the profile of active RNA pol II along the transcribed region. Accordingly, gene-specific profiles would be the result of combining local chromatin with the action of the transcription elongation machinery in response to gene-specific regulatory signals. The recent development of more sensitive high-resolution methods of genomic run-on (15) should facilitate the test of this model.
Spanish Ministry of Education and Science (BFU2007-67575-CO3-01/BMC to J.E.P-O. and BFU2007-67575-CO3-02/BMC to S.C.); Regional Valencian Government (ACOMP2009/368 to J.E.P.-O.); Regional Andalusian Government (P07-CVI02623 to S.C.). A.R-G. was covered by a fellowship from the University of Seville. Funding for open access charge: Andalusian Government.
Conflict of interest statement. None declared.
Supplementary Data are available at NAR Online.
The authors thank O. Rando for sharing data, F. Posas, E. de Nadal and the members of our labs for comments and advice, and Helen Warburton for English corrections.