Genome-wide association of Pol III transcription factors. To obtain a genome-wide profile of occupancy by the Pol III machinery, we investigated DNA association in vivo by proteins representing each of the three multisubunit Pol III-specific transcription factors TFIIIB, TFIIIC, and Pol III. Chromatin immunoprecipitation was performed on cells grown in synthetic complete medium with antibodies against the (HA)3-tagged TFIIIB subunit Brf1, the TFIIIB subunit Bdp1, the TFIIIC subunit Tfc4, and the Pol III subunit Rpc34. Input and immunoprecipitated samples were amplified and then applied to a microarray spotted in duplicate with 6528 PCR products corresponding to almost all yeast intergenic sequences. A fraction of the nonamplified immunoprecipitated material from each experiment was reserved for quantitative analysis of specific target genes.
With a few notable exceptions (see below), the occupancy profiles of the four factors are extremely similar (Fig. ). To assess the relative occupancies by the different factors, we analyzed the 1,200 spots for which at least one of the four factors is present at least twofold over the median ratio. When the median ratio of Bdp1 occupancy to Brf1 occupancy at these 1,200 spots is set to 1.0, the standard deviation of values around this median is 0.63, indicating good cooccupancy, as expected, by these two subunits of TFIIIB. Bdp1 occupancy is also well correlated with Rpc34 occupancy, with a standard deviation of 0.75. The Tfc4 distribution is considerably broader; when the median ratio of Tfc4 occupancy to Rpc34 is set to 1.0, the standard deviation is 1.98. Similarly, the standard deviation of the Tfc4 to Bdp1 ratio is 2.26. Thus, although Tfc4 is required by all Pol III genes, the amount of Tfc4 relative to other factors is somewhat variable across different Pol III genes (several extreme examples will be discussed in a later section). These observations suggest that, in general, the Pol III factors associate with genomic sequences in the context of the intact Pol III machinery.
The Pol III machinery associates with all tRNA genes, but with a fivefold range in occupancy. As a first step to define targets of the Pol III machinery, we identified 633 spots on the array that yield immunoprecipitation/input ratios that are at least twice the median for all four factors. As expected, the targets of the Pol III machinery are overwhelmingly tRNA promoters or sequences adjacent to the 3′ termini of tRNAs. Because of the small size (71 to 133 bp) of tRNA genes and the resolution limitations of the array experiment, we obtained an extremely high signal for any sequence abutting a tRNA gene, regardless of whether it is upstream or downstream of the tRNA gene. A given tRNA therefore typically contributes to a positive signal in at least two spots. In a few instances, several closely spaced tRNAs are represented by a single spot on the microarray; their individual contributions to the occupancy value of the spot cannot be distinguished by the array alone. A total of 232 tRNAs are represented by distinct spots on our array and yielded high-quality data for all replicates of the four factors we tested. Of these, 217 (94%) yielded a Pol III signal in the top 10% of microarray spots in our overall ranking. Among these is the tX(XXX)D tRNA gene, for which no anticodon has been annotated at SGD. Three tRNAs are unexpectedly low (in the bottom third) on our ranked list of Brf1-occupied sequences, but direct analysis reveals these to be significantly associated with Pol III transcription factors and, hence, false negatives on the array. Taken together, our results indicate that the Pol III machinery associates with all tRNA genes.
Although the array experiments are suggestive of a spectrum of occupancy levels, the spotted PCR products vary in size and are often considerably longer than a tRNA, with the result that any positive occupancy signal is diluted to various extents by the unbound neighboring DNA in the same PCR product. We therefore used quantitative PCR with primers targeted to individual small regions of DNA to measure the occupancy of Pol III factors at a variety of Pol III genes (Fig. ). All four factors yield exceptionally robust occupancy values, with Bdp1, Brf1, and Rpc34 typically enriched >200-fold at Pol III promoters relative to the control DNA; even the weakest immunoprecipitation results, those of Tfc4, averaged well over 20-fold enrichment at the Pol III promoters. The difference in enrichment values between the Tfc4 immunoprecipitation and those of the other factors may be attributable to a combination of different cross-linking efficiencies and differences in the antibodies.
For all tRNA promoters tested, the relative associations of the different components of the Pol III apparatus were fairly constant, strongly suggesting that the intact Pol III machinery is present and required for stable association. This observation is similar to that observed for basal factors involved in Pol II transcription (e.g., TBP, TFIIA, TFIIB, and Pol II) but different from that observed with TBP-associated factors, which can be relatively high or low depending on the promoter (
20-
23). The overall level of the Pol III factors at different tRNAs varies over a considerable range. To take an extreme example, at
tY(GUA)F2, Bdp1 occupancy is approximately seven times greater than it is at
tG(GCC)E, and Rpc34 occupancy is approximately five times higher (Fig. ). Occupancy of typical tRNA genes (defined by six of the nine genes individually tested) by Pol III factors averages ca. 60% of the maximal observed level. Although four tRNAs [
tD(GUC)K,
tI(UAU)D,
tI(UAU)L, and
tP(UGG)F] can be transcribed in vitro in the absence of TFIIIC (
6), association of Tfc4 at these promoters in vivo occurs at a level commensurate with that found at other tRNAs. This finding is in accord with the observation that
SNR6 requires TFIIIC for transcription in the context of chromatin (
1), even though it can be transcribed in vitro by a TFIIIC-independent mechanism (
17).
Occupancy of the 5S and SCR1 genes by Pol III factors is low compared to typical tRNA genes. In addition to the tRNAs, genes encoding all other previously known Pol III-transcribed RNAs (
RDN5,
RPR1,
SCR1, and
SNR6) show a high level of occupancy by Pol III factors. We observe no significant Brf1 occupancy at
NME1 or
TLC1, two genes encoding nontranslated RNAs. Pol III factors are also not associated with the
RUF genes, which encode a recently described group of non-protein-coding RNAs (
26). In contrast,
NME1,
TLC1, and the
RUF genes show high occupancy by the Pol II-specific transcription factor TFIIB (data not shown), confirming that these genes are transcribed by Pol II in
S. cerevisiae.
The occupancy levels of these Pol III factors at
RPR1 and
SNR6 are similar to that of typical tRNA promoters, whereas the 5S rRNA gene (
RDN5) associates with almost 10-fold-lower levels of Brf1, Bdp1, and Rpc34 than a typical tRNA. This may be at least partially explained by the fact that there are more than 100 tandem copies of the ribosomal DNA in yeast, of which only half are actively transcribed (
4). If 50% of the 5S copies are fully active, with the remaining 50% completely inactive, then Pol III occupancy of active 5S copies is actually fivefold lower than the level at a typical tRNA (Fig. ). Our results cannot exclude a scenario in which a low level of Pol III is present at every 5S copy, but this is unlikely because factor occupancy is highly correlated with transcription. The lower Pol III occupancy at the 5S gene might also reflect less efficient recruitment of other Pol III factors by TFIIIA.
The SCR1 gene is associated with approximately fivefold lower amounts of Brf1, Bdp1, and Rpc34 than a typical tRNA, and Tfc4 occupancy at this locus is only 5% that of an average tRNA (Fig. ). SCR1 is unusual in that it is far longer (522 nucleotides [nt]) than all other known yeast Pol III transcripts. However, SCR1 has both A and B blocks in typical intragenic positions, and it assembles all components of the Pol III apparatus, although apparently less efficiently than most tRNA genes.
SNR52 is a Pol III gene. In attempting to identify previously unidentified targets of the Pol III machinery, we observed that only 13 of the 500 genes with the highest Brf1 occupancy are not adjacent to a known Pol III gene or to a retroelement. Retroelements are present in multiple copies, many of which are adjacent to tRNAs, and they may cross-hybridize with other copies in non-tRNA-containing spots. All 13 potential sites of Pol III occupancy were tested by real-time quantitative PCR. With two exceptions (SNR52 and ZOD1, see below), we found no occupancy of these loci by Pol III factors, indicating that most of these spots are false positives on the microarray (most likely due to incorrect PCR products). Spots ranked lower on the Brf1 array, between 500 and 800, often represent loci that, although not immediately adjacent to Pol III genes, are very close to tRNAs, with only very small intervening genes. We tested 18 other genomic locations within this range for which there is no Pol III gene nearby; all were negative by quantitative PCR. These results suggest that there are few, if any, other Pol III-associated loci on our arrays.
One previously unsuspected site of Pol III occupancy is
SNR52, which encodes a small nucleolar RNA involved in ribose methylation of rRNA (
24). Due to the proximity of
SNR52 to tRNA
tH(GUG)E2, we mapped the location of the Pol III machinery along this chromosomal region (Fig. ). A distinct peak of Pol III factor occupancy is observed over
SNR52, at a level similar to that of
SNR6. Occupancy of the Pol III machinery is specific to
SNR52, as can be seen by the sharp drop in occupancy before a second specific Pol III peak appears over
tH(GUG)E2. SNR52 RNA levels are comparable to those of other Pol III transcripts (data not shown), and there is minimal occupancy by the Pol II transcription factor TFIIB over the
SNR52 promoter (data not shown). The
SNR52 locus contains the sequence GTTCGAAAC 35 bp upstream of the start of the mature RNA coding sequence; this sequence corresponds fairly well to the B block consensus GTTCRANYC (
7), and its position is similar to that of the B block 30 bp upstream of the mature
RPR1 RNA (
23). Thus,
SNR52 is a previously unsuspected Pol III-transcribed gene.
ZOD1, a functional Pol III promoter. Our array results indicate significant occupancy by all four Pol III factors in the intergenic region upstream of
UFO1, a gene encoding an F-box protein required for the ubiquitin-mediated degradation of the HO endonuclease (
18). DNA sequencing of a PCR-amplified region from position −447 to position +165 with respect to the
UFO1 translational start site shows this sequence in our strain to be identical to the published sequence, thus eliminating the trivial possibility that a tRNA might be present in this region in our particular strain (data not shown). In addition, all Pol III factors associate with this region in two different
S. cerevisiae strains (data not shown). Mapping experiments with a set of tightly spaced primer pairs reveals coincident peaks of Brf1, Bdp1, Tfc4, and Pol III occupancy between positions −344 and −252 with respect to the
UFO1 translational start site (Fig. ). Interestingly, occupancy by Tfc4 is relatively high at this locus, and occupancy by Pol III is disproportionately low, yielding an unusual Tfc4/Rpc34 ratio that is 15.5-fold higher than that of
tF(GAA)G. For this and other reasons to be described below, we term this locus
ZOD1 (for zone of disparity).
As expected from the occupancy of Pol III factors, ZOD1 contains putative A and B blocks whose spacing is similar to that found in Pol III-transcribed genes. The sequence GGTTCGAACTC at position −205 relative to the translational start of UFO1 is a good match to the B-block consensus, and the sequence TTGGCGCTTTGG at position −237 is a fairly good match to the consensus for the A block. Taken together, our results indicate that ZOD1 encodes a functional Pol III promoter, i.e., one that assembles a complete transcription apparatus.
ETC loci that are occupied by TFIIIC but not by other Pol III factors. As mentioned above, pairwise comparisons of the genomic profiles and quantitative analysis of individual genes indicate that the association levels of the Pol III factors are strongly correlated. Unexpectedly, seven loci representing the upstream regions of
TFC6-ESC2,
ADE8-SIZ1,
ARG8,
BCK1,
RAD14-ERG2,
RAD2, and
WTM2-
YOR228C (gene names separated by a hyphen indicate that the locus is between these two divergently transcribed genes) are occupied by Tfc4 but not occupied to a significant extent by the other Pol III factors tested. Quantitative analysis (Fig. ) indicates that the levels of Tfc4 association at these seven atypical loci (14- to 35-fold enrichment relative to the background) are roughly comparable to those at other Pol III genes. In contrast, occupancy by Bdp1, Brf1, and Rpc34 is not observed, even though typical Pol III genes exceed 200-fold enrichment above background for these factors. One of these loci, the intergenic region between
RAD14 and
ERG2, contains the gene encoding
RNA170, a Pol III RNA of unknown function (
31); the others have not been identified in any screen for noncoding RNAs or Pol III genes. We have designated these loci
ETC (for extra TFIIIC) (Fig. ). An eighth
ETC locus, in the
RPB5-CNS1 intergenic region, was identified by sequence analysis (see below).
ZOD1 and the ETC loci contain B blocks and additional conserved residues. Although
ZOD1 differs from the
ETC loci in that it recruits the complete Pol III apparatus, the disproportionately high Tfc4 occupancy at
ZOD1 is reminiscent of the
ETC loci. We therefore examined all eight loci for common sequence elements. The AlignAce motif-finding program (
35) reveals a common sequence resembling an extended B block in all eight of these intergenic regions, with a maximum a posteriori score 18 and a specificity score of 2.8 e
−12. Seven of these sequences are high-quality matches to the B block consensus, and the more degenerate sequence at
ETC3 is also aligned (Fig. ). Interestingly, there is 100% conservation of three additional nucleotides located 6 to 10 bases downstream of the B block consensus (Fig. ). Alignment of the B blocks of 274
S. cerevisiae tRNA genes indicates that one of these bases, the final C, is significantly conserved in tRNA sequences (204 instances out of 274 tRNAs). Perfect conservation of all three nucleotides is found in only 21 of 274 tRNAs but at all
ETC loci. The presence of a B-block makes it likely that TFIIIC association at these loci occurs in a manner generally similar to its association with typical Pol III genes. At the same time, the perfect conservation of three extra nucleotides suggests that there is something special about these otherwise apparently normal TFIIIC interaction sites.
A search of yeast intergenic sequences for loci matching the derived ETC consensus (CNRTTCGAAYCCNNRNYGRNGC), allowing for two mismatches, revealed two additional matches in loci not adjacent to tRNAs. Quantitative analysis indicates that the region upstream of SIP1 does not associate with any Pol III factors (data not shown), whereas the region upstream of RPB5-CNS1 displays the Pol III occupancy profile of an ETC locus (Fig. ). We termed this locus ETC8; its alignment with ZOD1 and ETC1 to ETC7 is shown in Fig. .
Phylogenetic conservation of ETC and ZOD1 sequences. Functionally meaningful sequences tend be highly conserved across
Saccharomyces species, with protein-coding sequences having a much higher level of cross-species identity than intergenic sequences (
2,
19). Interestingly, the B block of each
ETC locus is highly conserved among the four
Saccharomyces sensu stricto yeasts:
S. cerevisiae,
S. mikatae,
S. bayanus, and
S. paradoxus. Of the 11 positions within the B block consensus, 8 to 10 positions are identical (the boldface residues in Fig. ) in the four yeasts, and many of the nonidentities are at positions that are less conserved among B blocks in
S. cerevisiae. In addition, the three extra nucleotides downstream of B block core that are characteristic of the
ZOD1 and
ETC loci are extremely well conserved across the four yeast species. This striking conservation suggests that Tfc4 is likely to associate with these regions in the other yeast species and that Tfc4 association is biologically meaningful.
In general, the cross-species identity of Pol III-transcribed RNAs within the genus Saccharomyces is fairly high. For example, the sequence identity across the four Saccharomyces sensu stricto yeasts is 79% for RPR1, 86% for SCR1, and 100% for tC(GCA)P2. RNA170, which corresponds to ETC5, has 70% identity if the RNA is defined by the further upstream of its two 3′ termini; however, the next 70 nt until the downstream 3′ end are only 21% conserved. At the remaining seven ETC loci, sequence conservation across these yeasts is variable. Conservation of ETC4 (63% identity over 86 nt near the putative B block) and ETC3 (58% identity over a 74-nt region) is below that of all known RNA genes, making it unclear whether these loci encode a defined RNA species. In contrast, ETC7 has a 113-nt region with 89% identity across the four species, and ETC2 has a 67-nt region with 84% identity. This very high degree of sequence conservation suggests that ETC2 and ETC7 might encode heretofore undescribed RNAs transcribed by Pol III.
For ZOD1, the putative B block and 8 of 12 bases of the putative A block are identical across the Saccharomyces sensu stricto species. The presence of highly conserved and appropriately spaced A and B blocks, together with occupancy by Pol III factors, is suggestive of a biologically relevant Pol III function, although we cannot exclude the possibility that these conserved sequences are involved in a UFO1-related function. In contrast to the A and B blocks, sequences corresponding to the peak of Pol III, TFIIIC, and TFIIIB occupancy (positions −200 to −400 with respect to the UFO1 translation start) are only 26% identical, indicating that this region does not encode a conserved RNA. A 122-nt region closer to UFO1 (positions −165 to −44) is 63% identical across these yeasts, but this region corresponds poorly to the location of Pol III and thus may represent UFO1 regulatory sequences. Hence, ZOD1 contains highly conserved A and B blocks yet may not encode a defined RNA species.
Functional properties of ZOD1 and the ETC loci are conserved in another Saccharomyces species. To address whether ZOD1 and the ETC loci represent fortuitous binding sites of Pol III factors or a conserved biological function, we examined occupancy of Pol III factors at these loci in S. mikatae. Exploiting the significant protein sequence identity across Saccharomyces sensu stricto yeasts, we used antibodies against S. cerevisiae Bdp1, Tfc4, and Rpc34 to immunoprecipitate chromatin from S. mikatae. The occupancy profiles of the S. mikatae ETC and ZOD1 loci (Fig. ) are extremely similar to those observed in S. cerevisiae, including the skewed TFIIIC/Pol III ratio at ZOD1. Just as in S. cerevisiae, high occupancy by Tfc4 at the ETC loci in S. mikatae is unaccompanied by significant Bdp1 or Pol III occupancy. Interestingly, ETC5, the only ETC locus that coincides with a known S. cerevisiae RNA (RNA170), may show a very slight occupancy by Bdp1 and Pol III in S. mikatae. The striking functional similarities of ZOD1 and the ETC loci in S. cerevisiae and S. mikatae strongly suggest that these loci possess conserved biological functions.
Attempted identification of a ZOD1 RNA. Given the apparent disparity between the highly conserved A and B boxes and the lack of sequence conservation in the region of high occupancy by Pol III factors, we used two methods to address whether ZOD1 encodes a defined RNA species. First, randomly primed, reverse transcriptase PCR with the closely spaced primer set described above revealed the expected UFO1 transcript but no RNA corresponding to the region occupied by Pol III factors. The spacing of our PCR primers is such that we would have observed only longer transcripts that are at least 100 nt. Second, we performed S1 nuclease protection on total RNA with overlapping oligonucleotide probes (55 to 71 nt in length) complementary to both strands across the region from positions −491 to −60 relative to the UFO1 translational start site. This method yielded no evidence of discrete RNA transcripts, although this approach requires that a transcript contain approximately 35 (or more) bases of homology from the 5′ end of a given oligonucleotide probe. Thus, we have been unable to detect a significant ZOD1 transcript, although there are some limited locations for a small transcript that might have been missed due to the placement of the probes used in the analysis.