|Home | About | Journals | Submit | Contact Us | Français|
To investigate the function and architecture of the open complex state of RNA polymerase II (Pol II), Saccharomyces cerevisiae minimal open complexes were assembled by using a series of heteroduplex HIS4 promoters, TATA binding protein (TBP), TFIIB, and Pol II. The yeast system demonstrates great flexibility in the position of active open complexes, spanning 30 to 80 bp downstream from TATA, consistent with the transcription start site scanning behavior of yeast Pol II. TFIIF unexpectedly modulates the activity of the open complexes, either repressing or stimulating initiation. The response to TFIIF was dependent on the sequence of the template strand within the single-stranded bubble. Mutations in the TFIIB reader and linker region, which were inactive on duplex DNA, were suppressed by the heteroduplex templates, showing that a major function of the TFIIB reader and linker is in the initiation or stabilization of single-stranded DNA. Probing of the architecture of the minimal open complexes with TFIIB-FeBABE [TFIIB–p-bromoacetamidobenzyl–EDTA-iron(III)] derivatives showed that the TFIIB core domain is surprisingly positioned away from Pol II, and the addition of TFIIF repositions the TFIIB core domain to the Pol II wall domain. Together, our results show an unexpected architecture of minimal open complexes and the regulation of activity by TFIIF and the TFIIB core domain.
Transcription initiation is a multistep process that is conserved in all organisms (6, 16, 28). RNA polymerase (Pol) first recognizes and binds to promoter DNA with the assistance of one or more factors forming a state termed the closed complex. Subsequently, DNA surrounding the transcription start site is unwound, and the template strand is positioned in the Pol active site, forming the open complex (23). Transcription initiation then commences, initially producing short RNA products in an abortive initiation reaction, until Pol releases contacts with the promoter and transitions into a processive elongation mode. Each of these intermediate steps can be targeted to regulate transcription.
The closed-complex state of eukaryotic Pol II is termed the preinitiation complex (PIC) and contains Pol II and 6 general transcription factors (TFs) (TFIIA, TATA binding protein [TBP], TFIIB, TFIIF, TFIIE, and TFIIH) (11, 24). Unlike the closed complexes of other Pols, the Pol II PIC is stable and does not spontaneously form open complexes. Instead, open complex formation requires ATP hydrolysis and xeroderma pigmentosum B (XPB) helicase activity to unwind the DNA strands from positions ~−10 to +2 with respect to the human transcription start site (13, 27). These open complexes are unstable, with a half-life of ~1 min, and require continuous ATP hydrolysis to remain in this state (8, 13, 29).
For the Archaea and many eukaryotes, the bubble of unwound DNA in the open complex overlaps the transcription start site, located about 30 bp distant from the TATA element. One exception is the yeast Saccharomyces cerevisiae, where transcription starts within a window of ~50 to 120 bp downstream from TATA, even though yeast PICs are assembled surrounding the TATA (11a, 15, 19, 30). In vivo permanganate probing suggested that the unwinding of yeast promoter DNA begins, with respect to TATA, at about the same position as that in mammals and extends to the distant transcription start site (10). However, it is not known if all DNA is unwound at once in a large single-stranded bubble or whether a smaller bubble is propagated downstream while Pol II scans for an appropriate initiation site. It is also not known whether start site scanning involves the release of Pol II from the general factors and promoter DNA. Finally, it is not clear why yeast and mammalian start site selection is different, since Pol II and the general factors are well conserved. It was suggested previously that differences in Pol II and TFIIB can account for start site preference (17), and mutations that have modest effects on start site preference have been isolated in Pol II, TFIIF, and TFIIB (reviewed in reference 9).
Models for the architecture of the PIC at a TATA-containing promoter have been proposed based on biochemical probes positioned within the PIC (4, 9, 19) and from X-ray structures of the Pol II-TFIIB complex (14, 18). In these models, the TFIIB core domain binds both TATA-TBP and the wall domain of Pol II, positioning downstream promoter DNA over the Pol II central cleft, with upstream DNA directed toward the top surface of Rpb2, the second largest Pol II subunit. Two structured domains of TFIIF are positioned at separate sites on Rpb2 (5, 9), and one of these domains, the winged helix of the TFIIF small subunit, is near upstream promoter DNA, where it may bind and stabilize the PIC (9).
Two major questions concerning the mechanism of Pol II initiation are as follows: what is the architecture of the Pol II open complex, and how does Pol II scan for the transcription start site? Two models for open complex formation were recently proposed by combining PIC models and the X-ray structure of the Pol II elongation complex (14, 18). In one model, the unwound template strand is positioned in the enzyme active site, with upstream single-stranded DNA near a flexible region of TFIIB termed the B-reader, which was proposed previously to recognize DNA 8 bp upstream from the transcription start site (14). Another TFIIB element, termed the B-linker, is positioned near the junction of single- and double-stranded promoter DNAs and was proposed previously to function in DNA melting (14, 18). In both models, the TFIIB core domain, TBP, and upstream promoter DNA remain in the same location compared to the PIC. These models, however, do not explain how yeast Pol II can initiate mRNA synthesis at distant downstream sites.
In this work, we examine the activity and architecture of a minimal Pol II open complex. We observed remarkable flexibility in the open complex state that is consistent with downstream initiation, an unexpected sequence-dependent modulation of open complex activity by TFIIF, and surprising differences from the previously proposed open complex models for the position of the TFIIB core domain and the path of upstream promoter DNA.
DNA templates were generated by PCR from pSH1271 (containing a single Gal4 binding site upstream from a modified HIS4 promoter [see below]) with primers pBio965 (biotin-TAATGCAGCTGGCACGACAGG) and pNOT (GGCCGCTCTAGCTGCATTAATG). The 629-bp product was used to generate immobilized templates as described previously by Ranish et al. (22) and used for transcription and FeBABE [p-bromoacetamidobenzyl–EDTA–iron(III)] cleavage assays. Alternatively, the product was digested with DraIII and AlwNI restriction enzymes (NEB), yielding fragments of 363 bp, 92 bp, and 174 bp. These fragments were separated and purified from 2.5% agarose gels and used for the generation of heteroduplex templates. Heteroduplexes were formed by annealing phosphorylated 92-base oligonucleotides containing 12-bp mismatches at the positions indicated in Fig. 1A. The oligonucleotides were designed to leave overhangs complementary to DraIII and AlwNI sites in pSH1271, thus allowing the replacement of the HIS4 promoter from the TATA box through the start sites of in vitro transcription. The 92-bp heteroduplexes were purified on 2.5% agarose gels. Heteroduplex promoters were generated by the ligation of the mismatched 92-bp promoter inserts with the 363-bp and 174-bp fragments from pSH1271 templates overnight at 16°C. T4 DNA ligase (NEB) was heat inactivated (10 min at 65°C), and the reaction mixtures were run on 2% agarose gels. The 629-bp products were purified by use of a gel extraction kit (Qiagen) and ethanol precipitated, and the DNA was resuspended in 10 mM Tris (pH 8.0). The heteroduplex templates were quantified by use of an ND-1000 spectrophotometer (NanoDrop) and were used to generate immobilized templates as described above.
The sequence of the modified HIS4 promoter from plasmid pSH1271 used to generate immobilized templates is as follows, with the HIS4 TATA underlined: TAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTGAGCGCAACGCAATTAATGTGAGTTAGCTCACTCATTAGGCACCCCAGGCTTTACACTTTATGCTTCCGGCTCGTATGTTGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAGCTATGACCATGATTACGCCAAGCGCGCAATTAACCCTCACTAAAGGGAACAAAAGCTGGGTACCGGGCCCCCCCTCGAGGTCGACGGTATCGATAAGCTTGATATCGAATTCCTGCAGCCCGGGGGATCGATCCGGGTGACAGCCCTCCGAATTCGAGCTCGGTACCCGGGGATCTGTCGACCTCGAGAACAGTAGCACGCTGTGTATATAATAGCTATGGAACGTTCGATTCACCTCCGATGTGTGTTGTACATACATAAAAATATCATAGCACAACTGCGCTGTGTCAGCGACTGAATAGTAATACAATAGTTTACAAAATTTTTTTTCTGAATAATGACCGGATCCGGAGCTTGGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTGGCGCCCAATACGCAAACCGCCTCTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTAGAGCGGCC.
Twenty-microliter reaction mixtures contained 20 mM HEPES (pH 7.6), 100 mM potassium acetate, 1 mM EDTA, 5 mM magnesium acetate, 3 mM dithiothreitol (DTT), 38 mM creatine phosphate, 0.03 units creatine phosphokinase, 2 μg bovine serum albumin (BSA), 4 units RNase Out (Invitrogen), 0.05% NP-40, and 1 μg poly(dG-dC) competitor DNA. One microliter of immobilized template (86 ng template) was added to each reaction mixture, along with 24 ng VP16 activator and 120 μg nuclear extract or purified factors (TFs), as follows: 68 ng TBP, 26 ng TFIIB, 54 ng TFIIF, 12 ng TFIIE, 85 ng TFIIH, and 280 ng Pol II. With the exception of TFIIH, all factors were saturating for activity at these concentrations. Preinitiation complexes were allowed to form for 30 to 40 min at room temperature (RT). Transcription was initiated by the addition of 1 μl nucleoside triphosphates (NTPs) (10 mM each) and stopped after 3 min by the addition of 180 μl stop mix of 100 mM sodium acetate, 10 mM EDTA, 0.5% SDS, and 17 μg/ml tRNA (Sigma) to the mixture. Reaction mixtures were phenol-chloroform (2:1) extracted once, and the RNA was precipitated, washed in ethanol, and dried. The pellets were resuspended in 10 μl primer annealing mix containing 5 mM Tris (pH 8.3), 75 mM potassium chloride, 1 mM EDTA, and either 32P-labeled lacI primer (~5 × 105 cpm) or 65 μM 700IR fluorescently labeled lacI (Li-Cor Biosciences). Reaction mixtures were incubated for 45 min at 48°C (32P-labeled lacI) or 55°C (700IR-lacI). Next, 20 μl cDNA synthesis mix (25 mM Tris [pH 8.3], 75 mM potassium chloride, 4.5 mM magnesium chloride, 15 mM DTT, 150 μM deoxynucleoside triphosphates [dNTPs], and 100 units Moloney murine leukemia virus reverse transcriptase [MMLV-RT] [Invitrogen]) was added to each reaction mixture, and the mixture was incubated for 30 min at 37°C. Reactions were stopped by ethanol precipitation. The pellets were washed with 80% ethanol, dried, resuspended in 3 μl RNase A (40 μg/ml), and incubated for 3 min at room temperature before the addition of 3 μl formamide loading dye containing bromophenyl blue. Just before electrophoresis, samples were heated for 1 min at 90°C, transferred onto ice, and run on denaturing 8% acrylamide gels. Gels were visualized with a PhosphorImager (32P; Molecular Dynamics) or with an Odyssey scanner (700-IR; Li-Cor).
Reaction mixtures were assembled under the same conditions as those used for in vitro transcription with purified factors (TFs). Complexes were allowed to form for 30 to 40 min at RT on immobilized heteroduplex templates. Transcription was initiated from bubble 3 with 1 μl limiting NTPs (10 mM ATP and CTP plus 1 mM UTP) plus 5 μCi [α-32P]UTP (Perkin-Elmer) or from bubble 15 with 1 μl limiting NTPs (10 mM CTP and UTP plus 1 mM GTP) plus 5 μCi [α-32P]GTP and was stopped after 30 min by the addition of 180 μl transcription stop mix to the mixture. The reaction mixtures were extracted once with phenol-chloroform (2:1), and the labeled RNA was precipitated, washed in ethanol, and dried. The pellets were resuspended with formamide loading dye, heated to 65°C for 30 s, and transferred onto ice before loading onto denaturing 20% acrylamide urea gels. The RNA products were visualized with a PhosphorImager (Molecular Dynamics).
TFIIB was expressed as an N-terminal SUMO fusion protein from pLH237 [pET21a-6(His)-SUMO-yTFIIB] in BL21(DE3)RIL cells. Two liters of cells was collected by centrifugation and resuspended in 20 ml lysis buffer (50 mM HEPES [pH 7.0], 500 mM NaCl, 10% glycerol, 40 mM imidazole). The cells were treated with lysozyme (0.5 mg/ml) for 45 min at 4°C and disrupted by sonication. The extract was clarified by centrifugation and purified by using Ni-Sepharose affinity medium (GE Healthcare). TFIIB was eluted with a solution containing 50 mM HEPES (pH 7.0), 0.5 M NaCl, 0.5 M imidazole, 0.05% NP-40, and 10% glycerol. The eluate was dialyzed to remove imidazole and to reduce the concentration of [NaCl] to 375 mM and subsequently treated with SUMO protease Ulp1 (1.7 μg/ml) for 1 h at RT. Following cleavage, Ni-Sepharose was used to capture the His6-SUMO tag and protease. TFIIB was dialyzed in a solution containing 20 mM Tris (pH 7.8), 150 mM potassium acetate (KOAc), 50 μM zinc acetate (ZnOAc), and 10% glycerol and was further purified over a BioRex 70 column (Bio-Rad) with a linear gradient of 7.5% to 30% buffer B (20 mM Tris [pH 7.8], 2 M KOAc, 50 μM ZnOAc, 10% glycerol). TFIIB eluted at ~540 mM KOAc and was stored at −80°C. All buffers were supplemented with 1 mM DTT and phenylmethylsulfonyl fluoride (PMSF).
SHY810 [RAD3-(Flag)1-TAP tag] cells were grown at 30°C in yeast extract-peptone-dextrose (YPD) (3% glucose, 0.002% adenine) to an optical density at 600 nm (OD600) of 5.0. Cells were washed with cold TAP extraction buffer (20 mM HEPES [pH 7.6 at 4°C], 0.2 M KOAc, 20% glycerol, 1 mM EDTA). Cells were resuspended in 1 ml TAP extraction buffer per gram wet pellet and homogenized using chilled 425- to 600-μm glass beads (Sigma) by using a Bead Beater instrument (BioSpec Products). The extract was clarified in steps by centrifugation at 4°C for 20 min at 25,000 × g, followed by 90 min at 200,000 × g. The clarified extract was added to 33 μl IgG Sepharose (GE Healthcare) per gram of pellet and incubated for 2 h at 4°C. IgG beads were collected by centrifugation and washed twice with TAP extraction buffer and once with tobacco etch virus (TEV) cleavage buffer (20 mM HEPES [pH 7.6 at 4°C], 0.2 M KOAc, 20% glycerol, 1 mM EDTA, 0.1% NP-40). One volume of TEV cleavage buffer was added to IgG beads along with 10 μg TEV protease per ml IgG and incubated for 60 min at RT. IgG beads were collected by centrifugation, and the TFIIH eluate was removed. One volume of TEV cleavage buffer was added to IgG beads and incubated for 10 min at RT. The beads were collected, the eluate was transferred, and a third 10-min elution step was carried out. The eluates were combined, added to 3 volumes of calmodulin binding buffer (20 mM HEPES [pH 7.6 at 4°C], 0.2 M KOAc, 20% glycerol, 1 mM EDTA, 0.1% NP-40, 1 mM magnesium acetate [MgOAc], 1 mM imidazole, 2 mM CaCl2), and adjusted to 3 mM CaCl2. This solution was added to calmodulin affinity resin (Stratagene) and incubated for 90 min at 4°C. The resin was collected by centrifugation and washed twice with calmodulin binding buffer and once with calmodulin binding buffer with a low level of NP-40 (0.01%). Two volumes of calmodulin elution buffer (20 mM HEPES [pH 7.6 at 4°C], 0.2 M KOAc, 20% glycerol, 1 mM EDTA, 0.01% NP-40, 1 mM MgOAc, 1 mM imidazole, 3 mM EGTA) were added to the resin and incubated for 20 min at RT. The resin was collected as described above, and the elution was repeated twice. TFIIH-containing eluates were combined and concentrated with Microcon YM-100 filters (Millipore). DTT was added to 5 mM, and TFIIH was stored at −80°C. Buffers were supplemented with 1 mM DTT, 1 mM PMSF, 2 mM benzamidine, 3 μM leupeptin, 2 μM pepstatin, and 3.3 μM chymostatin.
Cells of yeast strain SHy808 (His6 tag at the N terminus of Rpb3) were grown at 30°C in YPD (3% glucose, 0.002% adenine) to an OD600 of 5.0, harvested by centrifugation, and weighed. Pol II was typically prepared from 30 liters of cells. The cells were resuspended in 0.33 ml freezing buffer (150 mM Tris [pH 7.9 at 4°C], 3 mM EDTA, 30 μM ZnCl2, 30% glycerol, 3% dimethyl sulfoxide [DMSO], 30 mM mercaptoethanol, 3× protease inhibitors) per gram wet pellet and flash frozen before being stored at −80°C. The cell suspension was thawed in a room-temperature water bath and homogenized by use of a Bead Beater instrument (BioSpec Products), and the extract was clarified by centrifugation as described above for the purification of TFIIH. The clarified extract was transferred into a glass beaker and stirred overnight at 4°C with 291 mg/ml ammonium sulfate. The sample was centrifuged for 30 min at 25,000 × g, and the supernatant was discarded. The pellet was resuspended in 75 μl HSB-0/10 per gram of cells harvested, and conductivity was adjusted to 400 μS/cm with HSB-0/10 (50 mM Tris [pH 7.9 at 4°C], 1 mM EDTA, 10 μM ZnCl2, 10 mM imidazole, 10% glycerol, 10 mM β-mercaptoethanol). The sample was added to HSB-1000/10 (HSB-0/10 plus 1 M KCl)-equilibrated Ni-Sepharose (10 μl resin per gram of cells harvested) and incubated 2 h at 4°C. The beads were collected by centrifugation, washed for 5 min at 4°C with 5 volumes of HSB-1000/10, and washed twice more with Ni-20 (20 mM Tris [pH 7.9 at 4°C], 150 mM KCl, 10 μM ZnCl2, 20 mM imidazole, 10% glycerol, 10 mM mercaptoethanol). Pol II was eluted three times by a 10-min incubation at RT in 2.5 volumes of Ni-200 (Ni-20 with 200 mM imidazole). The desired eluates were combined and slowly adjusted to a 55-μS/cm conductivity with Q buffer A (20 mM Tris [pH 7.9 at 4°C], 0.5 mM EDTA, 10 μM ZnCl2, 10% glycerol, 10 mM DTT). Pol II was further purified over a Source 15 Q column (GE Healthcare) using 2 linear gradients of Q buffer B (Q buffer A plus 1.5 M KOAc): 10 to 35% over 15 column volumes (CV) and 35 to 70% over 30 CV. The desired fractions were pooled and dialyzed in 3 steps (1 liter for 1.5 h for each step) in Pol II buffer [20 mM HEPES (pH 7.6 at 4°C), 20% glycerol, 8 mM MgSO4, 60 mM (NH4)2SO4, 10 μM ZnCl2, 10 mM DTT]. Pol II was concentrated by using Amicon Ultra-30K filters (Millipore) and stored at −80°C. Buffers were supplemented with protease inhibitors as described above for the purification of TFIIH.
TBP was expressed from pSH713 (pET21a-6His-TBP) in BL21(DE3)RIL cells. Four liters of cells was grown to log phase, induced, and harvested by centrifugation. The cells were washed and resuspended in 40 ml of a solution containing 20 mM Tris (pH 7.8), 250 mM KCl, 10% glycerol, and 5 mM mercaptoethanol. Cells were treated with 0.5 mg/ml lysozyme for 30 min and disrupted by sonication. The extract was clarified by centrifugation and purified by using Ni-Sepharose affinity medium (GE Healthcare). TBP was eluted with a solution containing 20 mM Tris (pH 7.8), 0.25 M KCl, 0.25 M imidazole, 10% glycerol, and 5 mM mercaptoethanol. TBP was adjusted for conductivity to 50 μS/cm by dilution with buffer A (20 mM Tris [pH 7.8], 10% glycerol, 1 mM DTT) and further purified over a Source 15 S column (GE Healthcare) using a linear gradient of 2.5 to 30% buffer B (buffer A plus 2 M KCl) over 20 column volumes. TBP eluted at approximately 200 mM KCl and was stored at −80°C. All buffers were supplemented with 1 mM PMSF. The TFIIBN-TBP fusion was also purified by using this method.
Recombinant TFIIF containing Saccharomyces mikatae Tfg1 and S. cerevisiae Tfg2 was expressed and purified as described previously (4). Recombinant TFIIE typically had a ~2-fold-lower specific activity than yeast-purified TFIIE using the reconstituted transcription system, so yeast-purified TFIIE was used for all assays. SHY392 [TFA1-(1× Flag)-TAP tag] cells were grown at 30°C in YPD (3% glucose, 0.002% adenine) to an OD600 of 5.0. Twelve liters of cells was harvested by centrifugation, and the cells were washed with 100 ml extraction buffer (40 mM HEPES-KOH [pH 7.5], 350 mM NaCl, 10% glycerol, 0.1% Tween 20, 0.5 mM DTT). The cell pellet was resuspended in 1 ml extraction buffer per gram of cells. The cells were homogenized, and the extract was clarified as described above for the purification of TFIIH. The clarified extract was bound to 3 ml IgG Sepharose (GE Healthcare), washed 3 times with 10 volumes of extraction buffer without DTT, and incubated for 3 h at 4°C. The beads were collected by centrifugation and washed twice with extraction buffer and once with TFIIE cleavage buffer (10 mM Tris [pH 8.0], 150 mM NaCl, 0.5 mM EDTA, 0.1% NP-40, 10% glycerol, 1 mM DTT). Three milliliters of TFIIE cleavage buffer plus 30 μg TEV protease were added to the washed beads and incubated for 45 min at RT. The beads were spun down, and the supernatant was collected for elution. A total of 1.3 ml cleavage buffer was added to the beads and incubated for 15 min before being collected as elution 2. This step was repeated for a third elution. The eluates were combined and added to 3 volumes of binding buffer (10 mM Tris [pH 8.0], 1 mM MgOAc, 1 mM imidazole, 2 mM CaCl2, 0.1% NP-40, 10% glycerol, 0.5 mM DTT) and adjusted to 3 mM CaCl2. This solution was added to 2 ml calmodulin affinity resin (Stratagene) and incubated for 2 h at 4°C. The resin was collected by centrifugation and washed once with binding buffer and then twice with wash buffer (binding buffer with a reduced concentration of NaCl [150 mM] and without NP-40). One volume of elution buffer (10 mM Tris [pH 8.0], 150 mM NaCl, 1 mM MgOAc, 1 mM imidazole, 3 mM EGTA, 10% glycerol, 0.5 mM DTT) was added to the resin and incubated for 20 min at RT. The resin was collected as described above, and the elution was repeated twice. TFIIE-containing eluates were combined, adjusted to 0.01% NP-40, and concentrated 10-fold with an Amicon Ultra-4 (10,000-molecular-weight-cutoff [MWCO]) filter device (Millipore) before being stored at −80°C. All buffers were supplemented with protease inhibitors as described above for TFIIH purification.
Human Pol II open complexes can be formed in vitro by two methods. In the first method, PICs containing Pol II and all the general transcription factors are incubated with ATP or dATP, leading to the unwinding of ~10 bp surrounding the transcription start site, monitored by KMnO4 reactivity with single-stranded DNA (13, 27). The maintenance of this state requires the continued hydrolysis of ATP, since the addition of an excess of ATPγS reverts the open complex back to the PIC (8, 13, 29). In contrast, the ATP addition to S. cerevisiae PICs has not yet been observed to generate KMnO4-sensitive DNA between TATA and the transcription start site. One possible reason for this is that ATP may also induce start site scanning so that the single-stranded DNA is not localized to a single position.
An alternative method of open complex formation involves assembling factors on promoter DNA containing a preformed heteroduplex bubble (12, 20, 25). In the human system, the optimal position for the bubble is variable, depending on the promoter used, but is generally located from positions ~−9 to +2 relative to the transcription start site. Transcription from these complexes requires only the factors TBP, TFIIB, and Pol II (20). TBP and TFIIB are presumably necessary to tether Pol II near the heteroduplex DNA and to assist in the positioning of the DNA within the Pol II active site. TFIIF was reported previously to either stimulate or have little effect on the activity of these minimal human open complexes (20, 25). TFIIE and TFIIH are unnecessary for the activity of human heteroduplex complexes, probably because they act primarily in DNA strand separation and/or the stabilization of the open state.
Yeast transcription initiates at variable distances downstream from the site of PIC formation, and it is not clear why the yeast system does not initiate at the same position as that of human Pol II. One model consistent with previous results is that initiation at ~30 bp downstream from TATA is blocked, forcing Pol II to scan downstream sequences for an appropriate start site. Because of this behavior, it was not clear whether yeast Pol II open complexes could be formed by using heteroduplex templates and, if so, where best to position the single-stranded bubble. To test whether open complexes could be formed, a series of 10 heteroduplex templates was generated based on the yeast HIS4 promoter, each containing 12 bases of unpaired single-stranded DNA. These bubbles span the region beginning 18 bp downstream of TATA, through the normal HIS4 initiation sites ~80 bp downstream of TATA (Fig. 1A). The promoter derivatives also contained an additional 372 and 165 bp of upstream and downstream DNA (22) and were attached to magnetic beads via biotin at the 5′ end of the promoter. The major most upstream HIS4 transcription start is defined as position +1.
We initially characterized the activities of two bubble templates, one coincident with the position of mammalian transcription initiation (bubble 3) and the other overlapping the normal HIS4 transcription start site (bubble 15). Figure 1B shows the activity of the bubble 3 heteroduplex template compared to transcription using the double-stranded HIS4 promoter. In all experiments, nucleotides were added for 3 min to preformed protein-DNA complexes to limit transcription to approximately one round of initiation. Figure 1B (lanes 1 and 2) shows that VP16 activated transcription using yeast nuclear extracts on the double-stranded HIS4 template. This transcription activity is comparable to the level of basal transcription (no activator) using a system containing highly purified and recombinant yeast factors (TFs) (Fig. 1B, lanes 3 and 4). As expected, both the crude and reconstituted complete systems require hydrolyzable ATP, as ATPγS, a substrate for RNA synthesis but not open complex formation, does not promote transcription when substituted for ATP (Fig. 1B, lanes 2 and 4).
Very high levels of transcription were observed using both the bubble 3 and bubble 15 promoters (Fig. 1B and C). High-level transcription from bubble 3 required Pol II, TBP, and TFIIB (Fig. 1B, lanes 5 to 8), and a similar behavior was observed with bubble 15. As expected, transcription initiation from these promoters was independent of β-γ-hydrolyzable ATP, since the substitution of ATPγS for ATP gave similar levels of mRNA (Fig. 1C, lanes 1, 2, 7, and 8).
Unexpectedly, transcription from bubble 3 using the complete set of general factors was 3-fold lower than that from the minimal system (Fig. 1C, lanes 1, 2, 5, and 6). This repression appeared to be caused by TFIIF, since the addition of TFIIF to the minimal set of factors also repressed transcription (Fig. 1C, lane 4). In contrast, TFIIF had no effect on transcription from bubble 15 at the normal site of HIS4 initiation (Fig. 1C, lanes 7 to 12). Our results demonstrate a surprising flexibility of the yeast system for the position of the heteroduplex bubble, ranging over ≥60 bp, and show that there is nothing preventing the minimal set of Pol II factors from initiation at the position used by mammalian Pol II. Importantly, TFIIF inhibits initiation by yeast Pol II at the mammalian start site position in heteroduplex HIS4 templates. This mechanism may also contribute to the inhibition of initiation from the mammalian start site position in double-stranded DNA when all factors are present.
To further investigate the flexibility of open complexes and the ability of TFIIF to repress initiation, the complete set of heteroduplex templates was tested for transcription activity and repression by TFIIF (Fig. 2A). Bubbles 1 to 7, spanning the mammalian start site position from 18 to 42 bp downstream of TATA, were all active as templates for the minimal set of factors. Bubble 7 had a significant background of transcription from Pol II alone, while the other templates all gave significantly higher levels of transcription when TBP and TFIIB were added. Transcription from all these templates was repressed by the addition of TFIIF. In contrast, bubbles 9, 10, and 12 (single-stranded DNA 42 to 65 bp from TATA) gave little or no transcription. Finally, bubbles 14 to 16, which all overlap the normal HIS4 initiation sites, gave high levels of initiation that were either stimulated by or indifferent to the addition of TFIIF. Bubble 14 also had a high background level of transcription from Pol II alone.
An analysis of initiation from bubble 3 using a high-resolution gel showed that initiation in the absence of TFIIF begins from sites within and just downstream of the single-stranded region and that TFIIF has its strongest repressive effect on starts within single-stranded DNA (Fig. 2B, with brackets indicating the region of single-stranded DNA). In contrast, transcription from bubble 15 initiates almost entirely within the single-stranded region, and transcription from at least two initiation sites is stimulated by TFIIF.
To test whether TFIIF repressed transcription initiation from bubble 3 rather than a later step, such as the transition to the elongation mode, minimal open complexes were incubated with ATP, CTP, and [α-32P]UTP for 30 min, generating a series of short RNAs (Fig. 3, lane 2). The synthesis of these short RNAs was inhibited by TFIIF, showing that TFIIF inhibits initiation (Fig. 3, lane 3). In contrast, the addition of TFIIF stimulated the production of short RNAs from bubble 15 and the nucleotides CTP, UTP, and [α-32P]GTP (Fig. 3, lanes 5 and 6). Thus, TFIIF appears to act by the modulation of transcription initiation.
We next investigated why transcription from the different bubble templates showed different responses to TFIIF. Possible variables include the distance of the bubbles from TATA or DNA sequence differences upstream of and/or within the bubbles. To test if the sequence upstream of the single-stranded region was important for regulation by TFIIF, 12 bp of DNA upstream of bubble 3 (repressed by TFIIF) was replaced by 12 bp upstream of bubble 15 (stimulated by TFIIF) (bubble 3 positions −54 to −43) (Fig. 4A). The replaced upstream DNA is underlined in Fig. 4A. Transcription from this new promoter variant was still repressed to the same extent by the addition of TFIIF compared to bubble 3 (Fig. 4B, lanes 1 to 4), showing that the sequence upstream of the bubble has no effect on the response to TFIIF. To test the importance of the bubble sequence for the TFIIF response, we replaced the single-stranded bubble 3 sequence with that of bubble 15 (bubble 3::15) (Fig. 4A). Surprisingly, we found that transcription from this promoter variant was slightly stimulated by TFIIF (Fig. 4B, lanes 7 and 8). High-resolution gel analysis showed that initiation from this template used two primary start sites, at positions −34 and −32, and that the TFIIF addition caused a strong preference for the transcription start site at position −32 (Fig. 4C, lanes 5 and 6).
Since TFIIF altered the sequence preference of Pol II at the transcription start site, we tested the effect of changing the base at one of these TFIIF-dependent starts. At bubble 3, initiation was evenly distributed among three start sites, with two being within the single-stranded bubble (Fig. 4C, lane 1). Upon the addition of TFIIF, transcription within the bubble was repressed, while initiation within double-stranded DNA at −29G was only modestly repressed. However, if −29G was changed to T (a nonpreferred base [bubble G −29T]), the addition of TFIIF repressed nearly all transcription, since there was no optimal transcription start site remaining (Fig. 4B, lanes 5 and 6, and C, lanes 3 and 4). These results show that TFIIF imposes a strong and unexpected sequence preference on the transcription start site.
We next investigated why transcription initiates poorly from the bubbles located between the mammalian initiation position and the normal HIS4 transcription start sites (bubbles 9 to 12; 42 to 65 bp downstream from TATA) (Fig. 2A). One possibility is that this region is at a nonoptimal distance from TATA and perhaps generates a strained, inactive open complex. Alternatively, the DNA sequence within this region may not be a good substrate for initiation. Several promoter variants were constructed to test these two possibilities (Fig. 5A). The single-stranded region of bubble 15 was moved 20 bp closer to TATA in two different ways: (i) 20 bp of internal promoter sequence was deleted (bubble 15[Δ20]), such that the bubble 15 sequence was moved to the position occupied by the inactive bubble 10, with the sequence immediately upstream being the same as that in bubble 15, and (ii) the bubble 10 single-stranded DNA sequence was precisely replaced by that of bubble 15 (bubble 10::15). In contrast to the nearly inactive bubble 10 template, these two new variants promoted initiation although with less efficiency than that of bubble 15 (Fig. 5B, lanes 1, 3, and 5). The sequence upstream of the bubble contributed to the efficiency of transcription, since bubble 15[Δ20] was transcribed more efficiently than bubble 10::15 (Fig. 5B, lanes 3 and 5), although neither was transcribed as well as bubble 15. These results show that the sequence of the bubble is critical for transcription activity, but the position with respect to TATA and the upstream DNA sequence can also influence transcription efficiency. In agreement with the above-described finding that the sequence of the heteroduplex region determines the responsiveness to TFIIF, transcription from both of these new bubble variants was stimulated by TFIIF (Fig. 5B, lanes 5 to 8), consistent with the fact that their single-stranded DNA is identical to that of bubble 15, which is normally stimulated by TFIIF.
To further test the finding that the DNA sequence of the bubble determines TFIIF responsiveness, we replaced the bubble 15 sequence with that of bubble 3 (bubble 15::3). As predicted, transcription from this new bubble was partially repressed by TFIIF (Fig. 5B, lanes 11 and 12). High-resolution analysis showed that TFIIF strongly repressed initiation within the bubble while slightly stimulating initiation in downstream double-stranded DNA, analogous to the behavior observed with bubble 3 (Fig. 5C, lanes 1 and 2).
The DNA sequences of bubbles 3 and 15 just upstream from the 3′ single-strand–double-strand junction are CTC (bubble 3) and CGC (bubble 15), where the G in bubble 15 (position +14) is the major initiation site in the presence of TFIIF. To test if this sequence difference is responsible for the different TFIIF responses, we altered bubble 3 base T at position −32 to G (Fig. 6A) and measured transcription with and without TFIIF. In comparison to bubble 3, where initiation within the single-stranded DNA was repressed by TFIIF, the single-base change in bubble 3 switched the response to TFIIF so that transcription from position −32 was now stimulated by TFIIF. Combined, our results show that the sequence of the heteroduplex near the single-strand–double-strand junction can have a strong influence on the response to TFIIF.
In principle, the modulation of transcription by TFIIF could be in response to changes in the template or nontemplate strand or both strands of the heteroduplex. To test which DNA strands are responsible for the TFIIF response, four heteroduplex variants, diagramed in Fig. 6B, were tested for transcription with and without TFIIF. All these variants were created at the site of bubble 15 (Fig. 6B). We chose to pair the bubble 6 sequence with bubble 15, since these two sequences are not complementary. Bub15/Bub15 is identical to bubble 15, with the wild-type HIS4 sequence on the template strand and identical bases on the opposite nontemplate strand. The other variants have either the bubble 6 heteroduplex or bubble 15 and 6 on either the template or nontemplate strands, as diagramed.
Figure 6C shows that initiation from bubble 15 at the G at position +14 is stimulated by TFIIF and that this pattern is the same when bubble 15 is only on the template strand (Bub6/Bub15) (lanes 2, 3, 8, and 9). In contrast, initiation from the single-stranded region of bubble 6 is repressed by TFIIF, with initiation starting primarily within the double-stranded region of the promoter. This behavior is identical to that observed when bubble 6 is present only on the template strand (Bub15/Bub6) (Fig. 6C, lanes 5, 6, 11, and 12). Also, note that the patterns of transcription initiation on all the templates with Pol II alone are nearly identical, but at a low level, compared to that when TBP and TFIIB are also present. Therefore, Pol II has an inherent preference for the initiation sites within the single-stranded bubbles that is enhanced by TBP and TFIIB. Together, our results demonstrate that it is the template strand that determines that transcription initiation pattern and the responsiveness to TFIIF.
The bubble templates allowed us to test the function of the TFIIB B-reader and B-linker regions. Previously, it was shown that mutations in the B-reader of the archaeal factor TFB were suppressed to various extents by the preopening of the DNA and that mutations in the B-linker were suppressed by archaeal TFE, which was proposed to function by stabilizing the DNA bubble (14). Several mutations were generated in the TFIIB reader and linker regions and tested by using the complete reconstituted transcription system on double-stranded DNA (Fig. 7, lanes 1 to 5). These TFIIB mutations resulted in little or no transcription. In contrast, all of these mutations were almost fully suppressed by the heteroduplex templates bubble 3 and bubble 15 (Fig. 7, lanes 6 to 15). The fact that transcription from these templates is suppressed so efficiently by preopened DNA suggests that a primary function of the B-reader and B–linker regions is in the formation and/or stabilization of single-stranded DNA in the open complex state.
High-resolution analysis of initiation using the reader and linker mutants showed that they initiate from the same positions within single-stranded DNA as those of wild-type TFIIB; however, they initiate poorly from double-stranded DNA just downstream from the bubble. Transcription using the TFIIB reader mutants, like with wild-type TFIIB, is repressed by the addition of TFIIF. In contrast, transcription using the TFIIB linker mutant L110P is repressed by TFIIF for single-stranded initiation but is stimulated for initiation within downstream double-stranded DNA. This distinct behavior shows that the roles of the linker and reader are not identical in the responses to TFIIF.
An important question is how the architecture of the open complex differs from that of the PIC. To probe the structure of the minimal open complexes, TFIIB-FeBABE derivatives were used to form either PICs with double-stranded DNA and the complete reconstituted system or minimal open complexes with the bubble templates. The activation of FeBABE with H2O2 generates hydroxyl radicals that cut polypeptides within ~30 Å and allows the mapping of protein-protein interactions in large complexes (3, 7). The cleavage of the Pol II subunit Rpb1 was monitored by Western blotting using an antibody reactive against the N terminus of Rpb1. Figure 8A (lanes 2 and 3) shows that in the PIC, FeBABE positioned within the TFIIB Zn ribbon at either residue 37 or 53 generates strong cleavage within the Rpb1 active site and dock region (A/D) and within the Rpb1 clamp domain, as previously shown (2, 3). The probing of open complexes formed on bubble 3 with these same TFIIB derivatives gave an identical pattern, showing that the positions of the Zn ribbon domain are similar in PICs and open complexes (Fig. 8A, lanes 8 and 9). Similarly, FeBABE positioned at TFIIB residues 67 and 118, in the B-reader and B-linker regions, respectively, gave similar cleavage patterns in both PICs and open complexes (Fig. 8A, lanes 4, 5, 10, and 11), showing that the reader and linker loops are positioned similarly in both complexes.
In contrast is the cleavage generated by FeBABE linked to the TFIIB core domain at residue 135 (green residue in Fig. 8D). In fully assembled PICs, this derivative generates strong cleavage in the Rpb1 clamp domain (Fig. 8A, lane 6, and D, blue highlighted surface) and in the fork/protrusion domain of Rpb2 (pink surface) (2). Importantly, this strong Rpb1 cleavage is absent in the open complex (Fig. 8A, lane 12). These results suggest that while the TFIIB Zn ribbon and reader/linker regions are positioned on Pol II similarly in both the PIC and open complexes, the position of the TFIIB core domain is very different, with the TFIIB core domain in the minimal open complex positioned away from the Pol II wall domain.
These mapping results suggested that one of the other general factors is responsible for the positioning of the TFIIB core domain within the PIC. To test whether TFIIF contributes to TFIIB positioning, PICs or minimal open complexes were assembled with TFIIB-FeBABE (at residue 135) and with or without TFIIF (Fig. 8B). PICs assembled lacking TFIIF contained all added general factors and Pol II (not shown), likely due to the high concentrations of factors used for assembly. These incomplete PICs were not active in initiation from double-stranded DNA. Rpb1 cleavage was monitored by using an antibody reactive against the N terminus of Rpb1. These results show that Rpb1 cleavage is observed only when TFIIF is present.
To extend these findings, TFIIB derivatives with FeBABE at either residue 135 or 184 on the core domain were used to probe the cleavage of Rpb2 containing a triple-Flag tag at the C terminus (Fig. 8C and D). In complete PICs, FeBABE at residue 184 cleaves primarily the Rpb2 wall domain (brown surface in Fig. 8D), while FeBABE at position 135 cleaves the fork/protrusion. Rpb2 cleavage from both these FeBABE-labeled positions was observed only upon the addition of TFIIF (Fig. 8C, lanes 2 and 4). Together, these results show that the TFIIB core domain is positioned differently in the PIC and minimal open complexes and that TFIIF is primarily responsible for this difference.
Since TFIIF repressed transcription from many of the bubble templates and has a dramatic effect on the location of the TFIIB core domain, we tested if the TFIIF-dependent positioning of the core domain contributes to repression. To test this hypothesis, the positioning of the TFIIB core needed to be unlinked from the presence of TFIIF. Given the results presented above, we reasoned that the Zn ribbon and possibly the reader/linker would be necessary for the full activity of the open complexes but that the TFIIB core domain would be dispensable. To generate a construct lacking the TFIIB core domain, the N terminus of TFIIB containing the ribbon and reader/linker regions was fused to the N terminus of TBP (Fig. 9A). The first 60 residues of yeast TBP is not conserved and likely serves as a flexible linker between the TFIIB N terminus and the TBP conserved domain.
This recombinant factor was purified and, as expected, had no activity in the reconstituted transcription system with double-stranded DNA (not shown). In striking contrast, the TFIIBN-TBP fusion worked nearly as well to promote transcription from bubble 3 as did TBP and TFIIB (Fig. 9B, compare lanes 2 and 4). If the TFIIF-dependent positioning of the TFIIB core domain on Pol II contributes to repression, then transcription using the fusion construct lacking the core domain should be resistant to TFIIF. As predicted by this model, the addition of TFIIF had no repressive effect, in contrast to the system with TBP and TFIIB, which was repressed by TFIIF (Fig. 9B, lanes 4 and 5). High-resolution analysis showed that the TFIIB-TBP fusion allowed initiation at the same position in the single-stranded DNA bubble as that of wild-type factors, but the fusion was defective in promoting initiation from downstream double-stranded DNA (Fig. 9C). Together, our results suggest that the positioning of the TFIIB core domain on the Pol II wall at bubble 3 is inhibitory to initiation.
Although Pol II and the general transcription factors are highly conserved, there is a clear difference in the positions and mechanisms of transcription start site selection between S. cerevisiae and mammals. Here we have examined the ability of the yeast system to initiate transcription at variable distances from TATA using a series of premelted HIS4 promoters, forming minimal open complexes with Pol II, TFIIB, and TBP. We found that yeast Pol II has remarkable flexibility in the ability to initiate transcription from these bubbles spaced over >50 bp of promoter DNA. Within this window, we found that the sequence of the bubble was the most important determinant of promoter activity but that the position of the bubble and the sequence immediately upstream of the bubble also contributed to initiation efficiency. The activity of these templates required only Pol II, TBP, and TFIIB, the same components required for the human system to transcribe premelted promoters (12, 20, 25). The most active HIS4 promoter derivatives were those with bubbles overlapping either the mammalian start site (~30 bp downstream of TATA) or promoters with bubbles overlapping the normal HIS4 start sites (~70 bp downstream). However, bubbles of appropriate sequence positioned between these two optimal locations did support initiation although less efficiently.
An unexpected finding was that TFIIF could modulate the activity of the minimal open complexes. At most bubble derivatives, TFIIF repressed initiation within the single-stranded region while permitting initiation 2 to 4 bp downstream of the bubble when an appropriate sequence was present. In contrast, bubbles surrounding the normal HIS4 start sites were either slightly stimulated by TFIIF or insensitive to its presence. Surprisingly, we found that the response to TFIIF was mediated by the sequence of the bubble. For example, the replacement of bubble 3 (repressed by TFIIF and overlapping the mammalian start site) with the bubble 15 sequence (stimulated by TFIIF and located at the yeast start site) created a promoter that was stimulated by TFIIF but that initiated transcription close to the mammalian start site. Conversely, the replacement of bubble 15 with the bubble 3 sequence gave a promoter that initiated far from TATA and was repressed by TFIIF. Additional experiments showed that the sequence of the single-stranded template strand, within a few bases upstream of the single-strand–double-strand junction, can determine whether an open complex is repressed or stimulated by TFIIF (Fig. 6A). What sequence feature of the heteroduplex region dictates the response to TFIIF? Heteroduplex templates that are not repressed by TFIIF tend to have some A/T character at the 5′ end of the bubble and G/C at the 3′ end, while bubbles repressed by TFIIF tend to have G/C spread throughout the bubble sequence. Heteroduplex regions that do not work as efficient promoters are very A/T rich (bubbles 9, 10, and 12).
Combined, our results show that the HIS4 promoter sequence is optimized to direct initiation from the in vivo initiation region located ~60 to 80 bp from TATA. Pol II, attempting to initiate at the mammalian position, would presumably be inhibited by TFIIF. Furthermore, the sequence of HIS4 ~40 to 60 bp downstream from TATA does not support initiation when it is single stranded. However, there must be additional control over transcription start site selection. The positioning of an active initiator (the bubble 15 sequence) 30 bp downstream from TATA in double-stranded DNA does not allow initiation (not shown). Consistent with this, the insertion of the strong SNR14 initiator at variable distances from the HIS4 TATA shows that transcription cannot initiate closer than ~50 bp from TATA (not shown). Thus, there are at least two levels of control that dictate the transcription start site in the yeast system, (i) the promoter sequence and (ii) an inherent property of Pol II and/or the general factors.
The complete purified transcription system also allowed us to test the function of the TFIIB reader and linker regions. B-linker mutations were found to block transcription from double-stranded DNA in the yeast and archaeal systems, and archaeal TFB linker mutants were rescued by the premelting of promoter DNA (14). Here we found that mutations in the TFIIB linker could also be rescued by the premelting of the DNA, as transcription from both bubbles 3 and 15 occurred at near-normal levels using the TFIIB L110P reader mutant. Similarly, three B-reader mutants were nearly inactive on double-stranded DNA but were rescued by premelted promoter DNA. The B-reader was previously reported to be critical for TFIIB function and to assist in transcription start site selection (1, 14, 21, 25, 26). Together, our new results show that both the B-reader and B-linker regions play a major role in melting and/or the stabilization of the melted DNA in the open complex; detrimental effects of the TFIIB mutations on the transcription of double-stranded DNA are almost completely reversed at heteroduplex promoters.
Finally, the minimal open complex system allowed us to probe the architecture of these complexes compared to that of PICs. Although the TFIIB ribbon, reader, and linker regions were positioned similarly in PICs and open complexes, there was a striking difference in the positions of the TFIIB core domain in the two complexes. In PICs, the TFIIB core domain binds the Pol II wall, while this TFIIB domain is positioned away from the wall in the minimal open complex containing TBP, TFIIB, Pol II, and the heteroduplex bubble. The addition of TFIIF caused a shift in the positioning of the TFIIB core domain to the location on the Pol II wall observed in PICs, and this occurred at both bubbles 3 and 15. We found that the elimination of the TFIIB core domain also eliminated the ability of TFIIF to repress transcription at bubble 3, showing that the repositioning of TFIIB contributes to the repression of transcription by TFIIF. These results give a different model for the architecture of the open complex state compared to previous proposals based on merging models for the PIC and the Pol II elongation complex (Fig. 10) (14, 18). In the previously reported models, the positioning of the TFIIB core domain on Pol II results in a sharp bend in the template strand, 14 bp upstream from the transcription start site, at the junction of single- and double-stranded DNAs. In contrast, the repositioning of the TFIIB core domain away from Pol II would eliminate this bend and the presumed resulting strain on the stability of the complex.
To develop a working model for the architecture of the TFIIF-containing open complexes, we need to account for the finding that TFIIF does not repress all minimal open complexes. Recall that open complexes formed on bubble 15 were slightly stimulated by TFIIF, while at the same time, TFIIF caused a repositioning of the TFIIB core on bubble 15 (Fig. 8B). One model consistent with our data is that TFIIF, either directly or indirectly, can “read” the sequence of the single-stranded template strand and help position this DNA within Pol II in an active and/or stable state. By this model, the stable positioning of the bubble 15 template strand would be assisted by TFIIF. In contrast, when TFIIF is added to open complexes that are repressed by TFIIF (e.g., bubble 3), the resulting bend in the template strand, caused by the binding of TFIIB to the Pol II wall, may pull the DNA into a nonfunctional position, leading to the repression of initiation. From the modeling of the PIC and the structure of the TFIIB-Pol II complex, we know that the TFIIB B-reader and B-linker as well as the unstructured linker in the TIIF small subunit are close to or within the Pol II active-site cleft (9, 14, 18). In future work to test this model, it will be informative to probe protein-DNA contacts between the single-stranded bubble, TFIIB, TFIIF, and Pol II to more precisely determine the path of single-stranded DNA in both active and inactive minimal complexes and to probe for direct interactions between TFIIB, TFIIF, and promoter DNA.
We thank Hung-Ta Chen for initial design of the heteroduplex bubble strategy, Bruce Knutson for sequence analysis, Patrick Cramer for communication of an RNA Pol II purification method, and members of the Hahn laboratory for advice and comments on the manuscript.
This work was supported by grant GM053451 from the National Institutes of Health.
Published ahead of print 24 October 2011