|Home | About | Journals | Submit | Contact Us | Français|
The key step in bacterial promoter opening is recognition of the -10 promoter element (T-12A-11T-10A-9A-8T-7 consensus sequence) by the RNA polymerase σ subunit. We determined crystal structures of σ domain 2 bound to single-stranded DNA bearing -10 element sequences. Extensive interactions occur between the protein and the DNA backbone of every -10 element nucleotide. Base-specific interactions occur primarily with A-11, and T-7, which are flipped out of the single-stranded DNA base-stack and buried deep in protein pockets. The structures, along with biochemical data, support a model where the recognition of the -10 element sequence drives initial promoter opening as the bases of the non-template strand are extruded from the DNA double-helix and captured by σ. These results provide a detailed structural basis for the critical roles of A-11 and T-7 in promoter melting, and reveal important insights into the initiation of transcription bubble formation.
Transcription initiation is a major point for the regulation of gene expression, and the DNA-dependent RNA polymerase (RNAP) is the central enzyme of transcription. In bacteria, the promoter-specificity σ factor combines with the core RNAP to form the holoenzyme, which carries out all steps of initiation (Murakami and Darst, 2003).
The σ factor recruits core RNAP to sites of transcription initiation through recognition of specific DNA sequences called promoters (Shultzaberger et al., 2006). The group 1, or primary, σ factors (σ70 in Escherichia coli, σA in Thermus aquaticus) are responsible for the bulk of transcription during log phase growth and are essential for viability.
The -10 element (or Pribnow box) is the most highly conserved and essential bacterial promoter motif (Figures 1A, 1B; (Hook-Barnard and Hinton, 2007; Shultzaberger et al., 2006). Once bound to the promoter in a closed (double-stranded) complex (RPc), σA-holoenzyme spontaneously isomerizes to the transcription-competent open complex (RPo), in which the DNA from within the –10 element downstream to the transcription start site (TSS, +1) is strand separated to form the transcription bubble ((deHaseth et al., 1998) the numbering scheme for the transcription register, with the transcription start site as +1, and negative and positive numbers corresponding to upstream and downstream positions, respectively, is denoted in Figure 1A).
Based on the observation that σ conserved region 2 (Lonetto et al., 1992) contains a number of invariant aromatic and basic residues (Figure 1C), (Helmann and Chamberlin, 1988) proposed that a function of σ is to bind one of the DNA strands within the nascent transcription bubble to stabilize the strand-separated state. Subsequent studies have shown that the invariant aromatic and basic residues of σ play key roles in –10 element recognition and transcription bubble formation (deHaseth and Helmann, 1995). Alanine substitutions of many of the invariant aromatic residues result in promoter-melting defects (Juang and Helmann, 1994). Holoenzyme sequence-specifically binds single-stranded DNA (ssDNA) corresponding to the –10 element nontemplate-strand (nt-strand) sequence (Marr and Roberts, 1997; Roberts and Roberts, 1996; Savinkova et al., 1988), and this activity has been mapped to structural domain 2 of σ (σ2; (Severinova et al., 1996; Young et al., 2001). Free σ, or fragments containing σ2, have sequence-specific ssDNA binding activity for the nt-strand –10 element sequence (Feklistov et al., 2006; Sevostyanova et al., 2007; Zenkin et al., 2007). In fact, the essence of RNAP promoter melting activity is localized to nt-strand – 10 element/σ2 contacts (Young et al., 2004). These findings support a model whereby σ2-mediated capture of nt-strand bases of the –10 element extruded from the DNA double-helix underlies the initiation of strand separation and provides crucial stability to RPo. In addition to its role in initiation, sequence-specific contact between σ2 and ssDNA can regulate transcript elongation by inducing a pause when σ2 recognizes –10-element-like sequences in the nt-strand of the transcription bubble (Ring et al., 1996).
The structure of T. aquaticus (Taq) σA-holoenzyme with a fork-junction promoter fragment revealed that the invariant aromatic residues of σA2 are perfectly positioned to interact with exposed bases of the -10 element nt-strand, but details were not revealed at 6.5 Å-resolution (Murakami et al., 2002). To provide a high resolution structural description of the key protein/DNA interaction in RPo formation, we determined X-ray crystal structures of a Taq σA fragment comprising structural domains 2 and 3 (σA2-3; Campbell et al., 2002) bound to ssDNA containing the –10 element nt-strand sequence (Figures 1C-E).
The oligonucleotide sequence used for cocrystallization (Figure 1D) is based on the conserved motif discovered in ssDNA aptamers selected for binding to free Taq σA in vitro (Feklistov et al., 2006). The ssDNA oligonucleotides bound Taq σA2-3 (Figure 1C) with a dissociation constant (Kd) more than three orders of magnitude lower than a control, anticonsensus sequence (Figure S1A). The most detailed model (with the T-12A-11C-10A-9A-8T-7 –10 element; Figures 1D, E) was refined at 2.1 Å-resolution to an R/Rfree of 0.194/0.237 (Table S1, Figure S1B). The structure with the T-12A-11T-10A-9A-8T-7 –10 element was essentially identical (root-mean-square deviation of 0.165 Å over all atoms). Although biochemical analysis of the crystal contents established that σA3 was present in the crystals, electron density for the domain was absent and it was presumed disordered.
The ssDNA is draped across a highly conserved, positively-charged surface of σA2 (Figures 1E, S1C), with a 90° turn in the DNA backbone between the -11 and –10 positions (Figure 1E). Extensive interactions occur between the protein and the DNA backbone of every nucleotide from –12 to –6 (Figure 2A). Base-specific interactions occur primarily with T-12, A-11, A-8, and T-7, especially A-11 and T-7, which are notably flipped out of the ssDNA base-stack and entirely buried in protein pockets (Figure 1E).
The ssDNA/protein complex results in the burial of 1,096 Å2 total molecular surface area. On the DNA, the interactions occur almost entirely within the –10 element (92% of the buried surface area). The DNA interacts with residues from all of the σ conserved regions (1.2, 2.1, 2.2, 2.3, and 2.4; Figures 1C, E; (Lonetto et al., 1992), with the bulk of the interactions occurring within σ regions 2.3 (73% of the buried surface area) and 2.1 (23%).
Based on previous studies, we expected to observe interactions between the downstream discriminator element (G-6G-5G-4) and residues of σA region 1.2 (Feklistov et al., 2006; Haugen et al., 2006; Haugen et al., 2008). In fact, the bases of the discriminator element do not interact with the protein, but peel away and participate in crystal contacts with GGG motifs from symmetry-related complexes, forming an unexpected G-quadruplex structure that is unlikely to be relevant to σ factor function but plays a critical role in crystal packing (Figure S1D). Nearby, the σA structure features a shallow, positively charged channel that likely accommodates the GGG motif in the physiological complex (Figure S1C; (Haugen et al., 2008).
T is strongly favored at –12, the upstream position of the –10 element (Figure 1B). In general, promoter mutations that substitute T-12 with another base weaken promoter activity/binding, and mutations that substitute another base with T strengthen promoter activity/binding (Moyle et al., 1991). Optimal binding of RNAP holoenzyme to fork-junction promoter probes, which have a mostly single-stranded –10 element, required the base-pair at –12 (Guo and Gralla, 1998), suggesting that the –12 position normally remains base-paired even in RPo (Figure 1A). Nevertheless, T-12 can also be recognized in the context of ssDNA (Feklistov et al., 2006; Roberts and Roberts, 1996; Sevostyanova et al., 2007).
The three 5'-nucleotides of the ssDNA (5’T-14G-13T-12) hang off the edge of the protein structure, with only T-12 making significant interactions with the protein (Figures 1E, ,2).2). The three nucleotides maintain a structure similar to one strand of a B-form double-helix, except that the base of G-13 is in the syn conformation. Modeling in a B-form double-helix reveals that there is ample space for the paired t-strand (-14 to –12), but continuation of the double-helix downstream to –11 is blocked by the invariant W-dyad (W256/W257) and other elements of the protein (Figures 2B, ,3A3A).
T-12 is propped against the W-dyad, with W256 making extensive van der Waals interactions primarily with the T-12 deoxyribose moiety (as well as with the base), and W257 making an ‘edge-on’ van der Waals interaction with the T-12 pyrimidine ring (Figure 2B). The T-12 5'-phosphate [-12(P)] is held by polar interactions with R246. These interactions would likely occur regardless of the identity of the –12 base. Explaining the preference for T at this position, R237, reaches over from the σ region 2.2 α-helix and forms a hydrogen-bond (H-bond) with the O4 atom of T-12 [T-12(O4)], and the aliphatic side-chain of K241 makes van der Waals contact with the T-12(C5-methyl) (Figure 2B).
The primary role of σ2 in -10 element recognition was first uncovered in genetic screens for σ mutants that suppressed single-base substitutions in the -10 element. Specifically, it was shown that substitutions at the position corresponding to E. coli σ70 Q437 (Taq σA Q260), which is absolutely conserved among Group I σ's (Campbell et al., 2002; Gruber and Bryant, 1997); Figure 1C), to H or R allows efficient transcription from mutant promoters having a T to C substitution at position -12 (Kenney et al., 1989; Waldburger et al., 1990).
In the Taq σA2/-10 element DNA structure, electron density maps show clear evidence for two, roughly equally populated conformations of Q260. In either conformation, however, the shortest distance between any atom of Q260 and T-12 is 6.3 Å – too far for the genetic results to be explained by a direct interaction between Q260 and the T-12 base (Figure 1E). However, modeling of the t-strand A base-paired to T-12 places the major-groove edge of this base within H-bonding distance of Q260 (Figure 3A). Surveys of protein/DNA interactions (Hoffman et al., 2004; Luscombe et al., 2001) point to a strong preference for Q to interact with the major-groove edge of A, while H and R strongly prefer G (Figure 3A), which corroborates our modeling and explains previous genetic results.
Q260 and other σ region 2.4 residues implicated in -10 element recognition lie on a long α-helix that is roughly perpendicular to the trajectory of the promoter DNA double-helix (Murakami et al., 2002). Because of this, amino acid side chains of σ do not appear to be able to establish sequence-specific interactions with the double-stranded -10 element (as in RPc) due to the depth of the major groove. Structural modeling (Figure 3A) suggests that Q260 can recognize the major-groove edge of the -12 bp only when the -11 position and downstream are strand-separated, allowing the -12 position of the resulting upstream fork-junction to move closer to the σ region 2.4 α-helix.
The most highly conserved position of the –10 element is A-11 (Figure 1B). Only a few per-cent of σA promoters have a base other than A at this position. Mutations from the consensus A; i) often completely inactivate promoters (Lee et al., 2004; Lim et al., 2001), ii) cause severe defects in RPo stability (Fenton and Gralla, 2001, 2003a, b; Guo and Gralla, 1998; Lim et al., 2001; Matlock and Heyduk, 2000; Schroeder et al., 2007), and iii) cause defects in binding to nt-strand ssDNA oligonucleotides (Roberts and Roberts, 1996); Figure S1A).
In addition to interacting with the T-12 backbone, the first W of the invariant W-dyad, W256, also interacts with the A-11 backbone and occupies the space where the A-11 ribose moiety would be if the DNA double-helix extended downstream from –12 (Figures 1E, ,2B,2B, ,3A).3A). This necessitates a flip of the entire A-11 nucleotide which, in turn, removes the A-11 base from the upstream base-stack formed by T-14G-13T-12 (Figure 1E). Instead, the A-11 base is completely buried in a hydrophobic protein pocket (Figures 2, 3B, 3C; (Tsujikawa et al., 2002).
The A-11 pocket is perfectly shaped to fit an A and would poorly accommodate any other base, explaining the high conservation of A-11 in the –10 element (Figure 1B) and the severe effect of promoter mutations at this position. On one face of the A-11 base (the front, or downstream face), Y253 makes a π-stack, as predicted by (Schroeder et al., 2007). On the opposite face, R246 stacks on the base, forming a cation-π interaction (Wintjens et al., 2000)(Figures 1E, ,2B,2B, ,3B).3B). The position of the R246 side-chain is stabilized through a polar interaction with the -12(P) (Figures 2, ,3B).3B). In the absence of DNA, the R246 side chain is free to swing into an open configuration, allowing the A-11 base to slip into the pocket.
Studies using nucleotide analogs in place of A-11 revealed a strict requirement for a purine base with no side groups at the N1 and C2 positions (Lee et al., 2004; Matlock and Heyduk, 2000). Furthermore, methylation of A-11 at N3 interfered with holoenzyme binding (Johnsrud, 1978; Siebenlist et al., 1980). In the structure, the back wall of the A-11 pocket forms a tight steric fit with the base that is only possible if the N1, C2, and N3 positions are unsubstituted (Figure 3C). F242 makes a hydrophobic contact with A-11(C2), while the polypeptide backbone between residues 241 to 243 makes several H-bonds with the A-11 base. The A-11 pocket is topped by F248, which makes van der Waals contact with A-11(N3) (Figures 3B, 3C). The side-chain of E243 does not contact the A-11 base, but appears to play an important role by forming the bottom part of the pocket and by making polar interactions with R246 to help stabilize its position (Figure 2B).
The path of the DNA backbone wraps around the surface of the protein with a 90° turn between the –10(P) and –9(P) (Figure 1E). T252 interacts with –9(P), and serves as a fulcrum of the DNA backbone turn (Figures 1E, ,4).4). In this way, T252 plays a critical role. Of all the highly conserved residues of σ region 2.3 (Figure 1C), T252 is the least tolerant to substitution (Waldburger and Susskind, 1994). Changes at this position yield the most severe promoter melting defects in vitro (Schroeder et al., 2008), and the only substitution that yields functional σ in vivo is highly conservative S (Waldburger and Susskind, 1994). Comparison of promoter binding vs. melting activities of the E. coli σ70 T429A substitution (corresponding to Taq σA T252A) suggests that T252 exerts its critical role at the strand separation step, after formation of RPc (Schroeder et al., 2008), consistent with the structure.
After A-11, the most highly conserved position of the –10 element, the next three nucleotides, T-10A-9A-8, are the least conserved (Figure 1B), and promoter mutations at these positions generally have less effect on promoter activity than mutations at the –12, -11, or –7 positions. In line with these observations, the –10/-9/-8 nucleotides are primarily bound through extensive interactions with the DNA sugar-phosphate backbone (Figures 2A, ,4).4). The three bases are stacked together and point away from the protein; only the base of A-8 makes van der Waals contact with T255, and also water-mediated contacts to the R259 side-chain and T252 main-chain (Figure 4).
The –9(P) and –8(P) make extensive polar contacts with protein side-chains and main-chain, whereas the -7(P) does not interact with the protein (Figures 4, S2). This explains chemical probing results, which found that ethylation of the –9(P) or –8(P), but not other phosphates in the –10 element, interfered with promoter binding by RNAP (Johnsrud, 1978; Siebenlist et al., 1980).
The downstream position of the –10 element, T-7, is almost as highly conserved as A-11 (Figure 1B). Promoter mutations at this position also generally have severe consequences for promoter activity (Moyle et al., 1991). The base-stack of C-10A-9A-8 is prevented from continuing in the downstream direction by T255 and R259 (Figure 4). Instead, the entire T-7 nucleotide is flipped out of the base stack (as predicted by (Schneider, 2001) and buried in another protein pocket formed by residues from conserved regions 1.2, 2.1, and 2.3 of σ (Figure 4). Unlike the A-11 protein pocket, the T-7 pocket is i) spacious compared with the size of the base, and ii) hydrophilic in nature. The T-7 pocket accommodates well-ordered water molecules that participate in the recognition of the base (Figures 2A, ,4).4). Every potential interacting moiety of the T-7 base is recognized by the protein.
Although the T-7 pocket is relatively spacious compared to the pyrimidine base, purine bases cannot be accommodated in the pocket. The spatial arrangement of H-bond donors and acceptors of a pyrimidine C-7 are not compatible with the T-7 pocket, and a favorable hydrophobic van der Waals interaction would be lost due to the absence of the C5-methyl (Figure S2A).
A salt bridge between R208 and the –6(P) is the final biologically relevant DNA/protein contact (Figure 4). Downstream, G-6G-5G-4 turn away from the protein to form the intermolecular G-quartet structure that participates in crystal packing (Figures 1E, S1D).
It has been established that σ2 sequence-specifically recognizes the nt-strand of the -10 element in RPo (Marr and Roberts, 1997; Roberts and Roberts, 1996; Savinkova et al., 1988), mediated by universally conserved aromatic residues of σregion 2.3 (Figures 1C, 1E; (Juang and Helmann, 1994). The role, if any, of sequence-specific recognition of the duplex -10 element in RPc is less clear, due to the transient nature of this intermediate. Current thinking posits that the -10 element (or at least its upstream part) may be recognized sequence-specifically in dsDNA form (i.e. in RPc) by residues of σ region 2.4, while upon strand separation and RPo formation, residues of σ region 2.3 recognize the nt-strand bases of the -10 element (reviewed in: (deHaseth et al., 1998; Helmann and deHaseth, 1999; Hook-Barnard and Hinton, 2007). In contrast to this view, our structural modeling suggests that sequence-specific interactions between σ2 and the duplex -10 element are unlikely to form prior to strand-separation beginning at A-11: we hypothesize that recognition of the -10 element sequence only occurs when strand separation is initiated, as the A-11 and T-7 bases are captured in their σ2 pockets (Figures 2–4).
To test our hypothesis, we investigated the binding of dsDNA containing -10 element sequences to E. coli RNAP holoenzyme compared with DNA lacking the -10 element [anti(-10) DNA] under conditions favoring RPc (4°C). To monitor DNA binding, we employed a recently reported ‘RNAP beacon assay’ (Mekler et al., 2011), which takes advantage of the sensitivity of light emission from a fluorophore attached on the σ–surface near the cluster of aromatic residues implicated in -10 element recognition. In free RNAP, the probe fluorescence is quenched due to photoinduced electron transfer from the cluster. Upon binding of -10 element DNA, the contacts between the aromatic residues and the fluorophore become disrupted, resulting in increased fluorescence signal. The assay is ideal for our purposes since it reports on specific σ2/-10 element interactions while being ‘blind’ to non-specific protein/DNA binding elsewhere on the holoenzyme that can mask weak, specific interactions in conventional binding assays.
To focus on RNAP/-10 element interactions and avoid the contribution of other promoter elements to binding affinity, we chose a dsDNA fragment (-22 to +4) based on the lacUV5 promoter (Figure 5), for which a stable RPc has been reported at low temperatures (Spassky et al., 1985). We observed specific binding of the dsDNA fragment (Kd = 1.0 ± 0.4 μM at 4°C; Figures 5A, 5D). A single-base substitution at the -11 position (A to G) resulted in a significant drop of affinity (Kd = 33.7 ± 14.9 μM), almost to the level of the anti(-10) sequence (Kd = 48.9 ± 21.1 μM; Figure 5A). In the two extremes, the observed specific interaction could be between RNAP holoenzyme and the fully duplex DNA fragment (as postulated in RPc) or, alternatively, the -10 element may be bound to RNAP holoenzyme with the A-11 and T-7 bases in their respective σ2 pockets (Figure 1E). To distinguish between these two scenarios, we introduced modified bases at the -11 and -7 positions of the -10 element duplex designed to prevent binding of the bases in their σ2 pockets but preserve the recognition surfaces of the dsDNA, and measured their effects on binding.
According to available RPc models (Murakami et al., 2002; Shultzaberger et al., 2006), the (A/T)-11 base pair of the duplex -10 element is exposed to σ2 via its major groove (Figure 6). With 2,6-diaminopurine (diAP) in place of A-11, the major groove profile and overall geometry of the -11 base pair remain intact (Figure 5D; (Cheong et al., 1988), but binding of the base in its σ2 pocket (Figures 2B, ,3)3) is compromised due to steric clash of the exocyclic amine at the 2-position (Figure S2B; (Lee et al., 2004). The diAP-11 incorporated into the duplex -10 element fragment caused a 14-fold decrease of binding (Figures 5B, 5D). We presumed that the residual binding of (diAP/T)-11 dsDNA [compared to anti(-10)] was due to recognition of the -10-like sequence on the bottom strand, where 4 bases out of 6 match the consensus (Figure 5D). Neutralizing this second -10 element with a 2-aminopurine (2AP) modification in the bottom strand opposite T-7 resulted in a loss of binding nearly to the level of the anti(-10) DNA (Figures 5B, 5D). The (T/2AP)-7 modification by itself does not affect the recognition of the nt-strand -10 element (Figure 5C). Introduction of diAP into each of three possible positions of a -35 element-containing duplex promoter fragment (-41 to -12) had no effect on binding (Figure S2C), ruling out possible effects of diAP on DNA helix geometry that could affect the putative -10 element dsDNA mode of binding.
Introducing modified bases into the t-strand would not be expected to affect the binding of the nt-strand A-11 in its σ2 pocket. Indeed, 5-methyl isocytosine (MeiC) or 3-nitropyrrole (3-NP) opposite A-11 alters the minor (MeiC, 3-NP) and major (3-NP) grooves of the base pair, but these modifications have no significant effect on DNA binding (Figures 5B, 5D).
In RPc, the (T/A)-7 bp is expected to face σ2 via its minor groove (Murakami et al., 2002; Shultzaberger et al., 2006); Figure 6). Replacing the (T/A)-7 bp with C/H (H, hypoxanthine) preserves the disposition of functional groups within the minor groove, but prevents binding of the -7 nt-base (Figures 5C, S2A). This modification resulted in a 30-fold increase in the Kd (Figures 5C, 5D). Introducing 2AP or even a universal base (5-nitroindole; 5-NI) in the t-strand opposite T-7 had no effect on dsDNA fragment binding even though these modifications significantly alter both major and minor groove profiles of the dsDNA (Figures 5C, 5D).
Finally, (H/C)-11 and (2-sT/A)-7 (2-sT, 2-thiothymidine) modifications address the unlikely cases where, in RPc, the (A/T)-11 bp faces σ2 from its minor groove and the (T/A)-7 bp faces σ2 from its major groove (Figure 5D). Again, even though these alterations preserve the respective dsDNA grooves that could be facing the σ surface, these modifications to the nt-stand A-11 and T-7 bases compromise the fit in their σ2 pockets and result in loss of binding affinity (Figure 5D).
In summary, we find that modifications to the A-11 or T-7 bases of the nt-strand of the -10 element expected to disrupt ssDNA binding (Figure 1E) compromise binding of the dsDNA fragment (marked red in Figure 5D). At the same time, dramatic alterations to the major and minor groove structures of the dsDNA do not significantly affect binding, as long as the A-11 or T-7 bases remain intact (marked green in Figure 5D). In combination, these results can only be explained if the critical nt-strand A-11 and T-7 bases are bound by σ2 in the single-stranded state and not in the context of fully closed dsDNA. We conclude that the specific binding observed for the unmodified duplex -10 element fragment is due to recognition of the A-11 and T-7 bases in their σ2 pockets, and that within the limits of detection of our assay, sequence-specific recognition of the duplex -10 element does not occur.
The -10 element was discovered more than three decades ago (Pribnow, 1975). Nevertheless, the molecular details of its recognition by the RNAP holoenzyme have, until now, been unknown. The crystal structures presented here reveal a high-resolution view of the sequence-specific interactions between the bacterial RNAP promoter-specificity σ factor and the nt-strand of the –10 element. These interactions are critical for nucleation of melting and stabilization of the initial strand separated state that ultimately allow the formation of the transcription bubble, providing the RNAP active site access to the DNA t-strand for coding of the transcript sequence.
Base-specific interactions between σ and the ssDNA quantitatively reflect the conservation at each position of the –10 element (Figure 1B). The observed ssDNA/σ2 interactions are completely consistent with previous footprinting and chemical probing experiments within the –10 element (Johnsrud, 1978; Siebenlist et al., 1980) and explain the strict requirements on the base at the –11 position (Figure 3C; (Lee et al., 2004; Lim et al., 2001; Matlock and Heyduk, 2000).
All of the protein amino acid side-chains that interact with the ssDNA are highly conserved (most of them invariant; Figure 1C). Many σ2 residues implicated previously in promoter binding [R237, K241 (Tomsic et al., 2001); K249 (Waldburger and Susskind, 1994); R259 (Fenton et al., 2000)] or promoter-melting [F248, Y253, W256, W257 (Juang and Helmann, 1994; Schroeder et al., 2009); T252 (Schroeder et al., 2008)] are seen to play important roles in the complex. The conserved aromatic residues of σ region 2.3 were presumed to fulfill their promoter melting role through stacking interactions with the –10 element nucleotide bases (Helmann and Chamberlin, 1988), but only Y253 participates in the complex in this way by stacking on the flipped-out A-11 (Figures 2B, ,3B;3B; (Schroeder et al., 2009). Many residues not previously implicated in promoter binding appear to make critical interactions (for instance, L108, N206, R208, L209, Figure 4; R246, Figures 2, ,3B3B).
During promoter opening RNAP unwinds about 1.3 turns of the dsDNA (from -11 to +3) without any external energy input (such as ATP hydrolysis), utilizing instead the binding free energy of interactions with promoter DNA. Promoter melting, therefore, is driven by RNAP affinity towards the “final state” i.e. the conformation of promoter DNA existing in RPo. The crucial role of σ2 in the nucleation of promoter opening is to provide favorable interactions with the melted -10 element DNA. Specific recognition of the dsDNA in the region to be melted would stabilize the closed DNA and would therefore be unfavorable for melting. Indeed, our structural modeling and biochemical data argue that -10 element sequence read out is coupled with the nucleation of strand separation.
We show here that even under conditions that favor RPc (4°C), RNAP specifically recognizes only the melted state of the -10 element with nt-strand bases at -11 and -7 flipped out of the DNA base stack (Figure 5). Complexes formed between RNAP holoenzyme and duplex -10 element DNA at 4°C appear “closed” on the basis of non-reactivity towards MnO4- oxidation, a technique used to reveal unstacked/solvent-exposed T bases (data not shown, Niedziela-Majka and Heyduk, 2005). This suggests that in intermediate complexes observed on various promoters at low temperatures, strand separation may be initiated but the T bases may remain stacked and/or protected by protein contacts and thus be non-reactive to MnO4- (Davis et al., 2007). Proceeding from this initial recognition state to the final stable transcription-competent RPo requires a combination of additional factors, such as auxiliary promoter elements, negative supercoiling, or elevated temperatures.
The cellular RNAPs from all three domains of life are conserved in sequence, structure, and catalytic mechanism (Lane and Darst, 2010), but initiation scenarios are distinct. Eukaryotic RNAPs (I, II, and III) employ an arsenal of general transcription factors (GTFs) that assemble at promoter elements located upstream and downstream of the TSS (Roeder, 2005), preparing a platform for RNAP recruitment. In contrast, bacterial σ is unable to recognize promoters on its own and functions, therefore, as a dissociable RNAP subunit.
Some scattered functional analogies between GTFs and σ seem to have resulted from convergent evolution. In the case of RNAP II (the best understood eukaryotic initiation system), the TATA-box (reviewed in (Nikolov and Burley, 1997), despite striking sequence similarity, is not analogous to the -10 element in function nor in recognition mechanism. The fact that it is recognized in double-stranded form (Nikolov and Burley, 1997), and is located ~20 base pairs upstream of the origin of melting and ~30 base pairs upstream of the TSS, make its role more similar to the bacterial -35 element. Continuing this analogy, the GTFs that assemble around the TATA-box (TBP and elements of TFIIB; (Kostrewa et al., 2009; Liu et al., 2010) play roles analagous to bacterial σ4. The B-linker and B-reader of TFIIB, possible counterparts of bacterial σ2, may be responsible for stabilization of melting and TSS selection, respectively, but the mechanism of their action remains unknown (Kostrewa et al., 2009).
The hallmark of our model for the role of σ2 in -10 element recognition and melting is that these steps are spatially and temporally coupled. By contrast, in eukaryotic initiation these steps appear to be independent and follow complex, step-wise mechanisms allowing for multiple layers of regulation (Kostrewa et al., 2009; Liu et al., 2010; Roeder, 2005).
Many studies have suggested that strand separation initiates at –11 and then propagates downstream (Chen and Helmann, 1995; Heyduk et al., 2006; Lim et al., 2001). Recent computational modeling found that initiation of strand separation at –11 resulted in efficient kinetics of RPo formation, while bubble initiation elsewhere yielded inefficient trajectories (Chen et al., 2010). A structural basis for these observations, as well as other insights into the initiation of transcription bubble formation, is provided by a comparison of the structure presented here (representing RPo) with a simple RPc model (adapted from (Murakami et al., 2002); Figure 6).
The flipping of the A-11 base is key to initiating the strand separation process (Heyduk et al., 2006; Lim et al., 2001). We noted that the absolutely conserved σ2-W256 resides on the downstream face of T-12, precisely where A-11 would be if the dsDNA continued downstream to -11 (Figures 2B, ,3A).3A). In the holoenzyme structure the bulky W256 side-chain protrudes from the bottom of a shallow, electrostatically basic trough with dimensions to accommodate dsDNA and formed by surfaces of σ2, σ3, and the β-subunit (Figure 6B). This prominent position of W256 at the at the -11 register of RPc suggests that when dsDNA is loaded in this trough, directed by electrostatic interactions with the DNA backbone, the W256 side chain may act as a ‘wedge’ for disrupting the (A/T)-11 base pair, initiating A-11 flipping (active mechanism) or stabilizing the conformation of the spontaneously flipped A-11 base (passive mechanism). These structural considerations corroborate initial suggestions for a crucial role of W256 in A-11 flipping based on the observation that the substitution W433A in E. coli σ70 (corresponding to Taq σA W256) had no effect on single-stranded -10 element binding but dramatically slowed the rate of double-stranded promoter DNA opening (Tomsic et al., 2001).
The proposed role of W256 in A-11 recognition invokes a mechanistic analogy to base flipping by other DNA-binding proteins that also use a ‘wedge’ residue (often an aromatic side chain) to invade the DNA double-helix and fill the space vacated by the flipped base, stabilizing its extrahelical conformation. Examples of such base-flipping proteins are numerous and include base excision repair proteins (Lau et al., 1998; Yang et al., 2009) or Tn5 transposase (that also uses a W residue as a ‘wedge’ for base flipping; (Davies et al., 2000).
Although the -12 position likely remains base-paired even in RPo, structural considerations indicate that the (T/A)-12 base pair can only be recognized after the A-11 is removed from the double-helix (Figure 3A), suggesting that sequence specific readout of the first two upstream positions of the -10 element occur at the A-11 flipping step. Subsequently, DNA helix untwisting continues downstream, driven by σ2 interaction with the sugar-phosphate backbone of the central non-conserved part of the -10 hexamer T-10A-9A-8 (Figure 4). This brings T-7 closer to its pocket on σ2. The recognition of T-7, and the discriminator element further downstream, may serve as ‘check-points’ along the pathway of propagation of melting towards the TSS (Figure 6; Matock and Heyduck, 2000).
Formation of the final RPo requires interactions between RNAP core-subunits in the active-site cleft with promoter DNA downstream of the -10 element (Saecker et al., 2002), but σ2/-10 element recognition is sufficient for the initial strand separation (Young et al., 2004). The key bases recognized by σ2, A-11 and T-7, are roughly 180° apart on the DNA helical axis (Figure 6C), therefore their binding in the pockets, as observed in our crystal structures, results in untwisting of the double helix by about half a turn. This initial untwisting also necessitates a sharp bend in the nt-stand (90° kink between positions -11 and -10, as observed in our structure; Figure 1E) which is likely responsible for positioning the downstream dsDNA in the RNAP active-site channel (Saecker et al., 2002); Figure 6A).
Whether σ actively disrupts the (A/T)-11 base pair or passively captures transiently exposed base(s) remains to be established – the two pathways are not mutually exclusive. Double-stranded DNA is thermodynamically stable but kinetically labile; individual base pairs have an average lifetime on the order of milliseconds (Gueron and Leroy, 1995) and are in equilibrium with flipped-out bases. Particularly unstable are ‘TA’ steps (as found in the –10 element) due to the relatively weak stacking interactions (Protozanova et al., 2004). Indeed, solution studies of bacterial promoters have shown that the –10 element has an altered structure, even in the absence of proteins (Drew et al., 1985; Spassky et al., 1988). The detailed mechanism of initial base flipping in the -10 element poses intriguing questions for future research.
Full details of Experimental Procedures are presented in the Supplemental Information.
The Taq σA2-3 fragment was subcloned into a pET28a-derived expression vector, transformed into E. coli BL21(DE3) cells, overexpressed, and purified using standard methods. The purified σA2-3 was concentrated to 40 mg/ml by centrifugal filtration (VivaScience) in 10 mM Tris, pH 8.0, 150 mM KCl, 0.1 mM EDTA, then flash frozen and stored at -80°C. The PAGE-purified oligonucleotides (Oligos Etc.) were dissolved in water to a concentration of 3 mM prior to use.
The ssDNA/σA2-3 complex was prepared on ice (molar ratio 2:1) at a final protein concentration of 10 mg/ml. Crystals were grown at 22°C using hanging-drop vapor diffusion by mixing equal volumes of the complex and a reservoir solution of 100 mM Tris, pH 8.5, 5% (w/v) PEG 8000, 20% (w/v) PEG 300, 10% (v/v) glycerol, and 0.15% (w/v) mellitic acid. For data collection, crystals were flash-frozen in liquid nitrogen directly from the mother liquor.
Diffraction data were collected at the Advanced Photon Source (Argonne National Laboratory) beamline NE-CAT 24 ID-E and at the National Synchrotron Light Source (Brookhaven National Laboratory) beamline X29. The structure was solved by molecular replacement. Iterative rounds of model building and refinement yielded the final models (Table S1).
The RNAP beacon assay was performed essentially as described (Mekler et al., 2011), except that Alexa 555 fluorophore was used.
We thank R. Shultzaberger and T. Schneider for providing Figure 1B and for helpful discussions; P. deHaseth, S. Malik, A. Mustaev, R. Saecker, and M. Schapira for helpful discussions; K.R. Rajashankar and F. Murphy at APS NE-CAT beamline 24ID-E, and W. Shi at NSLS beamline X29 for support with synchrotron data collection: R. MacKinnon for use of the fluorescence plate-reader. A.F. is a Merck Postdoctoral Fellow at The Rockefeller University. This work was based, in part, on research conducted at the the APS and the NSLS, supported by the US Department of Energy, Office of Basic Energy Sciences. The NE-CAT beamlines at the APS are supported by award RR-15301 from the NCRR at the NIH. This work was supported by NIH RO1 GM053759 to S.A.D.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
σ subunit of RNA polymerase binds and opens the promoter sequence.
The X-ray crystallographic coordinates and structure factor files have been deposited in the Protein Data Bank with accession IDs 3O0B (TGTACAATGGG oligo) and 3O0C (TGTATAATGGG oligo).
Supplemental Data include one table and two figures and can be found with this article online at http://www.cell.com/supplemental/***.