RNA polymerases catalyze RNA chain growth rapidly (20–70/sec for RNA polymerase II) in a template-directed manner (1
). They do not, however, move monotonously forward on the template. Rather, they oscillate between forward and backward movement at every step in the process (5
). Three states of a transcribing complex are observed (): a pre-translocation state, in which the nucleotide just added to the growing RNA chain is still in the nucleotide addition site; a post-translocation state, in which the enzyme has moved forward on the template, making the nucleotide addition site available for entry of the next nucleoside triphosphate (NTP); and a backtracked state, in which the enzyme has retreated on the template, extruding the 3’-end of the RNA (6
). Forward movement is favored by NTP binding, which traps the complex in the post-translocation state. Backtracking predominates when forward movement is impeded, for example by damage in the template or by nucleotide misincorporation in the RNA (8
). Backtracking by one or a few residues is reversible, whereas backtracking a greater distance leads to arrest, from which recovery is only possible by cleavage of the transcript in the polymerase active center, induced by SII (TFIIS) in eukaryotes and GreA in bacteria (reviewed in (10
)). Backtracking and cleavage enable proofreading of the transcript, through the excision of misincorporated nucleotides and resynthesis. (9
Fig. 1 The three states of a pol II transcription elongation complex. RNA transcript is red, DNA template is blue. The nucleotide base just added to the 3’-end of RNA, and the complementary base in the DNA template, are represented by cyan and green (more ...)
X-ray crystal structures of pol II transcribing complexes in the pre- and post-translocation states have been obtained, illuminating the mechanisms of RNA synthesis and product release (22
). Here we report X-ray structures of backtracked complexes, showing their definitive nature and completing the overall picture of the transcription process. Some of the backtracked structures include TFIIS, revealing interactions responsible for transcript cleavage, proofreading, and recovery from arrest.
Backtracked complexes were produced by two approaches. DNA-RNA hybrids with mismatched nucleotides at the 3’-end of the RNA were bound to pol II, directly recreating a backtracked state. Alternatively, DNA-RNA hybrids bearing DNA damage downstream of the 3’-end of the RNA were transcribed by pol II, with the expectation that the enzyme encountering the impediment would retreat to a backtracked state. In both cases, crystal structures were obtained showing extra electron density for the transcript downstream of the nucleotide addition site, indicative of an extruded 3’-end. The structures were, moreover, closely similar ().
Fig. 2 Structure of pol II elongation complex in the backtracked state. (A) Complex with one mismatched residue at the 3’-end of the RNA (12mer RNA). The view is a standard one, from the “Rpb2 side,” as in the past (22–25). Difference (more ...)
In a complex with a hybrid containing one mismatched residue at the 3’-end of the RNA (12mer RNA), the last matched residue (UMP) was in the nucleotide addition site (designated +1) and the mismatched residue (GMP) was in a location downstream (+2) not observed in any previous transcribing complex structure (). The UMP base was paired with its complement in the DNA template but tilted 15 to 20° out of the plane. Between the UMP and GMP, the backbone of the backtracked RNA was bent over 120° out of the path of the hybrid helix, enabling the GMP base to hydrogen bond with the AMP base two residues away (position −1) in the RNA chain ().
Interactions with bridge helix, trigger loop, and other pol II residues created a binding pocket for the backtracked GMP () and caused the deviations from hybrid helix geometry observed. Bridge helix residue Rpb1 Thr 827, Rpb2 Tyr 769, and Rpb2 529–531 contacted the GMP. Rpb 1 Asn 479 and Arg 446 contacted the UMP one nucleotide and the AMP two nucleotides away from the GMP. The trigger loop was in a novel conformation, intermediate between the “open” and “closed” conformations previously observed (22
). Trigger loop residues Rpb1 Asn 1082 and Gln 1078 contacted the phosphate group between the UMP and GMP, while His 1085, believed to play a role in catalysis in the closed conformation,(25
) was directed away from the RNA and appeared to contact Rpb1 Ser 769 and Gly772.
In a complex with a hybrid containing two mismatched residues at the 3’-end (13mer RNA), the last matched residue (UMP) and the first of the mismatched residues (GMP) were in locations similar to those for the complex with one mismatched residue (). The UMP base, paired with its complement in the DNA template, was tilted 20 to 30° out of the plane, and the backbone of the backtracked RNA was bent 80 to 90° out of the path of the hybrid helix (). The next backtracked (second mismatched) residue was not revealed in the structure, due to motion or to static disorder. In molecular dynamics simulations (, SOM), this backtracked residue was highly mobile, arguing in favor of motion rather than disorder.
Backtracked complexes containing additional mismatched residues and with different mismatched sequences showed the same conformation of the first mismatched residue. A complex with three mismatched residues (14mer RNA) also showed no ordering of residues beyond position +2, but a complex with seven mismatched residues (18mer RNA) produced more electron density downstream (). Backtracked nucleotides at positions +3 and +4 could be built into this density, though the base and sugar at +4 were disordered. The backbone of the backtracked RNA was sharply bent between positions +2 and +3, due to salt bridges of Rpb2 residues Gln 763 and Arg 766 with the phosphate between +2 and +3. Rpb1 Lys 752, Rpb2 Ser 1019, and Rpb2 Arg 1020 appeared to form hydrogen bonds or salt bridges with the phosphate between +3 and +4 (). Three regions of pol II seen to interact with the backtracked RNA, bridge helix residues Rpb1 824–827 and Rpb2 regions 760–772 and 529–531, also interacted with one another, forming a network presumably enhancing the stability of the backtracked state.
Fig. 3 Structure of backtracked complex with seven mismatched residues at the 3’-end of the RNA (18mer RNA). (A) Same representation as , except that ribonucleotide at +3 position is magenta and phosphate group at +4 position is orange. (B) Interactions (more ...)
We also solved backtracked complexes containing RNA of the same length (13mer) but with a G<>U rather than G<>G mismatch, and obtained nearly identical structures (data not shown). In the more extensively backtracked complex structures, the trigger loop was in an “open” conformation, remote from the nucleotide addition site. Part of trigger loop (1078–1081), however, remained near enough to interact with the backtracked RNA.
Cleavage of backtracked RNA in the presence of TFIIS depends upon an Asp 290 – Glu 291 dipeptide in a hairpin loop of domain III of the protein (29
), a zinc ribbon motif. Domain II, a three-helix bundle, and the linker between domains II and III, are responsible for binding to pol II (30
). Domain I, an N-terminal four-helix bundle, is nonessential in vitro
). but appears to play a role in transcription initiation (37
). A crystal structure of transcribing pol II in the post-translocation state, complexed with TFIIS (26
), showed domain III inserted in the pol II secondary channel (pore and funnel) but gave limited insight into the TFIIS mechanism, due to the absence of backtracked RNA. Superposition of this structure (1Y1V) with our backtracked complex structure reveals a steric clash of the hairpin loop of TFIIS domain III with backtracked RNA (). In particular, the catalytic Asp 290 and Glu 291 residues of TFIIS come too close to the phosphate group between the −1 and + 1 positions in the RNA ()., and a clash with the side chain of Rpb2 Tyr 769 may be noted as well (). This TFIIS structure might reflect the state following RNA cleavage, but to investigate interactions relevant to cleavage itself, we determined the structure of pol II in the backtracked state, with 13mer RNA, complexed with a point mutant of TFIIS (E291H) unable to cleave the RNA (32
). Initial phases were obtained by molecular replacement with the previous structure (1Y1V), omitting TFIIS, Rpb 4/7, the trigger loop, and nucleic acids. The initial electron density map clearly showed the three-helix bundle of domain II of TFIIS, part of domain III, a long alpha-helix of the interdomain linker, the pol II trigger loop, and nucleic acids. Trigger loop and nucleic acid models were manually built into the electron density. TFIIS was then manually docked and rigid body refinement was performed, followed by manual adjustment of the conformation of the hairpin loop of domain III.
Fig. 4 Structure of backtracked complex with TFIIS. (A) Clash of backtracked RNA with published structure of TFIIS. Backtracked complex with one mismatched base (12mer RNA), depicted as in , was superposed with TFIIS(1Y1V), with the use of Coot, based (more ...)
Parts of the electron density map were of sufficient quality that side chains could be placed and specific interactions between TFIIS and pol II identified (for example, the interdomain linker - pol II interaction). In the case of the hairpin loop, only the main chain could be built, due to limitations of resolution and less defined secondary structure. The structure clearly showed, however, a different position of the hairpin loop from that seen previously (). The position of the backtracked RNA was also affected, as the residue at +2 was disordered. The remainder of TFIIS was similar in conformation between the present and previous structures(26
). The trigger loop was in an “open” conformation in both structures (26
Superposition of our backtracked pol II – TFIIS complex structure with our structures of backtracked pol II alone revealed a steric clash of the hairpin loop with the RNA, but less severe than that observed with the previous post-translocation complex – TFIIS structure. The clash was more pronounced for longer backtracked RNAs (15mer, 18mer, and 24mer RNAs). Evidently, some rearrangement is required to accommodate both TFIIS and RNA in the complex, shown by the disorder of the residue at +2 in the 13mer RNA cocrystal.
In presence of wild type TFIIS, backtracked complexes containing one, two, three, four, and seven mismatched residues were cleaved at positions two, three, four, five, and eight residues from the 3’-end of the RNA (Figs. S4A, SAC
). Based on the structures of the complexes, we conclude that cleavage occurs between the addition site (-1) and the position preceding (+1 site). In effect, cleavage represents the reversal of nucleotide addition, except with water rather than pyrophosphate as the nucleophile (40
The cleavage rates of mismatched complexes were 15 to 30-fold greater than those of matched complexes (Fig. S4A
, and SOM), in keeping with results of previous studies (9
). This preferential cleavage of mismatched complexes is the presumed basis for the error correction capability of TFIIS (21
). Selective removal of mismatched residues has been observed for E. coli
RNA polymerase stimulated by the TFIIS homolog GreA (42
RNA polymerase stimulated by TFS (43
), and pol III with its intrinsic counterpart of TFIIS (45
Selective removal of mismatched residues has also been observed in the absence of GreA for T. aquaticus
RNA polymerase (46
). This “intrinsic” cleavage is much slower than the TFIIS-stimulated reaction (Fig. S4
and data not shown). The half time for intrinsic cleavage of a backtracked complex with one mismatched residue (12mer RNA) was about 440 sec, compared with about 10 sec for cleavage in presence of 100 nM TFIIS (Keq for TFIIS-pol II interaction is about 100 nM; Fig. S4C
and SOM). Intrinsic cleavage of longer backtracked complexes was barely detectable. More rapid rates of intrinsic cleavage reported for T. aquaticus
RNA polymerase (46
) likely reflect differences in reaction conditions, such as higher Mg2+
concentration, pH, and temperature, as well as the different RNA polymerases involved.
In our system, longer backtracked RNAs were also cleaved more slowly in the presence of TFIIS. The cleavage rates of backtracked 12mer, 13mer, and 14mer were 1200-, 480-, and 60-fold faster than backtracked 18mer. (SOM). This trend presumably relates to the rearrangement required to accommodate both TFIIS and backtracked RNA in the pol II secondary channel. Longer RNA may undergo rearrangement less readily, due to more extensive interaction in the secondary channel.
To further investigate the influence of base mismatch upon cleavage in the presence of TFIIS, we prepared a backtracked complex with a G<>U mismatch instead of a G<>G mismatch. Because G<>U can form a wobble base pair, it might occupy the nucleotide addition site at +1 rather than backtracking to +2. We observed two cleavage products, of two and three residues, instead of the single product of three residues found for a G<>G mismatch (Fig. S4B
). Evidently a wobble base pair does form to a limited extent.
Our results lead to two conclusions. First, pol II backtracked by one residue represents a discrete, stable state of the transcribing enzyme. In the course of backtracking, pol II pauses at this position. The evidence is three-fold: the observation of a defined structure of the 12mer RNA complex, in which the backtracked nucleotide is revealed at full occupancy; the absence of density for additional nucleotides in 13mer and 14mer RNA complexes; and molecular dynamics simulations, showing that nucleotides beyond the first backtracked residue are expected to be highly mobile. The demonstration of a defined one-residue-backtracked state supports the long held but largely conjectural notion of polymerases in diffusional equilibrium between forward and retrograde motion during transcription. Backtracking by one residue is favorable, whereas backtracking by two or three more residues confers no greater energetic benefit. Longer backtracked RNAs do make additional interactions, possibly contributing to arrest (irreversible backtracking) and, by alteration of the RNA conformation, to the ability of TFIIS to rescue the complex from the arrested state.
The second conclusion concerns the significance of the one-residue-backtracked state. It is readily cleaved in the presence of TFIIS, releasing a dinucleotide, supporting the previous ideas that cleavage occurs in the pol II active site, and that an important role of this cleavage is the removal of misincorporated nucleotides. Pol II structure is evidently well suited to the purpose. In the event of misincorporation, forward translocation is disfavored, due to distortion of the RNA-DNA hybrid helix, and the diffusional equilibrium of the enzyme is shifted towards the backtracked state. A significant lifetime in this state, due to binding of the misincorporated nucleotide in the +2 position, leads to cleavage. The stability of the one-residue-backtracked state thus underlies the proofreading capability of the pol II system.
Cleavage in the one-residue-backtracked state can occur both in the presence of TFIIS and in its absence (“intrinsic cleavage”). Since the reaction with TFIIS is more than a hundred times faster in vitro
, it is likely the predominant mechanism in vivo
. Intrinsic cleavage may nevertheless play an important role, as disruption of the gene for TFIIS causes at most a ten-fold greater error rate in transcription (47
). Moreover, it was recently shown that double mutation of TFIIS, changing Asp 290 and Glu 291 residues in the hairpin loop to alanines, interferes with intrinsic cleavage and is lethal in vivo
(Sigurdssen et al., accompanying paper). Our structural results, showing that the hairpin loop causes movement of the backtracked RNA, may explain this phenotype. Intrinsic cleavage depends on a particular RNA conformation, which may be prevented by TFIIS. Consistent with this idea, deletion rather than mutation of Asp 290 and Glu 291 residues, diminishing the steric clash with the RNA, has no effect on intrinsic cleavage or cell viability (Sigurdssen et al., accompanying paper)