|Home | About | Journals | Submit | Contact Us | Français|
Cellular RNA polymerases (RNAPs) are complex molecular machines that combine catalysis with concerted conformational changes in the active center. Previous work showed that kinking of a hinge region near the C-terminus of the Bridge Helix (BH-HC) plays a critical role in controlling the catalytic rate.
Here, new evidence for the existence of an additional hinge region in the amino-terminal portion of the Bridge Helix domain (BH-HN) is presented. The nanomechanical properties of BH-HN emerge as a direct consequence of the highly conserved primary amino acid sequence. Mutations that are predicted to influence its flexibility cause corresponding changes in the rate of the nucleotide addition cycle (NAC). BH-HN displays functional properties that are distinct from BH-HC, suggesting that conformational changes in the Bridge Helix control the NAC via two independent mechanisms.
The properties of two distinct molecular hinges in the Bridge Helix of RNAP determine the functional contribution of this domain to key stages of the NAC by coordinating conformational changes in surrounding domains.
RNA polymerases (RNAPs) play a central role in the regulation of gene expression. Like the majority of the enzymes involved in fundamental biological information-processing functions (for example, replication, transcription, recombination, repair), RNAPs are probably best viewed as intricate molecular machines. The movement of nucleic acid substrates, coupled with various types of active site chemistries, requires a precisely orchestrated sequence of conformational changes of protein domains during the transcription cycle (for recent reviews see [1-4]).
The nanomechanical mechanisms guiding the structural rearrangements of domains within the active site are still very poorly understood. Thus far, models of the fundamental reaction catalyzed by RNAPs, the nucleotide addition cycle (NAC), have predominantly been derived from a series of crystal structures that contain RNAPs as apoenzymes (for example [5-9]), or complexed with various substrates and inhibitors (for example [10-15]). Such structures, revealing (among other features) pre- and post-translocation states of RNAPs, have provided the basis for various hypotheses concerning the molecular mechanism of the NAC [1-4,16,17]. There are, however, two potential shortcomings associated with such approaches. First, in order to 'freeze' the RNAPs in a crystallizable conformation, substrate analogs or inhibitors need to be chosen that stop the reaction cycle at a specific point. This may result in the adoption of 'off-pathway' conformations that do not represent normal enzyme states. A second, more fundamental, problem is that short-lived intermediate structures cannot be captured in crystals because they are thermodynamically or kinetically unstable. Yet, it is likely that an awareness of the existence and functional significance of such intermediates will be required to develop a deeper understanding of the mechanisms operating within molecular machines.
We have designed new experimental tools to complement ongoing structural investigations. Based on the ability to assemble an active RNAPII-like enzyme from recombinant subunits in vitro, it is possible to modify any residue within an intact RNAP to introduce a variety of targeted mutations into any functional domain participating in the NAC . Such a strategy not only allows specific predictions based on available X-ray structures to be tested, but also can be used to explore systematically the functional contributions of individual domains to biochemically detectable activities. New robotic methods that facilitate the labor-intense high-throughput mutagenesis/assembly steps and transcription assays allow this approach to be implemented on a large scale . In a recent study we systematically replaced each of 17 consecutive residues of the Bridge Helix domain with all other 19 possible amino acid side chains . The Bridge Helix, a 35 amino acid α-helix spanning the RNAP active site, controls the flow of nucleic acid substrates and nucleotide precursors through the catalytic site (Figure (Figure1A).1A). Some of the earliest models of the NAC were based on the observation that the C-terminal portion of the Bridge Helix was kinked in some X-ray crystal structures of bacterial RNAPs [5,8], but appeared straight in numerous other crystals of bacterial, archaeal and eukaryotic RNAPs. The existence of two alternative Bridge Helix conformations made it seem likely that a periodic oscillation between straight and kinked conformations would be implicated in translocating RNAPs in single base-pair steps along the DNA template strand [7,10,21-24]. High-throughput mutagenesis of the C-terminal portion of the Bridge Helix provided unexpectedly clear evidence in support of Bridge Helix kinking; mutations destabilizing the normal α-helical conformation in certain positions cause a substantial increase in the specific activity of RNAPs. This phenomenon, referred to as superactivity, relieves a constraint on the catalytic mechanism by increasing the frequency of Bridge Helix isomerization between straight and kinked conformations [20,25].
While the combination of structural observations and mutagenesis data clearly highlights the functional contribution of the C-terminal portion of the Bridge Helix towards controlling the rate of the NAC, the role of the N-terminal portion of the Bridge Helix has thus far remained enigmatic. The primary sequence of this region is exceptionally highly conserved during evolution; for example, the sequences of the N-terminal 15 amino acids are identical between the archaeon Methanocaldococcus jannaschii and humans, and differ by only a single residue from yeast (Figure (Figure1B).1B). Such a high degree of structural conservation over more than two billion years of evolution can be partially accounted for by the fact that the Bridge Helix N-terminus is tightly surrounded by other domains and may therefore be spatially and evolutionarily constrained due to the need to maintain an extensive network of protein-protein interactions (Figure (Figure1A;1A; Additional files 1, 2 and 3; [12,15,25,26]). In apparent agreement with this view, all available X-ray structures of RNAPs show the N-terminal portion of the Bridge Helix in a rigidly α-helical conformation, suggesting the absence of significant conformational changes. For this reason, none of the current models of RNAP function consider the Bridge Helix N-terminus to play any dynamic role during the NAC [1-4,16,17].
New evidence presented here, based on a combination of high-throughput mutagenesis studies and molecular dynamics simulations, demonstrates that such a static view of the Bridge Helix N-terminus is untenable. The results show that this region contains a highly localized molecular hinge, and that the conformation of this site has a substantial influence on the rate of the NAC. In combination with the previously identified C-terminal hinge region, the data reinforces the overarching concept that the Bridge Helix plays a predominantly nanomechanical role during the translocation stage of the NAC by coordinating conformational changes in surrounding domains.
The N-terminus of the Bridge Helix of the RNAP from the euryarchaeon Methanocaldococcus jannaschii was dissected by high-throughput mutagenesis [19,20]. In this automated approach, each amino acid within the target region (Figure (Figure1B)1B) was replaced with all other 19 residues to reveal local structural requirements. The mutants were then assayed in robotic promoter-independent transcription assays, which provide a consistent measure of the synthetic rate of the NAC and correlate directly with results obtained from a variety of promoter-dependent, abortive- and elongation transcription assays ([18-20]; S. Wiesler and ROJW, unpublished data).
The results reveal unexpected insights into the function of the Bridge Helix N-terminus (Figure (Figure2A).2A). The most eye-catching phenotypes are associated with the mjA' M808 position, which constitutes - both qualitatively and quantitatively - a hotspot for superactivity caused by a chemically diverse range of substitutions. This suggests that the M808 position is structurally very delicately balanced and substitutions with more hydrophilic, bulky hydrophobic, and/or charged residues cause a substantial local disturbance. The degree of superactivity in the strongest N-terminal substitution (mjA' M808-P; >240% superactivity) greatly exceeds the highest level of superactivity displayed by the strongest C-terminal mutant (mjA' S824-P; approximately 170% superactivity; Figure Figure2A).2A). The observation that the replacement of M808 with proline results in the highest increase in specific activity immediately suggests that the phenotypes are caused predominantly by disruption of the local secondary structure; proline residues destabilize α-helices due to a lack of hydrogen bonding and steric interference with the backbone of the preceding turn . In a manner highly reminiscent of the previously characterized mjA' S824-P substitution phenotype, proline substitutions of the residues immediately surrounding mjA' M808 cause a sharp drop in RNAP catalytic activity, emphasizing the highly localized effect of such conformationally induced changes (Figure (Figure2B;2B; ). The locations of two functionally acceptable proline substitutions thus mark the presence of two discrete and separate molecular hinges, that will subsequently be referred to as BH-HN ('Bridge Helix - Hinge N-terminal'; typified by mjA' M808-P) and BH-HC ('Bridge Helix - Hinge C-terminal'; typified by mjA' S824-P). The shared functional property of M808-P and S824-P substitutions, causing maximal levels of superactivity, emphasizes that increased Bridge Helix kinking at the two hinges correlates directly with an increased rate of nucleotide addition [20,25].
Extensive database searches, covering completed genome sequences of a large variety of pro- and eukaryotic species, show that naturally occurring proline substitutions in the Bridge Helix primary sequences are exceptionally rare. The only known instances of proline residues occurring naturally anywhere in the N-terminal part of the Bridge Helix are found in the bacteria Orientia tsutsugamushi [28,29], and certain isolates of Arcobacter butzleri  and Bacillus subtilis . In each case, the substituted position is precisely orthologous to mjA' M808 (Figure (Figure2C).2C). In the C-terminal half of the Bridge Helix, the highly divergent plant RNAPIV and RNAPV enzymes display a strong tendency for a proline residue at the position orthologous to archaeal S824 (Figure (Figure2C;2C; [32,33]).
Insights into the conformational changes of the BH-HC are based on certain X-ray structures of bacterial RNAPs that were fortuitously crystallized in a kinked Bridge Helix conformation [5,8]. Kinking of BH-HC is stabilized by intramolecular interactions between amino acid side chains flanking each side of the hinge ([8,20]; Heindl et al., unpublished observations). The location of the BH-HC and its kinking properties are thus pre-determined by the local amino acid sequence. This raises the question as to whether intrinsic structural features could also account for the molecular mechanism underlying BH-HN function. Inspection of the primary sequence surrounding BH-HN, coupled with insights obtained from molecular dynamic (MD) simulations [34-36], provide a plausible explanation for the structural basis of the BH-HN hinge. Systematic sampling of conformational states at five picoseconds (ps) intervals in a series of 27 independent 200 ps semi-quantitative MD simulations highlights the presence of distinct areas prone to local unfolding, with the most prominent peak centered around mjA' G810 (Figure (Figure3A).3A). Under these simulation conditions, most other regions of the Bridge Helix, including the N- and C-termini, maintain their α-helical conformations at all stages throughout the simulations. Closer inspection of the simulated structures formed by BH-HN kinking reveals a molecular switching mechanism that can be rationalized directly on the basis of the primary amino acid sequence and α-helical geometry (Figure (Figure3B).3B). The kinking of BH-HN critically involves the two glycine residues G809 and G810 that are located immediately C-terminal to M808 and are essentially invariant in all archaeal and eukaryotic RNAPIIs (Figure (Figure1B).1B). Glycine residues display low helix-forming propensity because their high conformational flexibility is entropically unfavourable within geometrically constrained α-helical structures [37,38]. This increased flexibility allows G809 and G810 to flip out of the α-helical conformation to create a flexible hinge (Figure (Figure3B).3B). The flipped conformation is then stabilized further through a variety of non-covalent interactions of M808 with R811 and E812 (Figure (Figure3C).3C). The kinking model is based on elementary structural and thermodynamic principles (as represented by the MD force field), but also strongly supported by the phenotypes of the G809 and G810 mutagenesis series (Figure (Figure2A).2A). In both positions, any residue other than glycine causes a severe reduction in the catalytic activity of RNAP due to an increased helix propensity, which reduces the likelihood of BH-HN kinking. In contrast, another glycine residue located slightly more C-terminal, G813, is noticeably less sensitive to change (Figure (Figure2A).2A). Similarly, the stabilization of BH-HN through van der Waals contacts between M808 and R811 and E812 appears to be relatively non-specific, so that a chemically diverse group of side chains are either acceptable (especially in the R811 position), or result in enhanced stabilization of the BH-HN kink (M808 substitution series). The fundamental requirement for a large side-chain in the M808 position is particularly evident from the fact that certain amino acids with smaller groups (G, A) are not capable of causing the superactivity associated with enhanced BH-HN kinking. It is therefore apparent that the molecular properties of BH-HN, such as its structural stability, are directly determined by the primary amino acid sequence and emerge spontaneously from MD simulations as a property inherent in the primary sequence of the Bridge Helix domain. Mutagenesis changes the biophysical properties of BH-HN, which is, in turn, directly reflected in altered NAC rates (Figure (Figure2A2A).
The presence of two distinct hinges in the Bridge Helix raises the question whether BH-HN and BH-HC are involved in the same mechanism during the NAC. Kinking of either of the two hinges will result in considerable local distortions, predicted to include a spatial redeployment of amino acid side chains and changes in the overall length, flexibility and general topology of the Bridge Helix domain. Kinking of BH-HN could result in altered interactions with adjacent domains, such as the βD-II, Link and F-Loop domains (Figure (Figure1A;1A; Additional files 1, 2 and 3), whereas hinge movements in BH-HC are expected to affect the position and/or mobility of the DNA-RNA hybrid and Trigger Loop conformation (Figure (Figure1A;1A; ).
Intriguingly, RNAP IV Bridge Helices also contain an additional two amino acid deletion, which would cause an even more radical change in the Bridge Helix by creating a local 180° twist of the α-helical structure (Figure (Figure2C).2C). This class of mutants combines strictly confined effects (removal of two adjacent side chains) with complex long range effects (realignment of remaining side chains on either side of the deletion to new positions and localized underwinding of the helical structure), which may possibly be coupled with the propagation of stress forces to the N- and C-terminal anchoring points of the Bridge Helix. The radical nature of such twisting mutations serves as an ideal tool to gain a deeper insight into structural constraints acting on the entire Bridge Helix. A series of two-amino acid deletions was prepared (Figure (Figure4A).4A). The results show that the BH-HC region is indeed remarkably resistant to such major conformational changes (Figure (Figure4B).4B). Although X-ray structures provide apparently persuasive arguments for certain residues of the Bridge Helix domain assisting in catalytic functions (for example, residues orthologous to T821 contacting the 3' end of the nascent transcript and/or the incoming rNTP ), the deletion phenotypes described here prove that such contributions are either redundant or non-existent in our archaeal system, which is highly conserved in this region. The results furthermore illustrate that, although BH-HC is located much closer than BH-HN to the substrates involved in the NAC, there are surprisingly few topological restraints. In marked contrast, none of the two-amino acid deletions support significant activity in the N-terminal part of the Bridge Helix (Figure (Figure4B),4B), proving that conformational changes are much more restricted in that region (which is consistent with the higher degree of evolutionary identity in the N-terminal half of the Bridge Helix compared to the C-terminal half; Figure Figure1).1). The results from the two amino acid deletion scan thus provide a first indication that the local conformational requirements for BH-HN and BH-HC differ quite radically.
The substitutions mjA' M808-P and S824-P provide the strongest pieces of evidence for the existence and functional significance of BH-HN and BH-HC, respectively (Figure (Figure2B2B). Work in other systems has shown that the physicochemical properties of residues adjacent to prolines contribute considerably to the prevalence of cis/trans peptide bond isomers and therefore strongly influence kink geometry [39-42]. Additional proline substitutions of the positions immediately N- and C-terminal to either M808 or S824 (that is, mjA' A807-P/M808-P; M808-P/G809-P and Q823-P/S824-P; S824-P/G825) revealed examples of further intriguing differences between BH-HN and BH-HC: the double-proline substitution in BH-HN abolished the superactivity caused by M808-P, whereas the presence of an additional proline residue N-terminal to S824 continued to support the superactivity of S824-P (Figure (Figure5B).5B). The activity of the Q823-P/S824-P double mutant proves explicitly that the presence of two proline residues in this particular location of the Bridge Helix is not only compatible with catalytic function, but also compatible with superactivity. Prolyl-proline preferentially adopt an elongated polyproline II structure (87%), or less frequently (13%) a β-turn (Additional file 4A; [43,44]). Either of the structures would cause a substantial local increase in the flexibility of BH-HC.
In order to investigate this unexpected tolerance to the presence of two adjacent prolines in positions 823 and 824 in more detail, a complete substitution series of the residues around position mjA' S824-P was prepared, generating an assortment of systematic double-mutants (Figure (Figure5A;5A; mjA' Q823-X/S824-P and S824-P/G825-X; with X denoting 19 different variants). All substitutions N-terminal to S824-P (i.e. mjA' Q823-X/S824-P) displayed an almost completely invariant degree of extensive superactivity that was indistinguishable from the original S824-P mutant (Figure (Figure5B).5B). This result is remarkable because previous mutagenesis of Q823 revealed a broad spectrum of activity, ranging from substantial loss of function (Q823-C or I) to superactivity (Q823-D or E; ). It is evident that in the double-mutants the chemical nature of the side-chain residue in position 823 exerts no further functional influence, presumably because of the major distortion already caused by the proline substitution in position 824. Once such a gross structural alteration has occurred in S824-P, any additional changes in the adjacent N-terminal residue become structurally irrelevant.
Because Q823-P/S824-P displayed no loss of superactivity, the residue immediately N-terminal to the double-proline substitution was also permutated, resulting in variants containing three adjacent substitutions in the BH-HC hinge region (Figure (Figure5A;5A; mjA' A822-X/Q823-P/S824-P). The majority of these substitutions, including the triple proline mutant A822-P/Q823-P/S824-P (Figure (Figure5C;5C; Additional file 4B), still displayed clearly detectable superactivity, albeit at a slightly reduced level. The tolerance of the BH-HC hinge to radical mutagenesis, as previously observed in the two-amino acid deletion screen, is therefore also reflected in the unexpectedly high tolerance to multiple proline substitutions in that region. This geometric freedom is, however, also spatially limited: substitutions in positions C-terminal to S824-P (Figure (Figure5A)5A) were mostly inactive, indicating that despite the structural flexibility of the positions N-terminal to S824-P, the C-terminal positions are functionally constrained (Figure (Figure5D;5D; a similar trend is also apparent in the two amino acid deletion scan data [see Figure Figure4B4B]).
Taken together, the extensive body of mutagenesis data leads to the conclusion that the BH-HN and BH-HC structures operate within intramolecular environments that vary widely in their conformational constraints. The chemistry of the NAC is typically catalyzed by two magnesium ions, immobilized non-covalently within the active site (via the 'metal A' and 'metal B' motifs; for example, [45,46]). A complete substitution of Mg2+ by Mn2+ reduces the activity of archaeal RNAP to approximately 40% (data not shown); the reduced activity is probably caused by the slightly larger Mn2+ ions causing suboptimal spacing , and Mn2+-catalyzed reactions being more promiscuous in their requirements for alignment of the reactive groups . Intriguingly, assays of superactive BH-HN and BH-HC mutants in the presence of Mn2+ demonstrated that they fell into distinct categories. mjA' M808-D, M808-E and M808-P continued to display superactivity, whereas the activities of S824-P or Q823-P/S824-P only reached wildtype levels under these conditions (Figure (Figure6).6). We can therefore conclude that the superactive substitutions in the BH-HN and BH-HC regions have different consequences because conformational changes in different parts of the Bridge Helix affect separate stages of the NAC. The reduced catalytic activity caused by the presence of Mn2+ ions in the active site becomes rate-limiting in Q823-D/E or S824-P, whereas M808-P overcomes this limitation to a large extent by stimulating transcription through an independent pathway, most likely involving the β-D II and Link domains.
The results presented here reveal several new surprising insights, including compelling evidence for the existence of a molecular hinge region in the N-terminal portion of the Bridge Helix and evidence for an unexpectedly large degree of tolerance to radical structural changes in the C-terminal part of this domain. It is apparent that the Bridge Helix domain displays a much greater conformational freedom than anticipated from currently available X-ray structures of RNAPs. Few, if any, of the residues of the Bridge Helix appear to make any specific contribution to catalysis other than through defining the nanomechanical properties intrinsic to the α-helical structure. The implications for mechanistic models aimed at describing the NAC are manifold, ranging from a re-evaluation of the structural basis of the RNAP translocation mechanism, to highlighting the hitherto neglected role of highly conserved domains in the catalytic site, and to obtaining a better understanding of the evolutionary diversity of Bridge Helices in different organisms.
Currently, we have only a limited understanding of the forces acting on the Bridge Helix that could drive localized conformational changes. Attempts to model the full NAC using molecular dynamics studies are severely limited by the large size of multi-subunit RNAPs and the immense computational effort that would be required to simulate the molecular events expected to last from 10s to 100s of milliseconds for the extension of a nascent transcript by a single nucleotide (approximately 30 ms/rNTP incorporation under optimal in vivo conditions; for example, ). The study of the intrinsic structural properties of individual domains by fully atomistic computer simulations reveal, however, interesting nanomechanical properties that have functional implications for the RNAP translocation mechanism [50-52]. The Bridge Helix domain contains intrinsically unstable α-helical regions that undergo spontaneous kinking motions, even in the absence of externally applied forces (Figure (Figure3A).3A). At least two of these unstable regions correspond precisely to the biochemically-mapped BH-HN and BH-HC regions. Strategically-placed glycine residues, such as mjA' G809, G810 (for BH-HN) and G825 (for BH-HC) provide the structural basis for forming these molecular hinges, with surrounding residues determining additional kinking parameters, such as the likelihood of kinking and/or the half-life of the kink after its isomerization (Figure (Figure2A;2A; Heindl et al., unpublished observations). Interestingly, the simulations also highlight a potential third structurally labile region near the center of the Bridge Helix domain, spanning residues mjA' Q817 to R820. The relative sensitivity of this sequence to proline substitutions (Figure (Figure2B)2B) suggests, however, that this area of instability behaves functionally differently to BH-HN and BH-HC. It is possible that structural fluctuations in the central part of the Bridge Helix play a more dynamic role in supporting short-lived conformational changes that can either compensate for major structural rearrangements due to BH-HN and BH-HC kinking, or act as a store of 'fast' motions to lubricate the kinking of the hinge regions kinetically . Although such hypotheses are currently beyond experimental proof, it is interesting to note that several, structurally unexplained superactivity mutants map to this central sequence (for example, mjA' D816-N; Q817-S/T/C/K and V819-K;Figure V819-K;Figure2A2A), and that this region is also exceptionally tolerant to radical twisting of the α-helical axis induced by deletions of two-amino acid segments (Figure (Figure4B).4B). Furthermore, data from anisotropic network mode analysis suggests that rigid body movements of the clamp domain may exert forces onto the center of the Bridge Helix via the Switch domains, potentially linking transcription to a ratchet-like translocation mechanism (Additional file 5; ). The Bridge Helix thus appears to have evolved specific nanomechanical properties that result in the controlled and highly localized isomerization of its conformation in response to allosteric alterations in the surrounding protein domains and nucleic acid substrates.
The Bridge Helix N-terminus is tightly surrounded by other domains, such as the β-D loop II , the experimentally uncharacterized Link domain [25,55], and the F-Loop . As evident from the exceptionally high degree of sequence conservation (Additional files 1A, 2A and 3A), each of these domains is likely to play key roles in the NAC. The β-D II domain is a loop-like structure that interacts extensively in a side-way interaction with the central part of the Bridge Helix, while simultaneously maintaining direct physical contact with the most recently incorporated nucleotide (i-1 position). The interaction between the β-D II domain and nascent transcript also creates an extended binding pocket for the rNTP (additional file 1B, C). Similarly, the highly conserved Link domain is strategically placed to interact with the Bridge Helix N-terminus, β-D II domain, nascent transcript (i-1 and i-2 positions) and the incoming rNTP (Additional file 2B). Finally, an N-terminal extension of the Bridge Helix, the F-Loop, forms an extensive cap-like structure that contacts the Link domain and the tip of the Trigger Loop (Additional file 3B; ). The differential response of superactive substitutions in BH-HN and BH-HC to the presence of Mn2+ in the catalytic site supports the view that conformational changes in these regions cause a distinct effect in the catalytic site of RNAP. While the C-terminal Bridge Helix operates predominantly by influencing Trigger Loop conformation, kinking of the N-terminus via BH-HN most likely alters the positions and/or conformations of the β-D II and Link domains, which are in direct physical contact with the nucleotide and nucleic acid substrates.
The existence and biological relevance of the experimentally determined hinge positions that tolerate proline substitutions is independently confirmed by a small number of naturally occurring Bridge Helix variants. Considering the large number of genomes sequenced thus far (currently including approximately 1,200 microbial and approximately 800 eukaryotic genomes; http://www.ncbi.nlm.nih.gov/sites/genome/), it can already be stated with confidence that naturally occurring proline substitutions are an exceedingly rare phenomenon. The three bacterial species that contain naturally occurring proline-substitutions in BH-HN represent two diverse bacterial phyla (Figure (Figure7),7), and each of these phyla contains other closely related species with sequenced genomes, which do not contain any prolines in their Bridge Helices. Proline substitutions in the BH-HN region therefore appear to evolve spontaneously and independently in different bacterial lineages and subsequently remain restricted to individual species or strains. In at least some cases this evolutionary adaptation may be associated with significant simplifications of the transcriptional machinery, such as loss of the ω-RNAP subunit and absence of transcription-coupled repair (O. tsutsugamushi; [28,29]). These proline-containing Bridge Helix variants do not only provide strong and independent in vivo confirmation of the results identified in the high-throughput mutagenesis screen (Figure (Figure2B),2B), but prove that the results obtained in an archaeal model system also apply to the bacterial and eukaryotic domains; Bridge Helix kinking via two structurally independent molecular hinges is a universal mechanism operating across the entire evolutionary spectrum.
The application of a high-throughput in vitro mutagenesis approach to the N-terminal portion of the M. jannaschii bridge helix domain has revealed a range of new insights that could not be anticipated from previously available structural and genetic data. The data sets (Figures (Figures2A,2A, ,4B4B and and5B)5B) clearly illustrate that many of the most interesting insights were derived from substitutions that would almost certainly not have been designed using a rational, structure-led approach (for example, mjA' M808-P; ΔD816/Q817; A822-P/Q823-P/S824-P). Furthermore, widely used methods, based predominantly on alanine-scanning mutagenesis , are also limited in their capacity to uncover some of the most interesting phenotypes (Figure (Figure2A).2A). It is therefore clear that automated high-throughput methods for generating site-directed mutants and assaying their phenotypic consequences will play an increasingly important role in exploratory investigations of protein structure/function relationships as part of a diverse strategy aimed at obtaining new insights into complex biological systems .
Combinatorial permutation libraries, containing all 19 variants with codon replacements optimized for expression in E. coli were purchased for mjA' H806, A807, M808, G809, G810, R811, E812, G813, Q823-X/S824-P, A822-X/Q823-P/S824-P, S824-P/G825-X, and S824-P/G825-P/Y826-X from GeneArt (Regensburg, Germany). The mutations, located within a BstBI-SbfI fragment of the codon-optimized C-terminus of the Bridge Helix  were transferred to a pET21a bacterial expression vector for the production of full-length, intein-free mjA' subunits. The presence of the desired mutations in the expression constructs was verified by DNA sequencing. DNA constructs containing the two amino-acid deletions across the Bridge Helix were purchased from GeneArt as synthetic gene fragments and transferred to bacterial expression plasmids as described above.
The robotic procedures for high-throughput growth of bacterial expression strains, recombinant subunit purification and automated assembly into recombinant RNAPs (the 'RNAP Factory') have been described previously [19,20,58]. Briefly, bacterial constructs expressing the Bridge Helix mutants were transformed into chemically competent Acella cells (ΔendAΔrecA derivatives of E. coli BL21 [DE3]; EdgeBio, Gaithersburg, Maryland, USA). After growth for 16 to 18 hours at 37°C in 24-well plates in autoinduction medium (OverNight Express, Novagen, Nottingham, UK) the bacteria were harvested and used for a centrifugation-based robotic inclusion body purification protocol. The mjA' inclusion bodies were automatically solubilised in the presence of 8 M urea and quantitated at 562 nm with the bicinchoninic acid assay. Recombinant RNAPs containing the mjA' variants were assembled in a 96-well format dialysis cell using an urea-gradient from 6 M to urea-free spanning 16 hours at room temperature . Each mutant subunit was expressed, purified and assembled in vitro at least in quadruplicate to assure consistency and reproducibility. The assembled RNAPs were harvested and used immediately for robotic transcription assays (see below). The assembly efficiencies of key mutants (including, among others, mjA' M808-P; A807-P/M808-P; M808-P/G809-P; S824-P; Q823-P/S824-P; S824-P/G825-P; A822-P/Q823-P/S824-P) were compared to assembly rates achieved with the wildtype mjA' subunit by assaying the reconstituted polymerases at limiting and saturating template DNA concentrations (see Additional file 19 in  for details and examples); no differences between wild-type and mutant enzymes were detected. For replacement of Mg2+ ions in the catalytic site with Mn2+ (for the transcription assays shown in Figure Figure6),6), the in vitro assembly process was carried out as described above, but with dialysis- and transcription buffers containing 10 mM Mn2-O-acetate instead of Mg2-O-acetate.
The robotically implemented high-throughput trichloroacetic acid (TCA) precipitation assays, measuring the incorporation of (α-32P) rUTP into TCA-insoluble products, were carried out exactly as previously described [19,20]. Briefly, assay mixtures were incubated for 45 minutes at 70°C in thin-wall PCR plates before precipitating the radiolabeled transcripts by the addition of ice-cold TCA solution. After incubation for 30 minutes at 1°C, the nucleic acid precipitates were collected by vacuum filtration on a 96-GF/F glass fiber filter plate (Whatman, Maidstone, UK) and extensively washed with further aliquots of ice-cold TCA solution. After drying the filters, scintillant (MicroScint-O; Perkin-Elmer, Cambridge, UK) was added and the amount of incorporated (α-32P) rUTP quantified with a microplate counter (TopCount NXT, Packard, Cambridge, UK).
Molecular dynamics (MD) simulations were performed using GROMACS . In preparation for MD simulations, the archaeal Bridge Helix was modelled on the S. cerevisiae 'active elongation' RNAPII structure (PDB #2E2H) using the SwissModel server in automated mode . The simulation production runs were executed in a fully solvated atomistic production mode without restraints. The energies of the modelled structures were initially minimized in vacuum using GROMACS with an AMBER force field (http://ambermd.org/) on a CPU cluster of the National Grid Service (NGS). During pre-processing the system was warmed to 200K under the control of a Berendsen thermostat with a coupling constant of 1.0 ps. All structures were energy-minimized in pre-equilibrated simulation boxes filled with TP3 water, and sodium and chloride ions were added to a final concentration of approximately 150 mM. For production runs the temperature was increased to 300 K (27°C). The equations of motion were integrated using a step-size of two femtoseconds. The trajectories generated by 27 independent 200 picoseconds simulation runs, were analyzed using STRIDE , as implemented in VMD . The frequencies of particular residues adopting a 'coil' conformation during 5 ps analysis windows were plotted relative to the Bridge Helix sequence.
BH-Hc: molecular hinge located within the carboxy-terminal portion of the Bridge Helix; BH-HN : molecular hinge located within the amino-terminal portion of the Bridge Helix; NAC: nucleotide addition cycle; rNTP: ribonucleotide triphosphate; ps: picosecond; RNAP: RNA polymerase; TCA: trichloroacetic acid.
Evolutionary conservation and structure of the β-D II domain. (A) Alignment of β-D II domain sequences from bacteria (E. coli K12, T. thermophilus) and eukaryotes (S. cerevisiae and H. sapiens) against the archaeon M. jannaschii. Residues identical to the archaeal sequence are shown in red. The numbers flanking the sequences represent the location of the sequences within the open reading frame of the complete subunit. One of the residues in close contact with the nascent transcripts is boxed and identified with an arrow. (B) Arrangement of the β-D II domain relative to the RNAP active site. Most structures are shown in space-filling mode to emphasize spatial connections. The Bridge Helix is shown in green (with the BH-HN region (corresponding to mjA' M808-E812) highlighted in yellow), the β-D II domain in turquoise, the template DNA is pale blue, the RNA is red, the NTP in the insertion site shown as a pink stick model and catalytic metal ions as magenta spheres. The rNTP binding pocket is indicated with a white-dashed oval. (C) Close-up view of the β-D II domain. Note the extensive contacts between the β-D II domain with the rNTP and the i-1 position of the nascent transcript. Two potential β-D II/Bridge Helix contacts are mediated via residues orthologous to mjA' R811 and L814. ((PDB #2E2H); visualized with PyMOL).
Evolutionary conservation and structure of the Link domain. (A) Alignment of Link domain sequences from bacteria (E. coli K12, T. thermophilus) and eukaryotes (S. cerevisiae and H. sapiens) against the archaeon M. jannaschii. Residues identical to the archaeal sequence are shown in red. The numbers flanking the sequences represent the location of the sequences within the open reading frame of the complete subunit. Residues in close contact with the rNTP or nascent transcripts are indicated by boxes and arrows. (B) Arrangement of the Link domain relative to the RNAP active site. Most structures are shown in space-filling mode to emphasize spatial connections. The Bridge Helix is shown in green (with the BH-HN region (corresponding to mjA' M808-E812) highlighted in yellow), the Link domain in light purple, the template DNA is pale blue, the RNA is red, the NTP in the insertion site shown as a pink stick model and catalytic metal ions as magenta spheres. The rNTP binding pocket is indicated with a white-dashed oval.
Evolutionary conservation and structure of the Link domain. (A) Alignment of F-Loop domain sequences from bacteria (E. coli K12, T. thermophilus) and eukaryotes (S. cerevisiae and H. sapiens) against the archaeon M. jannaschii. Residues identical to the archaeal sequence are shown in red. The numbers flanking the sequences represent the location of the sequences within the open reading frame of the complete subunit. (B) Arrangement of the F-Loop domain relative to the RNAP active site. Most structures are shown in space-filling mode to emphasize spatial connections. The Bridge Helix is shown in green (with the BH-HN region (corresponding to mjA' M808-E812) highlighted in yellow), the F-Loop domain in lime, the template DNA is pale blue, the RNA is red, the NTP in the insertion site shown as a pink stick model and catalytic metal ions as magenta spheres.
Analysis of proline conformational space in proteins. (A) Conformation of peptide backbones containing a single (X-P-X; X is any other non-proline residue; left panel), two (X-P-P-X; central panel), or three subsequent proline residues (X-P-P-P-X) in protein structures displayed as Ramachandran plots (data generated using the web-based server described in ). The relative frequency of occurrence of particular /ω angle is encoded by the brightness of the square at the intersection of the coordinates. The /ω angle combination compatible with standard α-helical conformation is indicated with a dashed purple oval (left and central panels). Single proline residues conform to α-helical geometry when present at the extreme N- and C-termini of the α-helix, thus accounting for the occurrence of single prolines in the α-helical part of the plot in the left panel. For two or three adjacent proline residues, the only conformational space is in the top left quadrant of the plot, corresponding to polyproline-specific conformations. (B) Model of the extended poly-proline stretch in the mjA' A822-P/Q823-P/S824-P triple proline substitution mutant. The triple substitution mutant displays approximately 150% of activity in comparison to the wildtype enzyme (Figure 4B). The three proline substitutions are shown as yellow stick models and T821 is shown in red as a reference point pointing towards the catalytic site. The structure shown here was constructed using the M. jannaschii Bridge Helix sequence and conforms to the typical /ω angle combinations observed in α-helices and in polyproline structures. The structure is not necessarily an accurate model, but serves to demonstrate the increased local flexibility due to the presence of three subsequent proline residues. The model was created with Abalone http://www.biomolecular-modeling.com/Abalone/index.html.
Gaussian Network Model Analysis. The yeast RNAP elongation complex (PDB #2E2H) was subjected to Gaussian Network Model simulation (ignm.ccbb.pitt.edu; 6Å cut-off) to assess the distribution of forces within intact RNAPs. Slow-mode motions (rank 4, 6 and 8) affect particularly the center of the Bridge Helix (as indicated by green/gold color-coding), whereas the N- and C-terminal region remain immobile. The slow modes identify predominantly the response of individual domains to mechanical forces exerted on them from other structures during the simulation. It should be noted that such simulations do not take the chemical nature of residues into account and would therefore not be able to detect the intrinsic kinking properties of the BH-HN and the BH-HC regions.
This work was supported by a Wellcome Trust project grant [078043/Z/05/Z] to ROJW. I would like to thank Dominic Conquest for help with the subcloning of the M808-X and G813-X expression clone libraries into expression vectors. I would also like to thank Hans Heindl for advice on the molecular dynamics simulations, Noam Weingarten for programming and Tamas Kiss and Gabor Terstyanszki for advice and provision of computing resources on the UK National Grid Service. I also appreciate the helpful comments on the manuscript received from Martin Buck, Patricia Burrows and Simone Wiesler.