|Home | About | Journals | Submit | Contact Us | Français|
The vectorial (5′-to-3′ at varying velocity) synthesis of RNA by cellular RNA polymerases creates a rugged kinetic landscape, demarcated by frequent, sometimes long-lived pauses. In addition to myriad gene-regulatory roles, these pauses temporally and spatially program the co-transcriptional, hierarchical folding of biologically active RNAs. Conversely, these RNA structures, which form inside or near the RNA exit channel, interact with the polymerase and adjacent protein factors to influence RNA synthesis by modulating pausing, termination, antitermination, and slippage. Here we review the evolutionary origin, mechanistic underpinnings, and regulatory consequences of this interplay between RNA polymerase and nascent RNA structure. We categorize and attempt to rationalize the extensive linkage between the transcriptional machinery and its product, and provide a framework for future studies.
Life is enabled by the folding of one-dimensional, heteropolymeric macromolecules into three-dimensional, functional devices. Among the known biological heteropolymers, RNA is recognized as the most ancient, versatile, and self-sufficient . RNAs carry out a large repertoire of key cellular functions such as templating protein synthesis, catalyzing peptidyl transfer [2, 3], transesterification , and hydrolysis [5, 6] reactions, and regulating cellular function  via elaborate structures that rival the complexity of large proteins [6, 8-10].
A key evolutionary milestone in the ancient RNA-dominant world was the delegation of RNA biogenesis from ribozymes to a new heteropolymer, proteins. The emerging RNA-peptide world hypothesis posits that primordial short peptides stabilized and protected the RNA, enabling lengthening and coevolution of both heteropolymers to achieve greater complexity [11-13]. One of these ancient peptides, NADFDGD, synthesized from the earliest amino acids, may be preserved as the universal catalyst found in the active site of essentially all extant, multisubunit, DNA-dependent RNA polymerases (RNAPs) [14, 15]. Similarly, the conserved “palm” domain shared by T7 RNAP, reverse transcriptase, primase, etc., which houses a similar active site, may have evolved from an ancient RNA-recognition motif (RRM) subsequently elaborated with metal-coordinating residues .
After displacing their ribozyme counterparts, proteinaceous DNA-dependent RNAPs continue to co-evolve with their RNA products and now are responsible for vectorial (5′-to-3′ at varying velocity) synthesis of RNA molecules for all free-living organisms (excluding some organelles and viruses that use single-subunit RNA polymerases related to DNA polymerases). The functional importance of protein-based RNA biogenesis exposed the transcription machinery to extensive selection, elaboration, and adaptation , producing for instance an RNA exit channel that allows RNAP to alter catalytic rate and processivity in response to interactions with different nascent RNA structures [18, 19].
Two comparative examples using Escherichia coli and Bacillus subtilis RNAPs illustrate the lineage-specific evolution of RNAPs to co-transcriptionally fold and respond to regulatory structures in its nascent RNA product. In one example, E. coli RNAP strongly recognizes the well-characterized hairpin-stimulated his pause, whereas B. subtilis RNAP  (and mammalian RNAPII ) completely ignore this RNA-mediated signal. These different responses of distantly related enzymes to the same RNA structure likely reflect divergence driven by distinct regulatory needs. Conversely, divergent RNAPs have evolved kinetic or chaperone behavior needed to fold specific RNA structures. For example, by monitoring the co-transcriptional folding of P RNA into a catalytically active form, Pan and colleagues showed that the cognate E. coli RNAP is significantly more proficient than the noncognate B. subtilis RNAP in correctly folding the E. coli P RNA [22, 23]. We will elaborate in this review on both examples, which together demonstrate the intimate interplay between transcribing RNAP and its nascent RNA product that arose as the structures and properties of the enzyme and the transcript co-evolved.
The extensive coupling among vectorial RNA synthesis by RNAP, RNAP structure, RNA folding, and nascent RNA interaction with RNAP and transcription factors has long been recognized to play key roles in prokaryotic transcription attenuation mechanisms [22-25]. In prokaryotes, the coupling can be modulated by ribosomes, regulatory proteins, or small molecules. These interactions also can guide proper folding of biologically active RNAs. The connections among RNAP, nascent RNA folding, and regulators are likely of equal if not greater importance in eukaryotes where they are far less well understood.
In this review, we categorize the mechanistic underpinnings of RNAP-nascent RNA interactions as mediators of gene regulation, covering both bacteria and the developing understanding of instances in eukaryotes. What emerges is an overarching theme that RNAP pausing and nascent RNA structure can act as both receivers and carriers of regulatory information. We highlight (i) the different ways nascent RNA structures can affect transcriptional pausing, termination, and antitermination, (ii) the essential roles of RNAP pausing in guiding co-transcriptional RNA folding, (iii) the possible roles of elongation factors and RNAP itself as RNA folding chaperones, and (iv) the potential importance of RNAP-nascent RNA interactions in splicing and miRNA biogenesis.
Nascent RNA structures formed within and just upstream of the enzyme's RNA exit channel control the activity of RNAP in myriad ways, most prominently by dissociating transcription complexes at ρ-independent terminators in bacteria, by recruiting regulators like λN that suppress bacterial termination, by directly interacting with RNAP to suppress termination, and by either increasing or decreasing transcriptional pausing (Figure 1). Because pausing plays key roles in termination and antitermination and because its interplay with nascent RNA structure is particularly complex, we will first describe the fundamental mechanisms of pausing and then describe different ways nascent RNA structures influence pausing as well as termination, antitermination, and other RNAP activities.
As elongating multi-subunit RNAPs traverse individual transcriptional units, the rates of nucleotide addition (and thus the speed of the enzyme) vary by several orders of magnitude among the sites of RNA synthesis along the DNA template [26, 27]. This variation in reaction velocity produces a rugged kinetic landscape (as opposed to a smooth continuum). Prominent landmarks on this landscape are long-lived pause sites at which a fraction of RNAPs abruptly enter a transient, catalytically inactive state.
Transcriptional pauses can be classified by their mechanisms based on studies of model bacterial RNAPs (see Box 1) [28-31]: (i) elemental pauses, at which incompletely understood conformational rearrangements of RNAP involving the clamp and bridge helix and mediated by interactions of the RNA-DNA scaffold inhibit nucleotide addition without backtracking, possibly by disfavoring the fully translocated register necessary for productive NTP binding (Figures 1B and and2)2) [29, 32-34]; (ii) backtrack pauses, in which RNA-DNA pairing energetics drive their reverse translocation through RNAP, removing the RNA 3′ nucleotide from the active site into the secondary channel (Figures 1C and and2)2) [28, 35, 36]; and (iii) hairpin-stabilized pauses, at which a nascent RNA structure invades the RNA exit channel, stabilizes an open-clamp conformation of the enzyme, and increases the pause dwell time (Figures 1D, ,2,2, and and3)3) [28, 37]. Both backtrack and hairpin-stabilized pauses appear to form after the EC enters an initial elemental pause state.
Because the elemental pause may be an obligate intermediate from which long-lived, regulatory pauses are derived (on the order of seconds in vivo), it is important to understand its mechanistic origin and the pathways through which it leads to more stable pauses. The elemental pause appears to arise when RNAP loses its grip on the nucleic-acid scaffold during translocation from one template position to the next, allowing loosening of the RNAP clamp domain and changes in the conformation of the bridge helix and trigger loop [33, 34, 37, 38]. The collective effect of these reversible changes generates an inactive offline state that is unable to complete translocation [39-42]. This transient, reversible inactivation of the catalytic center is linked to both nucleic-acid sequence and global RNAP conformation. The fraction of elongating RNAP molecules that undergo this rearrangement at a given template position depends on the underlying sequence, but at least some RNAP molecules move past the site without pausing even at strong elemental pause sites. Recent genome-scale studies establish a consensus sequence for the elemental pause and document its occurrence on average about once per hundred bp in E. coli .
Elemental pauses can give rise to long-lived gene-regulatory pauses through hairpin-stabilization or backtracking (Figure 2), and is an obligate precursor to irreversible transcription termination [33, 34]. These two types of stabilized pauses synergize with nascent RNA structure in different ways, as described in the next section. Biochemical studies show that hairpin-stabilized pauses inhibit the trigger loop-trigger helices transition required for nucleotide addition and cause the RNA 3′ nucleotide to fray away from the DNA template in the pretranslocated register (Figure 2) [38, 43, 44]. Both effects may result from nascent hairpin-stabilization of a fully open clamp conformation. A role of the trigger loop in pausing also is supported by pause-altering effects of lineage-specific sequence insertions , transcription factors (TFs) , and small molecules (such as tagetitoxin for bacterial RNAP  and alpha-amanitin for RNAP II ).
The existence of the elemental pause is best documented in bacteria; it has been argued that eukaryotic RNAPII passes directly into a backtrack pause state without an elemental pause intermediate [49, 50], but current data on this point are inconclusive . Compared to the bacterial enzyme, what happens in the catalytic center of RNAPII during pausing is less well understood, and hairpin-stabilization of RNAPII pausing has not been described to date. However, RNAPII shares essentially all the active-site components of the bacterial enzyme, such as the highly conserved trigger loop/helices and bridge helix, making it reasonable to infer mechanistic similarities between these enzymes.
Although the mechanistic classes of pauses described above and illustrated in Figure 2 provide the most useful way to distinguish effects on the RNAP active site, a different classification of pauses based on effects of nascent RNA structure best illustrates the main points of this review. Thus, from the perspective of nascent RNA folding, pauses can be placed in three categories (Figure 3): (i) hairpin-enabling pauses, which allow time for nascent RNA structures to form or rearrange prior to pause escape and whose dwell time is unaffected by the changes in nascent RNA structure (these pauses may be either elemental or backtrack pauses); (ii) hairpin-inhibited pauses, in which dwell time is decreased by a nascent RNA structure that favors forward translocation by pulling on the RNA in the hybrid (principally backtrack pauses, but could also be elemental pauses); and (iii) hairpin-stabilized pauses, in which an initial elemental pause allows time for formation of a pause-prolonging RNA hairpin 11-12 nt from the RNA 3′ end that interacts with the RNA-exit channel, stabilizes an open-clamp RNAP conformation, and shifts the RNA 3′ nucleotide into a frayed position in the active site. The hairpin-stabilized and hairpin-inhibited pauses were originally described as class I and class II pauses, respectively . The hairpin-enabling pauses emerged as a new category thanks to improved understanding of pausing-guided RNA folding [51-53]. Taken together, this trio of transcriptional pause types reflect an intimate interplay between RNA hairpin formation and RNAP kinetics that exerts temporal effects on elongation. Besides providing kinetic instructions for concurrent RNA synthesis and folding, RNA hairpins also exert strong spatial effects on the register of translocating EC on DNA.
As robust mechanical devices with defined dimensions, shape, and stability, nascent RNA hairpins effectively restrict lateral movements of RNAP on DNA. Hairpin inhibition of pausing by restriction of backtracking has been documented for both bacterial RNAP  and eukaryotic RNAPII . These effects extend beyond pausing; backtracking also plays key roles in transcription error correction and transcription-coupled repair . Backtracking entails not only extrusion of the 3′ RNA into the secondary channel (or funnel-pore) and shifting the nucleotides in the DNA bubble and RNA-DNA hybrid, but also movement of upstream RNA back into RNAP (Figures 2 and and3).3). Formation of RNA hairpins and other structures at the upstream edge of RNAP generates a mechanical barrier to this movement of the upstream RNA, and thus indirectly suppresses backtracking and promotes forward RNAP motion (Figure 1E)  . Consistent with this idea, increased GC content in RNA sequences, which should increase both the prevalence and stability of nascent RNA secondary structures, correlates with reduced pause frequency and duration. Ablation of nascent RNA structures by RNase treatment, as expected, abolishes the impact of GC content on pausing . For at least RNAPII , and likely also for bacterial RNAP on noncoding RNA genes, numerous RNA structures that form behind the RNAP act as interspaced barriers that aid RNAP in maintaining its forward-moving motion. However, strong hairpins that zip behind the RNAP may push the enzyme forward faster than nucleotide addition, leading to shearing of the DNA-RNA hybrid, as observed in slippage and termination [57, 58]
When transcribing homopolymeric tracts of thymidines or adenosines, both single- and multi-subunit RNAPs exhibit an enhanced propensity for transient shearing of the DNA-RNA hybrid, followed by realignment of the template and transcript in a shifted register . Such transcript slippage, or reiterative transcription, produces a heterogeneous population of transcripts that contain insertions and deletions, many of which shift in translation frame for mRNAs and thus affect the ratio of truncated versus full-length proteins with distinct biochemical properties. These slippage events can be affected by formation of nascent RNA structures. For example, formation of a nascent RNA hairpin was recently reported to cause RNAP to skip a nucleotide in a transposase gene of the bacterium Roseiflexus; this slippage event is required to produce the active transposase . This nascent RNA hairpin is proposed to partially melt the upstream end of a rU5C4 RNA-DNA hybrid and thus promotes forward slippage (Figure 1F). Notably, this RNA structure also causes robust slippage and realignment for yeast RNAPII, suggesting a general capability of such nascent RNA devices to alter RNAP register on DNA .
In addition to effects on slippage and frameshifting, if the mechanical shearing force on the DNA-RNA hybrid exerted by a strong RNA hairpin exceeds a certain threshold (especially when coupled with an adjacent weak homopolymeric rU-dA hybrid) then the transcript 3′ end can be pulled out of the active site, ultimately leading to release of the transcript from the EC (Figure 1G). Thus, long (≥7 bp), GC-rich RNA hairpins followed by ≥3 tandem uridines constitute an intrinsic, or Rho-independent, transcription terminator. This signal is a ubiquitous cis-acting RNA device that dissociates bacterial RNAP from DNA and RNA 7-9 nt 3′ from the hairpin and demarcates the end of transcription units. Progressive formation of a stable RNA hairpin, in particular the annealing of the 2-3 bp at the bottom of the terminator hairpin, exert a strong lateral pulling force that extracts the RNA from the upstream hybrid accompanied either by hybrid shearing if the 3′ hybrid is predominantly rU-dA or forward translocation without nucleotide addition if the 3′ hybrid is stronger. Losing the major source of thermodynamic stability, the EC then collapses and dissociates (Figure 1G) [30, 57, 61, 62].
In addition to acting at or within the RNA exit channel, nascent RNA structures can bind and become anchored to sites on the RNAP surface, modifying the elongation properties of the enzyme (Figure 1H). One prominent example is the Lambdoid bacteriophage HK022 putL RNA, which consists of two conserved hairpins linked by an unpaired guanosine. The 65-nt putL nascent RNA exerts a local anti-pausing effect via suppression of RNAP backtracking, and more significantly, modifies the RNAP to render it resistant to multiple intrinsic and factor-dependent terminators over long distances, thus increasing expression of downstream viral genes . Another RNAP-anchored antiterminator RNA, the first found in Gram-positive bacteria, is the recently described EAR RNA structure (~120 nt); EAR enhances the expression of downstream exopolysaccharide genes and thus drives biofilm formation in Bacillus subtilis .
The broad-spectrum effects of putL and EAR RNA on several different transcription terminators suggest they affect a common step towards termination. putL was proposed to act by interfering with either the formation or the action of the terminator hairpin. In light of the newly revealed connection between clamp opening and RNAP pausing , it is also possible that putL and EAR RNA might suppress pausing and termination by physically preventing clamp movement.
The ability of anchored nascent RNA structures to create pause- and termination-resistant ECs resembles protein- and ribonucleoprotein-based antitermination systems such as λN, λQ, RfaH, and Nus factor-mediated antitermination in the E. coli rrn operons, [64-66]. Notably, the transcription of early λ genes is activated by a short, two-part RNA sequence (the linear boxA sequence and the boxB hairpin), which recruits the λN and Nus proteins to modify the RNAP. In contrast, λQ and RfaH act through binding the DNA rather than nascent RNA . It remains unclear if these RNA-based, protein-based, and hybrid antitermination systems utilize a unified fundamental mechanism, or target distinct steps of the termination pathway or different RNAP structures (e.g., the active site, flap, or clamp) to confer pause- and termination-resistance.
Reciprocal to the myriad effects of nascent RNA structures on RNAP, RNAP and associated TFs control how nascent RNA folds. In principle, RNAP and regulators can modulate RNA folding in at least three ways. First, the simple fact of 5′-to-3′ synthesis creates a positional bias that favors formation of local structures over structures involving distant segments. Second, transcriptional pausing can create an extended time window for the formation of slow-folding RNA structures, in essence selectively augmenting the consequences of vectorial synthesis at key positions. These positional and temporal effects on RNA folding are well recognized, although only a limited set of examples are understood in detail . Such effects can kinetically trap RNA in structures that may not be thermodynamically favored in the full-length RNA. Alternatively, vectorial synthesis and pausing may aid RNA folding into transient structures that protect the nascent RNA against kinetic traps until more 3′ segments required to form long-range interactions are synthesized. Third and finally, the transcriptional machinery may actively modulate nascent RNA folding though protein-RNA interactions that either favor or disfavor formation of particular RNA structures. These interactions may be direct contacts of RNAP to RNA or may be contacts made by accessory elongation factors like NusA that can be thought of as nascent RNA chaperones.
To illustrate these effects, we will describe documented examples of (i) pause-guided folding of gene-regulatory RNAs in attenuation and riboswitch-regulated transcription units; (ii) pause-guided folding of catalytic RNAs; and (iii) pause-guided processing of nascent mRNAs made by RNAPII (see also Table 1).
The best-characterized cases of pause-guided RNA folding occur in the 5′ leader regions of bacterial operons that are regulated by attenuation [31, 68, 69]. Strong pauses were first discovered just downstream from the first significant RNA structures of the alternatively folded leader transcripts of certain amino-acid biosynthetic operons in enterobacteria (e.g., the trp, leu, or his operons [26, 68, 70-73]). These attenuator RNAs operate by coupling the position of the translating ribosome with formation of alternative RNA structures, which dictate the expression of the downstream coding genes. Translation of tandem codons for an amino acid that is synthesized by the downstream gene products makes the position of the translating ribosome sensitive to the level of the cognate aminoacyl-tRNA, which is a function of intracellular supply of that amino acid. These pauses may both guide formation of RNA hairpin structures and allow time for ribosomes to initiate translation of the leader peptide, thus enabling ribosome movement-governed alternative RNA structure formation. Which RNA structure forms in turn determines whether intrinsic termination occurs upstream of the first structural gene.
In the trp and his cases, these RNA structures are located 11 nt before the paused RNA 3′ end and increase the pause dwell time, apparently by stabilizing the open-clamp conformation of RNAP [18, 28, 37, 74, 75]. Ribosomes, once loaded on the leader RNA, appear to release the paused RNAP by disrupting the pause hairpin, which thus has two distinct synchronizing functions (Figure 4A). First, by delaying RNAP, the pause hairpin allows time for ribosomes to initiate translation and translocate closer to the RNAP. Second, by triggering release of the paused RNAP when melted by the ribosome, the pause hairpin synchronizes RNAP and ribosome movements to enable the attenuation decision.
The his and trp pause hairpins increase the dwell time of pauses by a factor of ~10 alone, or ~30 in the presence of NusA [24, 26, 76], although even larger hairpin effects have been reported for other pauses . This pause-stabilizing activity of NusA is attributable to interaction of the NusA N-terminal domain (NTD) with a segment of the RNA duplex at the mouth of the exit channel [78, 79]. Little NusA effect is evident when a short duplex is present but shielded within the exit channel or when the 8-nt loop of the his pause hairpin loop is converted to a UUCG tetraloop [18, 74]. This effect of NusA NTD likely includes stabilization of the pause hairpin since NusA NTD increases the rate of artificial duplex formation by a factor of 2 . However, both the extent of RNA structure stabilization by NusA and possible effects of RNA sequence, shape, and structure on the extent of stabilization are unknown.
Interestingly, similarly positioned pause hairpins appear to have similar pause-prolonging functions that synchronize RNAP movements with binding of the trp RNA-binding attenuation protein (TRAP) at two distinct sites in the B. subtilis trp leader region [80, 81]. Interestingly, the mechanisms by which tryptophan controls gene expression differ significantly among enterobacteria and different firmicutes lineages (via the extent of tRNATrp charging sensed by a translating ribosome, or tRNATrp charging sensed by a T-box riboswitch, or direct tryptophan binding to TRAP). These B. subtilis hairpin-stabilized pauses establish that the broad parameters by which RNAP and NusA guide RNA structure formation are likely similar across bacterial genera, although a lack of effect of the E. coli his pause site on B. subtilis RNAP suggests that some as-yet unexplained, species-specific differences exist . Further, B. subtilis NusG significantly enhances the B. subtilis trp leader pauses , whereas E. coli NusG has little effect on hairpin-stabilized pauses and its E. coli paralog RfaH can suppress hairpin-stabilized pauses. Thus, at least some accessory elongation factors differ significantly in their effects on the pause-nascent RNA structure interplay (see next sections).
Recently, another clear example of hairpin-stabilized pausing governing folding of regulatory leader RNAs was reported for the mgtA operon encoding a Mg2+ transporter in enterobacteria ; in this case, enhancing the pause may both guide formation of alternative leader RNA structures, one of which is a Mg2+-binding riboswitch, and serve as the site of Rho-dependent termination when Mg2+ levels are high. The striking similarity of the mgtA leader pause to the his and trp leader pauses suggests that hairpin-stabilized pauses are more widespread in bacteria than has been appreciated.
Additional well-documented cases of pausing that guides nascent RNA folding occur in the leader regions of operons controlled by riboswitch-based attenuation mechanisms (Figure 4B). For instance, in the leader region of the B. subtilis ribDEAHT operon, which encodes flavin mononucleotide (FMN) biosynthetic enzymes, the lifetimes of two pause sites located downstream of the FMN-binding aptamer domain determine the effective concentrations at which FMN can stabilize the aptamer prior to leader RNA rearrangement into an antitermination structure  (Table 1). These riboswitch pauses are stimulated by NusA. However, whether transient pause hairpins affect these long-lived pauses and how the hairpin pauses alter nascent RNA folding pattern are unknown. The same is true for several other riboswitches for which pausing has been reported to affect ligand binding or RNA folding, such as in the leader regions of the E. coli btuB, thiM, and alx genes [25, 52, 53, 85] (Table 1). Folding of the B. subtilis pbuE adenine riboswitch is another potential example, but the role of pauses here remains uncertain despite high-resolution, single-molecule mapping of positions of nascent RNA folding during riboswitch synthesis [86-88]. For these potential riboswitch-regulating pauses, additional studies are needed to understand the interplay between pausing and nascent RNA folding, the precise nature of the pause mechanisms, and the possible roles of NusA or NusG in their functions.
Definitive evidence that the interplay of RNAP pausing and nascent RNA dictate the folding of biologically active RNA structures comes from landmark work on folding of P-RNA, the RNA component of RNase P, which is responsible for 5′ endonucleolytic maturation of pre-tRNA and other RNA precursors [22, 23]. In an initial study of a circularly permuted form of B. subtilis P-RNA, a strong NusA-enhanced pause was found to accelerate folding of the P-RNA into a catalytically active form . Either a mutant RNAP that abrogated pausing or omission of NusA slowed correct folding dramatically. Subsequently, a similar pause-dependent enhancement of folding was found for the wild-type E. coli P-RNA, which appears to depend on a strong consensus pause . Interestingly, this pause occurs 13 nt after the P8 helix of P-RNA, is detectable as a strong pause in vivo , but is unaffected either by disruption of the P8 helix or inclusion of NusA . Thus, the E. coli P-RNA pause appears to be a strong hairpin-enabling elemental pause that neither backtracks (which would presumably increase upon P8 disruption) nor is hairpin-stabilized. This pause appears to allow time for the formation of a meta-stable, non-native structure that sequesters and protects the 5′ segments from being trapped by stable, catalytically inactive structures (Figure 4C) . This elegant demonstration of how transcriptional pausing can kinetically guide nascent RNA folding into biologically active structures by enabling formation of transient “protecting” structures is the best understood example of what is likely a general paradigm for pause-guided nascent RNA folding (Figure 4C). Similar effects are proposed for E. coli signal-recognition-particle RNA and hybrid transfer-messenger RNA , but those pauses are less evident in vivo  and will require further study.
NusA, present in bacteria and possibly archaea, and NusG (Spt5), present in all three domains of life, are the most universal regulators of transcript elongation. E. coli NusA associates with most if not all ECs in vivo [22, 23, 90]. Although the NusA NTD is sufficient for effects on pausing and termination , most obviously by promoting formation of RNA duplexes [37, 74], NusA also contains other putative RNA binding domains: an OB-fold S1-like domain and two KH domains . These domains appear to interact with nascent RNA in at least some contexts [79, 92, 93]. Thus, a key question is if the ability of NusA to promote folding of P-RNA or other RNAs into biologically active form resides only in its pause-promoting NTD or might involve RNA chaperone-like functions of the other domains. These domains could either sequester upstream ssRNA to promote duplex formation closer to RNAP or they could directly bind to and stabilize duplex formation, like the NTD. Conceivably, their functions could differ among different nascent RNAs and could connect to other RNA chaperones known to promote RNA folding into biologically active forms [94, 95], leaving fertile area for study.
The effects of NusG on pauses in E. coli and B. subtilis clearly differ, either reducing or enhancing pausing, respectively [96, 97]. Mycobacterial and T. thermophilus NusG also appear to enhance pausing [98, 99]. NusG appears to associate with most ECs in vivo , and to exert its effect in E. coli by inhibiting backtracking or promoting forward translocation . Although NusG alone can bind weakly to RNA in vitro , its interactions in the context of an EC appear to be with DNA . Thus, the effects of NusG on RNA folding are likely to be indirect through modulating pausing and thereby time available for RNA folding. However, direct effects have not been excluded and may be a greater possibility in the eukaryotic ortholog of NusG, Spt5, which contains multiple C-terminal KOW domains that could reach to the nascent RNA from the binding site of the NTD on the RNAP clamp domain.
The roles of transcriptional pausing in eukaryotic gene expression, especially in metazoans and particularly in the promoter-proximal region, have recently become widely appreciated [101, 102]; however, its roles in guiding RNA folding are much less well understood than in bacteria. Detailed studies that define and dissect specific hairpin-enabling, hairpin-inhibited, or hairpin-stabilized pauses by RNAPII have yet to occur. Nonetheless, multiple recent findings suggest that pausing by RNAPII likely guides folding and processing of nascent mRNAs and primary microRNAs. Two recent native elongating transcript sequencing (NET-seq) studies that reveal RNAPII pause sites in mammalian cells at high resolution give a genome-scale perspective to these effects [103, 104]. Many of the RNAPII pauses cluster in promoter-proximal regions, poly(A) sites, termination regions, and at splice-site junctions, especially 3′ splice site junctions of retained but not skipped exons (Figure 4D), consistent with prior studies at lower resolution [105-107].
The major function of promoter-proximal pauses, which are present on most metazoan RNAPII transcription units, is thought to be enabling regulated gene expression by maintaining the promoter in a nucleosome-free state. Thus RNAPII remains available for activation and the extent of pausing vs. activated transcription is subject to regulation by a balance of TF activities. Pause regulation appears to be accomplished principally by the pause-promoting negative elongation factor, NELF, and the positive-acting protein kinase complex, P-TEFb [101, 108-110]. However, the possible interplay of nascent RNA structure and promoter-proximal pausing or even the sequence-dependence of promoter-proximal pausing has not yet received as much attention. The bacterially defined consensus pause clearly affects mammalian RNAPII [42, 111] and some evidence exists both for a vaguely consensus pause-like sequence (the “pause button”) and for backtracking as contributors to promoter-proximal pausing [105, 106, 112, 113]. However, DNA-bound TFs also clearly contribute significantly to pause dwell times either by restraining RNAPII through tethering interactions when bound upstream of the enzyme or by roadblocking RNAPII progress when bound downstream. Pause determination by complex sets of interactions [101, 105] is consistent with the multipartite pause signals defined for bacterial RNAPs , but also means that effects of RNA structures on pausing and conversely of pausing on nascent RNA folding may have easily been overlooked.
One model system of promoter-proximal pausing, the HIV-1 leader region, has been studied in some detail and suggests analogies to the mechanisms observed in bacteria. Here, RNAPII enters a backtrack pause in vitro at +62 just before the nascent RNA rearranges from an initial hairpin structure to the TAR hairpin that recruits Tat and PTEF-b [115-119] (Table 1). These ~60 nt RNAs accumulate in vivo prior to full HIV-1 activation . Interestingly, NELF-E, the RRM-containing, RNA-binding component of NELF  that aids HIV-1 leader pausing  interacts with the loop of TAR sequence-specifically . Thus, NELF-E may be a chaperone of nascent RNA structure formation for RNAPII, possibly analogous to the function of NusA for bacterial RNAP. The HIV-1 model system offers the best defined opportunity to fully dissect the nascent RNA structure – RNAPII interplay.
Strong evidence for an interplay between RNA structure formation and promoter-proximal pausing by RNAPII also comes from recent findings by Steitz, Sharp, and their co-workers that some microRNA precursors in mouse embryonic stem cells, which form distinctive RNA secondary structures and are processed by Dicer to generate Argonaute-programming miRNAs, are derived from promoter-proximal pause RNAs (Figure 4D) [124, 125]. These so-called TSS-miRNAs are proposed to destabilize paused ECs through effects on RNAP conformation similar to those documented for bacterial hairpin-stabilized paused ECs [18, 37]. Thus, it will be of particular interest to determine if premiRNA structures function as pause hairpins during promoter-proximal pausing by RNAPII, whether promoter-proximal pauses play enabling roles in nucleation of pre-miRNA structures, and if pause duration or alternative folding of nascent RNAs play regulatory roles in production of TSS-miRNAs.
The same questions apply to the better known route for biogenesis of pre-miRNAs, which are excised from primary RNAPII transcripts or intronic RNAs by the microprocessor complex of RNaseIII-like Drosha and its dsRNA-binding partner Pasha . Although NET-seq results confirm prior evidence for co-transcriptional action of microprocessor , the extent of pausing near the processing sites and thus the potential for interplay between nascent RNA structure formation and pausing in co-transcriptional miRNA biogenesis is as yet unclear. The detection of heterogeneous 3′ ends may be attributed either to pausing or to processed RNAs bound to RNAPII, pointing to the need for further study.
Transcriptional pauses in the vicinity of 3′ splice-site junctions also appear to be crucial in determining whether an exon is included in mature mRNA or skipped in favor of splicing to a downstream 3′ splice-site junction (Figure 4D) [103-105]. The interplay of pausing and splicing may be analogous to the bacterial interplay between pausing and translation . Similar to promoter-proximal pausing, however, the relative roles of DNA-RNA sequence, chromatin structure, DNA modifications, regulatory factor binding, and nascent RNA structure in determining pause site selection and dwell time at splice-site junctions remain to be determined. Given the complex RNA structural rearrangements involved in spliceosome assembly and catalysis, ample opportunity exists for an interplay between nascent RNA folding and splice-site proximal pausing to play regulatory roles in alternative splicing. Making the situation even more complex, exon-skipping appears to allow splice-site RNA to instead be processed into pre-miRNA in metazoans, including humans  (Figure 4D). Thus, a competition between binding of microprocessor and spliceosomal components at splice-site junctions may underlie these alternative RNA fates, leading to the obvious speculation that an interplay between RNA folding and transcriptional pausing could influence this competition. Detailed studies of good model systems will be required to characterize such interplay if it occurs.
Finally, pausing also can be detected at sites of transcript cleavage and polyadenylation, and at sites of RNAPII termination  (Figure 4D). Less is known about the possible roles of nascent RNA structures in these concluding stages of mRNA synthesis, but ample possibilities exist for alternative nascent RNA folding to modulate both pausing and the interactions of cleavage, polyadenylation, and termination factors that mediate these events.
Our review of the current understanding of the interplay among nascent RNA structures, RNAP, and transcriptional pausing reveals unambiguous evidence from bacteria that nascent RNA structure formation exerts extensive control of the transcribing RNAP at multiple steps (pausing, termination, antitermination, backtracking, slippage, etc.) from both within and outside of the RNA exit channel. In turn, RNAP kinetics, pausing, and TFs like NusA guide nascent RNA structure formation for gene-regulatory and catalytic RNAs. These finding suggest that linkage of pausing and folding of nascent RNA into biologically active structures is robust and likely widespread. Nonetheless, important questions remain (see Outstanding Questions). Among these, a fundamental question is whether the effect of RNAP on RNA folding is purely kinetic (i.e., providing necessary temporal instructions), or alternatively if the RNAP exit channel may actively aid RNA folding as a chaperone through physical interactions with RNA structures. The latter, if true, would be reminiscent of the ribosomal exit channel, which induces helicity in nascent polypeptides during co-translational protein folding . For eukaryotic transcription, however, much less is known, even though many leads point to an even more complex and interesting interplay between pausing and nascent RNA folding. Study of good model systems, such as HIV-1, will be the key to gaining definitive insights. Effective methods now exist to map nascent RNA secondary structures [130-132] and to halt RNAP at desired positions on DNA either kinetically or by nucleotide deprivation . Genome-wide methods to study transcription pausing [42, 103, 113, 134] and RNA structure [135-138] at single-nucleotide resolution likely can be combined to generate in vivo maps of nascent RNA folding. A combination of such powerful genome-scale approaches with detailed biochemical study of good model systems will provide the most productive path to a complete understanding of the fascinating interplay between transcriptional pausing and nascent RNA folding.
Elemental pause: A pause that results from rearrangements in the transcription complex active site and that does not involve backtracking. The elemental pause appears to be favored by a consensus sequence in bacteria, and to be a precursor to backtrack pauses or hairpin-stabilized pauses.
Backtrack pause: A long-lived pause caused by reverse translocation of RNA and DNA through RNAP that threads the 3′ RNA segment through a proofreading site occupied by 1 nt backtracked 3′ ribonucleotide and into the secondary channel (pore and funnel in RNAPII) in multi-nt backtrack states. The newly synthesized nascent RNA ≥15 nt from the 3′ end is drawn back into the RNA exit channel and eventually into the RNA:DNA hybrid in multi-nt backtrack states.
Hairpin-enabling pause: An elemental or backtrack pause that allows time for nascent RNA structure formation or rearrangement before further synthesis allows more of the RNA chain to emerge from the RNA exit channel of RNAP and participate in RNA folding, but whose dwell time is unaffected by formation or rearrangement of nascent RNA structure.
Hairpin-inhibited pause: An elemental or backtrack pause that allows time for nascent RNA structure formation or rearrangement and at which the new nascent RNA structure reduces the pause dwell time by inhibiting backtracking or favors forward translocation because the new structure pulls ssRNA out of the RNA exit channel.
Hairpin-stabilized pause: A long-lived pause in which an RNA duplex forms in the RNA exit channel 11 or 12 nt from the RNA 3′ end, favors an open-clamp conformation of RNAP, and increases the duration of the pause up to 100-fold in the presence of NusA. NusA NTD interacts with the loop or duplex region of RNA structures at the mouth of the exit channel ~5 bp or more from the base of the duplex within the exit channel.
Consensus pause: Pausing at a consensus sequence whose strongest determinants are G−10nnnnnnnnY R+1 (where the pause RNA 3′ nucleotide is −1) but for which sequences at +2−+8, −9 to −2, and −11 and upstream also can contribute. The contributions to pausing of interactions between RNAP and the nucleic acids in these different segments are generally additive  and can contribute to elemental, backtrack, or hairpin-stabilized pause classes depending on which subsequent rearrangements are favored once the initial elemental pause forms.
We thank David Brow, Andreas Mayer, and members of the Landick lab for many helpful suggestions on the manuscript. Research in the laboratory of R.L. is supported by a grant from the NIH (GM38660). Research in the laboratory of J.Z. is supported by the intramural research program of NIDDK, NIH. We apologize in advance to our colleagues that lack of space precluded us from citing all the relevant prior publications.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.