|Home | About | Journals | Submit | Contact Us | Français|
Recent years have witnessed a sea change in our understanding of transcription regulation: whereas traditional models focused solely on the events that brought RNA polymerase II (Pol II) to a gene promoter to initiate RNA synthesis, emerging evidence points to the pausing of Pol II during early elongation as a widespread regulatory mechanism in higher eukaryotes. Current data indicate that pausing is particularly enriched at genes in signal-responsive pathways. Here the evidence for pausing of Pol II from recent high-throughput studies will be discussed, as well as the potential interconnected functions of promoter-proximally paused Pol II.
Higher organisms have evolved sophisticated mechanisms for responding in an integrated and balanced manner to a variety of developmental, environmental and nutritional cues by precisely modulating transcription output. In response to both intra- and extracellular cues, organisms must execute complex programs that require exquisite regulation of both the timing and level of gene expression. These different transcriptional regulatory programs are orchestrated by the concerted action of sequence-specific transcription factors that recruit the transcription machinery. The enzyme that transcribes messenger RNA from protein-encoding genes is Pol II. With the help of a constellation of accessory factors, Pol II executes a series of distinct steps: it binds to promoters; initiates RNA synthesis and then pauses in early transcriptional elongation. The paused Pol II remains stably associated with the nascent RNA and is fully capable of resuming elongation, however further signals are needed to elicit the transition to a productive elongation complex. Once this maturation occurs, the polymerase processively progresses through the gene, terminates and eventually re-initiates transcription. To understand how developmental and homeostatic transcriptional programs operate requires that we know the transcription factors that are involved and their targets. But just as important is an understanding of the mechanisms by which the interplay between Pol II and regulatory factors leads to highly specific, yet readily modulated transcription profiles.
Traditional models of eukaryotic gene regulation were based largely on studies in S. cerevisiae, which emphasized primarily the recruitment step in the transcription cycle and assumed that little regulation occurred after the formation of a Pre-Initiation Complex (PIC). However, recent findings in metazoan systems have revealed that much of transcription regulation occurs well after the recruitment of Pol II and the transcription machinery to a gene promoter, through controlling pausing and the efficiency of early elongation. Thus, we are in the midst of a paradigm shift in our understanding of gene regulation as it applies to higher eukaryotic systems.
Here, we focus on the promoter-proximal pausing of Pol II and its regulated escape into productive elongation1. In this Review, we use the shorthand of calling promoter-proximal pausing simply Pol II pausing, although (as discussed below) there is evidence that the polymerase can pause during productive elongation as well. We describe the basic biochemical properties of paused Pol II and recent evidence from genome-wide studies indicating that this type of regulation is widespread in metazoans2–9. We then discuss how Pol II pausing can influence chromatin structure at promoters to facilitate gene activity, and how pausing might lead to rapid or synchronous transcriptional responses when cells are exposed to an activation signal. We also highlight how regulation of early elongation can interplay with factors regulating Pol II recruitment and the RNA processing machineries to finely modulate transcription in response to distinct signals occurring during development, homeostasis, and disease.
Although pausing has only recently been recognized to be a prevalent regulatory strategy, evidence that transcription elongation could be a rate limiting step in gene expression surfaced more than thirty years ago. A number of studies in mammalian cell culture in the late 1970’s and early 80’s indicated that transcription, once initiated, did not obligatorily produce a full-length transcript10, 11. Insight into when this post-initiation block occurred came from in vivo analyses of the uninduced Drosophila melanogaster heat shock (Hsp) genes in the Lis laboratory (using UV protein-DNA crosslinking12, nuclear-run on13, permanganate footprinting14 and analysis of the short, capped RNAs15) (Box 1). These studies revealed that transcriptionally engaged polymerase accumulates just downstream of the Hsp promoters, associated with 20–60 nucleotide-long nascent RNA13, 15. The properties of these promoter-associated Pol IIs were strikingly similar to those ascribed by Roberts and colleagues to E. coli RNA polymerases that pause at the start of the lambda late gene transcription unit16. Thus, the Lis group referred to the promoter-proximal Pol II found at the Hsp genes as ‘paused’17.
This technique involves protein-DNA crosslinking coupled with immunoprecipitation. When an antibody targeting Pol II is used, ChIP can identify regions of DNA that are bound by Pol II. We note that several antibodies are available that recognize different phosphorylation states of the Pol II CTD, including the early elongation form characterized by predominant Ser7/Ser5-P and the productive elongation form that is also phosphorylated at Ser291. However, because our knowledge is incomplete concerning how the many reported modifications of the CTD affect the affinity of these antibodies for their target epitope and because we do not know the complete modification status of Pol II at every step of the transcription cycle, we caution against using phospho-CTD antibodies as the sole method for establishing the presence of a paused Pol II.
Advantages: A snapshot of Pol II distribution can be achieved through rapid cross-linking of whole cells. Analysis of individual genes is straightforward using quantitative PCR. ChIP is readily adapted for high-throughput genome-wide studies, either by hybridizing immunoprecipitated DNA to an array (ChIP-chip) or through high-throughput sequencing of Pol II-bound DNA (ChIP-seq).
Disadvantages: Low spatial resolution and sensitivity. ChIP signal and specificity is highly dependent upon the antibody used.
Detects locally melted regions of DNA, such as those arising from paused polymerase, by selectively modifying unpaired thymines within a stable, open transcription bubble. Modified thymines are then converted to strand breaks that are visualized by Ligation-Meditated (LM) PCR.
Advantages: Can be performed directly on whole cells or tissues. Achieves nucleotide-level resolution for mapping paused polymerase. Does not require antibodies.
Disadvantages: Low throughput, since the readout involves LM-PCR on individual genes. As a result, the application is limited to genes where good primers for primer extension and LM PCR can be designed, making permanganate footprinting challenging in mammalian systems.
Run-on assays detect elongation-competent RNA polymerases through their ability to incorporate a label into nascent RNA in isolated nuclei. GRO-seq is a genome-wide nuclear run-on method that enables high resolution mapping of transcriptionally-engaged Pol II. Transcriptionally engaged Pol II are allowed to elongate for ~100 nucleotides in the presence Br-UTP. The RNAs are then base hydrolyzed to ca. 100 nucleotides in length and RNAs are affinity purified using anti-BrU beads and specific linkers are added to the 5’ and 3’ ends before submitting samples to Next-Generation Sequencing. The specific 5’ primer allows the orientation of the RNAs to be determined, while three affinity purifications at various points in the sample preparation provide a very low background.
Advantages: Specifically reveals transcriptionally engaged and active polymerase, with high sensitivity and low background. Adaptable for high-throughput genome-wide applications. Can be used in various organisms.
Disadvantages. Technically challenging and requires preparation of nuclei. Resolution for mapping of paused polymerase is reduced by the necessity to allow polymerase to run-on and incorporate labelled nucleotides into RNA.
Direct isolation and identification of short RNA species derived from promoter-proximal Pol II. Initial use of this technique isolated RNAs produced at individual genes using complementary sequence specific probes15, 98. Extending this technique genome-wide by scRNA-seq7 involves isolation of nuclei, size selection of short (<100nt) RNA species, and enzymatic degradation of RNAs lacking the 5’-cap prior to directional linker addition and high-throughput sequencing. This strategy allows for highly sensitive detection of RNA produced by promoter-proximal Pol II, including RNA species generated by Pol II that pauses only transiently or terminates transcription prematurely.
Advantages. Sequencing of short capped RNAs pinpoints the start site of transcription and the final nucleotide added by paused polymerase at single nucleotide resolution.
Low background and high sensitivity assay well suited for high-throughput genome-wide applications. Does not require antibodies, cell treatment or labelling. Can be used in various organisms.
Disadvantages. Technically challenging and requires preparation of nuclei. Does not distinguish between RNA species that remain associated with paused Pol II and those that have been released through transcription termination.
Importantly, additional work performed in the late 1980’s and early 90’s revealed that other promoters displayed paused Pol II. In fact, a large fraction of Drosophila genes investigated in detail (6 out of 10 genes) showed characteristics of Pol II pausing17, 18. Moreover, a handful of mammalian genes, including key cell regulatory genes such as human c-myc and Fos, showed an enrichment of engaged Pol II just downstream of the transcription start site (TSS) that was effectively indistinguishable from that seen at the Drosophila Hsp genes19–21. Pol II was also found to accumulate on promoters of the HIV LTR, although this regulatory system displayed several features that distinguished it from pausing at endogenous genes. First, the nascent RNA transcribed at the HIV LTR forms a functionally important unique secondary structure4, 22 and secondly, the HIV LTR produces an abundant, 59 nucleotide-long RNA that results from premature termination of the early elongation complex22. In contrast, there is no current evidence suggesting high levels of promoter-proximal termination by Pol II at endogenous genes. Nonetheless, these studies of multiple gene systems provided evidence of regulation after recruitment of Pol II to a gene promoter, begging the question of how widespread these ‘alternate’ mechanisms of gene regulation might be.
The findings described above, while appreciated by the field, were eclipsed by studies of transcription in the powerful yeast model system that demonstrated that recruitment of Pol II to promoters was a major mode of gene regulation and provided no compelling evidence for promoter-proximally paused Pol II23, 24. The interest in promoter-associated polymerase was recently reignited by the ability to carry out Pol II chromatin immunoprecipitation (ChIP) assays genome-wide using ChIP-chip or ChIP-seq techniques (See Box 1 on methods for detecting Pol II). These studies have provided evidence for widespread post-recruitment regulation of gene expression in metazoans.
The global localization of Pol II occupancy in multiple species has revealed that Pol II exhibits a variety of distributions along genes that provide insights into the mechanics of their regulation. In yeast, Pol II usually displays a relatively uniform distribution across the transcription unit25, as expected from models where, once recruited, Pol II experiences few regulatory barriers. In striking contrast, Pol II in Drosophila4, 6, 9 and mammalian cells26, 27 is frequently distributed non-uniformly on the bodies of genes. In these higher eukaryotes, a large fraction of genes display Pol II signal that is concentrated near transcription start sites (TSSs), indicating that polymerases recruited to these promoters are not released efficiently downstream into the gene. However, initial genome-wide studies employed Pol II ChIP to localize the polymerase, which in itself is not sufficient to distinguish between species that are paused during early elongation and those that are blocked at another post-recruitment step in the transcription cycle (Figure 1, Box 1: see ChIP). Therefore, elucidating the true status of poised, promoter-associated Pol II (Figure 1e) required the development and use of additional assays.
Defining the status of promoter-proximal Pol II is important for understanding the regulation of polymerase release from the promoter region into productive synthesis. For example, a recruited polymerase that fails to initiate RNA synthesis and is trapped as a PIC (Figure 1a) would involve significantly different mechanisms for release than a Pol II that had synthesized a short transcript but was blocked in early elongation. Moreover, early elongation complexes that accumulate downstream of promoters can be present in several conformations that are not all competent to resume RNA synthesis. Paused Pol II can be readily induced to re-start transcription (Figure 1b), whereas arrested and terminating elongation complexes cannot (Figure 1c–d) and require either rescue or re-initiation in order to generate a productive transcript.
The first genome-wide ChIP study of Pol II distribution in human primary lung fibroblasts26 in 2005 referred to promoter-proximal accumulation of Pol II as PICs (Figure 1a), because the peak of Pol II mapped near the TSSs and because extensive studies in vitro had firmly established the concept of a PIC as an intermediate that occurs early in the transcription cycle. However, in 2007, genomic analysis of Pol II in human ES cells revealed that the Pol II accumulation was accompanied by chromatin signatures of gene activity, suggesting that these Pol II had undergone transcription initiation27. Concurrent ChIP-chip analyses in Drosophila S2 cells and early embryos also identified a widespread accumulation of promoter-associated Pol II, and followed up with permanganate footprinting (Box 1) to investigate whether or not the observed Pol II had paused during elongation through the promoter-proximal region6, 9. This permanganate footprinting demonstrated the presence of stably melted DNA located 20–60 bases downstream of the TSSs of dozens of genes4, 6, 9, which is diagnostic of a transcriptionally engaged polymerase. In addition, depletion of a negative elongation factor that induces pausing (NELF, discussed below) released many of these Pol II complexes from promoter regions, further supporting their designation as engaged, but paused species6. However, it was unclear what fraction of these promoter-proximal elongation complexes was competent to resume RNA synthesis, because permanganate footprinting cannot distinguish between paused, arrested and terminating complexes (Figure 1b–d). Thus, these Pol II species were initially referred to as ‘stalled’4, 6, 9, which is a general term that includes all of these different forms of engaged Pol II (Figure 1f).
Global nuclear run-on assays (GRO-seq, see Box 1) in human primary lung fibroblasts in 2008 confirmed that many of the promoter-associated Pol II molecules are indeed paused, by demonstrating that they are largely capable of resuming transcription in vitro following treatment with the detergent sarkosyl2. Sarkosyl is thought to remove pause-inducing factors from the elongation complex, allowing for RNA synthesis to continue. Importantly, arrested or terminating elongation complexes cannot be induced to “run-on” in this assay, such that the peak of signal observed near promoters by GRO-seq clearly represents Pol II in a paused state2, 5, 28.
Furthermore, although backtracking and arrest of early elongation complexes (Figure 1c) were found to occur commonly in vitro using metazoan transcription systems29, promoter-proximal backtracked complexes were found to be rapidly rescued from arrest in vivo. Indeed, genomic analyses of short, capped RNAs generated by paused Pol II in Drosophila demonstrated that such backtracking is efficiently followed by TFIIS-mediated cleavage of the extruded RNA7. Thus, current evidence suggests that a large fraction of promoter-associated Pol II is in a stably paused state that is competent to resume RNA synthesis. However, more detailed analyses of Pol II status, and in particular the contribution of premature transcription termination to the promoter-proximal Pol II signal, will be required to conclusively address this issue.
Many metazoan genes display significantly higher levels of Pol II on their promoters than on their gene body, but this ratio of promoter to gene body Pol II density, termed the pausing index, can vary dramatically among genes (examples shown in Figure 2). Due to this broad spectrum of pausing indices, and the inherent difficulties in applying a discrete threshold to continuous data sets, calculations of the fraction of genes that display Pol II pausing in mouse embryonic stem (ES) cells have produced estimates ranging from ~30% to ~90% 5, 8. Therefore, rather than reflecting a biological difference, the reported differences in prevalence of pausing are likely a consequence of using different methods and different statistical criteria to define Pol II pausing. Notably, when GRO-seq and a consistent data analysis method are used to measure Pol II occupancy in human primary lung fibroblasts, mouse ES cells, or Drosophila cell culture, a similar fraction of genes are found to display paused Pol II in all cases: ~30% of all genes2, 5, 28. Thus, despite the fact that the definition of what constitutes ‘pausing’ is highly variable among the different groups studying this phenomenon, the proportion of genes that exhibit Pol II pausing appears to be relatively constant across species and developmental stages reported to date.
Interestingly, in all systems evaluated thus far, genes that exhibit pausing are enriched in signal-responsive pathways including development, cell proliferation, and stress or damage responses. This enrichment is of particular interest in pluripotent cells such as ES cells, where pausing has been suggested to play a role in cell differentiation27. Underscoring that the presence and level of paused Pol II can be highly regulated, the specific genes that are paused in various cell types and under varying conditions such as cell stress or cell cycle regulation can differ dramatically5, 30
Genomic analysis of Pol II distribution also indicates that pausing occurs at genes across the spectrum of expression levels3, 7, 8. In fact, recent global analyses of Pol II distribution by GRO-seq indicate that very few paused genes are transcriptionally inactive (<1%)2, 5. This argues strongly against a common perception that Pol II pausing is predominantly a mechanism to silence gene expression31, 32. It is consistent with previous data on paused Pol II: for example, all of the traditionally defined paused genes (Drosophila Hsps, β-tubulin, mammalian c-myc, Fos) exhibit considerable basal expression, and the Drosophila heat shock genes continue to undergo pausing during activation1. Based on these data, we argue that pausing should be considered a mechanism for tuning expression from active genes and perhaps poising them for future changes in expression, rather than as a means of gene inactivation.
We note that Pol II can reduce its elongation velocity and/or pause during productive synthesis as well, although the factors involved and the mechanisms governing pausing within the gene appear to be distinct from those regulating promoter-proximal Pol II33. Slowing of productive elongation is best characterized at the 3’-end of genes, where considerable accumulation of Pol II is observed just downstream of the Poly-A site (Figure 2e)2, 34. This slowing of Pol II at the end of the transcription unit is thought to facilitate the coupling of RNA cleavage with transcription termination35. Likewise, pausing within exons has been reported36 where it is proposed to play a role in promoting splicing. Accordingly, evidence suggests that Pol II elongation rates can impact alternative splicing, with slower elongation favouring inclusion of exons with inherently weak splice sites37. As such, we now appreciate that gene expression can be regulated at virtually every step in the transcription cycle, from PIC formation through productive elongation and RNA processing.
The establishment of paused polymerase requires both bringing Pol II to promoters, and stably retaining the early elongation complex within the pause region. These depend on the intrinsic strength of the core promoter38 and specific transcription factors that recruit chromatin remodelling proteins and the transcription machinery (Figure 3a–b; e.g. transcription factor TF1). After formation of the PIC, the promoter DNA is locally unwound, allowing the polymerase to initiate RNA synthesis and undergo promoter escape, wherein it releases many of the contacts with promoter-bound general transcription factors (GTFs)39. During this process, the GTF TFIIH phosphorylates Serine residues within the C-terminal heptapeptide repeat domain (CTD) of the Pol II largest subunit. The early elongation complex then extends the nascent RNA as it moves downstream into the gene. However, detailed analyses of early elongation have demonstrated that this process is fraught with difficulty29.
Work done largely in the Handa and Price labs in the early 90’s demonstrated that Pol II elongates inefficiently through the promoter-proximal region, displaying a strong tendency to halt or terminate within the first 100 nucleotides29, 40, 41. These studies provided key mechanistic insights into Pol II pausing by revealing that the block in early transcription elongation results in part from the association of two pause-inducing factors with the early elongation complex (Figure 3c). These factors, called DRB sensitivity inducing factor (DSIF)41 and negative elongation factor (NELF)42 are together sufficient to inhibit early elongation in a purified system, indicating that they work directly on the polymerase to help establish the paused elongation complex. Consistent with the lack of evidence for Pol II pausing in S. cerevisiae, homologs of the pause-inducing NELF proteins are absent in yeast, but are conserved from Drosophila to man43.
Despite the clear importance of DSIF/NELF in establishing paused polymerase, growing evidence suggests that these factors are not alone in affecting the residence time of promoter-associated Pol II. Recent in vitro work suggests that additional factors, such as Gdown1 and the general transcription factor TFIIF may also influence the stability or lifetime of the paused polymerase, perhaps by affecting the susceptibility of the early elongation complex to premature termination44. Although it remains unclear whether termination in the promoter-proximal region occurs in vivo, Pol II ChIP-seq studies have provided evidence for premature termination within transcribed units (i.e. downstream of +500)45, suggesting that the processivity of elongating Pol II is continually subject to regulation. Thus, much yet remains to be learned about how the efficiency of early elongation is regulated at the mechanistic and biochemical level.
The maturation of paused Pol II to a productively elongating form requires the kinase activity P-TEFb40, 46, 47. P-TEFb phosphorylates the repressive DSIF/NELF complex, causing NELF to dissociate from Pol II and transforming DSIF to a state that promotes Pol II elongation (Figure 3d)33. P-TEFb also performs additional phosphorylation of Serine residues within the Pol II CTD, creating a platform for binding of RNA processing and chromatin modifying factors that facilitate productive RNA synthesis33, 48. Given its key role in pause release, there is considerable interest in understanding how P-TEFb is targeted to particular gene promoters (shown in Figure 3d as TF2). Befitting the diversity of genes that exhibit Pol II pausing, a large repertoire of factors have been reported to perform this activity, including the acetylated histone-binding protein Brd449, 50, DNA-binding transcription activators such as c-myc and NF-κB8, 51, 52 and the Med26 component of the Mediator complex53. Moreover, P-TEFb is found to be associated with a large number of other elongation factors and chromatin modifying proteins in the “Super Elongation Complex”, suggesting that these factors work together to stimulate productive elongation54–57.
Interestingly, although only a subset of genes appear to accumulate high levels of paused Pol II, most Drosophila or mammalian promoters display a detectable enrichment of polymerase near the promoter compared with the gene body3, 8. In addition, analysis of the location of factors that regulate the establishment and release of pausing suggest that transient Pol II pausing is a general feature of the transcription cycle. For example, the vast majority of active promoters are bound by the pause-inducing factors DSIF and NELF3, 8. The levels of DSIF/NELF at promoters corresponds extremely well with total promoter Pol II, suggesting that these factors associate with most early elongation complexes. Further, treatment of cells with the specific P-TEFb inhibitor, Flavopiridol, blocks the entry of most Pol II into productive synthesis in both Drosophila and mammals8, 58, indicating that polymerase release from the promoter region typically requires the activity of P-TEFb. Taken together, these data suggest that the early elongation complex comes under control of DSIF/NELF at most genes, and that the escape of Pol II into productive elongation involves release of this repressive complex by P-TEFb. Thus, we envision that the rate of P-TEFb recruitment would be critical for determining both gene expression levels and the appearance of paused Pol II. At many genes, P-TEFb recruitment to promoters may immediately follow transcription initiation, leading to rapid release of polymerase into the gene. However, at other genes, P-TEFb recruitment may be a much slower event, permitting accumulation of paused Pol II.
Given the prevalence of paused Pol II at genes within critical developmentally- and environmentally-responsive pathways, identifying the functional roles of paused Pol II has become an active topic of research. We discuss models for four functions below, some of which may be interconnected (Figure 4).
The wrapping of promoter DNA around histone proteins to form nucleosomes can present a barrier to transcription by rendering critical recognition sequences inaccessible. As a result, the remodelling of promoter chromatin to remove or displace nearby nucleosomes is often required to permit recruitment of the transcription machinery and gene expression (shown in Figure 3a)59. Whereas many genes, especially those in yeast60, 61, have been shown to temporally couple this nucleosome remodelling with gene activation, genes with paused polymerase have been shown to undergo nucleosome removal to open promoters prior to and independently from gene activition62, 63. Moreover, paused genes have been shown to persist in a nucleosome-deprived, regulatory factor accessible state that is dependent on the presence of the paused Pol II (Figure 4a)3, 64–66.
A relationship between the paused polymerase and the lack of promoter nucleosomes is apparent at the Drosophila heat shock genes62, 63, whose promoter regions were shown to be nucleosome-deprived even in the un-induced state. Further studies of Hsp70 transgenes indicated that promoter-proximal mutations that affected the levels of paused Pol II also disrupted the binding of HSF to its target sites during heat shock and subsequent gene activation64, 67. Notably, this work suggested that pausing could help maintain an open and accessible promoter structure to facilitate binding by regulatory transcription factors as well as the transcription machinery.
The link between paused Pol II and maintenance of a nucleosome-deprived promoter has recently been demonstrated at a genome-wide level in Drosophila3. Genes with paused polymerase were globally shown to possess low levels of promoter nucleosome occupancy which was dependent on the presence of promoter-associated Pol II: depletion of the pause-inducing factor NELF, which considerably reduced promoter Pol II levels at highly paused genes, led to a concomitant increase in promoter nucleosome occupancy at these genes3, 65. Thus, paused promoters display a dynamic competition for promoter binding between nucleosomes and Pol II. Importantly, genes affected in this way often decreased their expression levels upon NELF-depletion and loss of paused Pol II.
Interestingly, the underlying DNA sequence appears to contribute to the requirement for promoter-proximal Pol II to prevent nucleosome assembly over many TSSs. Packaging of DNA into nucleosomes requires that the underlying sequences are somewhat flexible and amenable to making regular bends as they wrap around the histone proteins, and it has been shown that certain sequences are particularly well- or ill- suited for this purpose68, 69. Strikingly, genes with high levels of paused Pol II in Drosophila possess promoter sequences that are very nucleosome friendly and predicted to promote chromatin assembly3. Genes at which less paused Pol II is present, tend to disfavour nucleosome assembly3. Likewise, paused Pol II is enriched in mammals at CpG-island promoters2 which tend to possess open chromatin70. Although currently a subject of debate, evidence suggests that mammalian promoters with moderately high CG content intrinsically favour nucleosome formation71–73 suggesting that, like in Drosophila, the transcription machinery helps maintain accessible chromatin architecture around these promoters.
Thus, it is tempting to speculate that highly-regulated promoters, many of which exhibit paused Pol II, have evolved DNA sequences that enable a dynamic competition between paused Pol II and nucleosomes74. For example, it has been proposed that the presence of paused Pol II poises genes in ES cells for expression during development, in part by altering promoter chromatin27. As such, the loss of paused Pol II later in development could enable nucleosome occlusion and permanent gene repression. The formation of repressive chromatin may be further enhanced by the recruitment of Polycomb Repression Complexes, PRC1 and PRC2. Notably, in ES cells, bivalent genes75 that contain both PRC complexes have significantly less paused Pol II than genes that lack Polycomb5, 76. Likewise, in Drosophila, mutations in a key component of the PRC2 complex that would presumably create a more accessible chromatin structure allows an increase in Pol II recruitment and pausing on thousands of promoters in the early embryo77. Importantly, the chromatin opening function of paused Pol II would be connected to the other potential functions of pausing. For example, the presence of paused Pol II might allow genes that are transcribed at lower basal levels to be continually accessible and primed for bursts of transcription activation in response to specific cues or for generating synchronous transcriptional responses to signalling (see below).
While rapid gene activation at many genes involves mechanisms that are independent of pausing78, 79 the presence of Pol II is an appealing way to generate an accessible promoter region that can be bound quickly by activators and coactivators. Importantly, the presence of paused polymerase would allow a promoter to be readily switched from experiencing long-lived pausing to undergoing productive elongation simply through the binding of transcription activators that associate with P-TEFb (Figure 4b), bypassing a number of potentially slow or stochastic steps involved in PIC formation. The open promoter and scaffold of GTFs that remain after Pol II escape80 may ensure continuous rapid entry of a succession of Pol II complexes on the activated gene (Figure 3b). Moreover, the nucleosome-deprived status of paused promoters is likely to facilitate transcription factor binding, resulting in more efficient, reliable activation81.
In support of a role for paused Pol II in rapid activation, pausing has been observed at Drosophila genes that are rapidly induced, like the Hsps1 and a number of genes involved in early embryonic development6, 9, 82, leading to the idea that pausing facilitates synchronous changes in gene expression83. Consistent with this, many mammalian genes with paused Pol II (c-myc, fos, junB, TNF- α) have fast, transient expression kinetics20, 84, 85.
However, not all rapidly induced genes display paused Pol II prior to activation86–88, nor are the majority of paused genes highly inducible. In fact, recent work surveying the prevalence of paused genes across several signal transduction networks in Drosophila and murine ES cells revealed that pausing was more enriched at promoters encoding the constitutively expressed components of signal transduction pathways (e.g. receptors, kinases, transcription factors) than at the inducible downstream targets of these pathways89. Moreover, pausing was shown to regulate network activity largely through affecting the basal expression of signal transduction machineries89. Thus, the role of pausing within stimulus-responsive networks is not limited to poising inducible genes for activation. Instead, Pol II pausing can regulate the expression of key molecules such as transcription factors and signalling proteins, thereby tuning cellular responsiveness to external cues.
Pausing represents an additional regulatory step in the transcription cycle, beyond Pol II recruitment. Accordingly, this could allow activators that influence pause release to work together with factors that stimulate recruitment to exert combinatorial control of transcription levels (Figure 4c)30, 90. Indeed, most promoters contain binding sites for multiple transcription activators. Importantly, some activators specifically function to recruit general transcription factors (GTFs) or establish a paused Pol II (e.g. Transcription factors Sp190 and GAGA factor4; shown as TF1 in Figure 4c), some factors bring P-TEFb to the promoter (e.g. c-myc and HIV Tat8, 52, 90; shown as TF2), and others appear to both recruit and release paused Pol II (e.g. NF-κB and herpes virus VP-16 protein51, 90). Thus, the particular combination of transcription activators that bind near any promoter would determine the rates of Pol II recruitment and pause release, thereby defining the rate-limiting step for transcription. In this way, cellular events that altered the levels or activity of individual transcription factors could be integrated on a gene-by-gene basis, depending on the sequence context and associated factors on the promoter and enhancer regions.
Pol II coordinates the efficient processing of nascent RNA: adding a Cap to the 5’-end, coupling splicing events to transcription, and facilitating the 3’-end processing of RNAs. By coupling RNA processing to the status and activity of Pol II itself, the cell ensures that nascent RNA is properly protected from degradation and efficiently matures into a functional mRNA. Pol II is phosphorylated on its CTD at various positions providing a binding platform to recruit an entourage of protein factors that can execute both early and later events of RNA processing91. Phosphorylation of Ser5 within the CTD creates a binding platform for interaction with the 5’ capping enzyme (Figure 4d), and stimulates the activity of this enzyme92. In vivo, 5’ capping occurs as the nascent RNA is extended from 20 to 30 nucleotides in length and the bulk of RNAs associated with paused Pol II are capped7, 15. This was determined initially by detailed analysis of the Hsp genes15 and extended by recent global analyses7 in Drosophila. Interactions have also been reported between the RNA capping machinery and the pause-regulatory factor DSIF93, 94. Thus, pausing may provide both a kinetic “window of opportunity” as well as an interaction surface to facilitate addition of the 5’-methyl cap to the nascent RNA prior to the transition to productive elongation.
As mentioned above, phosphorylation of paused Pol II by P-TEFb provides a binding platform for complexes that carry out 3’-end processing95. As such, the requirement for P-TEFb activity to phosphorylate the DSIF-NELF complex and trigger pause release also may ensure that Pol II does not proceed into the gene before it is appropriately modified for binding by the RNA processing factors (Figure 4d). While rigorous tests of pausing as an obligatory checkpoint for Pol II CTD modification are lacking, the fact that the P-TEFb kinase phosphorylates both DSIF-NELF and Pol II might functionally couple pause release to this Pol II modification.
In the last few years, a new picture of transcription regulation has emerged: genome-wide data in metazoans now points to the widespread importance of Pol II pausing in transcription regulation. Indeed, escape of paused Pol II into productive elongation is regulated during environmental stress6, immunological signalling85 and development96.
Studies of pausing over the decades coupled with an explosion of interest in recent years have led to considerable understanding of the characteristics and function of paused Pol II. Nonetheless, three major categories of questions remain. The first concerns the pervasiveness and patterns of pausing in eukaryotes. Studies underway in many labs will sample a broad swath of additional cell types and organisms in addition to Drosophila, mouse and human studied thus far. These studies should identify common features of genes regulated by this mechanism, as well as revealing cell type- or condition-specific patterns of paused Pol II. Quantitative genome-wide studies should also assess if paused polymerases constitute nearly all promoter-associated Pol II or if there are promoters with significant amounts of other forms: e.g. PICs or arrested Pol II, indicative of alternative modes of regulation (Figure 1).
The second category deals with mechanistic questions designed to understand pausing in molecular terms. We know several factors41, 42 and DNA elements7, 82 that are involved in stabilizing the paused state, but the full repertoire of factors and their interactions remains to be determined. Moreover, we know very little about how these factors interact to mediate efficient pausing. How stable are paused Pol II complexes, what are the relative levels of termination and escape to productive elongation and how might this balance be controlled? It will also be important to further elucidate how P-TEFb is either directly or indirectly targeted to promoters, and how its kinase activity is regulated. Several mechanisms for P-TEFb recruitment have been documented, but surely more are to be discovered33. Future work should also elucidate exactly how the Pol II paused complex is transformed into a productively elongating machine. These events need to be examined in living cells with optical and biochemical methods that provide detailed information on the position and dynamics of paused Pol II and the accompanying protein and DNA interactions. Improvements in inhibitor discovery and in the already powerful molecular (Box 1) and microscopic technologies97 provoke optimism that these challenging mechanistic goals will be achieved.
The third category contains questions addressing the function of this regulation. We emphasize in this review varying levels of evidence for four potential roles of pausing (Figure 4). These proposed functions will be clarified by further rigorous tests that include global studies as well as targeted analysis of specific genes and phenotypic analysis following systematic disruption of pausing. It will be interesting to define whether pausing serves different roles at different functional classes of genes and how these putative roles are interconnected. For example, the transient checkpoint established by pausing could be particularly useful at highly active genes, to ensure that the nascent RNA is properly processed. On the other hand, the opening of chromatin structure by paused Pol II could both fine-tune the basal expression of signalling proteins89 and facilitate a rapid transcriptional response64. In this review, we highlight our current but incomplete understanding of Pol II pausing at promoters and its role in gene regulation. After decades of research and numerous cycles of simplifying and confounding theories and observations, we now have a framework and many of the tools needed to understand mechanistically transcription and its regulation genome-wide.
We thank the members of the Lis and Adelman labs for their helpful discussions on this review. Funding for this work was provided by NIH grant GM25232 to J.T.L. and the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (Z01 ES101987) to K.A.
John Lis is the Barbara McClintock Professor of Molecular Biology and Genetics at Cornell University. He did his Ph.D thesis research at Brandeis University and his postdoctoral studies at Stanford University, as a Helen Hay Whitney Foundation Fellow. Dr. Lis joined the faculty at Cornell in 1978 where his laboratory has developed and used a variety of strategies to probe the regulation of gene expression and the structure of promoters and genes in living cells. The lab’s primary model system has been the highly-inducible heat shock genes of Drosophila, and more recently genome-wide assays have been used in Drosophila and mammals to assess the generality of findings.
Karen Adelman is an investigator at the National Institute of Environmental Health Sciences (NIEHS). She received her Ph.D. from the Université de Paris VI, working at the Institut Pasteur as a National Science Foundation Pre-doctoral Fellow. Dr. Adelman did her post-doctoral research at Cornell University before joining the NIEHS in 2005. Her laboratory uses genetic, genomic and biochemical techniques to study gene regulation in Drosophila and murine systems, focusing on signalling pathways that respond to environmental and developmental cues.
Pre-Initiation Complex (PIC) is an entry form of Pol II in complex with general transcription factors, where the polymerase is bound to the promoter DNA but has not yet initiated RNA synthesis.
Heat shock (Hsp) genes are a set of highly conserved genes encoding molecular chaperones. These genes are rapidly induced in cell or organisms in response to a variety of cellular stresses including a several degree increase in temperature.
The Human immunodeficiency virus (HIV) promoter resides within a long terminal repeat (LTR) region. Transcription from this promoter produces both viral proteins and new RNA genomes.
CpG-islands are regions of higher than normal CpG sequence content that are on average 1000 base pairs in length. Such regions contain ~70% of all mammalian promoters, including both genes that are highly regulated and are broadly expressed.
Polycomb group proteins regulate chromatin structure to contribute to epigenetic inheritance of a repressed state. They form several complexes, broadly defined as Polycomb Repressive Complexes 1 and 2 (PRC1 and PRC2), which are thought to compact chromatin structure.
Ligation-Meditated (LM) PCR is a technique that can be used to map precisely the ends of DNA fragments from a specific region of the genome. Small DNA linkers are added to ends of DNA samples and then primer complementary to this linker is combined with a sequence specific primer to amplify the DNA of interest by PCR. The resulting DNA can then be sequenced by any of a variety of methods, or its size examined by gel electrophoresis.
The promoters of Bivalent genes exhibit histone modifications characteristic of both gene repression and activation. These genes display very low levels of Pol II occupancy and activity, and are hypothesized to be poised for activation during development.