|Home | About | Journals | Submit | Contact Us | Français|
It is widely assumed that the key rate-limiting step in gene activation is the recruitment of RNA polymerase II (Pol II) to the core promoter2. Although there are well-documented examples where Pol II is recruited to a gene but stalls3–14, a general role for Pol II stalling in development has not been established. We have performed comprehensive Pol II ChIP-chip assays in Drosophila embryos and identified three distinct Pol II binding behaviors: active (uniform binding across the entire transcription unit), no binding, and stalled (binding at the transcription start site). The striking feature of the ~10% genes that are stalled is that they are highly enriched for developmental control genes that are repressed at the time of analysis or poised for activation during subsequent stages of development. We propose that Pol II stalling facilitates rapid temporal and spatial changes in gene activity during development.
Pol II stalling is probably best studied at heat shock genes in Drosophila, where Pol II engages in transcription but pauses immediately downstream of the transcriptional start site3,4,22. Upon activation by heat shock, Pol II is able to rapidly transcribe these genes. Regulation of Pol II activity after recruitment has also been described in bacteria23, yeast13 and mammalian cell lines3,6–12, and includes instances where Pol II is found in an inactive pre-initiation complex24,25. We will collectively refer to inactive Pol II near the transcription start site as stalled Pol II.
To determine at which genes Pol II stalling occurs during development, we analyzed global Pol II occupancy in whole Drosophila embryos. While this is one of the few systems where genomics approaches can easily be applied to developmental questions, interpretation is complicated by the occurrence of multiple tissues. To reduce the complexity, we used Toll10b embryos (2–4 hours after fertilization), a well characterized mutant that contains a homogenous population of mesodermal precursor cells at the expense of neuronal and ectodermal cells15–21. In Toll10b mutants, mesodermal genes are uniformly activated while genes required for the development of ectodermal and neural tissues are repressed throughout the embryo15–17. Previous whole-genome microarray experiments have identified the transcript levels of all genes in these mutants19,20. To distinguish between stalled and active Pol II, we used a mixture of antibodies that recognizes both the initiating and elongating forms of Pol II (see Methods), and performed whole-genome ChIP-chip assays as previously described21.
The results show that many genes known to be repressed in Toll10b embryos display strikingly high levels of Pol II near the transcription start site (Fig.1A–D). In some cases the prominent Pol II peak is tightly restricted to the promoter region (e.g. at the tup gene in Fig. 1A), while at other genes Pol II is also found at low levels throughout the transcription unit (e.g. the sog and brk genes Fig. 1C,D). This is consistent with previous evidence that some genes such as sog are transiently activated but then repressed at later stages26, while other genes such as tup are never activated in Toll10b mutants19,20.
The Pol II profile of repressed genes is clearly distinct from those of active genes (Fig. 1E, F). For example, the Hbr gene, which encodes an FGF receptor specifically expressed in mesodermal precursors (Fig. 1E), and ribosomal genes such as RpL3 (Fig.1F), show uniformly high levels of Pol II throughout the transcription unit. Furthermore, genes that are silent in the early embryo simply lack Pol II binding altogether (Fig. 1G, H). Thus, there appears to be three distinct classes of genes: those with Pol II distributed throughout the transcription unit, those genes with preferential enrichment of Pol II at the transcription site, and genes that lack Pol II binding altogether.
To further characterize these three groups, we developed a principled method that classifies genes based on their Pol II enrichment profiles (Fig. 2, Supplemental Materials). Similar to an analysis performed in E.coli23, we calculated the ratio between Pol II enrichment at the transcription start site versus internal regions of the transcription unit (Fig. 2). We were able to assign 76% of the protein coding genes (10,220 of 13,448 genes) into one of three classes (Fig. 2B). At least 27% of all genes display an active Pol II profile in which Pol II is detected uniformly throughout the transcription unit. At least 12% of all genes (1,614 of 13,448) show disproportionate accumulation of Pol II near the transcription start site. Among this group, Pol II is tightly restricted to the transcription start site at 62% of genes. At the remaining 38% of genes, Pol II is also detected within the transcription unit, presumably because these genes -like sog – are expressed at low levels in at least a subset of cells during the timeframe of the analysis (2–4 hrs after fertilization). Finally, 37% of all genes lack Pol II binding altogether.
Several lines of evidence confirm that the ~1,600 genes with disproportionate enrichment of Pol II at the transcription start site represent a form of stalled Pol II (Fig. 3). First, all heat shock genes, which are the classical example of Pol II stalling4,22 fall into this class (Fig. 3A). Second, the Pol II peaks map an average of ~50 bp downstream of the transcription start site, consistent with the location of stalled Pol II at heat shock genes4,5,22(Fig. 3B). Since this is an average profile, it is possible that a fraction of Pol II occupancy comes from inactive pre-initiation complexes. However, the majority of detected Pol II signal appears to come from Pol II that is stalled downstream of the transcription start site. Third, Pol II stalling at these genes is consistent with comprehensive expression analysis using whole-genome tiling arrays20. Genes with Pol II tightly restricted to the transcription start site are either silent or only weakly expressed in Toll10b mutants (Fig. 3C). In contrast, genes with similar levels of Pol II binding, but uniform distribution throughout the transcription unit, are expressed at significant levels in these mutants (Fig. 3C). Finally, we used permanganate footprint assays as an independent method to confirm stalled Pol II at selected genes5,14. For example, the rho gene displays clear permanganate sensitivity downstream of the transcription start site (+36 bp), consistent with the Pol II stalling profile seen in Toll10b mutants (Fig. 3D; see Fig. 1B).
There are significant differences in the expression and functions of genes in the active, stalled or no Pol II classes based on in situ expression patterns (ImaGO)27 and functional annotations (GO)28 (Fig. 4). The set of genes with stalled Pol II is highly enriched for developmentally regulated genes, particularly those expressed in ectodermal and neuronal precursor cells (Fig. 4A). Consistent with the in situ hybridization data, genes with stalled Pol II are highly enriched for functions in development, including neurogenesis, ectoderm development and muscle differentiation (Fig. 4B). Many of these genes encode sequence-specific transcription factors (Hox, T-box, bHLH, zinc fingers, and HMG) and components of cell signaling pathways (FGF, Wnt, Notch, EGF, TGFβ, JNK, and TNF) (see Supplemental Materials).
In contrast, the set of genes with uniform Pol II binding is highly enriched for ubiquitously expressed genes (Fig. 4A), which function mostly in metabolism and cell proliferation (Fig. 4B). The set of genes that lacks Pol II binding is highly enriched in genes that show no staining in whole-embryo in situ hybridizations, confirming that they are not expressed during early embryogenesis (Fig. 4A). Many of these genes have functions in adult cells such as cuticle proteins or proteins required for vision (Fig. 4B).
Pol II stalling could reflect two nonexclusive developmental functions. It could be indicative of active transcriptional repression, or prepare genes for activation at later stages of embryogenesis. The second model is particularly attractive since Pol II stalling has already been shown to prepare heat shock genes for rapid activation4,22. We found evidence for both models.
Pol II stalling is particularly prevalent among genes expressed in the neuroectoderm and dorsal ectoderm, which are repressed in Toll10b embryos. To test whether Pol II stalling is specific for repressed genes, we examined the Pol II profile of repressed genes in mutant embryos in which these genes are active. For this, we used two well defined mutants, Toll rm9/rm10 and gd7 (2–4 h), in which cells adopt neurectodermal and dorsal fate, respectively15,19,20,29. Indeed, at these genes, Pol II is redistributed into the transcription unit in these mutants (Fig. 5) and some genes now display the active Pol II profile (Fig. 5B–D). These results demonstrate that Pol II stalling is associated with cell-type specific repression and thus is subject to dynamic changes during development.
Previous studies have shown that the repression of a large set of genes in Toll10b embryos depends on Snail, a well-studied repressor that is constitutively expressed in Toll10b embryos but not in Toll rm9/rm10 and gd7 embryos16,17,19–21,30. We found a statistically significant association between repression by Snail and Pol II stalling (Supplemental Materials). For example, among the 139 genes that are occupied by Snail21 and display reduced expression in the Toll10b mutant19, 54% exhibit stalled Pol II, while only 19% of all genes with reduced expression display Pol II stalling (p < 10−23, Supplemental Materials). This suggests that Pol II stalling in Toll10b embryos may be regulated by Snail. A role for developmental repressors in regulating Pol II stalling is also consistent with a recent study of Drosophila segmentation14.
Multiple lines of evidence suggest that Pol II stalling also occurs at genes that are poised for activation in older embryos. Genes with stalled Pol II are highly over-represented among genes that are rapidly induced within 12 hours after the time-frame of our analysis (p < 10−27, Supplemental Materials). Moreover, genes with stalled Pol II are enriched for genes expressed in the derivatives of the mesoderm precursors present in Toll10b mutants, such as the developing heart and muscle cells (p < 10−15, Supplemental Materials). These genes, such as Drop and bagpipe, are not yet activated at the timeframe of the analysis31,32, but nonetheless show high levels of Pol II near the transcription start site (Supplemental Materials).
To confirm that muscle genes indeed show stalled Pol II before activation, we performed permanganate assays on wild-type Drosophila embryos at 2–4 hours after fertilization (Fig. 6). Drop and ladybird show a clear permanganate footprint downstream of transcription. These footprints were specific to the early embryo stage, since S2 cells, a cell line derived from older embryos, did not show a permanganate footprint under the same conditions (Fig. 6). These results confirm that Pol II stalling is dynamically regulated and suggest that one of its functions is to prepare genes for activation.
Our genome-wide analysis revealed that genes in Drosophila embryos are found in three distinct dynamic states: active, stalled or no Pol II. Stalled Pol II is particularly associated with developmental genes that are repressed and poised for activation. We propose that Pol II stalling allows genes to rapidly respond to developmental signals and thus facilitates the dynamic temporal and spatial expression patterns of developmental control genes.
Toll10b is a dominant gain-of-function mutation of the maternal gene Toll15. Embryos were collected from Toll10b/+ females obtained directly from the balanced stock (Toll10b/TM3 Sb Ser and Toll10b/OR60). Toll rm9 and Toll rm10 are recessive Toll mutations15 and embryos were collected from Toll rm9 /Toll rm9 females. gd7 embryos were collected from gd7/gd7 females29. Wild-type embryos were white11,18.
The antibodies used were 8WG16 and H14 (see Supplemental Materials for further information).
We used Drosophila whole-genome tiling arrays printed by Agilent as described21. Probes of 60mers span the entire eukaryotic portion of the Drosophila melanogaster genome. While the spacing of these probes is ~280 bp on average, an additional probe is present between the two probes that flank each known TSS. Thus, the resolution around transcriptional start sites is ~ 140 bp. The data are available from ArrayExpress and our web site http://web.wi.mit.edu/young/pol2/.
We used the Rosetta error model to control for noise at probes, thus a probe required a p-value < 0.001. We did not use our previous algorithms for detecting bound probes and then assigning genes. Rather, we calculated parameters indicating Pol II enrichment directly for each gene (see Supplementary Materials for a detailed description). A combination of Pol II enrichment at the start site and median enrichment across the gene were used to classify Pol II as either absent, stalled or active.
For the identification of gene sets that are over-represented in the three classes of genes, we used the hypergeometric distribution test (see Supplemental Materials for further information).
Transcription bubble assays with KMnO4 were performed as described previously5,14. Embryos were collected 2–4-h AED (after egg deposition), dechorionated and partially homogenized before treatment with KMnO4. Embryos were treated with 20 mM or 40 mM KMnO4 for 60s on ice. The transcription start sites of the examined genes were identified and confirmed using ESTs in Flybase and previous expression analysis using tiling arrays20. The linker primers and gene-specific primers used for ligation-mediated PCR are listed in Supplemental Materials.
We would like to thank Robert Zinzen for collecting the Toll10b, Tollrm9/rm10 and gd7 embryos, Tom Volkert and Jennifer Love for microarray experimental support, and members of the Young lab for critical reading of the manuscript. This research was supported in part by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences (K.A.), by NIH grants HG002668 and GM069676 to R.A.Y., GM46638 to M.L., a grant by the Moore Foundation and a postdoctoral fellowship by Schering (A.S.). R.A.Y. consults for Agilent Technologies.
Author ContributionsJ.Z. and M.L. designed the experiment. J.Z. designed the arrays and carried out the experiments and analysis. A.S., M.K. and J.Z. analyzed expression data and functional categories, J.-W. H., S. N. and K. A. carried out the permanganate footprint assays, J.Z., M.L. and R.A.Y. prepared the manuscript.