|Home | About | Journals | Submit | Contact Us | Français|
The cohesin complex mediates sister chromatid cohesion and regulates gene transcription. Prior studies show that cohesin preferentially binds and regulates genes that control growth and differentiation, and that even mild disruption of cohesin function alters development. Here we investigate how cohesin specifically recognizes and regulates genes that control development in Drosophila.
Genome-wide analyses show that cohesin selectively binds genes in which RNA polymerase II pauses just downstream of the transcription start site. These genes often have GAGA factor (GAF) binding sites 100 base pairs (bp) upstream of the start site, and GT dinucleotide repeats 50 to 800 bp downstream in the plus strand. They have low levels of histone H3 lysine 36 trimethylation (H3K36me3) associated with transcriptional elongation, even when highly transcribed. Cohesin depletion does not reduce polymerase pausing, in contrast to depletion of the NELF (Negative Elongation Factor) pausing complex. Cohesin, NELF and Spt5 pausing/elongation factor knockdown experiments indicate that cohesin does not inhibit binding of polymerase to promoters or physically block transcriptional elongation, but at genes that it strongly represses, hinders transition of paused polymerase to elongation at a step distinct from those controlled by Spt5 and NELF.
Our findings argue that cohesin and pausing factors are recruited independently to the same genes, perhaps by GAF and the GT repeats, and that their combined action determines the level of actively elongating RNA polymerase.
The cohesin protein complex is essential for sister chromatid cohesion, participates in DNA repair, regulates gene expression, and mild disruption of its activity causes developmental deficits in Drosophila, zebrafish, mice, and humans [1-3]. The Smc1, Smc3 and Rad21 (Mcd1/Scc1) cohesin subunits form a ring-like structure, and Stromalin (SA, Scc3, Stag) interacts with Rad21. Cohesin binds to chromosomes, and likely participates directly in cohesion, repair, and transcription. How cohesin carries out these functions is not well understood.
To understand cohesin function, it is essential to know how, when, and where it binds to chromosomes. Cohesin binds chromosomes throughout interphase in multiple modes that differ in stability [4, 5]. These binding modes are regulated by multiple factors . Key among these is Nipped-B (Scc2, Mis4, Nipbl), which complexes with the Mau-2 (Scc4, Ssl3) protein and loads cohesin onto chromosomes [7-11]. Nipped-B and cohesin both bind chromosomes in a stable mode with a half-life of several minutes, and Nipped-B dosage dictates the amount of stable cohesin . Substantial evidence indicates that stable cohesin topologically encircles DNA [12, 13].
In Drosophila, Nipped-B co-localizes with cohesin along chromosome arms [14, 15]. Cohesin and Nipped-B bind some 20% of active genes, with the highest levels at transcription start sites and DNA replication origins [15, 16]. Cohesin is excluded from inactive genes, including genes silenced by Polycomb complexes . The pattern in mammalian cells is similar to that in Drosophila, except that cohesin, but not Nipped-B (NIPBL), associates with sites that bind the CTCF protein, which regulates transcription by various mechanisms [17-20].
Cohesin also regulates genes via multiple mechanisms. It facilitates looping and communication between transcriptional enhancers and gene promoters, and between sites that bind CTCF [17, 21-26]. Cohesin represses many genes as well, acting in concert with the Polycomb group (PcG) repressor proteins at a few genes in Drosophila cells, and many in mouse embryonic stem cells [1, 17, 27]. The genes bound and regulated by cohesin are enriched for those that control growth and development [1, 17, 27].
It is unknown how Nipped-B and cohesin selectively bind growth and development genes. To tackle this question we looked for DNA sequence and other features that are enriched in cohesin binding genes in Drosophila cells. We find that Nipped-B and cohesin bind genes in which RNA polymerase II is paused just downstream of the transcription start site. Our results indicate that at genes repressed by cohesin, cohesin hinders an early step in transcription distinct from, but closely coupled to the steps regulated by the DSIF and NELF pausing factors.
To find DNA motifs that might target Nipped-B and cohesin to specific genes, we compared sequences around the transcription start sites of genes that bind cohesin, and active genes that do not. Using genome-wide chromatin immunoprecipitation (ChIP-chip) data, we defined active genes that bind cohesin as those that bind RNA polymerase II (PolII), and cohesin (Nipped-B and Smc1) at the transcription start sites (TSS) in both the Sg4 (S2 subline) cells of embryonic origin, and ML-DmBG3 (BG3) cells derived from larval central nervous system . Genes that do not bind cohesin were defined as those that bind only PolII at the TSS in both cell lines. This identified 506 cohesin-binding and 1040 non-binding genes (Supplemental Files S1 and S2).
We counted the occurrences of all possible five nucleotide sequences in the plus strand from 1 kb upstream to 1 kb downstream of the TSS in both gene sets. This identified GAGAG and its reverse complement CTCTC as enriched in the cohesin-binding over non-binding genes. It also identified GTGTG, but not the reverse complement, as enriched in cohesin-binding genes. GAGAG and CTCTC occur 50 to 150 bp upstream of the TSS, and enrichment increases with the stringency of the p value used to call cohesin binding (Figure 1A,B).
GTGTG is enriched 50 to 800 bp downstream of the TSS, is plus strand-specific, occurs only in non-coding sequences (5′ UTR and intron), and thus likely functions in nascent RNA (Figures 1C-E). The GTGTG motif is a subset of extended GT or TG repeats, as revealed by sorting the cohesin binding and non-binding genes according to the frequency with which a GT or TG dinucleotide follows another (Figure 1F). The TBPH protein binds UG repeats , but RNAi depletion of TBPH in BG3 cells had minimal effects on expression of several cohesin-regulated genes (data not shown).
The GAGAG sequence binds GAGA factor (GAF) encoded by Trithorax-like (Trl). GAF has been mapped genome-wide in BG3 and S2 cells by ChIP-chip [29, 30]. GAF binds near the TSS of many cohesin-binding genes, indicating that the GAGAG sites identified by sequence analysis (Figure 1A) are functional. Accordingly, there is a 0.42 genome-wide correlation between Nipped-B binding in Sg4 cells and GAF in S2 cells, and a 0.50 correlation between Nipped-B and GAF in BG3 cells (Figure S1E,F).
GAF predicts genes that bind the NELF complex, which causes promoter-proximal polymerase pausing [30-32]. The genome-wide correlation between Nipped-B in Sg4 cells and NELF in S2 cells (0.45, Figure S1B) is similar to the GAF-NELF correlation (0.40, Figure S1D), indicating that Nipped-B also predicts NELF binding. As positive controls, the correlation between the B and E subunits of NELF in S2 cells is 0.77, and between Nipped-B and the Smc1 cohesin subunit in BG3 cells is 0.90 (Figure S1A,C). Thus Nipped-B and cohesin associate preferentially with genes that bind GAF and NELF. Nearly identical correlations between cohesin (Rad21) and GAF (0.39), and between cohesin and NELF (0.45) were found in quantitative co-immunostaining experiments with salivary polytene chromosomes (data not shown).
Both cohesin and the NELF pausing factor selectively associate with a subset of genes that lack the histone H3 lysine 36 trimethylation (H3K36me3) mark made during transcriptional elongation, providing further evidence that they bind the same genes. Comparing cohesin ChIP-chip data in BG3 cells to genome-wide data  for H3K36me3 and H3K36me1 (monomethylation) revealed that cohesin-binding genes have H3K36me1 (r = 0.53, Figure S1I) but not H3K36me3 (r = −0.01, Figure S1H). Like cohesin, NELF-binding genes also do not have H3K36me3 (r = 0.08, Figure S1G). Cohesin-NELF binding genes are expressed, and thus we conclude that elongation does not require H3K36me3.
Paused RNA polymerase produces short transcripts with 3′ ends just downstream of the TSS, which have been mapped genome-wide in S2 cells . This data reveals that cohesin-binding genes are highly enriched for short promoter transcripts compared to non-binding genes, confirming that cohesin selectively binds genes with paused polymerase (Figure 2A).
We tested if cohesin regulates polymerase pausing by KMnO4 footprinting after cohesin depletion in BG3 cells. KMnO4 modifies T residues in single-stranded DNA created by paused polymerase . We examined five genes that bind NELF and cohesin, two that are activated by cohesin (path, dm), and three that are repressed (HLHm3, invected, rho) . As a positive control, NELF depletion decreased KMnO4 reactivity at all five genes (Figure 2C). In contrast, Rad21 or Nipped-B depletion had no discernable effects. GAF knockdown decreased pausing at path, dm and HLHm3, had little effect at rho, and slightly increased pausing at invected. Western blots showed that NELF subunits were reduced by at least 90%, and that GAF was reduced by some 70% (Figure 2B). Nipped-B and Rad21 were reduced by ~80% (not shown) as described previously . RT-PCR experiments confirmed that Nipped-B and Rad21 knockdown increased rho, HLHm3, and invected transcripts, and decreased path and dm transcripts (not shown).
Ecdysone receptor (EcR) gene expression decreases in the mushroom body γ neuron and salivary glands when cohesin is absent, suggesting that cohesin activates EcR transcription in vivo [35, 36]. Cohesin binds throughout the 80 kb EcR transcribed region in BG3 and Sg4 cells (Figure 3A)  and NELF binds the active promoters in S2 cells . GAF binds all three promoters in BG3 and S2 cells [29, 30]. Although expressed at a high basal level, EcR lacks H3K36me3 (Figure 3A).
We knocked down cohesin and NELF to compare how they regulate basal and ecdysone-induced EcR transcription. All three promoters (p1, p2, p3) are active in BG3 cells prior to ecdysone hormone treatment, and depletion of Rad21, Nipped-B, NELF-B, or GAF slightly increased transcripts from p1 (Figure 3B). Ecdysone increased transcripts from the p3 promoter and total transcripts 4-fold within an hour, but knockdown of Nipped-B, Rad21, NELF-B, or GAF did not alter the induced levels. Thus cohesin does not activate EcR in BG3 cells.
Cohesin-dependent activation of EcR in vivo may require tissue-specific enhancers that are inactive in BG3 cells, but we also considered the possibility that the cohesin binding along EcR interferes with elongation, so that cohesin knockdown could simultaneously decrease activation and increase elongation, resulting in little change in transcript levels. We thus used ecdysone-induction time course experiments to see if depletion of Rad21, Nipped-B, NELF or GAF altered induction or elongation kinetics, following the induced wave of RNA synthesis along the gene with probes shown in Figure 3A. In the control, the first increases in p3 and intron A RNA occurred 10 min after induction, the first increases in intron B and C RNA at 20 min, and the first 3′ exon RNA increase at 30 min. We infer from this wave of RNA synthesis that elongating polymerase moves from p3 to the intron C site at between 2 to 2.5 kb per min. An increase in terminal exon RNA was delayed, suggesting that splicing slows movement from intron C to the end of the gene. Within the resolution of the experiments, knockdown of Rad21, Nipped-B, NELF-B or GAF did not alter the induction or elongation kinetics, indicating that cohesin does not alter polymerase recruitment or elongation (Figure 3C, Figure S2).
Depletion of target proteins was confirmed by westerns (not shown), and we detected effects on other genes. For the experiments in Figure 3 and Figure S2, Rad21 and Nipped-B knockdown increased HLHm3 transcripts up to 30-fold, GAF knockdown increased HLHmγ 5-fold and HLHm3 transcripts 2-fold, and NELF-B knockdown increased HLHm3 expression 2-fold.
Cohesin has more direct negative than positive effects on gene expression in BG3 cells . Little is known about how cohesin represses transcription, and the above data show that cohesin depletion can increase transcripts without reducing polymerase pausing, and that cohesin does not physically hinder transcriptional activation or elongation at the EcR gene. We thus considered the possibility that cohesin may repress transcription at a step that occurs just after pausing.
big bang (bbg, Figure 4A), terribly reduced optic lobes (trol, Figure 5A), the Enhancer of split complex [E(spl)-C], which contains HLHmδ, HLHm3, and HLHm7 and other bHLH genes, and the invected-engrailed (inv-en) complex (Figure 6A), are 11 of the 32 cohesin-binding genes that increase four-fold or more in expression with cohesin or Nipped-B knockdown in BG3 cells, and thus typify cohesin-repressed genes . All play roles in nervous system and imaginal disc development. They bind Nipped-B, cohesin and GAF in BG3 cells (Figures (Figures44--6),6), and NELF in S2 cells (not shown). None have H3K36me3 in the transcribed region with the exception of the 3′ end of trol, where cohesin binding is negligible (Figure 5A). All have H3K36me1 in the transcribed region except the E(spl)-C, although H3K27me3 made by the E(z) PcG protein shows that histone H3 is present [15, 27]. The E(spl)-C and inv-en complex are targeted by PcG silencing proteins in BG3 cells, but bbg and trol, like most cohesin-binding genes, are not [15, 27].
RNAi knockdown of Rad21 or Nipped-B in BG3 cells reduced cohesin and Nipped-B binding to bbg, trol, invected, engrailed, and E(spl)-C genes (Figure S3 and not shown). Rad21 and Nipped-B knockdown over 3 to 6 days also increased transcripts from all these genes, as previously observed . Total bbg transcripts, as measured by a 3′ exon probe, increased 5- to 20-fold (Figure 4C, Figure S4) and total trol transcripts increased 4- to 15-fold, with increases from all active promoters (Figure 5B, Figure S5). E(spl)-C transcripts increased 5- to 20-fold (Figure S6), and invected and engrailed transcripts increased 5- to 30-fold (Figure 6B, Figure S7). Rad21 and Nipped-B depletion had nearly identical effects, except that Nipped-B RNAi often required longer treatment. These data agree with the finding that Nipped-B and Rad21 depletion have virtually identical effects on gene expression genome-wide in BG3 cells . Rad21 and Nipped-B proteins were reduced ~80% by day 4 (Figure 4B and not shown). With this reduction cell division is slightly delayed, but there is no discernable change in sister chromatid cohesion or chromosome segregation .
Despite the transcript increases, cohesin depletion did not increase polymerase binding at the promoters as measured by ChIP. Rad21 knockdown did not alter Ser2P PolII presence at the bbg or trol promoters, but using anti-Rpb3, decreased total PolII at the proximal p3 promoter of bbg by 40% (Figure 4D, Figure 5C), although transcripts from this promoter increased. Rad21 knockdown increased Ser2P elongating RNA polymerase on the 3′ terminal exons of both bbg and trol by some 2-fold (Figure 4D, Figure 5C). Cohesin depletion increased total and Ser2P PolII 1.5-fold at the invected promoter, and 2-fold or more 700 bp downstream (Figure 6C). Low transcription precluded quantitative assessment of elongating polymerase for engrailed and the E(spl)-C. We conclude that cohesin does not repress these genes by inhibiting polymerase binding, and posit that it diminishes how much paused polymerase transitions to elongation.
NELF-B depletion had smaller and more promoter-specific effects than cohesin depletion. At bbg, NELF knockdown had no significant effect on transcripts from the p2 promoter, but increased p3 transcripts 3- to 5-fold (Figure 4C, Figure S4). NELF RNAi had no significant effects on trol, E(spl)-C or engrailed transcripts, but increased invected transcripts 3- to 5-fold (Figure 5B, Figure 6B, Figures S5-S7).
Spt5 depletion (>90%, Figure 4B and not shown) also increased transcripts of the cohesin-repressed genes to a lesser extent and in a more promoter-specific manner than cohesin depletion. bbg p3 promoter and total transcripts increased 2- to 3-fold (Figure 4C, Figure S4). Similar results were obtained using a different Spt5 RNAi target, and Spt4 RNAi (not shown). Spt5 depletion increased trol transcripts 2- to 3-fold, primarily from the p2 promoter, E(spl)-C transcripts 2- to 15-fold, and invected and engrailed transcripts by 5- to 10-fold (Figure 5B, Figure 6B, Figures S5-S7). Simultaneous knockdown of NELF and Spt5 had virtually the same effect as Spt5 depletion alone at all genes, consistent with the idea that Spt5 cooperates with NELF to induce pausing (Figures (Figures44--6,6, Figures S4-S7).
Spt5 depletion did not detectably increase Ser2P or total PolII at the terminal exons of bbg and trol, despite the transcript increases (Figure 4D, Figure 5C). Unexpectedly, Spt5 knockdown decreased both Ser2P and total PolII at the distal promoters of bbg and trol, although transcripts from these promoters did not change. Spt5 depletion had minor effects on Ser2P and total PolII at the bbg p3 promoter (Figure 4D), although transcripts from this promoter increased as much as 5-fold, and increased total and Ser2P PolII at the trol p2 promoter 2- to 3-fold (Figure 5C), which correlates with the increase in p2 transcripts. Thus Spt5 depletion alters PolII binding at bbg and trol promoters, but except for the trol p2 promoter, the changes do not correlate with transcript levels.
Spt5 depletion did not alter Nipped-B or cohesin binding to bbg, invected, or engrailed, but increased their binding to the trol p2 promoter, where PolII also increased (Figure S3). Thus Spt5 is not required for cohesin binding, and we posit that an increase in trol p2 transcription caused by Spt5 depletion increases cohesin binding at this promoter.
We performed simultaneous knockdowns of cohesin and pausing factors to test for epistatic and functional interactions. At bbg, combined NELF and cohesin or Nipped-B knockdown caused a 50-fold synergistic increase in total transcripts (Figure 4C, Figure S4). In contrast, at trol, combining NELF-B with Nipped-B or Rad21 knockdown had effects similar to Rad21 or Nipped-B knockdown alone (Figure 5B, Figure S5). In sharper contrast, combining NELF-B with Nipped-B or Rad21 knockdown decreased the level of E(spl)-C transcripts obtained with Nipped-B or Rad21 depletion alone (Figure S6). Combining NELF-B RNAi with Rad21 or Nipped-B depletion increased the effect of cohesin knockdown on invected, but decreased the effect on engrailed (Figure 6B, Figure S7). Because combined cohesin-NELF depletion rarely had the same effect as NELF or cohesin knockdown alone, we conclude that NELF and cohesin are not epistatic to each other. The robust interactions in some cases, however, indicate that the steps they regulate are closely linked. Because cohesin depletion did not increase PolII binding to promoters or reduce pausing, these findings argue that cohesin regulates a step downstream of pausing.
In contrast to NELF depletion, Spt5 depletion is epistatic to Nipped-B and cohesin knockdown. Combining Spt5 RNAi with Rad21 or Nipped-B knockdown diminished the transcript increase caused by cohesin or Nipped-B knockdown for most genes, including bbg, the E(spl)-C genes, invected, and engrailed (Figure 4C, Figure 6B, Figure S4, Figure S6, Figure S7). Spt5 knockdown decreased the effect of cohesin depletion on trol p1 transcripts, but had little effect, or with extended knockdown, increased the effect of cohesin RNAi on p2 and total trol transcripts (Figure 5B, Figure S5). Because Spt5 is also required for elongation in addition to pausing, the finding that Spt5 depletion is epistatic to cohesin knockdown argues that cohesin regulates a step between pausing and elongation.
Genome-wide studies show that Nipped-B (NIPBL) and cohesin preferentially bind a subset of active genes, and have direct negative and positive effects on gene expression in [15, 17, 18, 27, 35]. Many positive effects likely reflect a role for cohesin in facilitating enhancer-promoter looping [17, 21, 22], similar to its role in looping between CTCF sites (23-26). Our studies address the less-understood mechanisms by which cohesin represses genes. They indicate that cohesin selectively binds genes with paused RNA polymerase, and at repressed genes, interferes with transition of paused polymerase to elongation at a step distinct from those regulated by the DSIF and NELF (Figure 7).
Genome-wide correlation between cohesin and NELF binding, and high levels of short transcripts at cohesin-binding promoters provide compelling evidence that cohesin selectively binds genes with paused polymerase. This agrees with the finding that cohesin preferentially regulates genes that function in growth and development [17, 27], because these ontologies are also enriched among genes with paused polymerase [37, 38].
If cohesin enhanced pausing by physically impeding elongation, it would provide a simple explanation for the association of cohesin with genes with paused polymerase. We find, however, by permanganate footprinting, that cohesin depletion altered gene expression without reducing pausing, and by time course experiments, that cohesin does not impede elongation along the EcR gene. Thus we posit that cohesin regulates a specific step in transcription in a context-dependent manner.
Based on an ecdysone-induced wave of RNA synthesis we infer that PolII moves at just over 2 kb per minute along EcR. Cohesin, which has a chromosomal binding half-life of 2 to 9 minutes depending on the cell cycle stage , binds to the entire length of EcR . Thus the cohesin half-life appears incompatible with the elongation rate unless polymerase passes through cohesin rings, or unless cohesin opens and recloses without release from the chromosome. The interior diameter of cohesin is approximately 35 by 50 nm , and the elongating holopolymerase is less than 20 nm in diameter , and thus the idea that polymerase passes through cohesin is tenable.
The step regulated by cohesin at repressed genes is not polymerase recruitment. Cohesin depletion did not increase polymerase binding to bbg and trol promoters, and also did not decrease pausing at the genes tested, although it altered transcript levels. Thus at genes it represses, cohesin likely hinders a step downstream of polymerase pausing, and prior to elongation (Figure 7). Consistent with this idea, depletion of Spt5, which is required for both pausing and elongation [41-43], diminished the increase in transcripts caused by cohesin knockdown, and cohesin depletion increased the presence of Ser2P PolII, the elongating form, further downstream in the genes examined.
What is the step regulated by cohesin? The Set2 histone methyltransferase is required for H3K36me3, binds to the phosphorylated CTD of elongating PolII, and in yeast lacking Set2, transcription is sensitive to 6-azauracil, suggesting impaired elongation [44-47]. Cohesin-binding genes lack H3K36me3, and thus we posit that cohesin might inhibit binding of Set2 to the phosphorylated PolII CTD, or Set2 activity (Figure 7B). At genes repressed by cohesin, Set2 may be limiting for transition from pausing to elongation.
Cohesin does not repress all genes, and thus the cohesin-regulated step is not always limiting for transcription. One possibility is that the net effect of cohesin is a balance between facilitating long-range activation and promoter repression, and another is that the cohesin-regulated step is limiting when other proteins also repress a gene. This latter idea is consistent with the finding that the E(spl)-C and invected-engrailed genes are partially repressed by PcG proteins in BG3 cells . bbg and trol are not PcG targets in BG3 cells, but may be targeted by unknown repressors.
Cohesin usually does not bind PcG-targeted regions , but in the rare overlaps, cohesin repression is stronger than average . These genes show expression levels that range from low to high, indicating that co-repression by cohesin and PcG proteins restrains, but doesn’t silence transcription. This restrained state is frequent in mouse embryonic stem cells, and many genes that increase in expression with cohesin depletion in these cells have both the H3K4me3 histone modification characteristic of active genes, and the H3K27me3 PcG silencing modification [1, 17, 48]. Cohesin and PcG proteins may target closely linked steps in the transition from pausing to elongation, because PcG proteins also bind preferentially to genes with paused polymerase , and cohesin interacts with PcG complexes in nuclear extracts .
Association of cohesin with paused polymerase genes has implications for factors that control gene expression through chromatin structure, including domain boundaries, insulators, and dosage compensation complexes. Paused promoters have insulator activity that blocks long-range enhancer promoter interactions , and cohesin may facilitate this activity, as it does for CTCF in mammalian cells [19, 20]. H3K36me3 facilitates binding of the MSL dosage compensation complex that upregulates X-linked gene expression of in males , and thus lack of H3K36me3 may decrease compensation of cohesin-binding genes. MSL increases elongating polymerase levels , suggesting that cohesin and MSL may directly oppose each other at the transcriptional level.
Our data argue that the presence of cohesin and the NELF and DSIF pausing factors at the same genes are not interdependent. We postulate that the GAF factor binding sites, and/or the downstream TG repeats at these genes facilitate recruitment of both pausing factors and cohesin. GAF may recruit NELF at some genes , but GAF depletion does not mimic cohesin knockdown, indicating that GAF is not required for cohesin binding. The strand-specific enrichment of TG repeats in the transcribed region is particularly striking. We speculate, therefore, that UG repeats in the initial nascent transcripts may bind proteins that recruit Nipped-B, which then loads cohesin. The same proteins might also recruit NELF, which is transferred back to promoter-bound polymerase to induce pausing.
Published microarray chromatin immunoprecipitation (ChIP-chip) data was processed using the MAT program . The MAT score is a measure of enrichment that accounts for differences in oligonucleotide sequences, and is used to estimate probability of binding at defined p values. Nipped-B, Smc1 and PolII ChIP-chip data in BG3 and Sg4 cells was from reference 15 (GEO GSE16152), and GAF, NELF-B and NELF-E data from S2 cells was from reference 30 (ArrayExpress E-MEXP-1547). Histone H3K36 methylation and GAF ChIP-chip data for BG3 cells were from the modENCODE project [submit.modencode.org/submit/public/list, 29]. Correlations between MAT scores were calculated using R [R Foundation for Statistical Computing, Vienna, 2007; ISBN 3-900051-07-0; www.R-project.org].
We wrote computer programs to identify DNA sequences that are enriched in cohesin-binding genes. Using annotation type “gene” in Drosophila genome release 5.28, we identified genes that bind PolII, Nipped-B and Smc1 with p value ≤ 10−3 at the start site in both BG3 and Sg4 cells (Supplemental File S1), and genes that bind Poll but not Nipped-B or Smc1 in both cell types (File S2). We determined the number of occurrences of all possible five nucleotide sequences at each position from −1000 to +1000 relative to the TSS in both gene sets. The frequency of a sequence in a gene group was defined as the total number of occurrences in the group divided by the total number of five nucleotide DNA sequences (1,009,976 for cohesin-binding genes, 2,075,840 for non-binding). Differences in sequence frequencies between the two groups were normalized to identify large differences:
To determine if an enriched sequence occurred in a specific position, we measured the frequency of that sequence at each nucleotide position relative to the TSS, where the frequency was defined as the number of occurrences at each position divided by the number of genes in each group. Frequency plots were smoothened by averaging the frequencies at the surrounding 25 nucleotides for each position.
To compare sequence frequencies in non-coding (5′ UTR or intron) and coding sequences, we plotted the frequency at each nucleotide position relative to the TSS, but genes in which that nucleotide position was either coding or non-coding were removed from the gene group.
BG3 cells were cultured and proteins depleted by dsRNA treatment as described . PCR primers used to make templates for dsRNA synthesis are in Table S1, except for Rad21 and Nipped-B, which are described elsewhere . Templates were designed to avoid off-target matches of ≥19 nucleotides using Drosophila RNAi Screening Center (http://www.flyrnai.org/) online tools. The EcR gene was induced by addition of ecdysone (20-hydroxyecdysone, Sigma) to the culture media at 2 micromoles per liter.
To quantify protein depletion, total cell protein extracts were subjected to SDS-PAGE western blots as described . The GAF, NELF-B, NELF-E, Rad21, Nipped-B, and Spt5 antibodies are described elsewhere [14, 30, 55, 56].
Total RNA was isolated using columns according to the manufacturer’s protocol (Zymo Research), and quantified by real-time quantitative PCR . Primers for quantitative real-time reverse transcription-PCR are in Table S1 and elsewhere .
Chromatin immunoprecipitation was performed as described [15, 55]. Karen Adelman (NIEHS) provided Rpb3 antibodies. Precipitated DNA was quantified using RT- PCR. Enrichment was calculated relative to a genomic region upstream of engrailed that doesn’t bind PolII or cohesin (Chr 2R nucleotides 7436002-7436080).
Cohesin selectively binds genes that have paused RNA polymerase
Cohesin depletion alters gene expression without reducing polymerase pausing
Cohesin and pausing factors regulate closely linked steps at repressed genes
Cohesin inhibits transition of paused polymerase to elongation at repressed genes
The authors thank Karen Adelman and John Lis for generously providing antibodies, and Tomasz Heyduk for helpful comments on the manuscript. This work was supported by grants from the NIH to DD (GM055683) and DG (GM047477).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.