|Home | About | Journals | Submit | Contact Us | Français|
We have used paired-end sequencing of yeast nucleosomal DNA to obtain accurate genomic maps of nucleosome positions and occupancies in control cells and cells treated with 3-aminotriazole (3AT), an inducer of the transcriptional activator Gcn4. In control cells, 3AT-inducible genes exhibit a series of distinct nucleosome occupancy peaks. However, the underlying position data reveal that each nucleosome peak actually consists of a cluster of mutually exclusive overlapping positions, usually including a dominant position. Thus, each nucleosome occupies one of several possible positions and consequently, different cells have distinct local chromatin structures. Induction results in a major disruption of nucleosome positioning, sometimes with altered spacing and a dramatic loss of occupancy over the entire gene, often extending into a neighbouring gene. Nucleosome-depleted regions are generally unaffected. Genes repressed by 3AT show the same changes, but in reverse. We propose that yeast genes exist in one of several alternative nucleosomal arrays, which are disrupted by activation. We conclude that activation results in gene-wide chromatin remodelling and that this remodelling can even extend into the chromatin of flanking genes.
The DNA of eukaryotic cells is organized into chromatin to facilitate packaging into the nucleus and to regulate access to genetic information. The basic structural unit of chromatin is the nucleosome, which includes the nucleosome core, the linker DNA between nucleosomes and histone H1 (1). The nucleosome core is composed of an octamer of the four core histones (H2A, H2B, H3 and H4), around which is wrapped ~147bp of DNA in 1.75 negative superhelical turns. The nucleosome core can be isolated as a metastable intermediate, the ‘core particle’, by digesting chromatin with micrococcal nuclease (MNase). Indeed, a low-resolution crystal structure of native core particles has been described (2). High-resolution structures were obtained later using core particles reconstituted with defined DNA (3,4).
Digestion of chromatin by MNase proceeds through several stages. Initially, MNase cuts the relatively unprotected linker DNA, resulting in a series of discrete DNA fragments corresponding to integral numbers of nucleosomes (appearing as a nucleosome ‘ladder’ in an agarose gel). Thus, nucleosomes are regularly spaced along the DNA in vivo. In most cell types, the average length of DNA associated with a nucleosome is ~190bp (5), but in budding yeast it is only ~165bp (6,7). Later in digestion, MNase begins to trim the ends of the linker DNA, eventually reaching a transient block in the form of the chromatosome, a particle containing ~165bp of DNA and H1 (8). Finally, MNase removes the remaining ~20bp of linker DNA to yield the core particle. The core particle is relatively stable, but MNase destroys it eventually.
In vitro, the histone octamer binds more strongly to some DNA sequences than to others; the strongest of these are referred to as positioning sequences [reviewed in (9)]. In vivo, the distribution of nucleosomes on DNA is also strongly dependent on the underlying sequence (10–14), suggesting that eukaryotic DNA possesses a nucleosome positioning code (12,13). However, the requirement for nucleosome spacing, the presence or otherwise of sequence-specific transcription factors (14,15) and the activities of ATP-dependent nucleosome mobilizing complexes (11,16) are expected to modify how the information specified by the nucleosome code is used (17).
The biological role of nucleosome positioning is a subject of intense interest and some controversy [reviewed in (18)]. The position of a nucleosome is defined by the DNA sequence it occupies i.e. the DNA within the core particle. Therefore, all nucleosomes have a position with respect to the genomic sequence. A useful concept is to imagine the genome as a series of overlapping ~147-bp windows, each of which has the potential to be occupied by a nucleosome (19). The occupancy of each potential position might be very low (very few cells in a population have a nucleosome at this position, e.g. a nucleosome-depleted region), or very high (maximum occupancy is when all cells in a population have a nucleosome at this position). Thus, a strong position is one with a high occupancy (i.e. the same nucleosome is present in most cells).
The advent of DNA microarrays and massively parallel sequencing has revolutionized the study of positioning. In the first of these pioneering studies, hybridization of nucleosomal DNA to microarrays was used to measure average occupancies over an entire yeast chromosome (20) and later for the entire genome (21). However, this method cannot determine nucleosome positions very accurately because the precise borders of the nucleosomes cannot be ascertained by hybridization. Sequencing of nucleosomal DNA can determine positions very accurately (22,23) and is now possible on a genome-wide scale. Recent genome-wide single-end nucleosome sequencing studies have resulted in important insights into nucleosome positioning (13,14,24–27). There is, however, a significant drawback to this approach: only one end of each nucleosome is determined. This end sequence might be derived from a fully trimmed nucleosome (a core particle), thereby providing an accurate position, or it might be derived from an incompletely trimmed nucleosome (containing residual linker DNA), or from an over-digested nucleosome (cut internally), resulting in an inaccurate position. This problem is resolved by paired-end sequencing, a refinement of next-generation sequencing, which provides sequence reads from both ends of the same DNA molecule. Accordingly, after alignment with the genome sequence, the exact length of the DNA fragment can be deduced. Paired-end sequencing has been used recently to investigate the genomic distributions of several classes of MNase-resistant particle derived from chromatin (28).
Accurate positions are crucial for an understanding of the sequence determinants of nucleosome positioning. A fully trimmed nucleosome core particle should contain ~147bp, as in the crystal structures (3,4,29). Here, we describe the results of a paired-end nucleosome sequencing study aimed at defining accurate positions for nucleosomes in the budding yeast, Saccharomyces cerevisiae. We compare nucleosome positions in control cells with those in 3-aminotriazole (3AT)-treated cells. 3AT inhibits the enzyme encoded by HIS3, which is required for histidine biosynthesis, resulting in induction of the amino acid starvation pathway through translational control of the transcriptional activator Gcn4 (30).
We show that mono-nucleosome preparations are composed of a mixture of particles containing DNA of different lengths, as expected. We determine accurate nucleosome positions by considering the subset of nucleosome sequences derived from core particles (145–150bp in length). Our results are best described using the concept of ‘nucleosome position clusters’, which specify sets of mutually exclusive overlapping positions and usually include a dominant position (11,16,17,31). Thus, each nucleosome can adopt one of the several alternative positions. To account for position clusters, we propose that yeast genes exist in one of several alternative, overlapping, nucleosomal arrays.
Activation by 3AT results in a dramatic loss of canonical nucleosomes from some genes and the position cluster organization of the remaining nucleosomes is disrupted. Furthermore, chromatin disruption often extends into neighbouring genes. Thus, activation-induced chromatin remodelling events are gene wide and can even spread farther, disturbing the chromatin of flanking genes.
YDC111 (MATa ade2-1 can1-100 leu2-3,112 trp1-1 ura3-1 RAD5+) (16) was grown to late log phase in synthetic complete (SC) medium (control) or SC medium lacking histidine to which 3AT was added to 10mM for 20min just before harvesting (3AT treated). Core particle DNA was prepared by MNase digestion of nuclei, gel purified, repaired and checked for DNA size and quality as described (Figure 1) (32). Paired-end sequencing was performed as described (33). Control cells yielded 16.6 and 12.3 million aligned paired reads of 40nt each for the first and second experiments, respectively; 3AT-treated cells gave 13.1 and 13.5 million aligned paired reads, respectively. Paired reads were aligned to the S. cerevisiae genome using ELAND. Reads with mis-matches were excluded from the analysis. The GEO accession number for the data presented here is GSE26493.
An algorithm was written to extract nucleosome positioning information from the sequence data. First, it was assumed that all nucleosome sequences between 145 and 150bp represent accurate positions. These sequences were used to define a set of accurate positions (SAC) adopted by nucleosomes, represented by their midpoint coordinates (i.e. nucleosome dyad axis). Secondly, to include as much of the data as possible, the midpoints of all remaining reads (those <145bp and >150bp) were calculated and then these reads were allocated to the nucleosome in SAC with the closest midpoint, provided that the two midpoints were <10bp apart. Finally, the data were smoothed using a 6-bp window. Nucleosome position maps were obtained in which the number of sequences corresponding to a specific nucleosome position is plotted against the chromosomal coordinate of the dyad axis. The scripts written to analyse the data are given in Supplementary Data.
Nucleosome core particles were prepared by MNase digestion of nuclei prepared from control and 3AT-treated cells (Figure 1). We obtained between 12 and 17 million aligned paired reads per sample. The yeast genome can accommodate approximately 75000 nucleosomes, given that the haploid genome is ~12.1Mb and the nucleosome spacing is ~165bp. Consequently, approximately 200 sequences per nucleosome should be expected. This read depth should provide data with low statistical sampling error.
To maximize the fraction of fully trimmed core particles, a balance must be struck between the full trimming required to obtain accurate position data and the tendency for MNase to begin cutting within the core particle. A typical nucleosome length distribution (Figure 1E) suggested the presence of three different populations: (i) core particles (peaking at 149bp); these are mono-nucleosomes with little or no linker DNA remaining. Consequently, their DNA content defines accurate positions; (ii) mono-nucleosomes with residual linker DNA (peaking at ~157bp and at ~165bp). Incomplete trimming might reflect the binding of H1, which is present at relatively low levels in yeast and so only some nucleosomes would be expected to contain it (34), although poorly trimmed nucleosomes of about chromatosome size have been observed even in the absence of H1 (5); and (iii) subnucleosomal particles containing less than ~140bp. These probably derive primarily from internal cleavage of core particles by MNase, perhaps following spontaneous uncoiling of DNA from the ends of the nucleosome (35). Alternatively, some might represent remodelled nucleosomes or transcribed nucleosomes lacking an H2A-H2B dimer (36,37). The data presented below belong to the first of two independent experiments, which gave essentially the same results.
The PHO5 promoter was chosen as a control region because it is one of the best studied loci in the chromatin literature. Mapping of the PHO5 promoter by indirect end-labelling has established that the repressed promoter is organized into an array of positioned nucleosomes numbered −1 to −5 (38). There is a gap between nucleosomes −2 and −3, where binding sites for the transcription factors Pho4 and Pho2 are located. Induction disrupts this ordered chromatin structure and increases accessibility of the promoter DNA (38). In our experiments, cells were grown under conditions such that PHO5 should be repressed.
The nucleosome occupancy profile is a plot of the chromosome base coordinate versus the number of nucleosome sequences that contain that particular base. It is therefore a measure of the probability of a base being contained within a nucleosome. Occupancy profiles for the PHO5 promoter in control and 3AT-treated cells are shown (Figure 2A); all aligned nucleosome sequences were included. The data were not subjected to mathematical manipulation, except that the 3AT data were multiplied by 1.27 to compensate for the fact that fewer total sequences were obtained relative to the control. The agreement between the profiles for control and 3AT-treated cells was excellent; the traces superimposed in places and showed limited quantitative variation.
The occupancy profiles for the PHO5 promoter exhibited peaks corresponding to the five reported nucleosomes (Figure 2A). Importantly, although the peaks were quite obvious, the troughs between them did not dip close to the baseline, indicating that many nucleosome sequences included what should be linker DNA between the reported nucleosomes. To assess whether this was due to poorly trimmed nucleosomes (i.e. from nucleosomes significantly>150bp and therefore including some linker DNA), the plot was restricted to data for nucleosome sequences 145–155bp in length (Figure 2B), corresponding to 50% of all nucleosomes. The occupancy profiles were marginally sharper (Figure 2B), but not essentially different from the profiles corresponding to all nucleosomes (Figure 2A). This suggests that the excluded nucleosomes (those >155bp and <145bp) are derived from the same nucleosomes in the restricted data set (145–155bp).
The accurate nucleosome positions are those between 145 and 150bp, corresponding to core particles (comprising ~30% of nucleosome sequences from control and 3AT-treated cells). These positions were defined by their midpoints (dyad axes). To include as much data as possible, the midpoints of all remaining sequences (those <145bp and >150bp) were calculated and then assigned to the accurate position with the closest midpoint. These simple rules yield a position map in which the number of sequences corresponding to the midpoint of each specific nucleosome position is plotted against the chromosomal coordinate of the dyad axis.
The position map for the PHO5 promoter showed that some positions were strongly favoured, particularly those corresponding to nucleosome −3 and the PBY1 coding region (Figure 2C, D). However, in all cases, there were some less prominent midpoints near each major midpoint, corresponding to positions which overlap the major position to different extents. Thus, each dominant midpoint was associated with a cluster of midpoints, which we term a ‘position cluster’. Each position cluster specifies a set of overlapping positions, which are mutually exclusive because canonical nucleosomes cannot physically overlap on the same DNA molecule. Therefore, in some cells the nucleosome is present at the dominant position in each cluster while in other cells, the nucleosome is at one of the alternative positions. All of the PHO5 promoter nucleosomes corresponded to clusters including a dominant position, except nucleosome −4, which was a cluster of alternative positions with similar probabilities. PBY1 also exhibited position clusters with a particularly dominant position for the first nucleosome. In conclusion, the chromatin structure of the PHO5 promoter is best described in terms of nucleosome position clusters, rather than uniquely positioned nucleosomes.
Immediately downstream of TRP1 is ARS1, a well-studied replication origin. Examination of the chromatin structure of a TRP1 ARS1 plasmid by indirect end-labelling has revealed a hypersensitive site at the ARS consensus sequence (ACS), where the origin recognition complex binds, and three well-positioned nucleosomes (39). Our occupancy profile for TRP1 ARS1 showed the expected nucleosome-depleted region at the ACS and three clear nucleosome peaks downstream (Figure 2E). Once again, the agreement between the profiles for control and 3AT-treated cells was excellent. As for the PHO5 promoter, analysis of nucleosome positioning on TRP1 and ARS1 indicated the presence of position clusters, rather than unique positions (Figure 2E).
The GAL1 and GAL10 genes are transcribed from a divergent promoter and should be repressed under our growth conditions. A quite regular series of nucleosome occupancy peaks was observed across both genes, which corresponded to a series of position clusters (Figure 2F). There were nucleosome-depleted regions in the divergent promoter [corresponding to Gal4-binding sites), within the GAL10-coding region (corresponding to Reb1 sites required for activation of a ncRNA gene (40)] and at the FUR4 promoter.
In conclusion, nucleosome position clusters were detected at all three regions examined (Figure 2), indicating that these genes can exist in any of several alternative chromatin structures.
Previously, we mapped nucleosome positions on HIS3 at high resolution using the monomer extension technique (16). In the absence of the Gcn4 activator, HIS3 is organized into a dominant array of five nucleosomes, D1–D5, with a background of alternative, overlapping positions. Activation by Gcn4 results in increased occupancy of the alternative positions, and the D-positions are no longer dominant. This study (16) provides positioning data of sufficiently high resolution for direct comparison with the current study.
Both HIS3 and the neighbouring PET56 gene are induced by 3AT (30). The occupancy profiles indicated that HIS3 is flanked by nucleosome-depleted regions, corresponding to the HIS3-PET56 and DED1 promoters (Figure 3A). The profile for control cells indicated five nucleosome peaks, although the separation between the third and fourth peaks was relatively indistinct and their occupancies were lower. The profile for 3AT-treated cells was somewhat different: the distinction between the third and fourth nucleosome peaks was even less clear, the fifth nucleosome peak was shifted a little upstream, and the overall occupancy was lower. The effect of 3AT on PET56 was more subtle, with slightly reduced occupancy at both ends of the coding region, but no obvious change in the fairly regular set of nucleosome peaks.
Five position clusters were present on HIS3 in control cells (Figure 3C). The midpoints of the most prominent peaks were +15, +179 (with slightly weaker peaks at +156 and +203), +327, +491 and +642/+672. These midpoints predict an array with a range of linker lengths and an average spacing of 164bp (typical of yeast). They correspond reasonably well to the five dominant positions mapped previously (16): +8, +163, +327, +527 and +683, except for D4 that was mapped at +527 rather than at +491. In the paired-end data, the strongest peak in the cluster was at +491, but there was a smaller peak at +516.
There were significant changes in the position clusters on HIS3 in 3AT-treated cells (Figure 3D). In the first cluster, the D1 position remained the most probable but its dominance was reduced relative to an overlapping position ~20-bp downstream. The same was true of the fifth cluster in which the dominance of the +672 position was reduced relative to that at +642. The major changes occurred in the D2, D3 and D4 clusters: the dominant position in the D2 cluster was shifted downstream to the +203 position; the D3 and D4 clusters were replaced by a very weak cluster centred on +425. Thus, the 3AT-induced HIS3 gene had only four nucleosomes on average, rather than the array of five in control cells. Since the positions adopted by the nucleosome at each end of the array (D1 and D5) did not change significantly, the average spacing of the nucleosomes was much greater on the 3AT-induced gene (218bp) than in control cells (164bp).
In conclusion, the paired-end data are in good agreement with our previous monomer extension studies, providing some validation. Induction with 3AT reduced the occupancies of the dominant positions relative to the alternative positions and resulted in removal of one nucleosome and some re-positioning of the remaining nucleosomes.
ARG1, another Gcn4-dependent gene, is strongly induced by 3AT (30). In control cells, ARG1 was organized into nine nucleosome peaks, flanked by nucleosome-depleted regions corresponding to the ARG1 and YOL057W promoters (Figure 4A). The 5′ half of YOL057W was also well organized, displaying six nucleosome peaks before becoming more irregular. In 3AT-treated cells, there was a massive loss of occupancy across the entire ARG1 gene, extending into the 3′-flanking gene (YOL057W) (Figure 4A), even though its expression is not affected by 3AT (30). Furthermore, the regular nucleosome peaks observed on ARG1 in control cells merged into one another in 3AT-treated cells. The position clusters present on ARG1 and YOL057W in control cells were heavily disrupted in 3AT-treated cells (Figure 4B, C). Thus, ARG1 induction was associated with loss of more than half of its canonical nucleosomes; those remaining were no longer organized into clusters with dominant positions. Furthermore, these effects were propagated downstream into YOL057W.
To find other genes displaying similarly dramatic, 3AT-induced effects on chromatin structure, the numbers of nucleosome sequences per coding region in control and 3AT-treated cells were compared using a whole-genome survey. This analysis ranked all genes using a ‘disruption score’, corresponding to the ratio of nucleosome sequences in 3AT-treated cells to sequences in control cells (after adjustment for the difference in the total number of nucleosome sequences obtained for control and 3AT-treated cells). A disruption score of <1 indicates that a gene has fewer nucleosome sequences in 3AT-treated cells, like ARG1. A cut-off score of 0.75 was set, requiring that a gene has >25% fewer sequences in 3AT-treated cells. Forty-nine genes, including ARG1, had an average disruption score of <0.75 (Table 1). In addition, 13 genes showed the reverse effect, with an equivalent cut-off score of >1.32 (25% fewer nucleosome sequences in control cells than in 3AT-treated cells).
The expression microarray study (30) found 305 genes that are induced >2-fold by 10mM 3AT and 104 genes that are repressed >2-fold. Of the 49 genes with disruption scores equal to or <0.75, 29 were induced>2-fold [(30), Table 1]. If the three genes for which there are no data are excluded, 63% of genes with heavily disrupted chromatin are induced by 3AT. In addition, four of the genes unaffected by 3AT are located next to genes that are induced by 3AT (ERV2/YPR036W-A, YSC83/ARG4, YIR035C/LYS1 and COX9/IDP1). Of the 13 genes with heavily disrupted chromatin in control cells, three are repressed by 3AT (33% of genes for which data are available; Table 1). Genome-wide, there was a good correlation between the disruption score and the fold induction by 3AT (Supplementary Figure S1). A small fraction of genes which were strongly induced or repressed by 3AT showed only weak chromatin disruption. This could be because the fold-change in expression of these genes is high but the absolute level of transcription is not very high (see ‘Discussion’ section). It should also be noted that the 3AT expression data are for a different yeast strain grown in a different medium (30) and so some genes might be affected differently by 3AT.
By comparing changes in expression in wild-type and gcn4Δ cells, Natarajan et al. (30) reported a list of 539 Gcn4 target genes. Of the 46 genes with disrupted chromatin structure in 3AT-treated cells and for which there are expression data, 38 are Gcn4 targets (83%). Only eight genes are not Gcn4 targets and three of these are neighbours of affected genes (Table 1). As expected, none of the genes with disrupted chromatin in control cells are Gcn4 targets. Thus, most of the genes identified by the disruption survey are known Gcn4 targets and are induced by 3AT.
Occupancy profiles for some genes with heavily disrupted chromatin structures identified by the whole-genome survey (Table 1) are shown (Figure 5). In all cases, there was a dramatic loss of occupancy over the coding region in 3AT-treated cells. In control cells, the chromatin structures of LYS1 (Figure 5D), the 5′- and 3′-ends of HIS4 (Figure 5B) and the 5′-half of IDP1 (Figure 5E) were quite regular, displaying well-defined nucleosome peaks, corresponding to position clusters with dominant positions. The chromatin structures of ARG4 (Figure 5C), ICY2 (Figure 5A), the central region of HIS4 and the 3′ half of IDP1 were less regular, indicating a more complex position cluster organization. The occupancy profiles of these genes in 3AT-treated cells were very different from those of control cells, indicating that the remaining nucleosomes had been rearranged. In the case of HIS4, all regularity was lost, indicating the absence of dominant positions (Figure 5B). The occupancy profile of LYS1 remained quite regular in 3AT-treated cells, but there were only six clear peaks, which were out of phase with the peaks in control cells, with the exception of the sixth peak (Figure 5D). This indicated a change in the average positions and spacing of the nucleosomes, as observed for HIS3 (Figure 3).
In striking contrast to the effects of 3AT on nucleosome occupancy of the coding regions, there was little effect on occupancy at the promoters and 3′-ends of these genes, which were all significantly depleted of nucleosomes in both control and 3AT-treated cells.
The disruption of ARG1 chromatin in 3AT-treated cells extended far into the gene downstream (Figure 4A). This was also true for HIS4 and ARG4 (Figure 5). None of these downstream genes are Gcn4 targets and all are unaffected by 3AT (30). Indeed, nucleosome occupancy on YSC83, downstream of ARG4, was reduced so strongly that it was scored as a gene with heavily disrupted chromatin structure (Table 1). In all three cases, occupancy of the downstream gene at the end farthest from the target gene was similar to that in control cells, revealing that the disruptive effect diminished with distance from the target gene.
The chromatin structures of the genes downstream of ICY2, LYS1 and IDP1 were unaffected (Figure 5). However, in these cases, the chromatin structure of the upstream gene was disrupted. Most strikingly, YIR035C, upstream of LYS1 (Figure 5D), was heavily disrupted, even though it is not a Gcn4 target gene and is unaffected by 3AT (30).
In summary, the chromatin structures of 49 genes were heavily disrupted in 3AT-treated cells: the entire coding region was heavily depleted of canonical nucleosomes. In some cases, this disruption extended to flanking genes, either upstream or downstream. Nucleosome positioning was heavily disrupted with major reductions in occupancies of dominant positions observed in control cells. At HIS3 and LYS1, the average number of nucleosomes on the gene was reduced, implying changes in nucleosome spacing.
There were 13 genes with extreme disruption scores in the opposite sense: more nucleosome sequences were obtained from 3AT-treated cells than from control cells (Table 1). Three of these genes (URA1, OLE1 and MOG1) are repressed by 3AT (30). MOG1 had a more ordered chromatin structure in 3AT-treated cells than in control cells (Figure 6A), with four nucleosome peaks corresponding to position clusters (Figure 6A). In control cells, the first nucleosome peak on MOG1 and the corresponding position cluster were greatly diminished. MOG1 shares a divergent promoter with OPI3, the coding region of which was somewhat depleted of nucleosomes (Table 1), but it is unclear whether this effect was communicated from MOG1, because OPI3 is also repressed by 3AT (30). The gene with the most disrupted chromatin in control cells was URA1. There was a major loss of occupancy over the coding region in control cells relative to 3AT-treated cells (Figure 6B). All nine position clusters located between the nucleosome-depleted regions flanking URA1 in 3AT-treated cells were disrupted in control cells (Figure 6B). In conclusion, MOG1 and URA1 are repressed by 3AT (30) and their chromatin structures showed the opposite transition from 3AT-induced genes.
Nucleosome length distributions indicate that each sample contains fully trimmed nucleosome core particles, together with some incompletely trimmed nucleosomes and damaged core particles. This was expected because of MNase digestion kinetics, the possible influence of H1, and slower trimming of the final ~20bp of linker DNA. DNA length is essential information for determining accurate nucleosome positions. If the DNA is significantly >150bp, there is uncertainty in the position of the nucleosome, because it occupies only 145–150bp. If the DNA is significantly <145bp, the position is also unclear, because the nucleosome from which it is derived must have been cleaved internally or trimmed excessively from one or both ends. Sequences which are too long or too short can be selectively excluded from paired-end data, but not from single-end data.
In our analysis, we consider only DNA fragments of approximately mono-nucleosome size (any protected DNA fragments much larger or much smaller than the nucleosome are not present in our data sets, because the DNA was gel purified). Thus, we are considering only canonical nucleosomes, which are really defined by their ability to protect ~147bp. We expect that there might be a small fraction of ~147-bp sequences scored as canonical nucleosomes that are not canonical nucleosomes. Some of these sequences might correspond to internal cleavage sites in neighbouring nucleosomes. If so, such sequences would have to contain an intact linker, which is unlikely and cannot be quantitatively very significant because such cleavages would smear the nucleosomal repeat pattern. Another possible problem is that a transcription factor bound adjacent to a nucleosome might protect some linker DNA after the very extensive digestion used to make core particles, but we are not aware of any studies indicating that transcription factors offer strong protection against MNase digestion. It is worth noting that transcription factors bind reversibly to DNA (unlike histones in the nucleosome) and so would be expected to offer less protection. In addition, even histone H1, which binds tightly to the nucleosome, offers only transient protection from MNase under these conditions. Overall, we believe that the vast majority of ~147-bp sequences are indeed canonical nucleosomes.
There is potential for bias in genome-wide sequencing studies, particularly because two different DNA amplification steps are involved. We do not believe that bias is a major problem in our study because: (i) we have validated our current mapping data by comparison with some famous examples in the classical literature (Figures 2 and and3);3); (ii) the average nucleosome occupancy is very consistent across the genome; and (iii) our data are very reproducible.
The chromatin structures reported here for the PHO5 promoter and TRP1 ARS1 (Figure 2) are consistent with previous studies using low-resolution indirect end-labelling (38,39). However, the higher resolution provided by paired-end sequencing reveals that each positioned nucleosome reported by indirect end-labelling is in fact an average of several overlapping positions (a position cluster). We and others have described similarly complex chromatin structures previously (11,16,23,31,39), but it has been generally assumed that these are atypical. More recently, complex chromatin structures have been noted genome wide in Caenorhabditis elegans (27). The present study demonstrates that complex chromatin structures are the rule in yeast chromatin, not the exception.
We define a position cluster as a set of overlapping positions, usually including a dominant position (Figure 7A). These must be alternative positions, because canonical nucleosomes cannot physically occupy the same DNA. In this context, it is worth noting that in vitro, a nucleosome can invade the territory of a neighbouring nucleosome, resulting in the loss of one H2A–H2B dimer and forming a particle that protects ~250bp from MNase digestion (42). If such coalesced nucleosomes are present in yeast, they would not appear in our maps because they protect much >147bp.
In a particular cell at a given moment, the nucleosome represented by a position cluster occupies one of the positions within the cluster. Thus, in some cells, the nucleosome will occupy the dominant position; in other cells, it will be at one of the alternative positions. This observation has important biological implications. For example, many models proposed for the regulation of specific genes depend on precise positions adopted by nucleosomes at the promoter, with critical transcription factor-binding sites located in the linker DNA, or just inside the nucleosome core, rather than in the inaccessible centre. Our data imply that factor-binding sites at nucleosomal promoters (e.g. PHO5), might be accessible in some cells, but not in others. It seems likely that remodelling machines will play critical roles here, because they are able to move nucleosomes along the DNA, perhaps from one position in a cluster to another, perhaps exposing or obscuring specific factor-binding sites. Furthermore, there is potential for stochastic effects, given that apparently identical cells can have different chromatin structures.
Although the existence of position clusters indicates that chromatin structure is more complex than has been generally acknowledged, a significant simplifying factor is that nucleosomes in yeast are regularly spaced with an average linker length of 15–20bp (a 160- to 165-bp repeat). Consequently, we propose that each position cluster corresponds to positions belonging to alternative arrays with the same spacing (17). An array of perfectly positioned nucleosomes predicts a ‘square-wave’ occupancy profile (Figure 7B), which is not generally observed. The only obvious example of a square nucleosome occupancy peak that we have found in our data is the single nucleosome located over each centromere, which is therefore perfectly positioned, but this nucleosome is unusual in that it contains CenH3 (Cse4), a variant of H3 (33).
A set of five overlapping arrays with the same spacing (165bp) predicts a profile similar to the more regular profiles and position clusters observed experimentally (Figure 7C); this example is just one of many possibilities. Most arrays must have the same spacing to yield the observed bulk chromatin repeat of 165bp, but quantitatively rare arrays could have quite different spacing. An interesting example is the square wave interference pattern generated in the case where half the cells have an array of five nucleosomes on gene X (165-bp spacing) and the other half have an array of four nucleosomes (220-bp spacing), beginning and ending with the same nucleosome (Figure 7D): both outermost nucleosomes give rise to a clear nucleosome peak in the occupancy profile, but the inner nucleosomes contribute an irregular pattern, including sharp spikes. Counter-intuitively, a well-positioned nucleosome is located below the central trough in the occupancy profile (Figure 7D). Thus, both regular and irregular occupancy profiles could be accounted for by overlapping regular arrays.
If a spacing factor begins at one nucleosome-depleted region and terminates at the next, then nucleosomes on a gene might be subjected to spacing from both ends, resulting in at least two alternative arrays. Little is known about how nucleosomes are spaced in yeast in vivo. In vitro, the yeast ISW1 and INO80 complexes can create arrays with ~175-bp spacing, and ISW2 can assemble arrays with ~200-bp spacing (43,44). In Drosophila, there are two well-characterized nucleosome spacing factors, ACF and CHD1 (45,46). How the activities of spacing factors interact in terms of array formation in vivo is an important question.
The effect of 3AT on the chromatin structures of some induced genes is dramatic: nucleosome occupancy is heavily reduced over the entire coding region. Moreover, some 3AT-repressed genes show the opposite trend: occupancy increases to normal levels in 3AT-treated cells. Thus, reduced occupancy correlates with transcriptional activation. In addition, nucleosome spacing is altered after induction on at least two genes (HIS3 and LYS1). Altered nucleosome spacing might reflect an intermediate chromatin state corresponding to a level of disruption in between the resting state and a major loss of canonical nucleosomes. Only a subset of 3AT-induced genes show extreme loss of canonical nucleosomes. These are probably the most transcriptionally active genes, since single gene and microarray studies indicate that the extent of histone loss from coding regions correlates with heavy transcription (47–51).
The mono-nucleosome sequencing approach identifies only canonical nucleosomes (i.e. those which protect ~147bp of DNA). Consequently, reduced occupancy on coding regions and at nucleosome-depleted regions could reflect actual loss of nucleosomes (resulting in free DNA), or the presence of ‘non-canonical’ nucleosomes which have been remodelled such that they no longer adequately protect their DNA from MNase. Thus, reduced occupancy on coding regions might reflect loss of the entire histone octamer, or of just one or both H2A–H2B dimers (48). Histone hexamers and H3-H4 tetramers protect less DNA than the octamer and DNA in this size range (~80 to ~120bp) is not present in our preparations of core particle DNA. Alternatively, the histones might still be bound to the DNA, but present in remodelled nucleosomes, as we have suggested previously (16,52).
In the cases of the genes most strongly affected by 3AT, loss of canonical nucleosomes occurs not just over the coding region, but extends into neighbouring genes (Figure 5). It seems unlikely that this is a direct effect of transcription, involving RNA polymerase II ploughing on into the chromatin of the downstream gene after release of the mRNA, because in some cases upstream genes are affected. Previously, we have observed disruption of nucleosome positioning on the flanking TRP1 gene in CUP1 or HIS3 plasmid chromatin after induction (11,16). However, since CUP1 and HIS3 were not in their native chromosomal contexts and the TRP1 gene was also active, the biological significance of the disruption of flanking chromatin structure is unclear. More recently, in Drosophila, a single-gene nucleosome scanning study has shown that heat shock induces nucleosome loss over a pair of divergently transcribed Hsp70 genes, extending in both directions into the flanking sequences as far as the scs and scs′ insulating elements (50). This effect does not depend on transcription, but on poly(ADP-ribose) polymerase (50). Since there is no evidence for this enzyme in yeast, the mechanism in yeast must be different, perhaps involving remodelling by SWI/SNF, as we have observed previously for HIS3 (16,52). We are currently investigating this possibility.
In summary, our genome-wide nucleosome sequence data show not only that there is a major loss of canonical nucleosomes from the coding regions of some 3AT-induced genes, but also that the positioning of the remaining nucleosomes is heavily disrupted. Thus, the chromatin structure of the coding region undergoes major remodelling on activation, with disruption of the dominant nucleosomal array and loss of canonical nucleosomes. This disruptive effect can be communicated to flanking genes through nucleosome-depleted promoters and 3′-regions that are seemingly unaffected, indicating that they do not act as strict boundaries. The factors that direct the formation of these domains of altered chromatin structure and determine their boundaries are currently under investigation.
We have submitted four sets of paired-end sequencing data to the GEO database; these correspond to two independent experiments: control cells and cells treated with 3AT. The accession number is GSE26493. These data are available to the public as of 20 July 2011.
Supplementary Data are available at NAR Online.
Funding for open access charge: Intramural Research Program of the National Institutes of Health (National Institute for Child Health and Human Development).
Conflict of interest statement. None declared.
The authors thank Gary Felsenfeld, Alan Hinnebusch, Rohinton Kamakaka and Victor Zhurkin for helpful comments on the manuscript. The authors thank Kip Bodi and Michael Berne at the Tufts University Core Facility for paired-end sequencing.