Precise global mapping of histone variants is indispensable for understanding the interaction between chromatin structure and gene expression. Genome-wide surveys of the distribution of individual histone variants H2A.Z or H3.3 have revealed that they are widely distributed in the genome1-11
. However, no one has attempted a genome-wide study of the distribution of individual nucleosome core particles (NCPs) that contain both variants; it has therefore not been possible to distinguish NCPs carrying only one of these variants from those carrying both. This is particularly important because we have shown in a previous study 12
that H3.3/H2A.Z NCPs are unusually unstable under conditions normally used in preparations for such studies; it therefore seemed possible that preferential loss could account for earlier reports3,4,7,8,11,13
that H2A.Z-containing nucleosomes are absent from ‘nucleosome-free regions’13-22
at transcription start sites (TSSs) of active genes. With the use of isolation procedures that preserve the stability of the NCPs containing double variant, it becomes possible to carry out a genome-wide survey of each of these kinds of variant NCPs.
To determine the distribution of different combinations of histone variants, we prepared monomer NCPs from a HeLa cell line expressing FLAG-tagged human histone H3.3 23
, and carried out individual or sequential immunopurification followed by high throughput Solexa sequencing4
to obtain genome-wide high resolution profiles for total
H3.3 or double
(H3.3/H2A.Z) NCPs (Supplementary Fig. 1
online). From these libraries, it was also possible using computational analysis to deduce the relative profiles of NCPs carrying H2A.Z only
(in combination with H3.1 or H3.2 but not H3.3) and NCPs containing H3.3 only
(not in combination with H2A.Z) (see Methods). H3.3/H2A.Z NCPs are disrupted by exposure to moderate salt concentrations12
; we therefore carried out all of the purifications in low ionic strength solvents except as indicated. The mononucleosomes that we used as input in our study reflect the bulk of the genome (Supplementary Fig. 2
We first investigated the distribution of histone variants around genomic TSSs. To correlate the distribution with gene expression, we created separate profiles containing 1000 genes each for highly expressed, intermediately expressed and silent genes. The data show that H3.3, H2A.Z, and H3.3/H2A.Z NCPs are selectively enriched at TSSs of active genes (). Only a small fraction of H2A.Z only
and almost none of H3.3 only
NCPs are detected at such sites (). The results for H3.3 and H2A.Z separately are apparently at variance with high resolution (mononucleosome level) studies, which have indicated that sites immediately upstream of the TSS of active genes tend to be generally depleted of H2A.Z NCPs and to a lesser extent of H3.3 NCPs1,3,4,7,8,13
. Since H3.3/H2A.Z NCPs are easily disrupted 12
, and these comprise a large fraction of total
H2A.Z NCPs at TSS (compare ), it seemed possible that when isolated at higher salt concentrations they would be under-represented. As we anticipated, the second genome-wide screen, using NCPs prepared under conditions which exposed them to 150 mM NaCl, showed a relative minimum of H2A.Z abundance at the TSS, reproducing the earlier findings (). We conclude that underrepresentation of H2A.Z-containing NCPs at TSS can arise from preferential disruption of H3.3/H2A.Z NCPs.
H3.3/H2A.Z NCPs mark ‘nucleosome-free regions’ of active promoters.
We further carried out an analysis of positioning for all NCPs containing H2A.Z, making use of tags on both strands to determine accurately the boundaries of each NCP13
. Consistent with published data, NCPs prepared in 150 mM NaCl show a 200 bp region depleted of H2A.Z NCPs immediately upstream of the TSS (−1 nucleosome), whereas in the surrounding region four phased nucleosomes are detected (from −2 to +3) ( and Supplementary Fig. 3
online). In contrast, the low salt preparation clearly reveals the enrichment of H2A.Z NCPs at the −1 position; the peaks in the region corresponding to −1 and −2 nucleosomes are not well ordered (). The observed irregular patterns are entirely consistent with a population of sites in which one or two NCPs can occupy any of several positions in this ~400 bp region (Supplementary Fig. 4
online). Individual active genes also displayed similar changes at TSS (). It should be noted that these previously undetected NCPs carry both H3.3 and H2A.Z.
Next, we examined the distribution over other regulatory elements, including CTCF-binding sites, which typically represent regions with insulator activity 24
, and DNase I hypersensitive sites, typically associated with the centers of regulatory activity25
H2A.Z is enriched at the center of the intergenic CTCF-binding sites26
(). A small number of H2A.Z only
NCPs (less than 20% of total) contribute to this enrichment. Interestingly, total
H3.3 also had its highest peak at the sites, but again only one fifth of them are H3.3 only
NCPs (), suggesting that the majority of NCPs at the center of the binding sites are the H3.3/H2A.Z double variant. This is confirmed by the profile for double
(H3.3/H2A.Z) NCPs (). We next examined H2A.Z nucleosome positioning around CTCF-binding sites. Under the low salt conditions, the two highest peaks for both 5′ tags and 3′ tags are observed at the center of the binding sites ( and Supplementary Fig. 5
online). However, these two peaks are nearly missing () under higher salt conditions and the pattern is now quite similar in many respects to the one reported earlier 27
, which showed a nucleosome-free gap at the binding sites surrounded by an ordered array of H2A.Z NCPs . These results reveal the presence of H2A.Z nucleosomes, largely H3.3/H2A.Z NCPs, at this “nucleosome-free” region. The distribution of nucleosome levels around CTCF-binding sites (Supplementary Fig. 6
online) in low salt condition indicates that a single H2A.Z NCP can bind in several different positions within the CTCF-binding region, a pattern resembling that seen at TSS sites. A survey of intergenic ENCODE DNase I hypersensitive sites28,29
reveals high concentrations of total
H3.3 and total
H2A.Z (), whereas there is only a small enrichment of NCPs containing H3.3 or H2A.Z alone; the double
variant NCP predominates. H3.3/H2A.Z NCPs are not detectable in HeLa cells at sites that are DNase I hypersensitive in CD4+ T cells but not in HeLa (). This shows that the presence of the unstable NCPs reflects the activity of the hypersensitive sites, which also carry histone modifications correlated with enhancer activity (K.C., C.Z., D.S., W.P. and K.Z., unpublished data). Taken together, H3.3/H2A.Z NCPs mark ‘nucleosome-free regions’ of active promoters as well as enhancers and insulator regions.
H3.3/H2A.Z NCPs enriched at other regulatory elements.
We then examined patterns of distributions of histone variants at the transcription termination sites (TTSs). The abundance of total
H2A.Z near the TTS is low, nearly uniform and almost independent of gene activity (). In accordance with previous observations in the Drosophila
, over the most active genes H3.3 abundance reaches a broad peak around TTS and then decreases on either side (). Double
variant NCPs rise slightly in abundance 3′ of the TTS of the more active genes (). These may function in transcriptional termination, antisense transcription or antisilencing. There is a narrow local minimum at the TTS in the H3.3, H2A.Z and H3.3/H2A.Z distributions ( and Supplementary Fig. 7a-f
online). Similar patterns are seen with the input sample of NCPs (before immunoprecipitation) and total genomic DNA (Supplementary Fig. 7g,h
), suggesting that these very low level signals are an artifact associated with TTS sequences, and should be taken into account in analyses of this kind.
Histone variants near transcription termination sites (TTSs). Method was the same as used for .
To characterize the distributions of histone variants across entire genes, we displayed our data on a normalized distance scale with the TSS set at 0 and the TTS at 1, and with a compressed scale for the regions around the TSS and TTS. Of all the H2A.Z containing NCPs near the TSS of active genes, the majority carry both H3.3 and H2A.Z (). There is a slight but consistent elevation of H2A.Z only
particles over the gene bodies and downstream of TTS of the silent gene population (Supplementary Fig. 8
H3.3 NCPs shows a gradient of increasing abundance from 5′ to 3′ over the entire transcribed regions of active genes (), reminiscent of the distribution of histone H3 lysine 36 trimethylation in active genes4,19
. Interestingly, NCPs containing only
H3.3 are almost completely absent from TSS (), showing that in this region H3.3, when present, is almost always partnered with H2A.Z. In contrast, the pattern and the density of NCPs containing only
H3.3 within the gene bodies are very close to those of total H3.3, indicating that the majority of H3.3 NCPs over transcribed regions carry the single variant H3.3 but not H2A.Z. We note that NCPs containing the single variant H3.3 are still relatively unstable compared to canonical NCPs12
and might accommodate the passage of RNA polymerase. Double
variant NCPs are enriched over the TSS and at a relatively low abundance near TTS, both correlated with transcriptional level (). Some of these particles are also present in gene bodies, but at quite low concentrations (see Supplementary Fig. 9
online). The presence of these NCPs over transcribed regions might facilitate chain elongation of Pol II and/or the rapid loss of nucleosomes over some gene bodies, perhaps of immediately inducible genes, a phenomenon seen at Hsp70
Different combinations of histone variants have distinctive distribution patterns across genes.
As we show here, the distribution of the unstable double variants is distinct and quite different from the distributions of NCPs carrying either H3.3 or H2A.Z alone. The ‘nucleosome-free region’ of active promoters is likely to be occupied to a considerable extent by the labile H3.3/H2A.Z NCPs (Supplementary Note online). These unstable NCPs could serve as ‘place holders’ to prevent the region from being covered by adjacent quite stable (canonical) NCPs and/or nonspecific factors, as might occur if the region was completely free of nucleosomes. At the same time, because of their relative instability the H3.3/H2A.Z NCPs could more easily be displaced by transcription factors. Our results suggest a new model for the chromatin structure at vertebrate promoters and other regulatory sites, in which the site is dynamically cycling between occupancy by these unstable nucleosomes, or by transcription factors, or perhaps by some canonical nucleosomes if the site is temporarily silent or has not yet been replaced by variant histone after replication. For some small fraction of the time the site may also be vacant during the period in which these components are exchanging places (). Which of these states is detected will depend on the measurement method, but they are all part of the promoter structure, in which the double variant H3.3/H2A.Z NCP appears to play an important role.
Schematic representation of the dynamic exchange of factors at a transcriptionally active TSS or other regulatory elements.
Each combination of histone variants gives rise to a distinct and characteristic nucleosome stability31
. It is not yet clear which of these differences in stability arise from differences in amino acid composition, and which are caused by a combination of histone modifications unique to each variant. Our present results clearly show, however, that each variant or combination of variants has a highly specific pattern of distribution in vivo
, suggesting that these differences in stability are elaborately exploited in the regulation of gene expression.