|Home | About | Journals | Submit | Contact Us | Français|
Nucleosomes are an essential component of eukaryotic chromosomes. The impact of nucleosomes is seen not just on processes that directly access the genome such as transcription, but also on an evolutionary timescale. Recent studies in a number of organisms have provided high-resolution maps of nucleosomes throughout the genome. Computational analysis, in conjunction with many other kinds of data, has shed light on several aspects of nucleosome biology. Nucleosomes are positioned by several means, including intrinsic sequence biases, by stacking against a fixed barrier, by DNA-binding proteins and by chromatin remodelers. These studies underscore the critical organizational role of nucleosomes in all eukaryotic genomes. Here, I review recent genomic studies that shed light on the determinants of nucleosome positioning and their impact on the genome.
Nucleosomal chromatin is a hallmark of all eukaryotic genomes . The linear chromosome comprises DNA periodically wrapped nearly twice in left-handed turns around histone octamers to form nucleosomes containing about 147 bp DNA, separated by shorter linker DNA. Visualized by electron microscopy or atomic force microscopy, this primary level of chromosome structure appears similar to a string of beads. Additional levels of compaction are possible, and the chromosome may associate with other proteins and nuclear structures to demarcate topological and regulatory domains, but the primary nucleosomal structure is nonetheless retained.
Nucleosomes serve three primary functions as components of chromosomes. First, they provide some measure of packaging and stabilize the negative supercoiling of genomic DNA in vivo [2, 3]. Without the periodic toroidal coiling of the DNA in nucleosomes, eukaryotic chromosomes risk becoming a mess of plectonemic helices. Second, the histones in the nucleosome can be posttranslationally modified or replaced by histone variants, providing for an additional, epigenetic layer of information to be associated with the genome. Epigenetic modifications can affect the packaging of chromatin and guide the interactions of trans acting proteins with the genome [4, 5]. Third, nucleosomes can directly regulate the access of trans acting factors to functional elements on chromosomes by virtue of how they are positioned relative to these elements.
Studies stretching back more than two decades have shaped the idea that nucleosomes, by impeding access of the transcription machinery to genomic DNA, are a general inhibitory influence on transcription [6–10]. A cornerstone of this idea is that a sequence occupied by a nucleosome is refractory to binding by other factors. In this view, knowing the positions of nucleosomes relative to gene features, and understanding the mechanisms that govern nucleosome positioning and remodeling is of paramount importance for understanding transcriptional regulation in the context of chromatin. The advent of high-resolution tiling microarrays and deep sequencing has spurred interest in examining nucleosome positions and its determinants. This article reviews recent work using genomic approaches that casts new light on the long-standing question of nucleosome positioning and its biological consequences.
Genomic analysis of nucleosome positions involves digesting chromatin with micrococcal nuclease (MNase), an enzyme that cuts linker DNA , isolating the mononucleosomal fraction by size fractionation or salt extraction, and identifying the recovered DNA by hybridization to microarrays or sequencing. If identification of nucleosomes containing a histone variant or specific modification is desired, an immunoprecipitation step can be introduced. Computational analysis of the raw data – intensities in the case of microarrays or normalized read counts from deep sequencing – yields nucleosome positions with an associated numerical score .
The trend of mapping nucleosome locations comprehensively began with a tiling microarray analysis of chromosome III of S. cerevisiae at 20 bp resolution . A whole-genome nucleosome map at 4 bp resolution, also using S. cerevisiae and tiling arrays , was followed by a spate of studies using emerging deep sequencing technologies, which allowed nucleosome maps to be made at nominally single-nucleotide resolution. These studies cover many model organisms and human (see Table 1 in ), and address some basic questions regarding nucleosome organization.
First, what is the nucleosome structure of promoters? The average nucleosome profile shows that the promoter is relatively devoid of nucleosomes compared to the surrounding regions. Assays with varying MNase treatments indicate that promoters could contain particularly fragile or unstable nucleosomes, but are nonetheless functionally depleted relative to the surrounding region of the chromosome [15, 16]. This nucleosome-depleted region (NDR) in the promoter is a characteristic in all organisms studied so far (in humans, seen primarily in active promoters) and corresponds to the site of assembly of the transcription initiation complex [17, 18]. Second, how are nucleosomes positioned relative to one another? All in vivo studies reveal arrays of nucleosomes spaced approximately 180–200 bp apart, confirming on a genomic scale the string-of-beads view obtained with individual molecules. Third, do nucleosomes occur at the same absolute positions in the chromosomes in all cells? In yeast, most nucleosomes appear to be stereotypically positioned to within approximately 20–30 bp in most cells in the population. This holds true across studies from many labs, experimental platforms and even across growth conditions that lead to changes in gene expression [19–21]. In metazoa and Arabidopsis, most nucleosome positions are more variable in different cells in the population, with the exception of the +1 nucleosome downstream of the promoter NDR [20, 22–24]. Fourth, what governs the absolute positions of nucleosomes? The genome-wide nucleosome maps allow us to examine different factors governing the location of nucleosomes on the chromosome, as considered further below.
Well before the genomic era, it was recognized that the sequence of DNA affects its ability to form nucleosomes and could influence where nucleosomes occur in chromosomes [25, 26]. Sequences that favor the formation of nucleosomes tend to contain the dinucleotide WW (W is A or T) repeated every 10 bp, offset by 5 bp from a similarly repeating dinucleotide SS (S is C or G) [27, 28]. A model based on this repeating dinucleotide bias, combined with the preferences of nucleosomes for different 5-mers, can predict the in vivo occupancy of nucleosomes fairly accurately in yeast (74%) and less so in C. elegans (60%) . A simpler model based on GC content performs similarly in predicting nucleosome positions in yeast . In Drosophila nucleosome maps, the DNA sequence preference is modest and a model based on the WW dinucleotide preference does not predict most nucleosome locations, but a model based on GC dinucleotides predicts the +1 nucleosome location .
The existence of sequence preferences of nucleosomes led to the idea that the in vivo nucleosome map is intrinsically encoded in the DNA sequence of the genome [19, 30]. The first studies comparing maps of in vitro assembled nucleosomes -- with positions depending primarily on DNA sequence -- with the corresponding in vivo maps were done simultaneously by independent groups, both using yeast, and they generated similar data but arrived at different conclusions. Based on the correlation between the in vitro and in vivo maps, Kaplan et al.  favored the idea that genomic nucleosome locations in vivo are intrinsically encoded in the DNA sequence, but Zhang et al.  maintained that nucleosome positions in vivo were not significantly determined by intrinsic DNA-nucleosome interactions. The ensuing debate [26, 32–34] can fortunately be resolved without too much difficulty. As suggested in subsequent analyses, a key point is the difference between nucleosome occupancy, which measures the extent to which part of a nucleosome occurs over a given position, and translational nucleosome positioning which measures the extent to which, when a nucleosome is present, it is aligned relative to a given position. Correlation between the in vitro and in vivo nucleosome maps was high (on the order of 70%) when occupancy was measured, but far lower (15 – 20%) when positioning was specifically considered [20, 26, 27]. However, since the alternating WW and SS dinucleotides and the GC content strongly influence the bendability of DNA, it is likely that this type of DNA sequence preference captured in the models governs the rotational positioning of nucleosomes, namely which points of the double-helix face towards the histone core at a given translational position .
Consistently phased nucleosomal arrays, as seen most strikingly in yeast, can be explained by the so-called barrier model or statistical positioning . Here, a strong positioning signal anchors one nucleosome at some absolute position, and other nucleosomes are stacked against it to form the characteristic arrays. The intervals between nucleosomes in the array could be determined by steric considerations and chromatin remodelers, but the array would become more irregular with increasing distance from the fixed barrier. Experimental and computational analysis in yeast supports this view [36, 37].
What determines the position of the barrier? Intrinsic sequence features, distinct from the alternating dinucleotide preference underlying nucleosome occupancy could still play a role. In S. cerevisiae, one component of the barrier is constituted by the NDR at the promoter. Yeast promoters are AT rich and such sequences, in particular poly(dA:dT), which is prevalent in promoters, are refractory for forming nucleosomes . Nucleosomes can form adjacent to the promoter NDR in similar relative positions. The barrier is thus initiated in part by omission of nucleosomes. The NDR is evident in in vitro assembled nucleosomes [31, 38], confirming that this is an intrinsic factor, but it is not sufficient to generate the consistently positioned arrays observed in vivo. The other component of the barrier appears to be the strongly positioned +1 nucleosome downstream of the NDR. This nucleosome, which tends to contain the histone H2A.Z variant , appears be positioned in part by intrinsic sequence preferences but primarily by the binding of the transcription machinery [15, 20, 26]. Thus in S. cerevisiae, the combination of the intrinsically defined NDR and the +1 nucleosome together could constitute the anchor against which the downstream nucleosomal array is stacked (Fig. 1A). In S. pombe, the NDR appears to be less intrinsic, and nucleosome-depleted promoters as well as the positioned +1 nucleosome both seem to be dependent on transcription . Since human promoters and regulatory elements appear to contain sequences conducive to nucleosome occupancy , it is possible that extrinsic factors are more important in determining nucleosome positions in other organisms (Fig. 1B).
Another intrinsic factor in nucleosome barrier positioning in humans is suggested by a recent study which also compared nucleosome positioning in vivo and in vitro . Nucleosomes that were strongly positioned in vitro tend to contain G/C nucleotides whose frequency peaks at the center of the nucleosome dyad. With increasing stringency of positioning, the flanking regions also contain an increasing amount of AA/TT dinucleotides. Thus a nucleosome favoring sequence flanked by nucleosome deterring sequences forms a 'container' site, which traps nucleosomes at specific locations more strongly than either sequence alone (Fig. 1B). In granulocytes and T-cells, such intrinsically defined 'container' sites function as barriers, with regular arrays of nucleosomes on either side , distinct from the NDR-anchored arrays in yeast which occur primarily in the direction of transcription.
Binding of sequence-specific transcription factors can also form barriers. In yeast, the ubiquitous transcription factors Abf1, Rap1, and Reb1, which have binding sites near promoters [27, 42], as well as a condition-specific factor, Gal4, at the GAL1-10 promoter  can contribute to the formation of the NDR and to barrier function. In human cells, the insulator binding protein CTCF [24, 43, 44], and the transcription factor NRSF  can initiate phased nucleosome arrays, suggesting that strong binding of sequence-specific factors might be a common means of generating a nucleosome barrier and thereby phased arrays (Fig. 1B).
A striking difference between in vitro and in vivo nucleosome maps is the absence of periodically positioned and phased nucleosomal arrays in the former [19, 24, 31]. Clearly the presence of a barrier element like the NDR in yeast, or container sites in human, while required by the barrier/statistical positioning model, is not sufficient for nucleosome spacing as observed in vivo. ATP-dependent chromatin remodelers, in particular those belonging to the ISW family of remodelers which are known to establish regular nucleosome spacing [45, 46] are good candidates for the missing activity. Two recent studies address this possibility.
The first study  sought to determine what biochemical factors could convert in vitro assembled nucleosomal structures to something resembling the in vivo map. The authors assembled purified Drosophila histones into nucleosomes using a library of large yeast genomic DNA fragments in vitro. While the NDR was observed, there were no regularly phased nucleosomal arrays. Addition of yeast whole cell extract did not change the picture, indicating that simple binding of a yeast protein is not sufficient either. However, addition of ATP to the in vitro nucleosomes plus whole cell extract generated a remarkable likeness of the in vivo map, with positioned nucleosomes emanating from the promoter NDR and proceeding downstream . While ATP-dependent processes such as phosphorylation could not be ruled out, this result is most consistent with an ATP-dependent chromatin remodeler being responsible for the well-spaced nucleosome array. The data also called into question a role for RNA polymerase or transcription in defining the promoter barrier, but it is not clear why such nucleosome arrays are largely unidirectional and transcriptionally downstream of the promoter.
A second study  adopted a genetic approach to the same question. Deletion of the chromatin remodelers ISW1, ISW2 and CHD1 together was known to result in synthetic phenotypes, suggesting some redundancy. Deletion of the individual remodeler genes did not dramatically affect overall nucleosome positioning, but deletion of all three or even just ISW1 and CHD1 together resulted in a significant reduction in the regular array of nucleosomes downstream of the promoter . Interestingly, the +1 nucleosome was still maintained to some extent in the absence of the three chromatin remodelers, as was the NDR. Possibly, other chromatin remodelers such as the RSC or SWI/SNF complexes could be involved. Indeed, RSC is required for normal positioning of the nucleosomes flanking the NDR . These results show that the activities of ATP-dependent chromatin remodelers in vivo is important for maintaining the regular arrays of well positioned nucleosomes, likely overcoming sequence-dependent effects (Fig. 1A).
A rigorous evaluation of the idea that nucleosomes generally inhibit transcriptional processes in vivo is somewhat complicated. An early test in yeast showed that only 15% of all genes showed higher RNA levels following nucleosome depletion, but surprisingly, 10% of genes showed significantly lower RNA levels . A similar study recently in yeast and mammalian cells yielded similar results. Normal nucleosome levels depend on high-mobility-group proteins, including HMGB1 in mammals and Nhp6 in yeast. HeLa cells containing reduced HMGB1 showed a 30% increase in overall RNA levels, but only 13% of genes were significantly affected, of which about half were downregulated due to reduced nucleosomes . Similar results in the yeast nhp6 mutant indicate that nucleosomes are not universally inhibitory and can even promote transcription, so their effects are gene-specific.
In vivo, a substantial fraction of a transcription factor's motifs are not occupied by the factor, and this is attributed in part to impedance by nucleosomes (Fig. 2A) . Yet many transcription factors can successfully compete with nucleosomes for occupancy of their binding sites in vivo. For example at the yeast CLN2 promoter, binding of Mcm1, Reb1 and Rsc3 precludes the formation of nucleosomes at a sequence that is highly conducive to nucleosome formation, and allows SBF to bind . In the GAL1-10 promoter, the binding of RSC partially unwraps a nucleosome and allows Gal4 to activate transcription. In mouse liver, Foxa2 binds equally well to nucleosomal and non-nucleosomal sites, and in fact occupies the former simultaneously with nucleosomes . Indeed, factors such as FoxA, GATA, PU.1, glucocorticoid receptor, TFE3 and many others have been proposed to be 'pioneer factors' by virtue of their ability to bind nucleosomal DNA , underscoring the fact that inhibition of transcription factor binding by nucleosomes is by no means absolute.
Nucleosomes also form roadblocks in the way of RNA polymerase as it proceeds along the template and could affect transcription elongation. Single-molecule experiments reveal that nucleosomes indeed cause slowing down and pausing of RNA polymerase , but even without accessory factors, elongation through nucleosomes evidently occurs, with nucleosomes being passed back behind the advancing polymerase through the formation of DNA loops . In vivo measurements of nucleosome dynamics are consistent with this action . The best evidence for nucleosomes presenting a barrier to elongation in vivo comes from deep sequencing of the 3' ends of nascent RNA associated with the elongating polymerase. Relative to the known positions of nucleosomes, the highest density of polymerase pausing occurs just before the nucleosome dyad axis at early nucleosomes in the template . These peaks of pausing were most obvious in a strain deleted for a component of the TFIIS elongation factor. But while pausing, particularly at nucleosomes, is evident, it is harder to establish that RNA levels would in fact be higher in the absence of nucleosomes and if in fact there is any regulatory mechanism that overcomes the nucleosomal barrier to promote transcription.
How important is the positioning of nucleosomes as seen in vivo? Yeast cells lacking many of the major chromatin remodelers that regulate nucleosome positioning are perfectly viable in a variety of growth conditions, and even when the regular arrays of nucleosomes are grossly disrupted in a triple mutant, cells are viable and transcription is remarkably unaffected . However, transcription initiation from cryptic sites occurs in the absence of ISW2  and CHD1 and ISW1 , suggesting that rather than inhibiting normal transcription, one benefit of the ordered arrays of nucleosomes might be to inhibit such cryptic initiation from internal sites (Fig. 2B).
Nucleosomes have been part of the genome since early in the evolutionary history of eukaryotic chromosomes. In fermentative yeasts such as S. cerevisiae which underwent a whole-genome duplication about 100 million years ago, the promoters of genes involved in aerobic growth intrinsically encode a relatively closed nucleosomal organization, but in aerobic yeasts such as C. albicans, these promoters encode a relatively open configuration [57, 58]. Correspondingly, differences in nucleosome maps between related yeast species were found to be largely due to nucleosome-disfavoring AT-rich or homopolymeric sequences in promoters [59, 60]. Searching sequences that are refractory to nucleosome occupancy in vivo but not in vitro has allowed the identification of trans-acting factors that can presumably evict and/or position nucleosomes. Interestingly, such factors are different in different yeast species, for example Reb1 and Rsc3 in S. cerevisiae, Cbf1 in C. albicans and Sap1 in S. pombe . Differences in nucleosome maps between yeast species can be related to differences in transcription factor binding and differences in gene expression (such as binding of MBF and cell-cycle gene regulation), but only a subset of patterning differences between species correspond to changes in regulation [61, 62].
In polymorphic medaka (Japanese killifish) strains, the rate of single nucleotide polymorphisms (SNPs) is high in nucleosome core regions and low in the linker DNA, but insertions/deletions (indels) > 1 bp are conversely enriched in the linkers and depleted in the nucleosome core, mirroring earlier studies in yeast . Further analysis in human showed that while SNPs are enriched over the core region of bulk nucleosomes, they are not enriched over nucleosomes containing the H2A.Z variant or histone H3 trimethylated at lysine 4 (H3K4me3), likely related to the fact that these epigenetically modified nucleosomes tend to occur in active promoters and the sequences underlying them are under selective pressure . Fine-grained directional analysis of interspecies substitutions in the human genome (relative to chimpanzee) revealed that A/T to G/C changes, which would favor nucleosome occupancy, are in fact enriched in the core, whereas the reverse is somewhat disfavored . While the exact mechanisms underlying these differences are not completely understood, nucleosome positions clearly affect genetic variation, with likely implications for the evolution of functional elements in the genome.
The ability of deep sequencing to easily generate high-resolution datasets has driven much of the recent interest in nucleosome positioning. The majority of these studies however use yeast and related organisms, and the coverage of other organisms is relatively sparse. Yeast has a compact genome and contains nucleosome-disfavoring sequences in promoters, which strongly influences the nucleosomal map. Some caution is warranted in extrapolating inferences made from yeast to other organisms. The relative contributions of sequence biases, transcription-factors and chromatin remodelers are worth examining in other organisms, and in different cell types in these organisms.
The role of histone modifications and variants is an area of intense investigation [4, 5]. Much has been learnt through the use of ChIP, but with some exceptions [66–68], it is typically not coupled to a nucleosomal assay based on MNase and deep enough sequencing to map these epigenetic marks and their combinations at truly single-nucleosome resolution. Rigorous experimental tests of cause and effect relationships between nucleosome positions, epigenetic modifications, the binding of sequence-specific proteins, and processes like transcription, replication, repair and recombination still need to be performed in many systems.
One area that can be relatively opaque to standard nucleosome analysis is heterochromatin, especially in higher eukaryotes. These regions can be compact and inactive, underrepresented in the typical mononucleosomal fraction, and also contain repeat sequences which impair mapping of short sequence reads. It is important to examine how heterochromatin differs from euchromatin and how it varies across cell types and developmental stages. How nucleosomal structure relates to higher-order chromatin structure is still largely unknown. Many technical challenges remain, but the lowered costs of deep sequencing, improved means of genetic manipulation in many organisms, and the increasing accessibility of these methods and computational tools bodes well for further studies of nucleosome positioning in all areas of chromatin biology.
Work in the Iyer lab is supported by grants from the National Institutes of Health (CA095548, CA130075, HG004563) and the Cancer Prevention and Research Institute of Texas (CPRIT RP120194). I apologize to investigators whose primary reports could not be cited due to space considerations.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.