By comparing actual
in vivo nucleosome positions to our predicted or experimentally measured intrinsically encoded positions, we can test whether
in vivo positions are dictated by the genomic sequence. To this end, we used five different approaches. First, we measured the distance between our predicted stable nucleosome positions (stability probability ≥0.2; see Methods) and 99 experimentally mapped nucleosome positions at 11 loci
21-28 (
Supplementary Fig. 11). There is some disagreement between different experimental measurements of nucleosome positions ( and
Supplementary Fig. 12), hence discrepancies between our predictions and literature reports are attributable to inaccuracies both in our model and in the literature. Even so, six loci showed substantial correspondence ( and Supplementary Figs
13-
22). Overall, 54% of our predicted stable nucleosomes were within 35 bp of the literature positions, significantly more than the 39 ± 1% expected by chance (
P < 10
-16).
Second, we compared our predictions to three genome-wide measurements of nucleosome positions at low
29,30 or higher
31 resolution. Our model showed significant correspondence to these experiments, predicting lower occupancy at nucleosome-depleted (low nucleosome abundance) coding or intergenic regions
29,30 (Supplementary Figs
23-
25; 68% of 57 depleted coding regions and 76% of 294 depleted intergenic regions had predicted low occupancy compared with 30% (
P < 10
-6) and 56% (
P < 10
-9), respectively, expected by chance). The model also showed strong correspondence with the higher resolution nucleosome map
31: 45% of our predicted stable nucleosomes were within 35 bp of experimentally determined nucleosome positions
31 compared with 32 ± 1% expected by chance,
P < 10
-15 (Supplementary Figs
26 and
27). Notably, our predictions also match closely the stereotyped chromatin organization at Pol II promoters as revealed by the higher resolution nucleosome map
31, and the most stable nucleosome predicted by our model at promoters is located precisely (within 8 bp) where stable nucleosomes containing the histone variant H2A.Z are located
in vivo32 ().
Third, we compared the yeast model predictions to those of a model constructed independently using only nucleosome-bound sequences from chicken. The predictions of the chicken model when applied to the yeast genome correlated strongly with those of the yeast model (
Supplementary Fig. 28) and with the genome-wide experimental measurements of nucleosome occupancy at yeast coding and intergenic regions
29-31: 35% of 57 depleted coding regions and 72% of 294 depleted intergenic regions had predicted low occupancy compared with 4% (
P < 10
-4) and 53% (
P < 10
-8) expected by chance.
Fourth, we carried out a new selection for nucleosome formation on yeast genomic DNA
in vitro. This experiment directly reveals intrinsically encoded, individual high-affinity nucleosome positions. These
in vitro nucleosome locations overlap significantly with our
in vivo yeast nucleosome collection: 32% of 339 selected
in vitro nucleosomes overlapping the
in vivo bound sequences compared with 5% (
P < 10
-5) expected by chance. The
in vitro selected nucleosomes are particularly enriched in intergenic regions that have a high predicted nucleosome occupancy, compared with random genomic locations and to locations immediately upstream or downstream of the selected nucleosomes (
P < 10
-3; and Supplementary Figs
29 and
30).
Finally, we experimentally tested whether our highest occupancy predictions are highly occupied by nucleosomes
in vivo, by measuring their
in vivo nucleosome occupancies and comparing them to the occupancies at three nucleosome sites flanking the
GAL1-10 and
PHO5 promoters for which the nucleosome positions are known. Five of the eight predictions tested yielded
in vivo occupancies comparable to or greater than those of the known nucleosome positions (), indicating that ~60% of the intrinsically high-occupancy nucleosome sites on the DNA sequence are strongly occupied
in vivo. In 10 out of 11 cases, these predicted nucleosome positions also had higher occupancy than regions 73 bp (one-half the length of a nucleosome) upstream or downstream from the predicted position ( and Supplementary Figs
31 and
32).
Taken together, these results show that ~50% of the
in vivo nucleosome organization can be explained solely by the sequence preferences of nucleosomes. Moreover, these results indicate that the nucleosome depletions observed at coding and intergenic regions
29-31 are attributable in part to unstable nucleosomes (that is, positions on the DNA sequence that nucleosomes have a low probability of occupying) encoded in these regions.