|Home | About | Journals | Submit | Contact Us | Français|
Dynamic expression of the genome requires coordinated binding of chromatin factors and enzymes that carry out genome-templated processes. Until recently, the molecular mechanisms governing how these factors and enzymes recognize and act on the fundamental unit of chromatin, the nucleosome core particle, have remained a mystery. A small, yet growing set of structures of the nucleosome in complex with chromatin factors and enzymes highlights the importance of multivalency in defining nucleosome binding and specificity. Many such interactions include an arginine anchor motif, which targets a unique acidic patch on the nucleosome surface. These emerging paradigms for chromatin recognition will be discussed, focusing on several recent structural breakthroughs.
The eukaryotic genome is organized in a polymeric complex of protein and nucleic acid called chromatin. The repeating unit of chromatin is the nucleosome core particle separated from each adjacent nucleosome core particle by intervening extranucleosomal linker DNA. The 200 kD nucleosome core particle complex is equally distributed in molecular weight between protein and DNA: the two copies each of histones H2A, H2B, H3 and H4 constitute about 100 kD and the 145-147 bp of DNA the other 100 kD. The nucleosome is a disk shaped complex approximately 100 Å in diameter, but it is far from a featureless disk [1•]. Instead, the nucleosome offers a rich diversity of surfaces for interactions with chromatin proteins and enzymes.
The nucleosome is often depicted in colors to highlight the individual histones and the nucleosomal DNA (Fig. 1a). It can be convenient to consider the nucleosome core particle as three distinct binding surfaces: a central disk of core histones with a diameter of about 65 Å, an outer shell of DNA, and histone tails that splay out from the central histone core (Fig. 1b). Each of these binding surfaces offers possibilities for specific recognition by chromatin proteins and enzymes. The central histone core displays nearly 27,000 Å2 of accessible surface area, replete with nooks and crannies and bulges providing interesting complementary surfaces for interacting proteins. For example, the central histone disk is only 25 Å in thickness at the dyad where two histone H3 molecules come together, but 60 Å in thickness where the H2B αC helices protrude. This surface geometry is further enhanced by a variety of charge distribution on the histone surface. The most prominent charge feature is an acidic patch created by the histone dimer of H2A and H2B (Fig. 1d), the target of the arginine anchor motif described below.
The nucleosome was long considered a repressive structure where histone proteins occluded transcription factor binding to DNA. However, much of the DNA on the outside of the nucleosome is solvent accessible and available for binding to chromatin factors. In what is unlikely to be a coincidence, the DNA grooves align across the DNA gyres (Fig. 2 middle panel) [1•, see also Fig. 1.5c in ref. 2], and it would seem probable that at least some transcription factor or chromatin enzyme will bind across the gyres to interact with aligned major or minor grooves. This would increase a transcription factor’s DNA specificity since it would support high affinity binding only when the binding sites on the two gyres were spaced appropriately, and it would also promote binding to the nucleosome over naked DNA.
The histone tails that extend from the histone core comprise roughly 25% of the mass of the histones and are abundant in post-translational modifications that contribute to epigenetic regulation of gene activity [3,4]. Much has been discovered regarding how chromatin modules can specifically recognize particular post-translational modified histone tail peptides [5,6].
It is also useful to view the nucleosome without coloring to distinguish between the individual components or parts. When so viewed as in Fig. 1c, the nucleosome is a single entity, where histone and DNA atoms cannot be distinguished from each other. This reminds us that chromatin factors and enzymes are unlikely to interact or recognize individual histone or DNA components of the nucleosome. Instead, the architecture of the entire nucleosome is likely to be recognized through binding of a combination of nucleosomal elements. Here we discuss themes in nucleosome recognition by chromatin factors and enzymes and highlight several recent structural breakthroughs.
Despite a wealth of structures detailing the molecular organization of the nucleosome core particle and the structural consequences of histone variants and DNA sequence , structures illustrating the mechanisms underlying nucleosome recognition by chromatin factors and enzymes have been more elusive. In fact, nearly a decade passed between the 2.8 Å structure of the nucleosome core particle in 1997 [1•] and the crystal structure of a peptide from the Kaposi’s sarcoma latency-associated nuclear antigen (LANA) anchored to the nucleosome [8•]. Rather than co-crystallization, this first structure of a macromolecule bound to the nucleosome was achieved through soaking of the LANA peptide into preformed nucleosome core particle crystals. The technical challenges associated with co-crystallization of a chromatin protein with the nucleosome core particle were overcome in 2010 with the RCC1 (Regulator of Chromosome Condensation)-nucleosome structure [9•,10]. This was followed by crystal structures of the BAH (Bromo-Associated Homology) domain of the yeast silencing protein Sir3 (Silent information regulator) in complex with the nucleosome [11•] and a peptide from the centromeric protein CENP-C bound to a centromeric nucleosome containing the CENP-A histone variant [12•]. More recently, two more crystal structures have been reported of the first chromatin enzyme, the Polycomb repressive complex 1 (PRC1) E2-E3 ubiquitylation module bound to the nucleosome core particle [13••] as well as that of the chromatosome, containing the nucleosome core, linker DNA, and the globular domain of the linker histone H5 [14••].
Complementary structural techniques have also emerged to add to the current understanding of nucleosome recognition. Methyl-TROSY NMR has allowed the characterization and modeling of the nucleosome in complex with a peptide from HMGN2 (High Mobility Group Nucleosome-Binding Domain-Containing) , the globular domain of linker histone H1 [16•], and the PWWP domain from PSIP1/LEDGF (PC4 and SFRS1-interacting protein/lens epithelium-derived growth factor) [17,18]. Such efforts built on the tour de force assignment of methyl resonances from histone Val, Leu, and Ile side chains . More recently, advances in single particle cryo-EM technology have been applied to generate a 7.8 Å structure of the prototype foamy virus intasome bound to the nucleosome core particle [19••].
A central theme emerging from the growing list of structures of the nucleosome bound to chromatin proteins and enzymes is multivalency. Rather than binding to a single component or surface of the nucleosome, chromatin proteins and enzymes often recognize multiple nucleosomal components often across multiple and even non-contiguous nucleosomal surfaces. This allows for specific structural and enzymatic specificities to be distributed to unique nucleosomal locations and emphasizes the importance of studying the nucleosome as an entity, and not simply as the sum of its histone and DNA parts. For example, RCC1 uses its β-propeller loops to bind to both histone and DNA components of the nucleosome [9•,20]. RCC1 anchors to the H2A/H2B acidic patch with its switchback loop and also binds across the major groove of nucleosomal DNA at SHL 6 (superhelical location) with two additional loops. Further contacts are likely made with the nucleosomal DNA at SHL 6.5 by the N-terminal tail of RCC1. [DNA superhelical locations (SHL) identify DNA major groove positions which face the histone octamer, extending from SHL0 at the dyad to SHL±7 near the sites of DNA entry/exit.]
The Sir3 BAH domain binds to all four histones across a large surface of the nucleosome core disk, including both histone tail and histone-fold regions [11•]. Sir3 BAH engages the H4 N-terminal tail organizing an otherwise unstructured region including the H4 basic patch. Additional interactions are made with surfaces of the nucleosome disk contributed by H3, H4 and H2B in the functionally defined LRS (loss of rDNA silencing) region as well as the H2A/H2B acidic patch. Of note, nucleosome affinity is enhanced by N-α acetylation of Sir3, not by direct nucleosome interaction, but indirectly through stabilization of nucleosome binding loops of the Sir3 BAH domain [21,22]. Similar to Sir3, the centromeric protein CENP-C makes multivalent interactions with multiple histone surfaces. A region of CENP-C binds to the H2A/H2B acidic patch while another region interacts with a CENP-A H3 variant-specific hydrophobic surface near the nucleosomal dyad [12•].
PRC1 is a member of the Polycomb group of protein complexes . Similar to other Polycomb complexes, it is a transcriptional repressor of developmentally regulated genes and is commonly misregulated in human cancer [24-26]. Its main molecular functions include ubiquitylation of H2A Lys119 and an intrinsic ability to compact chromatin arrays [27-29]. Rather than being a specific entity, mammalian PRC1 exists as a family of canonical and variant complexes with unique subunit composition . However, despite this complexity all PRC1 family members share a common core, an E3 ligase composed of Ring1B (or Ring1A) and Bmi1 (or one of five other Polycomb group RING finger proteins). This heterodimeric RING-type ubiquitin E3 ligase can pair with one of several E2 ubiquitin-conjugating enzymes including UbcH5c to ubiquitylate H2A, but only in a nucleosomal context [31-33]. While relevant surfaces of Ring1B, Bmi1 and the nucleosome had been identified, the mechanism governing nucleosome specificity remained elusive [32,34].
We have determined the crystal structure of the Ring1B-Bmi1-UbcH5c E3-E2 ubiquitylation module bound to its nucleosome core particle substrate at 3.3 Å resolution [13••]. Unlike other histone modifying enzymes structurally characterized with histone peptides that bind to the targeted amino acid and neighboring residues in the primary sequence, PRC1 binds multiple surfaces of the nucleosome that are distinct from the site of catalysis (Fig. 2a). The Ring1B-Bmi1 heterodimer forms a saddle over the αC helix of H2B (the outermost margin on the nucleosome) anchored on each side by RING domain-histone interactions, the most critical between Ring1B and the H2A/H2B acidic patch. This allows the PRC1 E3 ligase to position its E2 with the E2 active site cleft facing the nucleosome surface near H2A Lys119. Instead of the E2 UbcH5c binding directly to the targeted H2A sequence, it binds to nucleosomal DNA in two places, near the nucleosomal dyad and near the point of DNA entry and exit, with further interactions expected with extranucleosomal linker DNA. By recognizing unique nucleosomal surfaces, PRC1 can efficiently and specifically ubiquitylate H2A without sequence specific recognition of the targeted H2A sequence. A nucleosome substrate is required because the PRC1 ubiquitylation module binds all components of the nucleosome – all four histones and nucleosomal DNA. Moreover, in the context of the nucleosome, the H2A C-terminal tail, which is unstructured in an H2A/H2B dimer, adopts a rigid conformation that presents the targeted lysine to the E2.
Decades of research support the role of linker histones (denoted H1 or H5) binding to the nucleosome core particle and linker DNA to promote 30 nm fiber formation and resultant chromatin compaction . Linker histones contain a central globular domain flanked by N- and C-terminal extensions, with the globular domain and C-terminal extension contributing to nucleosome binding and function [36-38]. The nucleosome core particle with ~20 bp of extranucleosomal linker DNA bound to a single linker histone is called the chromatosome . Two major models have been proposed for the chromatosome: 1) the symmetric, on-dyad model in which the linker histone globular domain binds the nucleosome on the dyad and interacts with linker DNA on both sides of the nucleosome core, and 2) the asymmetric, off-dyad model in which the linker histone globular domain bind nucleosomal DNA next to the dyad and interacts with linker DNA on one side or both sides of the nucleosome core . A recent crystal structure of the chromatosome assembled with chicken H5 [14••] paired with NMR and cryo-EM investigations using Drosophila H1 and human H1.4 [16•,41•], respectively, suggest that both structures may exist in a linker histone subtype dependent manner and moreover may differentially influence higher order chromatin structure.
Zhou et al. determined the 3.5 Å crystal structure of the chromatosome containing the globular domain of chicken linker histone H5 (most similar to human H1.0) [14••]. The structure shows symmetric, on-dyad binding with the globular domain of H5 interacting with the DNA minor groove at the nucleosomal dyad and minor grooves of extranucleosomal linker DNA on both sides of the nucleosome approximately 1/2 to 1 helical turns from the core particle (Fig. 2b). Nearly all interactions occur with the DNA phosphodiester backbone. Interestingly, asymmetric, off-dyad binding was observed for Drosophila H1 based on methyl-TROSY and spin-labeling NMR characterization [16•]. A similar conclusion was reached using human H1.4 visualized in chromatin arrays by single particle cryo-EM [41•]. Zhou et al. suggest that sequence specific changes in linker histone subtypes dictate binding modes and demonstrate corresponding changes in compaction of chromatin arrays in vitro. Altogether this raises the possibility that deposition of specific linker histone subtypes may differentially influence local chromatin structure in vivo.
Integration of viral DNA into host genomes is performed by a tetrameric complex of integrase bound to viral DNA ends and is called an intasome . The intasome must deform its target DNA to allow phosphodiester bond cleavage and subsequent integration . Surprisingly, many retroviruses integrate into loci within the host genome that are frequently assembled into nucleosomes [44-46]. Moreover, a preferred integration site relative to the nucleosome position is commonly observed. For example, the prototype foamy virus (PFV) intasome preferentially integrates into DNA wrapped into a nucleosome over naked DNA and specifically ±3.5 helical turns from the dyad. However the mechanism governing this nucleosome preference and the resultant site-specificity were unclear until recently.
Maskell et al. used single particle cryo-EM to generate a 7.8 Å map of the intasome nucleosome structure, allowing docking of the intasome and nucleosome atomic structures previously determined by x-ray crystallography [19••] (Fig. 2c). Unlike previous chromatin protein-nucleosome EM structures at resolutions above 20 Å (reviewed in ), the resolution of this model allowed direct mechanistic insight into intasome function. The intasome uses three of its four subunits to bind to both histone and DNA components of the nucleosome, binding across both gyres of DNA as well as one adjacent αC helix of H2B. Nucleosome binding results in a 7 Å deformation of one underlying DNA double helix leading to a DNA structure similar to that of the intasome on naked DNA. This may explain the decreased nucleosome binding and integration observed with nucleosomes assembled with Widom 601 nucleosome positioning sequence  frequently used in biochemical and structural characterization of chromatin processes in vitro, as this optimized sequence may resist deformation at the target DNA site and therefore integration. By pairing histone interactions with DNA integration, the intasome defines specific sites within the pseudosymmetric nucleosome for integration. The preference for sites occupied by nucleosomes may favor the observed bias toward integration in stable heterochromatic rather that actively transcribed regions of the host genome that are less stably assembled into nucleosomes.
Since limited structural and biochemical data is available detailing nucleosomal recognition by chromatin factors and enzymes, it is remarkable that a clear paradigm for nucleosome binding is emerging. Nearly all of the chromatin factor-nucleosome structures published to date, including the PRC1 ubiquitylation module, RCC1, Sir3, CENP-C and LANA bind the H2A/H2B acidic patch on the nucleosome using a critical arginine side chain, termed the arginine anchor motif [8•,9•,11•,12•,13••]. The H2A/H2B acidic patch is generated by H2A (Glu56, Glu61, Glu64, Asp90, Glu91, Glu92) and H2B (Glu105, Glu113) residues and lies in a groove on the nucleosome disk surface lined on one side by the H2B αC helix and on the other side by the distal ends of the H2A α2 and H2B α1 helices (Fig. 1d). A small ridge separates two sections of the acidic patch groove. The chromatin factor arginine anchor residue projects into the deeper acidic pocket on one side of this central ridge allowing the arginine side chain guanidinium group to make charged interactions with the H2A side chain carboxylates of the Glu61, Asp90, Glu92 acidic triad (Fig. 3). Additional van der Waals interactions exist between the aliphatic region of the arginine anchor side chain and the adjacent H2B αC helix. While the arginine side chain follows a similar trajectory into the acidic patch in each of the aforementioned structures, the main chains surrounding the arginine residue are divergent, suggesting a considerable amount of variability in the types structures that can present an arginine anchor to the nucleosome surface [48,49]. There does not appear to be any sequence conservation beyond the arginine residue itself. Each chromatin factor makes additional contacts in the acidic patch and in many cases elsewhere on the nucleosome surface, but for the PRC1 ubiquitylation module, RCC1, CENP-C and LANA the arginine anchor-acidic patch interaction is among the most essential contacts for nucleosome binding and in the case of PRC1 enzymatic activity. A recently solved unpublished crystal structure of the SAGA (Spt-Ada-Gnc5-Acetyltransferase) complex deubiquitylation module bound to the nucleosome provides yet another example of an arginine anchor interacting with the H2A/H2B acidic patch (Michael Morgan and Cynthia Wolberger, personal communication).
Biochemical data suggest additional roles for the acidic patch in other chromatin processes. The acidic patch has been implicated in higher order chromatin structure and is integral to formation of the 30 nm fiber through interaction with a basic patch in the H4 N-terminal tail [1•,50-53]. An NMR-based model of a fragment of the high mobility group protein HMGN2 illustrates another example of an arginine anchor-acidic patch interaction . Similarly to PRC1 subunit Ring1B, BRCA1 likely employs an RING domain arginine anchor to bind to and ubiquitylate nucleosomal H2A [13••]. Another E3 ligase, RNF168, may not bind directly to the H2A/H2B acidic patch, but nonetheless requires an intact H2A/H2B acidic patch for nucleosome and H2A/H2B dimer ubiquitylation [33,34]. In S. cerevisiae, acidic patch mutations disrupt histone modification patterns including H2B K123 ubiquitylation and downstream H2A K4 and K79 methylation [54,55]. It remains to be determined whether arginine anchors are used by these histone modifying enzymes in establishing these marks.
While the arginine anchor-acidic patch interaction is emerging as a paradigm for nucleosome recognition, it is by no means the rule, as seen with recent chromatosome and intasome-nucleosome structures. As the sample size of characterized nucleosome binding proteins is still small, it remains to be discovered how often the acidic patch is used by chromatin factors and what other paradigms exist for nucleosome binding.
While much progress had been made over the last five years to reveal molecular details underlying nucleosome recognition, we have only scratched the surface of this complex and fundamental system. The PRC1 ubiquitylation module-nucleosome complex structure presents the first high-resolution depiction of a histone modifying enzyme’s nucleosome recognition. This paves the way for structural characterization of other classes of histone modifying enzymes bound to the nucleosome. Further strides are also needed to understand the molecular details of chromatin binding by other important types of chromatin factors, including chromatin remodelers and readers of epigenetic modifications. The intasome-nucleosome docking model illustrates the power of emerging cryo-EM technology for structural characterization of large nucleosome assemblies to complement existing crystallography and NMR-based methods. Future work in this area will undoubtedly reveal the extent of arginine anchor usage and uncover new paradigms for nucleosome recognition. Ultimately, the combined structural knowledge may inform the design of small molecule or peptidomimetic modulators of chromatin structure and function .
We would like to thank Peter Cherepanov and Allessandro Costa for providing coordinates for their intasome-nucleosome core particle model, and Michael Moore and Cynthia Wolberger for permission to cite their unpublished work. This work was supported by the National Institutes of Health grants GM088236 and GM11165 to S.T. and Damon Runyon Cancer Research Foundation grants DR-2107-12 and DFS-14-15 to R.K.M..
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.