|Home | About | Journals | Submit | Contact Us | Français|
Double chromodomains occur in CHD proteins which are ATP-dependent chromatin remodeling factors implicated in RNA polymerase II transcription regulation. Biochemical studies suggest important differences in the histone H3 tail binding of different CHD chromodomains. In human and Drosophila, CHD1 double chromodomains bind lysine 4-methylated histone H3 tail which is a hallmark of transcriptionally active chromatin in all eukaryotes. Here, we present the crystal structure of the yeast CHD1 double chromodomains, and pinpoint its differences with that of the human CHD1 double chromodomains. The most conserved residues in these double chromodomains are the two chromoboxes that orient adjacently. Only a subset of CHD chromoboxes can form an aromatic cage for methyllysine binding, and methyllysine binding requires correctly oriented inserts. These factors preclude yeast CHD1 double chromodomains to interact with the histone H3 tail. Despite great sequence similarity between the human CHD1 and CHD2 chromodomains, variation within an insert likely prevents CHD2 double chromodomains to bind lysine 4-methylated histone H3 tail as efficiently as in CHD1. By using the available structural and biochemical data we highlight the evolutionary specialization of CHD double chromodomains, and provide insights about their targeting capacities.
CHD (chromo-ATPase/helicase-DNA-binding) proteins are uniquely structured to encode double chromodomains, and these domains always occur N-terminal to a conserved SNF2 domain (Figure 1) 1. A recent review discusses how multiple subfamilies of SNF2-containing proteins, found in eukaryotes, may perform ATP-dependent protein translocation on DNA, and promote chromatin remodeling 2. In vitro studies have shown purified CHD1 proteins from Saccharomyces and Drosophila to exhibit ATP-dependent nucleosome repositioning to organize oligonucleosomes in a structure that is compatible with gene expression during transcription 3; 4. All SNF2 containing proteins and their complexes contain other conserved domains that are expected to influence their repertoire of chromatin remodeling functions by forming specific protein-protein and protein-DNA contacts. In vertebrates, nine CHD variants contribute to the regulation of transcription, whereas Saccharomyces yeast encodes one CHD protein. Yet, molecular differences between distinct CHD proteins and their chromatin targeting potentials are poorly understood 5. It is necessary to understand the structure and sequence differences among CHD proteins to establish a better understanding of their interaction and chromatin remodeling potentials.
A simplistic classification of CHD proteins based on their conserved domain organization suggests these proteins can be clustered into three classes as shown in Figure 1. For example, all CHD genes from unicellular organisms are similar to human CHD1, and distinct from two other classes that are encoded in multi-cellular organisms; these are represented by human CHD4 and human CHD7 genes in Figure 1. Human CHD3 and CHD4 integrate within multi-subunit complexes termed NuRD (nucleosome remodeling histone deacetylase; for review see; 6; 7; 8), and human CHD7 regulates differentiation and development of several organs 9; 10. Genetic studies have shown that Saccharomyces CHD1 is present at transcribed genes, and its chromodomains are essential for this function 11. Although regions of transcribed genes in all eukaryotes are rich in lysine 4 methylation on the histone H3 tail (H3K4me), neither the chromodomains nor the intact Saccharomyces CHD1 protein interacts with the H3K4me peptide 12; 13; 14; 15.
This is in stark contrast with the specific mode of interaction we previously identified and characterized for binding of the human CHD1 to the H3K4me peptide. To better understand the molecular features of the CHD1 double chromodomains, we have determined the atomic structure of this region in Saccharomyces CHD1 to delineate its differences with the human CHD1 structure. We have also generated a phylogenetic tree of CHD proteins based on their chromodomain segments, which suggests distinct functional specialization during evolution. We show 21-residue chromoboxes form the conserved core in each chromodomain of all CHD proteins, and propose diversity is generated by distinct insert regions outside the chromobox sequences. As the targets of the majority of CHD double chromodomains, including Saccharomyces CHD1, remain unidentified, our analysis offers a practical guide to further investigate the targeting and contribution of the chromodomain region to the function of the CHD proteins.
To explain the observed functional differences between human and yeast CHD1 chromodomains, we solved the crystal structure of the Saccharomyces CHD1 double chromodomains at 2.2 Å resolution (Table 1). Figure 2 shows this structure in comparison with that of the human CHD1 in complex with the methylated histone H3 tail. In both structures, each set of the secondary structure elements corresponding to a chromodomain assembles similar to the prototypical HP1 chromodomain 13. However, sequence insertions in between the conserved regions in chromodomain 1 lead to a substantially larger chromodomain. A universal insert in CHD chromodomain 1 is insert 2, which serves to block peptide binding as seen in HP1 and Polycomb chromodomains 13. The CHD double chromodomains arrange perpendicular to each other using a helical linker. Interestingly, the inter-chromodomain linker segments of Saccharomyces CHD1 and human CHD1 exhibit only 18% sequence identity, but both use 33 residues to fold similarly (Figure 2C). The structure of Saccharomyces CHD1 suggests chromodomains 1 and 2 may also cooperate to bind a partner.
Human and Saccharomyces CHD1 exhibit a major difference in the folding of their insert 1 region within chromodomain 1 (Figure 2). In Saccharomyces, insert 1 folds adjacent to the α2 region. Interestingly, two unconserved cysteine residues in insert 1 and α2 face each other at a distance compatible with a disulfide bond, and this could further stabilize the close positioning of insert 1 to α2 only in Saccharomyces CHD1. These features in chromodomain 1 of Saccharomyces CHD1 have also been noted in the recently reported solution structure of this domain 15. Unlike in Saccharomyces, human insert 1 forms the inter-chromodomain surface, and directly contributes to the histone H3 tail binding 13. Additionally, in human CHD1, two tryptophans (322 and 325) in chromodomain 1 form an aromatic cage around methyllysine 4 of the H3 tail (Figure 2B) 13. In Saccharomyces CHD1, there is a substitution of the second tryptophan with a Glu. To determine if Glu replacement with a Trp would impart methyllysine binding to Saccharomyces CHD1, we performed mutagenesis in conjunction with fluorescence anisotropy binding studies.
Figure 3c shows that the mutant Saccharomyces CHD1 E220W does not interact with the lysine 4 methylated histone H3 tail, suggesting that differences in other regions, particularly the folding of insert 1 dramatically affect binding capacities of CHD1 proteins. To date, a specific binding partner for Saccharomyces CHD1 double chromodomains has not been reported, but hints about such partners are revealed from interactome studies. Saccharomyces CHD1 has been identified in complexes with casein kinase II as well as a group of proteins that facilitate elongation for transcription by RNA polymerase II 16; 17. It is likely that the double chromodomains of Saccharomyces cooperate to accommodate binding of a protein partner at the juncture of chromodomains.
An alignment of the sequences of all CHD variants with non-CHD chromodomains reveals their most conserved region is limited to 21 residues that forms a conserved core in each chromodomain (Figure 3A,B and supplementary figure S1). We reintroduce the term chromobox to refer to this homology region. Originally the term chromobox was used to imply the nucleotide sequence that encodes 37 homologous amino acids in the chromodomains of HP1 and Polycomb genes 18. The 21-residue chromobox in CHD1 proteins is highly related in structure to that region in HP1 and Polycomb proteins 13. When aromatic residues are present at both positions 5 and 8 of the chromobox then formation of a two-residue aromatic cage for binding a methyllysine is predicted (Figures (Figures2,2, ,3b,3b, and supplementary figure S1). We note that human CHD1 chromobox 1 meets this requirement, whereas neither human CHD1 chromobox 2 nor Saccharomyces CHD1 chromoboxes meets this requirement. The CHD chromoboxes form highly related structures due to significant sequence identity; there is 33% sequence identity between chromoboxes 1 and 2 in Saccharomyces CHD1. Also, the human and Saccharomyces CHD1 proteins have 62% identity in chromobox 1 and 43% sequence identity in chromobox 2 regions. Figure 3B shows that the absolutely conserved Tyr 10 in chromobox 2 could not cooperate with Trp 5 to form a novel aromatic cage as was predicted in a study that showed chromodomain 2 in Saccharomyces CHD1 binds lysine-4 methylated H3 tail 19. Therefore, Saccharomyces CHD1, and highly likely, any other CHD protein could not use the chromodomain 2 region to assemble an aromatic cage for interaction with a lysine-methylated peptide.
To understand the relationship among the diverse CHD proteins, we prepared a phylogenetic tree on the basis of their aligned chromodomain sequences (Figure 4). A total of 34 CHD proteins were included in our sequence alignment. We observed distinct patterns of similarities that suggest the presence of major evolutionary divisions in CHD proteins. This led to a clear split among CHD1 proteins where we found those from unicellular organisms to form a separate group (class B) from the CHD1 proteins of multi-cellular organisms (class A). Therefore, H3K4me binding is widely used in class A. Interestingly, in vertebrates, a CHD2 gene also belongs to class A that exhibits high sequence identity with human CHD1 (Figures (Figures2C2C and and4).4). We noted a difference in the insert 3 region in chromodomain 2, as this region is 15 residues longer in CHD2 and likely interferes with peptide binding surface. To determine whether changes in insert 3 influence target selectivity, we prepared the double chromodomain fragment of human CHD2, and performed fluorescence anisotropy binding assays with methylated histone peptides. Figure 3c shows human CHD2 interacts with the H3K4me peptide 30-fold weaker than that detected for human CHD1. Furthermore, the presence of lysine 9 acetylation, arginine 2 methylation or lysine 36 methylation does not improve the binding affinity of human CHD2 for H3K4me peptide. While the biochemical function of CHD2 is unknown, mouse CHD2 gene is shown to have a role in development and survival 20. Therefore, we suggest an important role for insert 3 extension is to regulate H3K4me binding by CHD2 protein in a manner coordinated with complex stages of transcription. One possible mechanism may involve the action of a temporally regulated kinase that phosphorylates CHD2 to relieve insert 3 interference with histone tail binding.
The organisms listed in class B encode one CHD protein, except in Schizosaccharomyces where there are two CHD proteins called HRP1 and HRP3 (Figure 4). The double chromodomains of HRP1 and HRP3 are highly related to each other, yet HRP3 chromodomain 1 has aromatic residues at both positions 5 and 8 of the chromobox. Interestingly, the insert 1 region in both HRP proteins is shorter than that in Saccharomyces CHD1, suggesting although HRP3 may bind a lysine-methylated peptide the contribution of insert 1 to binding may be distinct from that used by human CHD1 (Figure 1B, and supplementary Figure S1).
Distinct from the two CHD1 classes is class C CHDs that includes CHD3 through CHD5, sometimes referred to as Mi-2 proteins (Figures 4 and supplementary Figure S1). These CHDs typically reside in multisubunit NuRD complexes, and contribute to histone deacetylation (for review see; 6). Chromodomain 1 in this class always contains a hydrophobic residue at position 8 of the chromobox, precluding a methyllysine binding function via an aromatic cage. The inter-chromodomain linker is 10 residues longer than that in CHD1 proteins (supplementary Figure S1), suggesting an altered organization of the tandem chromodomains. The chromodomains of Drosophila Mi-2 are shown not to interact with methylated H3 tail, and instead are implicated in DNA binding that is required for nucleosome recognition 21. The structure or affinity of such DNA binding is not yet determined, and DNA binding does not appear to be the function of CHD1 chromodomains, as we did not detect it for Saccharomyces CHD1 and human CHD1 (data not shown). Furthermore, in contrast to the inter-chromodomain linker in CHD1, class C linkers contain substantially more lysines and arginine residues, indicating a potential for using a unique surface of interaction for DNA binding. Interestingly, majority of class C CHDs contain two PHD fingers that always occur N-terminal to the double chromodomains (Figure 1). The PHD finger has also been implicated in lysine-methylated histone tail binding involving an aromatic cage 22; 23; 24. A CHD3 PHD finger was recently shown to bind H3K36me peptide using a pull down assay 23. It remains to be shown whether PHD fingers and class C chromodomains can cooperate to form a unique interaction with nucleosomes.
Another subfamily is class D that includes CHD7 (Figures (Figures11 and and4),4), which is implicated in the CHARGE syndrome. CHARGE syndrome is a common cause of congenital anomalies affecting several tissues in humans. Mutations in the CHD7 gene of individuals with the CHARGE syndrome are believed to account for the disease 25. More recently the phenotypic spectrum of human mutations along CHD7 gene, including those in the chromodomain region, has been reported for patients with CHARGE syndrome 26. Other members of class D also appear to contribute to tissue and developmentally specific chromatin regulation. For example, CHD8 (or Duplin) is associated with beta catenin-mediated gene expression 27; 28, whereas CHD9 is associated with ligand-dependent transcription by nuclear receptors 29. Class D CHDs contains chromodomains with different aromatic residues at positions 5 and 8 of chromobox 1 (Tyr at position 5 and a Phe/Tyr at position 8; see Figures 3a). They also exhibit very long insert 1 regions, suggesting class D may form an aromatic cage to bind methyllysine at the juncture of the two chromodomains (supplementary Figure S1A). Interestingly, this class exhibits an inter-chromodomain linker that is 13 residues shorter than the one in A and B classes (supplementary Figure S1B). Additional studies are necessary to determine whether reducing the linker impacts the tandem chromodomain arrangement and lead to new surfaces of inter-molecular interaction.
Finally, found four CHD proteins that exhibit dramatic divergence in their double chromodomain sequences. As such their evolutionary order of appearance could not be judged by our analysis. These are represented by two uncharacterized members from Arabidopsis, one from Giardia lamblia and one from Plasmodium falciparum (Figure 4). Giardia lamblia causes gastroenteritis in mammals, manifesting itself with severe diarrhea and abdominal cramps in humans. Plasmodium falciparum causes malaria, which is considered a major threat to human populations. The analysis of single nucleotide polymorphism for Plasmodium falciparum has previously revealed the divergence time for this organism to coincide with the start of human population expansion and consistent with a genetically complex organism able to evade host immunity 30. Despite dramatic sequence differences in the CHDs of Plasmodium and Giardia, their two chromoboxes are highly conserved as shown in Figure 3a. Moreover, the organization of conserved domains in both of these CHD proteins resembles that of CHD1 (Figure 1), suggesting they have a fundamental role in transcription regulation. As with Saccharomyces CHD1, Plasmodium and Giardia CHD proteins do not have the necessary combination of aromatic residues to bind a lysine-methylated peptide (Figure 3A).
In vitro studies have demonstrated that Drosophila CHD1 can act alone to organize nucleosome arrays in a form that is compatible with chromatin during transcription (i.e., a larger inter-nucleosome spacing and lack of incorporation of the linker histone H1) 4. During transcription by RNA polymerase II, nucleosomes do not fall apart, but instead become displaced in a carefully regulated manner to comply with initiation, elongation and termination stages (for review see; 31). We suggest CHD proteins manage nucleosome repositioning at various stages of transcription by responding to important epigenetic queues. We suspect the double chromodomains of CHD proteins assist this process and help to organize distinct CHD proteins at various sites of gene activity. While some unicellular organisms have only one class of CHD proteins, in worms, flies, plants and vertebrates, three distinct classes exist and all are implicated in transcription regulation. For class A CHDs we indicated their localization involves binding of their double chromodomains to the lysine 4 methylated H3 tail (for review see; 32). This modification typically correlates with transcription start sites as well as throughout the coding regions. Class C CHDs may use PHD fingers to recognize the lysine 36 methylated H3 tail, and use double chromodomains for DNA binding 21; 23. The H3K36me modification is also associated with transcription. As class C CHDs are components of histone deacetylating complexes, they may be related to the Rpd3S histone deactylating complex that has been shown to localize at transcription termination sites in Saccharomyces. Within Rpd3S, a chromodomain protein called Eaf3 recognizes the H3K36me modification 33; 34.
Class D CHDs have double chromodomains that appear to recognize a protein partner bearing a methyllysine. In Drosophila, both CHD1 and Kismet-L facilitate transcription by RNA polymerase II 35; 36. Localizations of CHD1 and Kismet-L are distinct during transcription stages, and Kismet-L populates transcription sites prior to CHD1 arrival 36. Mammalian CHD6 is also shown to co-localize with RNA polymerase II during both preinitiation and elongation stages of transcription 37. Additional studies are required to understand the molecular mechanism(s) that various CHD proteins exert to prepare chromatin for RNA polymerase II action. This is likely to be very important for the understanding of the basic transcription apparatus as well as the more complex tissue and cell specific gene transcription in multi-cellular organisms.
The appropriate mode of CHD localization to transcription sites appears to have dramatic consequences as mutations in CHD7 protein, that are implicated in the CHARGE syndrome, often lead to premature stop codons 26. Furthermore, malaria and Giardia parasites encode substantially different CHD proteins. For Plasmodium falciparum the complete genome is characterized and annotated 38. Moreover, the protein interaction network within malaria has been extensively studied, and resulted in identifying potential partners for the malaria CHD protein 39. Interestingly, a recent report indicated that dual infection with HIV and malaria fuels the spread of both diseases, and these diseases together cause the deaths of over 4 million people per year in Africa 40. The chromatin machinery of malaria likely contributes to this factor as malaria, which is not sexually transmitted, has dramatically impacted the process of spreading HIV infection.
All expression constructs contain an N-terminal 6xHis-tag, were cloned into the BamHI/NdeI sites of the pET11a vector, expressed in BL21(DE3) E. coli (Novagen), and purified by Ni2+-affinity chromatography (Qiagen). The double chromodomain constructs express residues 174-339 (for Saccharomyces CHD1) and residues 257-447 (for human CHD2) of the corresponding CHD genes. The E220W point mutation in Saccharomyces CHD1 was prepared using the QuikChange mutagenesis kit (Stratagene).
For fluorescence polarization, 100 nM of fluorescein-labelled peptide (prepared and used as previously described 41) was used in 50 mM sodium phosphate pH 8.0, 25 mM NaCl, 5 mM TCEP (the binding assays involving human CHD2 contained an additional 25 mM NaCl). The peptides used in binding assays were H3K4me3 [ARTK(me3)QTARKSTGGKAY], H3K36me3 [APSTGGVK(Me3)KPHRY], H3K4me3K9acK14acK18ac [ARTK(me3)QTARK(ac)STGGK(ac)APRK(ac)QLAY],H3K4me3R2me2a [AR(me2a)TK(me3)QTARKSTGGKAY], and H3K4me3K36me3 [ARTK(me3)QTARKSTGGKAPRKQLATKAARKSAPATGGVK(me3)KPHRY].
Protein crystals were grown by hanging drop vapor diffusion at 10 °C with 6 mg/ml protein in 10 mM Bis-Tris Propane (BTP) at pH 8.0, 12 mM NaCl, 5 mM TCEP, 0.9 M ammonium sulfate and 3.5% isopropanol. The mother liquor was 1.8 M ammonium sulfate and 7% isopropanol, and the crystals were cryoprotected in 5 % isopropanol, 1 M ammonium sulfate and 35% ethylene glycol. These crystals diffracted to 2.2 Å in space group I41. Native diffraction data were collected at the APS SER-CAT 22-ID beamline and indexed with HKL2000 42. The structure was phased using MOLREP 43 and using the human CHD1 tandem chromodomains as the search model. The structure was built with ArpWarp 44 followed by manual model building in Coot 45. The final structure was refined using RefMac5 restrained TLS refinement 46. All ribbon diagrams were drawn in PyMol 47, and the structure superpositions were generated by Coot.
Supplementary Figure S1. Structure-based sequence alignment of the chromodomain regions of CHD proteins are grouped according to class A (in green), class B (in red), class C (in brown) and class D (in blue). Panel A shows the aligned chromodomain 1 regions. Panel B shows the aligned linker regions, and panel C shows the aligned chromodomains 2 regions. The residue numbering is arbitrary, and secondary structure diagrams are derived from both human and yeast CHD1 crystal structures. We refer to the chromobox homology motif as the 21 residues that are highlighted within each chromodomain. A subset of CHD proteins contain aromatic residues at positions 5 and 8 (marked with black stars) of chromobox 1 (panel A), which interact with methyllysine-containing peptides (as expected in class A and class C CHDs). Inserts 1 and 2 occur in chromodomain 1 (panel A), whereas insert 3 occurs in chromodomain 2 (panel C). In panel C and the class A subgroup, the extension of insert 3 in human CHD2 is highlighted in yellow.
We thank William R. Pearson for advice on phylogenetic analysis. This work was supported by NIH grant GM070558 to S.K..
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Accession Number: The coordinates have been deposited in the Protein Data Bank under ID code 2H1E.