|Home | About | Journals | Submit | Contact Us | Français|
We report a sensitive peptide pull-down approach in combination with protein identification by LC-MS/MS and qualitative abundance measurements by spectrum counting to identify proteins binding to histone H3 tail containing dimethyl lysine 4 (H3K4me2), dimethyl lysine 9 (H3K9me2), or acetyl lysine 9 (H3K9ac). Our study identified 86 nuclear proteins that associate with the histone H3 tail peptides examined, including seven known direct binders and 16 putative direct binders with conserved PHD finger, bromodomain, and WD40 domains. The reliability of our proteomic screen is supported by the fact that more than one-third of the proteins identified were previously described to associate with histone H3 tail directly or indirectly. To our knowledge, the results presented here are the most comprehensive analysis of H3K4me2, H3K9me2, and H3K9ac associated proteins and will provide a useful resource for researchers studying the mechanisms of histone code effector proteins.
Histone proteins are extensively modified by an array of PTMs that include phosphorylation, acetylation, methylation, ADP-ribosylation, proline isomerization, ubiquitinylation, sumoylation, citrullination, butyrylation, and propionylation [1–4]. Many of these PTMs have been associated with distinct biological processes, such as gene activation, silencing, replication, mitosis, and DNA damage response. Recent progress in MS has allowed for unbiased identification of many novel histone PTMs [5–13] to reveal important insights into the complexity of histone regulation.
The functions of histone PTMs have been the subject of extensive investigations over the past decade. For example, acetylation of histone H3 tail has been generally correlated with transcriptional activation , in contrast, distinct H3 methylation states are associated with different regulatory outcomes, for example, methylation of H3K4 is associated with gene activation, whereas methylation of H3K9 and H3K27 with gene repression [15,16]. Although specific modifications have been correlated with distinct biological processes, the precise mechanisms by which histone modifications transmit their biological signals into meaningful biological readouts are poorly understood. There is mounting evidence that histone modifications transmit biological signal through the recruitment of effector proteins, or “readers” . Association of the downstream effectors may lead to changes in the accessibility of the DNA template to the transcriptional machinery, recruitment of enzymatic activities such as ATP-dependent chromatin remodeling complexes, or changes in the higher order structure of chromatin, all of which would lead to specific regulatory outcomes [2, 17].
The binding to histone PTMs by effector protein is mediated by conserved binding modules, such as the canonical bromodomain binding to acetylated lysines . The binding modules for methylated lysines include chromodomains, Tudor domains, MBT-repeats, WD40-repeats, and PHD fingers . It has also become apparent that multiple effector molecules can recognize a single modification, but the actual in vivo binding is dependent on a broader biological context. These observations underscore the need for a more comprehensive analysis of epigenetic readers and highlight the importance of understanding how histone PTMs regulate biological processes.
Here, we report a comprehensive, unbiased screen for proteins associated with three H3 PTMs: dimethylated K4 (H3K4me2); dimethylated K9 (H3K9me2), and acetylated K9 (H3K9ac). Our studies led to the identification of 86 proteins that bind, directly or indirectly, to the amino terminus of histone H3. Among the identified proteins, one-third represents previously described direct effectors of H3, H3K4me2, or H3K9me2, as well as their known associated proteins. Importantly, we have identified many novel modification-specific binders, including PHD finger-, WD40-, and bromodomain-containing proteins. Results presented here offer a rich source of candidate effector molecules for further downstream mechanistic analysis. To our knowledge, this analysis is the most comprehensive study to identify novel histone PTM binders in an unbiased manner.
The H3 amino terminal peptides containing amino acids 1–21 coupled to a biotin linker, ARTKQTARKSTGGKAP-RKQLA-GGK-biotin (H3 peptide), ART(dimethyl-K]QTAR-KSTGGKAPRKQLA-GGK-biotin (H3K4me2), ARTKQTAR (dimethyl-K)STGGKAPRKQLA-GGK-biotin (H3K9me2), and ARTKQTAR(acetyl-K)STGGKAPRKQLA-GGK-biotin (H3K9ac) were purchased from Upstate-Millipore (Billerica, MA).
PHF2 antiserum was generated by immunizing rabbits with purified recombinant GST-PHF2 C-terminal fragment corresponding amino acids 830–1098 of PHF2.
HeLa S3 cells were purchased from National Cell Culture Center (Minneapolis, MN). The cytoplasmic (S100) and nuclear extracts (NE) were prepared as previously described [20, 21]. Briefly, HeLa cells were washed in cold PBS, and resuspended in five times the packed cell volume (PCV) with hypotonic buffer (10 mM Tris, pH 7.3, 1.5 mM MgCl2, 10 mM KCl, 10 mM β-mercaptoethanol, and 0.5 mM PMSF). The cells were incubated on ice for 10 min and then pelleted by centrifugation at 1900×g for 10 min. The swollen cell pellet was then resuspended in half the PCV with hypotonic buffer and homogenized with 15 strokes in a glass dounce homogenizer. The lysate was centrifuged at 5000×g for 20 min at 4°C to separate the cytoplasmic proteins from the nuclear pellet. The nuclear pellet volume (NPV) was determined and the pellet was resuspended in 0.5 mL of extraction buffer (20 mM Tris, pH 7.3, 1.5 mM MgCl2, 0.2 mM EDTA, 25% glycerol, 10 mM β-mercaptoethanol, and 0.5 mM PMSF) per milliliter of NPV. The nuclei pellet was then homogenized by ten strokes in a glass dounce homogenizer. While stirring, 0.5 mL of extraction buffer containing 1.2 M KCl was added per milliliter of NPV drop wise to the homogenized nuclear extract. The extract was further stirred for 30 min at 4°C. The sample was then centrifuged for 30 min at 20 000×g. The NE supernatant was dialyzed against dialysis buffer (20 mM Tris, pH 7.3, 100 mM KCl, 0.2 mM EDTA, 20% glycerol, 10 mM β-mercaptoethanol, and 0.5 mM PMSF) and then centrifuged at 20 000×g for 30 min. The NE supernatant was aliquoted, snap frozen in liquid nitrogen, and stored at −80°C.
The peptides were prebound to streptavidin agarose beads (Invitrogen, Carlsbad, CA) in 50 μL of NETN buffer (20 mM Tris, pH 8, 1 mM EDTA, and 0.5% NP-40) containing 100 mM NaCl. Five milligrams of HeLa nuclear extracts was added to each of the peptide-bound agarose beads and rotated for 5 h at 4°C. The beads were then washed five times with 1 mL of NETN buffer containing 200 mM NaCl. The washed beads were boiled with 60 μL of Laemmli buffer, analyzed by SDS-PAGE on a 4–20% gel (Invitrogen), and stained with colloidal CBB.
Sample preparation and MS analysis on a Finnigan LTQ mass spectrometer (Thermo Finnigan, San Jose, CA) were as previously described in ref. . Briefly, each lane of the gel was divided into eight equal size gel slices. The gel slices were destained with 50 mM ammonium bicarbonate in 50% methanol and washed overnight with HPLC grade water. The gel slices were then crushed and digested with 200 ng of trypsin in 50 mM ammonium bicarbonate at 37°C for 4 h. Digested peptides were extracted with 200 μL of ACN and dried by speed vac. Each sample was dissolved in 20 μL of 5% methanol/95% water/0.1% formic acid and analyzed on a C18 column (100 mm×75 μm, 300 Å pore diameter, PicoFrit; New Objective, Woburn, MA). The mobile phase A (0.1% formic acid in water) and B (0.1% formic acid in methanol) was used with a gradient of 10–80% mobile phase B over 10 min followed by 80% B for 10 min at a flow rate of 200 nL/min. Peptides were directly electrosprayed into the mass spectrometer using a nano-spray source. The LTQ was operated in a data-dependent mode, acquiring fragmentation spectra of the 20 strongest ions.
Raw MS data were searched using BioWorks 3.2 Sequest software (Thermo Electron, Waltham, MA) against the NCBI human protein RefSeq database. Results from the eight bands for each pull-down/MS were loaded into the multi-consensus files with the following filter threshold settings: XCorr of 1.50 (1+), 1.80 (2+), and 3.00 (3+) and probability of 1 × 10−2 for peptides and XCorr of 10.0 and probability of 1× 10−3 for proteins. All protein identifications were verified by manual inspection of MS/MS assignments. The number of spectra acquired from the peptide mixture was used as a measure of protein abundance within the protein pull-down as described previously in ref. .
Protein abundance data derived by spectrum counting were subjected to hierarchical cluster analyses in Spotfire Decision Site v.7.0 with correlations determined by magnitude and shape (Euclidean distance) .
To identify the effector proteins that associate with H3 modifications at K4 and K9, we used C-terminally biotinylated peptides corresponding to the N-terminal H3 tail (aa 1–21) either unmodified (H3), dimethylated at K4 (H3K4me2), dimethylated at K9 (H3K9me2), or acetylated at K9 (H3K9ac). The peptides were immobilized on streptavidin agarose beads and incubated with HeLa nuclear extracts (Fig. 1). Bound proteins were eluted and separated by SDS-PAGE. CBB staining revealed both distinct and similar protein bands eluting from the four H3 peptide pull-downs. As expected, there were far fewer bands in the streptavidin beads control lane, indicating that nonspecific binding to the agarose beads was relatively low (Fig. 2A). While protein stains provide a good visualization of proteins present, low abundance proteins may not be readily visible. For unbiased and sensitive proteomic analysis, we divided each lane into eight equal size slices and digested each in situ with trypsin for protein identification and quantification.
The 40 tryptic peptide samples were analyzed individually by HPLC-MS/MS on an LTQ mass spectrometer. To assess the extent of nonspecific binding, we also analyzed the streptavidin beads control pull-down. The majority of proteins that bound to the control agarose beads consisted mainly of common and abundant nonspecific proteins that we often identify in immunoprecipitation assays (see Table 1 of Supporting Information). All proteins identified in the four H3 peptide pull-downs are listed in Table 1.
It has been demonstrated that stable isotope labeling with amino acids in cell culture (SILAC) in combination with high-resolution MS can provide an accurate method for quantifying the relative amount of each protein identified in the pull-downs [25–27]. Indeed, such an approach has been used to identify histone modification-specific binding proteins and protein interaction partners [28, 29]. However, a major limitation of the isotopic amino acid labeling method is the limited number of samples that can be used for relative quantification, which requires pair wise comparisons. Therefore, it is not convenient to use such methods to assess the binding of proteins in the five pull-downs conducted in this study. Instead, we opted for spectral counting as described by Yates and coworkers  which serves as a simple and convenient method to estimate the abundance of each protein. We combined the results of the eight MS analyses for each pull-down and the total number of spectra were tallied, Table 1.
To compare and contrast the composition of proteins pulled down by the various modified histone H3 peptides, we used hierarchical clustering to analyze and heat maps to visualize the spectral counts as simple estimates of the relative abundance of each protein (Fig. 3) [24, 30]. As expected, the streptavidin beads pull-down contained the least amount of proteins overall. In a few instances, a significant number of peptides for a protein were detected in the control pull-down, such as in CHD3, CHD4, and TRIM28; however, the marked increase in the peptide spectral counts in the H3 pull-downs suggests that these proteins likely also bound to H3 or to the modified H3 sites (Table 1). In all, we identified 86 distinct proteins that preferentially bound to at least one of the four H3 peptides examined relative to the control beads. Of these 86 proteins, 65 were not recovered in the control beads pull-down; for the 21 proteins that were detected in the control pull-down, they were significantly enriched with greater than 70% increase in the number of peptides identified in at least one of the H3 pull-downs (Table 1).
If our approach captured bona fide interactions, we would expect to identify known binders to the H3 peptides used in this study. Indeed, our screen identified seven proteins previously shown to recognize H3 tail directly including known readers of H3K4me2, H3K9me2, and unmodified H3 tail (Tables 2 and and3).3). The major protein complex isolated with the unmodified H3 peptide is the NuRD/MeCP1 protein complex (Fig. 2B and Table 2), as previously reported [31–37]. In addition, we also recovered known interacting proteins of these direct effectors, suggesting that a subset of the detected interactions is indirect. These data are discussed below in more detail (Section 3.4). Remarkably, 33 proteins (more than one-third of the proteins identified in our screen) were previously reported to bind directly or associate with direct binders to the H3 tails examined. These results support the validity of our approach and strongly suggest that the remaining 55 proteins are relevant, novel binding partners that directly or indirectly associate with the H3 peptides examined in this study.
The three isoforms of HP1 (α, β, and γ), which bind specifically to H3K9me2, were identified in our screen. The chromodomain of HP1 mediates the interaction with H3 di- and trimethylated at K9 for gene silencing and maintenance of heterochromatin. Two of the HP1 isoforms (α and β) displayed strong binding preference for H3K9me2 as previously reported , whereas the HP1γ homolog (LOC653972) also bound to unmodified H3 and H3K9ac to a lesser degree. Accordingly, our screen identified proteins known to associate with HP1, including a component of the CAF1 complex CHAF1B , and corepressor protein TRIM28/KAP-1 . CHAF1B and TRIM28/KAP-1 exhibited preferential binding to the H3K9me2 peptide, suggesting that the association is HP1-dependent. It is also noteworthy that TRIM28/KAP-1 contains a PHD finger that may contribute to its specificity of binding to H3K9me2.
WDR5 was identified to preferentially bind to H3K4me2 using a similar peptide pull-down assay . In agreement with this result, WDR5 was significantly enriched in the H3K4me2 pull-down in comparison to the unmodified H3 tail. WDR5 contains a WD40 domain and is a common component of the MLL family of H3K4 methyltransferase complexes. WDR5 associate with RbBP5, ASH2 and a SET domain protein to form the “core complex” that is sufficient for methylation on nucleosomal substrates . RbBP5, which is known to tightly associate with WDR5, was also recovered in our study, and although less abundant in terms of peptide recovery, it exhibited binding specificity closely mimicking that of WDR5. Thus, it is likely that RbBP5 binding to H3K4me2 peptide is via interactions with WDR5.
CHD1 is an ATP-dependent chromatin remodeling protein that binds to H3 tails containing di- or trimethylated K4 via its tandem chromodomains . We confirmed CHD1 interaction with H3K4me2 in our peptide pull-downs (Tables 1 and and3).3). We also detected weak binding of CHD1 to the H3K9ac peptide, although the biological significance of this interaction is not clear.
Interestingly, we also identified multiple subunits of PAF1 elongation complex: PAF1, LEO1, CTR9, CDC73 that preferentially associated with H3K4me2 tail, although binding to other H3 peptides was also observed. PAF1 promotes transcriptional elongation and regulates methylation of H3 at K4. Preference for H3K4me2 may be explained by the previously reported association of the PAF1 complex with CHD1 and H3K4 methyltransferase core complex [43, 44]. Alternatively, one or more PAF1 components can directly bind to the histone tail.
Finally, we wish to comment on the lack of recovery of BPTF, ING2, and TAF3, three proteins that have been reported to recognize methylated H3K4 via PHD finger motif. The PHD fingers of these proteins show a higher preference for binding to trimethyllysines over dimethyllysines in pull-down assays, which likely accounts for their absence in our H3K4me2 pull-down [45, 46]. In future studies, we will perform similar proteomic screen to search for novel proteins associated with specific mono-, and trimethyllysines, which will expand our knowledge on how the different methylation states may be distinguished by the downstream effectors.
Histone modifications can exert their function not only via recruitment of specific readers, but also through exclusion of certain effector molecules from the histone tails. One example includes the NuRD complex, whose binding to the H3 tail is perturbed by methylation at K4 [47, 48]. In agreement with these reports, our results indicate that the binding of multiple components of the NuRD complex, and the related MeCP1 complex, are dramatically reduced in the H3K4me2 pull-down, however to a much lesser degree, H3K9me2 also reduced binding of this complex (Fig. 2A and Table 1).
Our study also identified two proteins previously reported to directly bind H3, albeit without clearly defined modification-dependent specificity. The histone chaperone SET/TAF-1β was shown to directly recognize unmodified and methylated H3 tails . However in our assay, SET/TIF-1β does not significantly bind to the unmodified H3 tail, but instead strongly associates with H3K4me2, H3K9me2 and, to a lesser extent, with H3K9Ac.
The second protein, ERCC6, is an ATP-dependent chromatin remodeler involved in DNA excision repair and was previously reported to bind histone tails . In our assay ERCC6 exhibited the highest spectral counts to the unmodified H3 and H3K9me2 peptides; however binding to H3K4me2 was not detectable, whereas binding to the H3K9Ac peptide was very weak, Table 1.
In sum, our screen identified the NuRD/MeCP1 complex and seven proteins (HP1α, HP1β, HP1γ, CHD1, WDR5, PAF1, SET/TAF-1β, and ERCC6) which were previously shown to directly associate with modified or unmodified H3 tail. These direct binders and their associated proteins constituted 33 of the 86 proteins identified in these four peptide pull-down assays. Importantly, the protein partners that associate with the direct binders displayed similar binding preference to that of the direct binders. Notably however, not all protein partners of known binders had analogous association patterns. For example, ANP32A and ANP32B, which associate with SET/TIF-1β in the INHAT complex, appear to bind specifically to H3K9me2, whereas SET/TIF-1β recognizes both methylated peptides equally well. These data indicate that different complexes formed by a given protein may exhibit different histone-binding properties.
Taken together, results discussed above (section 3.3–3.6) demonstrate not only the feasibility of the approach undertaken in this study, but also suggest that the majority of the remaining 53 identified proteins represents novel direct and indirect H3 tail binders.
Consistent with the role of K4 and K9 methylation in gene activation and silencing, respectively, many of the identified K4 binders are involved in transcriptional activation, whereas K9 binders are involved in transcriptional repression. It is likely that many of the detected proteins associate with H3 indirectly. However, it is striking that 13 of the identified novel binders contain PHD finger domain, a structural fold implicated in specific methyllysine recognition on H3 tail, and more recently, in recognizing H3 tail with an unmethylated K4 . Thus, many of the detected PHD finger proteins may in fact represent direct H3 binders. Five novel PHD finger binders associated specifically with H3K4me2 peptide (PHF2, PHF8, PHF12, PHF16, and PHF23) and two with H3K9me2 peptide (DPF2 and UHRF2) (Table 3). Another PHD finger protein, TRIM28/KAP-1 also preferentially bound to H3K9me2; however, as discussed above (section 3.4), this interaction may be mediated by its association with HP1. Additionally, three identified PHD finger proteins discriminated against methylation at K4 (PHF14, RAI1, and TRIM33) in manner similar to the PHD finger of BHC80 , and one recognized only the unmodified tail (BAZ2A). Although the identification of these PHD finger proteins was based on a low number of peptides identified for each of these proteins (Table 1), we were able to confirm the binding of PHF2 to H3K4me2 by western blotting analysis of H3 peptide pull-downs containing various methylation states at K4 and K9, Fig. 4.
Two of the identified H3K4me2 binding proteins (PHF2 and PHF8) contained a PHD finger and a JmjC domain, a conserved histone demethylase domain, Table 3. If association of PHF2 and PHF8 with H3K4me2 indeed occurs directly via the PHD finger, it would provide a first example of a JmjC domain protein that specifically recognize this site.
PHD fingers and bromodomains are often present and adjacent to each other in many chromatin-associated proteins. These binding modules have been shown to functionally cooperate, possibly through the combinatorial reading of methyl–acetyl modification patterns . Four of the identified PHD finger proteins also contain a bromodomain (TRIM28/KAP-1, BAZ1B, BAZ2A, and TRIM33), but only BAZ1B preferentially bound to the H3K9ac peptide. However, given the very limited set of modifications we used in this study, our results cannot preclude the possibility of combinatorial methyl–acetyl reading by these factors.
WD40 is another domain implicated in the recognition of the methylated proteins. WDR61 was prominently detected in both the H3K4me2 and H3K9me2 pull-downs (Tables 1 and and3),3), with weaker association with the unmodified and acetylated peptides. This binding pattern is similar to that of WDR5, although the number of WDR61 peptides detected in the H3K4me2 and the H3K9me2 pull-downs were comparable, suggesting that the WD40 domain of this protein may not have the ability to discriminate between these two methylated lysines. We are currently investigating a possible mechanism of methylated H3 recognition by WDR61.
In this study, we have identified 86 proteins that bind to his-tone H3 amino-terminal peptides, including H3 tail peptides with dimethylated K4 and K9, and acetylated K9. Given the importance of H3K4 and H3K9 methylation in gene expression regulation, the identification and catalog of effector protein complexes recognizing these marks is an important first step in understanding the molecular mechanisms of epigenetic regulation. To our knowledge, this report is the most comprehensive study to date on the identification of proteins bound to these commonly modified residues on histone H3.
We used the number of tryptic peptides identified for each of the 86 proteins in the pull-down as a rough estimate of the relative amount of proteins identified. While many variables will affect the number of peptides that can be identified for each protein, such as protein abundance, efficiency of protein digestion and peptide extraction, and how well the peptide “fly” in the mass analyzer, there is however a general correlation between protein abundance and the number of spectra identified . It is important to emphasize that spectral counting is not truly quantitative, though it does provide a useful approximation of the relative amounts of each protein identified in the four pull-downs. For proteins that are identified based on large number of spectra, the relative difference in protein abundance more closely correlates with the difference in the number spectra identified in each pull-down. For proteins that are identified with very low number of peptides, it is more difficult to draw meaningful conclusions about how significant the difference in the spectra number correlates with relative difference in protein abundance between different pull-downs. We caution that for all the new binders identified in this study, further biochemical validations are warranted. Despite these limitations, for proteins that were identified based on a very low number of peptides, such as PHF2, we were able to confirm its specific interaction with H3K4me2 by Western blotting analysis (Fig. 4) thus at least for the PHF2, spectral counting as a means of semi-quantification is valid.
Two previous studies that used a similar peptide pull-down approach to identify novel binders to H3K4me2 and H3K4me3 readers in 293 nuclear extracts lead to the identification of WDR5 and BPTF, respectively [40, 46]. In this study, in addition to WDR5, many more H3K4me2 associated proteins were identified possibly because we further optimized the experimental procedure, used less stringent wash conditions and performed a more detailed mass spec analysis of the recovered proteins (see Section 2 for details). A significant advantage of this approach over the candidate approach with recombinant candidate proteins is that novel effectors are discovered in an unbiased manner. However, a resulting caveat is that we are unable to distinguish direct binding to the modified lysine from indirect interactions. Confirmation of direct binding requires further downstream biochemical characterization, such as in vitro binding assays and in vivo co-localization studies. Nevertheless, the over-representation of proteins containing methyllysine binding modules among the newly identified binders, as well as high recovery of the previously characterized H3K4me2 and H3K9me2 methylation effectors in our screen suggests that this study will significantly expand the spectrum of histone PTMs effector proteins.
The diversity of proteins identified in our screen, many of which in a modification-dependent manner, will provide a rich source of biological leads and provide a stepping stone for future studies.
We thank Joanna Wysocka and Eric Chan for critical comments on the manuscript. J.W. is partially supported by the Research Platform of Cell Signaling Networks from the Science and Technology Commission of Shanghai Municipality. This work was supported by National Institutes of Health Grant 1R43CA132680-01 to D.W.C.
The authors have declared no conflict of interest.