|Home | About | Journals | Submit | Contact Us | Français|
We previously reported a novel affinity purification (AP) method termed modified chromatin immunopurification (mChIP), which permits selective enrichment of DNA-bound proteins along with their associated protein network. In this study, we report a large-scale study of the protein network of 102 chromatin-related proteins from budding yeast that were analyzed by mChIP coupled to mass spectrometry. This effort resulted in the detection of 2966 high confidence protein associations with 724 distinct preys. mChIP resulted in significantly improved interaction coverage as compared with classical AP methodology for ~75% of the baits tested. Furthermore, mChIP successfully identified novel binding partners for many lower abundance transcription factors that previously failed using conventional AP methodologies. mChIP was also used to perform targeted studies, particularly of Asf1 and its associated proteins, to allow for a understanding of the physical interplay between Asf1 and two other histone chaperones, Rtt106 and the HIR complex, to be gained.
Progress in the chromatin field has been closely intertwined with technical improvements in both genomic and proteomic technologies. For instance, the chromatin immunoprecipitation (ChIP) protocol has been used for many years to define the binding sites of a protein on DNA (Kuo and Allis, 1999). While early uses of the ChIP protocol were coupled to standard PCR and restricted to the study of a few genomic loci at a time, the development of better detection platforms, such as ChIP-chip (Ren et al, 2000) and ChIP-seq (Barski et al, 2007), now allows genome-wide studies. The analysis of proteins associated with chromatin has also benefited from technical advances. For instance, detailed analyses of histone isomers and their post-translational modifications (PTM) by mass spectrometry (MS) have been conducted in numerous organisms (Masumoto et al, 2005; Bonenfant et al, 2006; Thomas et al, 2006). These analyses enabled researchers to identify novel modifications (Garcia et al, 2007) and to uncover cooperative actions among multiple histone modifications (Jiang et al, 2007; Taverna et al, 2007), adding an extra level of complexity that was previously undetected.
One area of chromatin research that still requires technical improvement is the identification and characterization of protein complexes associated with chromatin (Lambert et al, 2010). Affinity purification and mass spectrometry (AP-MS) has emerged as a powerful tool for characterizing protein–protein interactions and biological systems in general (Gingras et al, 2007; Gstaiger and Aebersold, 2009). To date, AP-MS has been successfully applied to multiple model organisms, including budding yeast (Rigaut et al, 1999), fission yeast (Cipak et al, 2009; Kim et al, 2009a), Drosophila melanogaster (Veraksa et al, 2005), Caenorhabditis elegans (Ooi et al, 2010), mouse (Bienvenu et al, 2010), mouse stem cells (Kim et al, 2009b) and human cells (Glatter et al, 2009). Furthermore, numerous large-scale studies have been performed both in budding yeast (Ho et al, 2002; Gavin et al, 2006; Krogan et al, 2006) and in human cells (Ewing et al, 2007), resulting in an improved characterization of protein–protein interaction for thousands of gene products. As well, our understanding of many chromatin-related processes, such as transcription, has greatly benefited from AP-MS studies. For example, an exhaustive analysis of protein complexes associated with human RNA polymerase II (RNAPII) by tandem affinity purification (TAP) and analyzed by MS (Jeronimo et al, 2007; Cloutier et al, 2009) revealed many new proteins relevant to RNAPII biology. However, these and most other studies (Sardiu et al, 2008) only focused on protein complexes that were extracted in the soluble fraction of the nucleus or the entire cell. No study systematically investigated protein interactions of proteins bound to chromatin.
Two techniques have been reported to enable purification of protein complexes associated with a particular genomic locus. The first approach relies on specific nucleic acid probes, which are affixed to a solid support (i.e., beads). These nucleic acid sequences act as affinity probes and replace antibodies. The proteins associated with the nucleic acid probes can then be selectively purified and subsequently identified by MS (Rubio et al, 2008; Schultz-Norton et al, 2008; Burckstummer et al, 2009; Dejardin and Kingston, 2009). The second approach uses mini-chromosomes that contain sequences of interest flanked with repetitive Lac operator sequences. These mini-chromosomes can be selectively purified from the bulk of chromatin using an immobilized Lac repressor (Akiyoshi et al, 2009; Unnikrishnan et al, 2010). These two approaches are well suited for studying specific genomic loci and their associated protein complexes. Unfortunately, these methods are limited in their applicability because they require many specialized tools (affinity probes), they focus only on distinct genomic loci and they require a large amount of materials. Thus, another approach is required for performing large-scale studies involving multiple baits.
In order to gain additional insight in the role of chromatin binding proteins, we previously reported the development of an AP method coupled to MS termed mChIP (for modified chromatin immunopurification (mChIP; Lambert et al, 2009). mChIP efficiently purifies protein–DNA macromolecular complexes and enables their subsequent analysis by MS. The mChIP method consists of a single AP step, whereby chromatin-associated proteins are isolated from mildly sonicated and gently clarified cellular extracts using magnetic beads coated with antibodies (Lambert et al, 2009). As such, the mChIP approach maintains chromatin fragments in solution, enabling their specific purification, something not previously possible in classical AP-MS methods (Lambert et al, 2009). mChIP was successfully applied to the study of both histones (Lambert et al, 2009) and non-histone (Fillingham et al, 2009; Lambert et al, 2009) chromatin-associated proteins. Furthermore, the mChIP method was shown to drastically increase the coverage of the interactome for chromatin-associated proteins that are difficult to purify, such as Lge1 and Yta7 (Lambert et al, 2009). Finally, contrary to classical AP-MS techniques, mChIP can sensitively identify direct and indirect (through chromatin) protein associations present only at a few genomic loci (Fillingham et al, 2009).
In this study, we report the first large-scale mChIP characterization of the chromatin interactome in budding yeast. As part of this study, 102 baits known to bind DNA or with functional links to chromatin were successfully purified by mChIP. MS was used to identify the chromatin proteins associated with these baits. This mChIP study of the chromatin interactome resulted in the detection of 2966 high confidence protein associations with 724 distinct preys. To our knowledge, this is the first large-scale effort to map the chromatin-associated protein–protein interaction network.
We are particularly interested in better defining the interactome of chromatin-associated proteins for which little information was available. An analysis of the manually curated complement of the Saccharomyces Genome Database (SGD; www.yeastgenome.org) identified 64 proteins binding to DNA, including 32 transcription factors or transcriptional activators/repressors (Supplementary Table S1). Interestingly, 70% of these transcription factors and transcriptional activators/repressors possess five or fewer known interaction partners, previously observed by AP-MS in the BioGRID database (Stark et al, 2006;www.thebiogrid.com). By contrast, 15 out of the 64 DNA binding proteins had more than 20 protein–protein interactions reported by AP-MS (Supplementary Table S1). This group of proteins was composed mostly of histones or members of large chromatin remodeling protein complexes, which are present at high levels in the cell. As such, current AP-MS methods appear viable for studying some types of DNA binding proteins (e.g., histones), but they provide little information about other classes (e.g., transcription factors).
In this study, we used the mChIP procedure to characterize the 32 known DNA binding proteins that have fewer than 6 reported interactions (Supplementary Table S1) and 98 other proteins with molecular functions relevant to chromatin biology. For instance, 10 histone chaperones, 10 lysine acetyl transferases (KAT), 6 lysine methyl transferases and 7 nuclear proline isomerases were also used as baits for mChIP (see Supplementary Table S2 for complete list). The protein expression of endogenous C-terminally TAP-tagged baits (Howson et al, 2005) was first assessed by western blot. From the 130 yeast strains tested, 110 showed the expression of a TAP-tagged bait protein at the correct molecular weight. These 110 strains were subsequently subjected to large-scale mChIP purifications (Figure 1). The purified proteins from each mChIP were resolved on 4–12% NuPAGE gels and the gels were silver stained (Supplementary Figure S1). Each lane corresponding to one mChIP experiment was sliced into 12 sections. The proteins present in the sections were in-gel digested with trypsin and were subsequently analyzed by MS (see Materials and methods section for details). In total, 110 different TAP-tagged proteins were subjected to mChIP, and 102 of them were successfully analyzed by MS (Figure 1B and C, Supplementary Table S2).
By design, the mChIP technique attempts to preserve protein–protein interactions by keeping the salt concentration in buffers and the sample centrifugation to a minimum (Lambert et al, 2009). Consequently, the mChIP analysis of proteins globally associated with chromatin (such as histones (Barski et al, 2007) or members of the RSC complex (Floer et al, 2010)) identified large numbers of associated proteins. Efficient data analysis is thus critical to fully appreciate the data generated by mChIP-MS. To refine the mChIP data set, we first applied a step designed to remove common contaminants (Supplementary Table S3). The list of common contaminants was compiled from control mChIP purifications (Lambert et al, 2009) and from a list of ribosomal proteins (common contaminants in AP-MS experiments (Gingras et al, 2007)) in SGD (www.yeastgenome.org). This first curation step resulted in a data set containing 5723 protein associations among 102 unique baits and 896 distinct preys (Figure 1B, Supplementary Table S4). The results were visualized as a heat map generated by hierarchical clustering of the data set (Supplementary Figure S3). Upon further examination of the heat map, it became clear that numerous prey proteins are detected at high frequencies in the mChIP results (vertical lines in Supplementary Figure S3). While these high-frequency preys were never observed in our negative controls, they did not appear relevant to chromatin biology and were also removed from the final mChIP-MS data set. To more systematically identify these non-specific mChIP preys, a mChIP abundance factor (the number of times a prey was identified in our mChIP screen) was determined for each prey (Supplementary table S4). Examples of high-abundance preys include Yra1 (54), Prp43 (50) and Vps1 (48), which have housekeeping roles not related to chromatin biology. Other scoring algorithms for the removal of non-specific binders have been reported (Ewing et al, 2007), but our data set is not suitable for these algorithms. First, mChIP does not identify only direct protein–protein interactions but also indirect protein associations mediated by chromatin. No previous scoring algorithm has been designed to take this into account. Second, the baits studied by mChIP are functionally linked, and thus they often associate with the same preys. As such, some preys have a high mChIP abundance factor but, nonetheless, they need to be retained in the final mChIP data set (e.g., histone chaperones co-purifying with histones). To circumvent these issues, a manual examination of the data set was performed based on the prey's mChIP abundance factor, molecular function and cellular localization (see Materials and methods section for complete details). This led to the removal of 170 non-specific binders, resulting in a higher confidence mChIP data set containing 724 prey proteins (Supplementary Table S4). This refined data set was used to generate a second heat map based on hierarchical clustering using the Pearson's correlation (Figure 2A; Supplementary Figure S3).
In total, curation of the mChIP-MS data set removed 67% of all protein–protein interactions while maintaining 85% of the protein–protein interactions, previously detected by TAP-MS of the same bait proteins (Supplementary Figure S4A). Interestingly, the majority of the 49 protein–protein interactions, previously detected by TAP-MS but removed by our curation method, have been annotated as background in a subsequent reanalysis of the TAP-MS data sets (Collins et al, 2007) or in a recent large-scale AP-MS study (Breitkreutz et al, 2010) (Supplementary Figure S4B). Furthermore, comparison of the preys identified as background in this study with two other large-scale AP-MS studies (Krogan et al, 2006; Breitkreutz et al, 2010) revealed a large overlap (Supplementary Figure S5). In addition, proteins defined as background only by our curation method are enriched for RNA processes and location in the nucleolus in agreement with these preys being contaminants (Supplementary Figure S6A). On the other hand, most proteins classified as contaminants in other AP-MS studies but not included in the mChIP-MS background have not been detected (78 out of 127) or have been detected only in one or two mChIP (32 out of 127) (Supplementary Figure S6A), consistent with them not being contaminants in our data set. The remaining 17 preys identified by others as contaminants all possess functions related to chromatin biology, such as histones, which explain their identification in multiple mChIP-MS (Supplementary Figure S6A).
While our curation approach appears efficient, the lack of an appropriate gold standard data set for benchmarking of the mChIP curation method prevents easy assessment of its value. Thus, we defined global trends within the mChIP-MS data as a mean to better evaluate its quality. For instance, a comparison of the list of non-specific binders to our curated data set revealed that the non-specific binders are biased towards mid to high expression levels, whereas the mChIP preys are biased towards low to mid expression levels (Figure 2B). The higher expression levels of non-specific binders are consistent with the literature on AP-MS contaminants (Chen and Gingras, 2007). Furthermore, over 80% of the preys in the final data set are each associated with less than 5 baits (Figure 2C). In addition, the mChIP data were enriched for chromatin-related functions (such as chromosome segregation/division or transcriptional control), while the non-specific binders were not (Supplementary Figure S7). We also observed that the preys retained in the final data set were detected by mChIP-MS in a reproducible manner across multiple biological replicates (Supplementary Figure S8). Taken together, these metrics indicate that our manual removal of non-specific binders improved the overall quality of three mChIP data set.
Next, mChIP data were compared with previously reported genome-wide TAP-MS data (Gavin et al, 2006; Krogan et al, 2006). For over 75% of the baits studied by mChIP-MS, more prey proteins were detected compared with TAP-MS. Furthermore, 18% of the baits that were successfully analyzed by mChIP-MS had previously failed by TAP-MS (Gavin et al, 2006; Krogan et al, 2006; Figure 2D). Interestingly, there was no correlation between the increase in the number of associated proteins detected by mChIP and the bait expression level (Figure 2E). This finding suggests that the increase in the number of protein associations detected by mChIP-MS compared with those detected by TAP-MS is not mainly due to a more sensitive mass spectrometer, but rather to the purification technique itself. Overall, the budding yeast chromatin-associated interactome that is now accessible by mChIP-MS is an environment not previously investigated and worth further study.
High-abundance chromatin-associated proteins, such as histones and their chromatin-associated protein networks, were successfully characterized by mChIP-MS. The results are consistent with the wealth of protein interaction data currently available in the literature for high-abundance baits (Supplementary Table S1; Fillingham et al, 2009; Lambert et al, 2009). In our current study, emphasis was also placed on lower abundance targets, such as transcription factors. For instance, the results from the mChIP-MS analysis of the Hap2 transcription factor (a member of the CCAAT-binding complex) was compared with traditional TAP-MS. mChIP-MS of Hap2-TAP revealed over 80 associated proteins, including Hap3 and Hap5, which are known to form a heterotrimer with Hap2, and were previously identified by conventional TAP-MS (Gavin et al, 2006; Krogan et al, 2006) (Figure 3A). Interestingly, the overexpression of Hap2-FLAG using a galactose-inducible construct followed by a one-step AP-MS analysis also produced an extensive interactome (Ho et al, 2002). However, a significant fraction of these associated proteins (~57%) did not possess functions related to chromatin because they are localized outside the nucleus (Figure 3B, Supplementary Table S5). By contrast, Hap2-TAP mChIP largely uncovered chromatin-related associations (80 out of 82), including six transcription factors (Ste12, Dal81, Gln3, Stp1, Stp2 and Yap5) and chromatin remodeling complexes (RSC, SAGA, etc). The association of Hap2 (a global regulator of carbohydrate metabolism) with Dal81 and Gln3 (two transcription regulators of nitrogen utilization pathways) suggests a broader role for Hap2 than previously reported. Our mChIP data suggest that these transcription factors may mediate crosstalk between the nitrogen utilization and non-fermentable sugar utilization pathways.
Another example of transcription factors successfully studied by mChIP is the highly homologous and functionally redundant Msn2 and Msn4 proteins, which are implicated in stress response. We recently showed using conventional AP-MS that the transcription factor Msn4 is associated with the NuA4 lysine acetyltransferase complex (Mitchell et al, 2008). This interaction was further characterized by mChIP. First, mChIP-MS of Esa1-TAP (the catalytic subunit of the NuA4 complex) was performed and, as expected, resulted in the co-purification of Msn4 (Figure 3C). Second, reciprocal mChIP of both Msn4-TAP and the related Msn2-TAP resulted in the co-purification of NuA4 subunits (Figure 3C). Moreover, members of both the SAGA and TFIID complexes were also associated with Msn2 and Msn4, which suggests that numerous transcriptional co-activators participate in Msn2 and Msn4 functions (Figure 3C). Based on the spectral count data (Liu et al, 2004), it appears that Msn2 preferably associates with protein complexes that contain the Gcn5 rather than Esa1 KAT (Figure 3C). Conversely, Msn4 does not show this bias in its association with these transcriptional co-activators (Figure 3C). mChIP-MS analyses of Msn2 and Msn4 also identified proteins uniquely associated with each of these transcription factors. For instance, Ste23 was the top MS hit in the Msn2 mChIP, but was not detected with Msn4. Ste23 is a metalloprotease, which is an ortholog of the mammalian insulin-degrading enzyme (Alper et al, 2009). Ste23 was also shown to catalyze the cleavage of a peptide sequence corresponding to pro-α-factor in vitro (Alper et al, 2009). Furthermore, an additional link between Ste23 and Msn2 lies in the presence of a stress response element (STRE) upstream from the STE23 gene (Treger et al, 1998b). STRE are often bound by the Msn2 and Msn4 transcription factors, and STRE-controlled genes are induced following heat shock (Treger et al, 1998a, 1998b). Heat-shock proteins, many of which possess STRE, are required for proper α-factor processing (Meacham et al, 1999). Based on our data, we postulate that Ste23 has a role in proper stress responses in budding yeast.
Coordinated gene expression is essential for maintaining cellular fitness (Zhou et al, 2009). In budding yeast, numerous transcription factors are critically involved in regulating the expression of multiple genes at distinct phases of the cell cycle (Wittenberg and Reed, 2005). In S. cerevisiae, the cell cycle transition from G1 to S begins with START, a coordinate transcriptional program resulting in the timed expression of hundreds of genes. Two protein complexes essential for this process are the MBF and SBF transcription factors, composed of Swi4–Mbp1 and Swi4–Swi6, respectively (Moll et al, 1992). Previous AP-MS studies of MBF and SBF revealed interaction partners, such as Whi5, Nrm1, and Msa1, with known roles in cell cycle regulation. mChIP-MS analyses of Swi4-TAP, Swi6-TAP and Mbp1-TAP successfully identified known interaction partners (such as Stb1) for both MBF and SBF, which had not been previously identified by AP-MS methods (Figure 4).
Interestingly, the networks for the transcription factors Azf1 and Mcm1 showed an interconnection with the Swi4, Swi6 and Mbp1 networks (Figure 4). In fact, associations between Azf1-TAP and Swi6, as well as associations between Mcm1-TAP and Mbp1, Swi4, and Swi6, were detected by mChIP-MS (Figure 4). Mcm1 is a transcription factor that participates in the regulation of multiple genes depending on its associated proteins (Ferrezuelo et al, 2009). For instance, when Mcm1 interacts with Ste12, it participates in regulating the mating-specific genes (Errede and Ammerer, 1989), whereas association with Yox1 or Yhp1 leads to the regulation of genes expressed in the M to G1 transition (Pramila et al, 2002). The mChIP data for Mcm1-TAP shows a wide array of associated proteins involved in properly regulating the cell cycle (e.g., Sum1) and transcriptional activators, such as Gzf3 and Pog1 (Figure 4). Furthermore, Mcm1-TAP was found to associate with Bck2 and Ste12 by mChIP-MS. Bck2, which is known to activate numerous cell cycle-regulated genes (Ferrezuelo et al, 2009), was previously shown to be affected in strains lacking ste12 or mcm1, thus indicating a common function (Ferrezuelo et al, 2009). The fact that Mcm1-TAP co-purified with both Ste12 and Bck2 by mChIP-MS supports a direct interplay between these transcription factors at specific promoters. Overall, we successfully purified several networks of transcription factors involved in cell cycle regulation using the novel mChIP approach.
As part of our proteomic screen, many nuclear peptidyl proline isomerases, enzymes that catalyze conformational changes of proline residues (Lu et al, 2007), were studied. Seven nuclear peptidyl proline isomerases, including Cpr1 (Figure 5), a known member of the Set3 complex (Pijnappel et al, 2001), were successfully analyzed by mChIP-MS. In particular, mChIP-MS of Cpr1-TAP revealed a large number of associated proteins, including members of the Set3 complex as expected (Figure 5A). In addition, all members of the TORC1 complex and some members of TORC2 were found suggesting a role in nutrient sensing. Moreover, numerous components of the spindle pole body, as well as proteins with spindle-related functions, were found with Cpr1. These findings suggest that Cpr1 possesses wider functions than previously thought, especially with regard to regulating cellular growth (Figure 5A). Surprisingly, the E3 ubiquitin ligase Bre1 and its interaction partner Lge1 were found to be associated with Cpr1 (Figure 5A). This association raises the possibility that Cpr1 is ubiquitinated by Bre1, which is supported by the presence of higher molecular weight bands in a western blot for the TAP tag of the mChIP material, albeit at a low level (Figure 5B). To further explore this possibility, Cpr1-TAP strains containing a plasmid encoding myc-tagged ubiquitin under the control of the copper-inducible CUP1 promoter were prepared. Following induction with CuSO4, myc-tagged ubiquitin was expressed at high levels to facilitate the detection of ubiquitinated proteins. Using this strategy, Cpr1-TAP was observed to be ubiquitinated at mid-log phase culture (Figure 5C). Furthermore, the extent of Cpr1 ubiquitination was increased following treatment with rapamycin (a TORC1 inhibitor) or benomyl (a microtubule-destabilizing agent; Figure 5C), whereas global ubiquitination levels were not increased (Supplementary Figure S9). These higher molecular weight bands were abolished when a mutant ubiquitin K48R G76A protein, that is incapable of forming polyubiquitin chains, was expressed (Supplementary Figure S9A). Therefore, these higher molecular weight bands were confirmed to be polyubiquitinated forms of Cpr1. Moreover, in strains lacking lge1, bre1 or rad6, polyubiquitination of Cpr1 was significantly reduced (Figure 5D, Supplementary Figure S9B) further supporting a direct role for Bre1 mediating ubiquitination of Cpr1. Cpr1 ubiquitination appears to be modulated in response to the two drug treatments, which suggests roles for Cpr1 in nutrient sensing and cell cycle regulation through the action of Lge1, Bre1 and Rad6.
We previously used mChIP to show that the histone H3/H4 chaperone Rtt106 associates with two other histone chaperone complexes, HIR and CAF-1 (Fillingham et al, 2009). Because HIR and CAF-1 are both known to interact with Asf1 (Sharp et al, 2001; Sutton et al, 2001), mChIP was used to further characterize the chromatin-associated protein networks of Hir1-TAP, Rtt106-TAP, Asf1-TAP and Cac1-TAP (Figure 6A). MS analysis of these four baits revealed that HIR, Rtt106 and Asf1 associate with each other, whereas Rtt106 and CAF-1 compose another well-characterized complex (Huang et al, 2005, 2007; Li et al, 2008). The association between Rtt106, HIR and Asf1 was further dissected by testing whether Asf1 was required for Rtt106 association with HIR. Rtt106-TAP mChIP followed by western blotting (mChIP-WB) for Hir1-myc showed a strong association, which was abolished in the absence of asf1 (Figure 6B). We previously demonstrated that Hir1 binding to the HTA1-HTB1 promoter is not affected by deleting asf1 or rtt106, whereas Rtt106 binding to the same promoter requires both Hir1 and Asf1 (Fillingham et al, 2009). Taken together, these findings suggest a central role for Asf1 in the association among Rtt106, HIR and Asf1. We thus focused on Asf1 to further unravel the physical associations among these histone chaperones.
To directly probe the association between Rtt106 and Asf1, Rtt106-TAP mChIP-WB experiments were performed from strains containing a myc-tagged version of wild-type Asf1 or the Asf1 V94R mutant (Figure 6C). The V94R mutation was previously shown to cause a greatly reduced affinity for histone H3/H4 (Mousson et al, 2005) and, therefore, it is a good tool for defining the role of histones H3/H4 in these associations. The Rtt106–Asf1 association was found to be significantly reduced in the V94R mutant compared with the wild-type Asf1 (Figure 6C). This suggests that the ability of Asf1 to bind histone H3/H4 is critical for efficient interaction with Rtt106. Another alternative is that the association between Rtt106 and Asf1 is dependent on the presence of chromatin and thus is reduced in the V94R mutant. To directly test this alternative, mChIP-WB experiments were performed in the presence of benzonase, a promiscuous endonuclease that digests both DNA and RNA (Figure 6C). In the absence of DNA, wild-type Asf1 was co-purified with Rtt106-TAP, but the V94R mutant was not detected (Figure 6C). This suggests that Asf1 and Rtt106 interact through histone H3/H4. This indirect association between Asf1 and Rtt106 is consistent with the lack of interaction observed between recombinant Asf1 and Rtt106 in in vitro binding assays (Huang et al, 2005). On the other hand, the well-documented interaction between the HIR complex and Asf1 (Sharp et al, 2001; Sutton et al, 2001) is not affected in the V94R point mutant or in the absence of DNA (Figure 6D).
The nucleosome assembly factor Asf1 has been extensively studied by AP-MS and possesses well-defined interaction partners such as the HIR complex (Green et al, 2005), Rad53 (Emili et al, 2001; Hu et al, 2001) and the histones H3/H4 (Munakata et al, 2000). Further, mChIP-MS experiments of Asf1-TAP successfully identified these known interaction partners and also revealed an extended network of proteins associated with Asf1 such as transcription factors (Pdr1 and Pho2), proteins involved in DNA replication (Sld3, Fob1) and Mnr2, a putative magnesium transporter (Supplementary Table S6). We next tested how this network of associated proteins was affected by the absence of genes previously linked to Asf1 (Figure 7A). Lack of hir1 resulted in a drastic reduction of Asf1's network of associated proteins, including the loss of the HIR complex, Rtt106, and the transcription factors Pdr1 and Pho2 (Supplementary Table S6). On the other hand, deletion of rtt106 appeared only to have a marginal impact on the proteins associated with Asf1-TAP by mChIP-MS (Supplementary Table S6). These finding are consistent with our view that HIR functions upstream of Asf1 and Rtt106, whereas Rtt106 functions downstream of both HIR and Asf1 (Fillingham et al, 2009).
Asf1 was previously shown to be required for the acetylation of lysine 56 of histone H3 (H3K56Ac) (Recht et al, 2006). Tests were performed to define how this histone mark affects the network of associated proteins with Asf1. To do so, mChIP-MS purifications of Asf1-TAP in strains lacking RTT109 (the sole KAT responsible for H3K56Ac) were performed (Figure 7A). In this background, a slight reduction in the Asf1-associated protein network was observed (Supplementary Table S6). Interestingly, the number of Rtt106 peptides sequenced by MS (an indication of protein concentration) was significantly reduced in the rtt109Δ background. This reduction was not observed in a strain lacking vps75 (Figure 7B), a chaperone previously shown to stabilize Rtt109, but not known to affect the levels of H3K56Ac (Fillingham et al, 2008). This observation points to an important role for H3K56Ac in the interaction between Rtt106 and Asf1. The mChIP-MS of Asf1-TAP in a strain where all histone H3 proteins contained the K56R mutation also exhibited a lower number of Rtt106 peptides (Figure 7B). This supports the notion of a reduced association between Asf1 and Rtt106 in the absence of H3K56Ac. Previous work has shown that H3K56Ac (catalyzed by Rtt109) greatly increases the affinity of Rtt106 for H3-H4 and promotes Rtt106-based replication-coupled nucleosome assembly (Li et al, 2008). In addition, we have demonstrated that Rtt106 binds to the HTA1-HTB1 divergent promoter and enables proper replication-independent nucleosome assembly (Fillingham et al, 2009). Using ChIP, we tested whether H3K56Ac affected Rtt106 binding to the HTA1-HTB1 promoter (Figure 7C). Consistent with our mChIP, conventional ChIP revealed that Rtt106 binding to this promoter is reduced in the absence of rtt109, the enzyme responsible for the H3K56Ac mark (Figure 7D). This reduced binding of Rtt106 to the HTA1-HTB1 promoter is also observed in a H3K56R strain background (Figure 7E). Taken together, these pieces of data support a model where Rtt106 interacts with chromatin via Asf1/HIR in most cases. Moreover, its association with chromatin is more prominent when histone H3/H4 is previously acetylated at K56, which results in the proper assembly of nucleosomes at the HTA1-HTB1 promoter.
The solubility and stability of protein complexes can be a problem in AP-MS experiments. The spindle pole body is a very large macromolecule between 300 and 500 MDa and acts as the only microtubule organizing center in budding yeast (reviewed in Jaspersen and Winey, 2004). Not surprisingly, such a macromolecule is refractory to common AP-MS protocols, forcing traditional biochemical approaches to be employed. For instance, classical purification of the spindle pole body by gradient centrifugation followed by MS analysis revealed most of the components of this large organelle. Unfortunately, this technique is not suitable for large-scale studies because it requires extensive manipulations and 40 l of yeast culture (Wigge et al, 1998). As part of our study, several proteins associated with the spindle pole body were used as baits for mChIP-MS. They were successfully analyzed without the need for further optimization of the method (Figure 3D). Interestingly, some proteins not previously linked to the spindle pole body (e.g., the putative lysine methyltransferase Set5 or the poorly characterized peptidyl proline isomerase Fpr2) were found to co-purify with spindle pole body components by mChIP (Supplementary Table S4). Numerous spindle pole body components are phosphorylated and those PTMs are essential for proper spindle pole body function (Donaldson and Kilmartin, 1996; Stirling and Stark, 1996). The mChIP data obtained for the spindle pole body raise the possibility that other PTMs such as lysine methylation might also be critical for this organelle's function.
We aimed to characterize chromatin-associated proteins that were poorly studied by AP-MS. In doing so, however, we were also successful in studying proteins that possess other functions. One such example is Crp1, a poorly characterized nuclear protein (Huh et al, 2003) reported to bind cruciform DNA (Rass and Kemper, 2002). The known interaction partners of Crp1 include Pep4 and Prc1, which are two proteins involved in vacuolar degradation (Van Den Hazel et al, 1996). mChIP-MS of Crp1-TAP successfully detected Pep4 and Prc1. In addition, mChIP-MS identified another vacuolar proteinase Prb1, the glycogen synthases Gsy1 and Gsy2, as well as the phosphatase Glc7 and its targeting subunit Pig2 (Figure 3E). Surprisingly, six proteins associated with Crp1-TAP (Glc7, Pep4, Gsy2, Pig2, Htd2 and Prb1) are required for proper glycogen accumulation (Francois and Parrou, 2001), which suggests that Crp1 may have a critical role in this process. Interestingly, Crp1 is an ortholog to the mammalian AMP-activated protein kinase β-2 subunit, which is known to directly bind glycogen and coordinate cellular metabolism in response to energy demands (Polekhina et al, 2003). Mutations in the AMPK genes in human have been reported to result in improper glycogen accumulation and numerous diseases (Arad et al, 2002). Although the exact role of Crp1 in the glycogen synthesis pathway is still undefined, our results clearly reinforce the need for further study of this gene.
In this study, we report the characterization of the protein interactomes of 102 chromatin-associated proteins. This was performed using the mChIP-MS procedure, which we developed to facilitate the purification of chromatin-bound protein networks (Lambert et al, 2009). The application of mChIP-MS to these baits resulted in a substantial increase in the number of nodes in the network, as compared with conventional approaches (Figure 2). Many transcription factors notoriously difficult to study by conventional AP-MS methods were successfully analyzed by mChIP-MS (Supplementary Table S1). An example of this success is demonstrated in our study using mChIP of cell cycle regulators involved in START (Figure 4). In this study, we were able to recapitulate the majority of the protein–protein interactions discovered over the past 10 years for the cell cycle transcription factors SBF and MBF, as well as to considerably expand the network. For instance, the previously hypothesized (Ferrezuelo et al, 2009) association between Mcm1 and Bck2 was re-affirmed with the detection of four unique peptides for Bck2 after Mcm1 mChIP-MS analysis (Supplementary Table S4). Physical interactions between transcription factors are recognized as critical components of their regulation (Walhout, 2006). The ability of the mChIP-MS approach to identify these lower abundance interactions can be attributed to a reduction in sample loss as a consequence of maintaining chromatin in solution, a reduction in the number of processing steps as a consequence of using an efficient single-step AP, and, finally, a fourfold reduction in the mass of cells required per purification. Therefore, the mChIP procedure has proven to be an efficient high-throughput method for studying numerous types of baits associated with chromatin.
By design, our mChIP approach enables the identification of pure protein–protein and chromatin-mediated protein–protein interactions (Lambert et al, 2009). Our final data set contains both direct and indirect protein associations, which produces a more holistic view of these bait interactomes. For instance, extensive literature links the process of histone H2B ubiquitination (requiring the action of Rad6, Bre1 and Lge1; Hwang et al, 2003) to the trimethylation of histone H3 on lysine 4 (H3K4) (performed by the Set1-containing COMPASS complex; Wood et al, 2003). In particular, ubiquitination of histone H2B on lysine 123 was observed only when the E2 ubiquitin ligase Rad6, the E3 ubiquitin ligase Bre1 and their interaction partner, Lge1 (Hwang et al, 2003), were present. Deleting one of these factors resulted in the abrogation of both histone H2B ubiquitination and H3K4 trimethylation (Hwang et al, 2003; Wood et al, 2003). Recent work reported that Swd2, a subunit of the COMPASS complex, is recruited to chromatin in a manner that requires histone H2B ubiquitination (Lu et al, 2007), which suggests a direct physical link between Rad6/Bre1/Lge1, histone H2B ubiquitination and the COMPASS complex. Our mChIP-MS analysis of Bre1 and Lge1 identified COMPASS components, whereas the reciprocal mChIP-MS of Set1 and Swd3 (two COMPASS components) identified Bre1 and Lge1 (Supplementary Table S4). These physical associations are in accordance with the known links between histone H2B ubiquitination and H3K4 trimethylation, and show that the study of large macromolecular complexes containing both direct and indirect associations can be very informative. This is especially relevant in light of recent work that challenged the classical linear view of chromatin architecture in favor of three-dimensional models containing numerous intra- and inter-chromosomal interactions (Fraser, 2006; Schoenfelder et al, 2010a). For instance, the estrogen receptor has been recently shown to cause extensive chromatin looping to bring together gene enhancers and their transcription start sites (Fullwood et al, 2009). More generally, co-regulated genes were also shown to physically interact and to associate with ‘transcription factories', which are regions enriched for highly phosphorylated (i.e., active) RNAPII (Schoenfelder et al, 2010b). It is now clear that chromatin architecture is not random, but rather adopts preferred three-dimensional conformations, which are now being discovered (Duan et al, 2010). Thus, our ability to study protein complexes associated with DNA in their native environment should prove invaluable for the study of chromatin.
mChIP-MS analyses of well-characterized proteins, such as the nucleosome assembly factor Asf1, also revealed numerous novel protein associations. For instance, the association between Asf1 and transcription factors (e.g., Pho2, Pdr1) is likely indirect and is lost in the absence of hir1 (Supplementary Table S6). Interestingly, Asf1 and Pho2 have been previously localized to the PHO5 promoter, and both proteins are essential for proper PHO5 activation (Adkins et al, 2007). Moreover, nucleosome assembly at PHO5 was found to be delayed in the absence of Hir1 (Schermer et al, 2005), which raises the possibility of a direct action of HIR in the association between Asf1 and Pho2. It is unknown whether Pho2, a transcription factor, can directly recruit Asf1 via Hir1 to PHO5 in order to properly evict nucleosomes and thus promote PHO5 expression. More generally, we found that the HIR requirement for mediating Asf1 interactions was reflected in the HIR requirement for Asf1 to recruit the H3/H4 chaperone Rtt106.
Even though most open reading frames in Saccharomyces cerevisiae have been analyzed by AP-MS, our study detected numerous novel protein–protein interactions for many baits associated with chromatin. These discoveries reinforce the need to further analyze protein–protein interactions in model organisms, such as budding yeast, using novel techniques designed for a specific class of baits. For example, proteins associated with membranes would greatly benefit from improved protocols. Going forward we foresee the development of these new protocols, technical improvements of affinity reagents and improved sensitivity of MSs, which will contribute to the detection of many more protein–protein interactions.
All yeast strains and plasmids used in this study are listed in Supplementary Table S7. Growth media and strains were prepared following standard practices. Strains from the TAP collection were obtained from Open Biosystems (Huntsville, AL). Genomic deletions and epitope-tag integrations that were made for this study were designed with PCR-amplified cassettes, as described previously (Longtine et al, 1998; Puig et al, 2001) and confirmed by either PCR analysis or immunoblotting for tag expression.
Modified chromatin immunopurification was performed as per reference (Lambert et al, 2009). Briefly, one-step affinity immunopurification was performed using TAP-tagged proteins and M-270 epoxy Dynabeads (Invitrogen) coated with rabbit IgG (Sigma-Aldrich), according to the manufacturer's instructions. Briefly, 700 ml of cultured yeast cells grown in yeast, peptone, dextrose (YPD) medium to an OD600 of ~1 were pelleted and washed with water. Cells were resuspended in a lysis buffer (100 mM HEPES, pH 8.0, 20 mM magnesium acetate, 10% glycerol (v/v), 10 mM EGTA, 0.1 mM EDTA with fresh protease inhibitors mixture (Roche) and phosphatase inhibitors mixture (Roche)), frozen in liquid nitrogen in small droplets, and lysed using a coffee grinder half-filled with dry ice for 1 min. The dry ice from the ground cells was allowed to evaporate, and the resulting whole-cell extract was sonicated three times for 30 s with at least 1 min on ice between each pulse. Nonidet P-40 was added to a final concentration of 0.4%, and the sample was mixed by hand for 30 s. The extract was gently clarified by centrifugation at 1800 g for 10 min (4°C), and the supernatant was transferred into a fresh tube. In some cases, 75 units of Benzonase (Sigma-Aldrich) were added per ml of protein extract to completely remove DNA. Freshly prepared rabbit IgG-coated Dynabeads were added (200 μl per sample), and the samples were incubated with end-over-end rotation for 3 h at 4°C. Using a Dynal MPC-S magnet (Invitrogen), the beads were collected on the side of the sample tubes, and the supernatant was discarded. The beads were washed three times in fresh tubes by resuspension and transfer in 1 ml of ice-cold wash buffer (100 mM HEPES, pH 7.4, 20 mM magnesium acetate, 10% glycerol (v/v), 10 mM EGTA, 0.1 mM EDTA, 0.5% Nonidet P-40). Finally, the beads were resuspended in 1 ml of elution buffer (0.5 M NH4OH pH >11, 0.5 mM EDTA) and incubated with end-over-end rotation for 20 min at room temperature. The protein eluates were transferred into fresh tubes and were evaporated to dryness using a SpeedVac with no heat. The protein sample was resuspended in 1 × loading buffer (50 mM Tris–HCl, pH 8, 2% SDS, 100 mM DTT, 10% glycerol) and resolved on a NuPAGE 4–12% SDS–PAGE gel, unless mentioned otherwise. For protein visualization, the gels were silver stained or stained with Coomassie blue. For western blot analysis, the proteins were transferred onto a nitrocellulose membrane, blocked in 5% non-fat milk in TBST (20 mM Tris-base, 150 mM NaCl, 0.1% Tween 20), and then probed with anti-TAP (Open Biosystems), anti-H3 (Abcam), anti-H3K56Ac (Upstate), H3K4Me3 (Cell Signalling Technologies), anti-actin (Abcam) or anti-myc antibodies (Roche).
Gel bands were excised, reduced, alkylated, and digested as described previously (Lambert et al, 2009). Briefly, peptide solutions were dried in a SpeedVac and stored at −20°C until the mass spectrometric analysis. LC-MS/MS was performed by dissolving the peptide samples in 5% formic acid and loading them into a 200 μm × 5-cm precolumn packed in-house with 5 μm ReproSil-Pur C18-AQ beads (Dr Maisch HPLC GmbH) using a micro Agilent 1100 HPLC system (Agilent Technologies). The peptides were desalted on line with 95% water, 5% acetonitrile, 0.1% formic acid (v/v) for 10 min at 10 μl/min. The flow rate was then split before the precolumn to produce a flow rate of ~200 nl/min at the column. Following their elution from the precolumn, the peptides were directed to a 75 μm × 5 cm analytical column packed with 5 μm ReproSil-Pur C18-AQ beads. The peptides were eluted using a 1-h gradient (5–80% acetonitrile with 0.1% formic acid) into an LTQ linear ion trap mass spectrometer (Thermo-Electron). MS/MS spectra were acquired in a data-dependant acquisition mode that automatically selected and fragmented the five most intense peaks from each MS spectrum generated. Peak lists were generated from the MS/MS .raw file using Mascot Distiller 188.8.131.52 (Matrix Science) to produce a .mgf file with default parameters, except that for each MS/MS individual peak lists were generated assuming a +2 and a +3 charge. All .mgf files from one sample were merged into a single file and then analyzed and matched to the 6298 S. cerevisiae protein sequences in the SGD (released April 2007), using the Mascot 2.1.04 database search engine (Matrix Science) with trypsin as the digestion enzyme, carbamidomethylation of cysteine as a fixed modification and methionine oxidation as a variable modification. Peptide and MS/MS mass tolerances were set at ±3 and ±0.8 Da, respectively, with one miss-cleavage allowed and the significance threshold set to 0.01 (P>0.01). Finally, an ion score cutoff of 30 was chosen to produce a false-positive rate of <1% in the MS data (Elias et al, 2005). A protein hit required at least two ‘bold red peptides,' i.e., the most logical assignment of the peptide in the database selected. Furthermore, when peptides matched to more than one database entry, only the highest scoring protein was considered.
Manual curation of raw protein association data generated by mChIP-MS was performed using a two-step process. At first, a list of common background contaminant was generated from multiple mChIP-MS experiments from untagged yeast cells. These background proteins are highly abundant and involved in housekeeping roles, such as metabolic processes and ribosomal biogenesis. This list was further supplemented by an exhaustive list of ribosomal proteins curated from the SGD (www.yeastgenome.org) annotated as ‘structural constituent of ribosome' (GO:0003735) (Supplementary Table S3). All ribosomal proteins were added to background contaminant, as ribosomes are large macromolecule and as such, not all subunits were observed in the mChIP-MS untagged controls. These background proteins were removed from all raw mChIP-MS association data.
Next, we applied another curation step designed to remove preys present at high frequency in the mChIP-MS association data but without relevance to chromatin biology. To do so, the number of times that a given prey was detected by mChIP-MS in the complete data set, referred to as ‘mChIP abundance factor', was computed. Then each prey that was observed in three or more mChIP-MS experiments was manually curated based on two additional criteria: molecular function and localization. Molecular functions that were targeted include protein folding, mRNA export, fatty acid biosynthesis, ribosome biogenesis and RNA processing, as well as proteins located to the mitochondria and preribosome. The SGD was used to determine the molecular functions and localization of mChIP preys. In this way, 170 proteins were identified as not relevant to chromatin biology, labeled as non-specific mChIP binders and removed from the final mChIP-MS association data (Supplementary Table S4, bottom table). The resulting curated mChIP-MS data set has been submitted to the IMEx (http://imex.sf.net) consortium through IntAct (Kerrien et al, 2007) and assigned the identifier IM-14085.
Supplementary information, Supplementary Figures S1–9, Supplementary Tables 1, 2, 5 and 7
Supplementary Tables 3, 4 and 6
We thank Anne-Claude Gingras and Wade Dunham for their critical comments. WT, Asf1-myc and Asf1-V94R-myc plasmids were gifts from Jessica Tyler. This work was supported by operating grants (to DF) from the Canada Research Chair program, Canadian Foundation for Innovation and the University of Ottawa and by operating grants (to KB) from the Canadian Cancer Society Research Institute and an early researcher award from the Ontario Government and by a start-up grant (to JF) from Ryerson University. DF is a Canada Research Chair in Proteomics and Systems Biology. KB is a Canada Research Chair in Chemical and Functional Genomics. J-PL was supported by an Ontario graduate scholarship.
Author contributions: J-PL performed the mChIP purifications and MS analyses, analyzed the data and wrote the manuscript. JF performed the ChIP experiments, generated reagents and edited the manuscript in part under the supervision of JG. MS generated reagents under the supervision of KB. KB edited the manuscript and provided reagents. DF directed the project and writing of the manuscript.
The authors declare that they have no conflict of interest.