|Home | About | Journals | Submit | Contact Us | Français|
Reprogramming of somatic cells into iPSCs involves a dramatic reorganization of chromatin. To identify posttranslational histone modifications that change in global abundance during this process, we have applied a quantitative mass-spectrometry-based approach. We found that iPSCs, compared to both the starting fibroblasts and a late reprogramming intermediate (pre-iPSCs), are enriched for histone modifications associated with active chromatin, and depleted for marks of transcriptional elongation and a subset of repressive modifications including H3K9me2/me3. Dissecting the contribution of H3K9methylation to reprogramming, we show that the H3K9methyltransferases Ehmt1, Ehmt2, and Setdb1 regulate global H3K9me2/me3 levels and that their depletion increases iPSC formation from both fibroblasts and pre-iPSCs. Similarly, inhibition of heterochromatin-protein-1γ (Cbx3), a protein known to recognize H3K9methylation, enhances reprogramming. Genome-wide location analysis revealed that Cbx3 predominantly binds active genes in both pre-iPSCs and pluripotent cells but with a strikingly different distribution: in pre-iPSCs, but not in ESCs, Cbx3 associates with active transcriptional start sites, suggesting a developmentally-regulated role for Cbx3 in transcriptional activation. Despite largely non-overlapping functions and the association of Cbx3 with active transcription, the H3K9methyltransferases and Cbx3 both inhibit reprogramming by repressing the pluripotency factor Nanog. Together, our findings demonstrate that Cbx3 and H3K9methylation restrict late reprogramming events, and suggest that a dramatic change in global chromatin character is an epigenetic roadblock for reprogramming.
Reprogramming of somatic cells into iPSCs by overexpression of the transcription factors Oct4, Sox2, Klf4 and cMyc is a fascinating, but inefficient process, with only a small subset of starting cells converting to a pluripotent state after 1–2 weeks1,2. Mechanistic insights into how chromatin regulators and chromatin states control reprogramming are only now beginning to be explored3–11. To gain insight into global chromatin changes that occur during reprogramming to iPSCs, we were interested in quantifying the post-translational modifications (PTMs) of histones. We reasoned that histone PTMs with dramatic changes in global levels during reprogramming could be important for the suppression or promotion of the process.
Investigations of the role of histone PTMs during reprogramming have typically relied on the use of site-specific histone antibodies in immunostaining and chromatin immunoprecipitation (ChIP) experiments3,12–14. While very insightful, these methods are reliant on the availability of cognate antibodies, and epitope recognition can be affected by modifications on neighboring residues or interacting factors. To circumvent these issues, we used label-free mass spectrometry (qMS)-based proteomics15,16 as an alternative approach to quantify alterations in histone PTMs during reprogramming, which is independent of antibodies (Fig S1A).
To begin with, we determined the abundance of acetylation or methylation modifications at lysine (K) residues of histone H3 and H4, at the start and endpoint of reprogramming, i.e. in mouse embryonic fibroblasts (MEFs) and iPSCs derived from these cells. Two iPSC and two MEF lines were subjected to six independent qMS reactions per sample (see Experimental Methods), generating a highly reproducible quantification of histone PTMs in these cell types, which is summarized in Table S1.
We found a wide variation in the abundance of histone acetylation across lysine residues in both histone H3 and H4. Within histone H3, acetylation was most abundant on residues K14 and K23 irrespective of the cell type analyzed (Fig 1A). Acetylation of H3K9, a mark associated with transcriptionally active or bivalent promoters and enhancers17,18, of H3K27, a mark characteristic of active enhancers19, or of H3K56, which overlaps with the binding of OCT4, SOX2, and NANOG in human ESCs20, was present on less than 5% of histone H3 molecules (Fig 1A). These results potentially reflect the preferential association of this acetylation events with regulatory genomic elements compared to a broader chromatin role for H3K14 and H3K23ac. For example, H3K14 acetylation has already been implicated in DNA damage responses21. Similar variations in the extent of acetylation were found for the lysine residues of the N-terminal tail of histone H4 (Fig 1A). Importantly, almost all acetylated lysines on histone H3 and H4 were more prevalent in iPSCs than MEFs (Fig 1A). Since acetylation of lysines has been associated with active chromatin states and transcription17, our findings extend the conclusion that pluripotent cells are more euchromatic than differentiated cells22.
Histone methylation patterns are more complex due to the presence of mono-, di, and tri-methylation states. Similar to acetylation, there is a dramatic variation in the abundance of methylation across lysine residues in MEFs and iPSCs (Fig 1B, S1B). For methylated lysines related to transcriptional repression, such as H3K9 and H3K27, between 60–80% of the respective lysine are methylated in both MEFs and iPSCs, revealing an unexpected coverage of the genome by histones carrying these methylated residues. Methylation at lysine residues known to be associated with enhancers and promoters, for example at H3K4, is much less abundant in both cell types. Surprisingly, methylation associated with transcriptional elongation, particularly H3K36 methylation, is relatively abundant in the genome.
Analyzing differences in global histone methylation profiles between iPSCs and MEFs (Fig 1C), we found that H3K79me2 and H3K36me3, two methylation marks associated with transcriptional elongation23,24, and the relatively uncharacterized mark H3K18me125, were the top three marks that are more abundant in MEFs than iPSCs (Fig 1C). Notably, the reduction of global levels of H3K79me2 and H3K36me3 during reprogramming may be important for the generation of iPSCs since the inhibition of Dot1L, the enzyme responsible for H3K79 methylation and overexpression of H3K36me2/me3 demethylases enhance iPSC formation7–9. To better understand the function of H3K18me1, we performed chromatin immunoprecipitation with an antibody specific for H3K18me1 (Fig S1C) in combination with promoter microarrays. We found that H3K18me1 is enriched in coding regions with a pattern similar to that of H3K79me2 (Fig 1D). These findings identify the association of a previously uncharacterized histone modification with transcriptional elongation. They also indicate that the most downregulated histone modifications during reprogramming are all linked to transcriptional elongation which is surprising given that pluripotent cells have been argued to be transcriptionally more permissive compared to differentiated cells26. Therefore, these modifications may have different functions in pluripotent and differentiated cells, which will be interesting to study in the future.
Among methylation marks associated with transcriptional silencing, H3K27 methylation states were not very different between iPSCs and MEFs; H3K9me2 and H3K9me3 levels were higher in MEFs than iPSCs; and H4K20me3 and H4K20me1 were more abundant in iPSCs than MEFs (Fig 1B–C, S1B). We also noted a strong increase in unmethylated H3K9 and H4K20 residues from MEFs to iPSCs that was higher than that of any methylation mark (Fig 1C), suggesting that the unmethylated state of these lysine residues is an important feature of the pluripotent state. Together, these data indicate that not all repressive methylation histone marks are depleted in iPSCs compared to MEFs, although pluripotent cells have a more euchromatic character compared to MEFs. We conclude that H3K9 and H4K20 methylation, in contrast to H3K27 methylation, partly exert their effects on reprogramming by modulation of their global levels.
Histones can be modified simultaneously on multiple amino acids. qMS offers arguably the only approach to exactly quantify combinations of PTMs that occur on the same peptide. We therefore examined the combination of acetylation and methylation that occurred within each tryptic histone peptide (i.e. in the peptides of H3 containing either K9/K14, K18/K23, or K27/K36, or in the H4 peptide carrying K5/K8/K12/K16) (Fig 1E, S1D–F). All examined histone peptides accommodate different modifications at neighboring amino acids, highlighting the complex control of histone modifications and functional output. For the H3-K9/K14 peptide, we found that repressive K9 methylation marks and the activating K14ac were often present on the same histone molecule (Fig 1E). Furthermore, although total levels of H3K14ac were similar between iPSCs and MEFs (Fig 1A), H3K14ac was significantly higher in iPSCs than MEFs when H3K9 was unmodified or acetylated on the same histone molecule (Fig 1E), indicating that modifications on the K9 residue affect the acetylation status of K14 in a cell type-specific manner. In addition, the unmodified form of the H3K9/K14 peptide (K9un/K14un) was the most prevalent isoform of this peptide in iPSCs, perhaps enabling the rapid acquisition of various modifications in response to differentiation cues (Fig 1E).
The differences in global levels of histone PTMs between MEFs and iPSCs prompted us to determine when they occur during reprogramming, and whether iPSCs are similar to ESCs in their global histone PTM profile. Although reprogramming is inefficient, intermediate stages of the process have been described1,3,11,27–29. To examine the global chromatin state in a late intermediate of reprogramming, we took advantage of pre-iPSCs. pre-iPSCs can be isolated from reprogramming cultures as a clonal population of cells with an ESC-like morphology that have efficiently repressed the somatic gene expression program but lack the expression of most pluripotency factors.3,27,30. These cells are commonly obtained when reprogramming is induced with retrovirally expressed Oct4, Sox2, Klf4, and cMyc3,27,30. We reasoned that the analysis of the histone PTM profile in pre-iPSC lines would allow us to determine when global chromatin character changes occur in the reprogramming process relative to known transcriptional changes. Hence, we performed label-free qMS analysis for histone PTMs on one male ESC line, one male and one female pre-iPSC line, with least five replicate qMS data sets per cell line, and compared them to the iPSC and MEF data. Quantitative differences in histone PTMs between two cell types were confirmed directly by using chemical stable isotope labeling and subsequent mixing of the histone samples from the two cell types before MS analysis16 (Fig S2).
Unsupervised hierarchical clustering of histone PTM levels for all replicate datasets, based on combinations of modifications per tryptic histone peptide (summarized in Table S1), demonstrated that the global chromatin character of pre-iPSCs is similar to that of MEFs (Fig 2A). Furthermore, pre-iPSCs and MEFs cluster away from both ESCs and iPSCs, which in turn are more related to each other in chromatin state (Fig 2A). In pre-iPSCs a few histone PTMs were present at an intermediate level between iPSCs and MEFs, as for instance H3K18ac/K23ac (Fig 2B), or less abundant than in any other cell lines, as for instance H4K5acK8acK12acK16un (Fig 2C). Irrespective of these differences, a shift toward the pluripotency profile is not evident for the majority of histone PTMs in pre-iPSCs. Together these findings demonstrate that ESCs and iPSCs share a similar global histone modification profile revealing a pluripotency-specific global chromatin structure. Furthermore, the transition from the MEF-like to the pluripotency-specific global chromatin character occurs late in reprogramming, after the state represented by pre-iPSCs, rather than gradually throughout the entire reprogramming process. Based on these data, we propose that the establishment of the global pluripotency-specific chromatin state constitutes an epigenetic barrier that contributes to the reprogramming block encountered by pre-iPSCs and the low overall efficiency of reprogramming to iPSCs.
Our qMS approach demonstrated that pre-iPSCs and MEFs are more enriched for repressive H3K9 methylation marks than iPSCs. We therefore directed our efforts on further deconvoluting reprogramming barriers associated with the H3K9 site. Specifically, we asked whether the depletion of the writers, the histone methyltransferases (HMTases) Ehmt1, Ehmt2, and Setdb1, or the readers, the heterochromatin protein 1 (HP1) family members Cbx1, Cbx3, and Cbx5, or overexpression of the H3K9 demethylase Jmjd2c, could modulate the efficiency of reprogramming. HP1 proteins are small proteins that have been shown to bind specifically to methylated histone H3K9 via their chromodomain in biochemical assays31,32. However, although initially identified as evolutionarily conserved regulators of heterochromatin formation, recent progress suggests additional roles for HP1 proteins in the regulation of active gene expression in euchromatic regions 33,34.
Since the knockout of some of these enzymes, like Setdb1, is known to lead to loss of pluripotency35,36, we only transiently depleted them during reprogramming by means of siRNA-mediated knockdown. Reprogramming was induced by overexpression of Oct4, Sox2, and Klf4, i.e. in the absence of ectopic cMyc, in MEFs containing GFP reporters linked to pluripotency promoters (Oct4 or Nanog). Efficient knockdown for all target genes was confirmed in each reprogramming experiment (Fig S3A). The depletion of any of the three H3K9 HMTases consistently increased the number of Oct4-GFP positive colonies at least two-fold, and simultaneous knockdown of all three HMTases (hence forth called 3XHMT) enhanced reprogramming even more efficiently (Fig 3Ai, S3Ai). GFP-positive colonies also expressed the endogenous pluripotency marker Esrrb indicating that the increase in colony number did not simply represent GFP-reporter activation (Fig S3B). Depletion of Cbx3 consistently increased the number of Oct4-GFP-positive colonies, while the effects of Cbx1 and Cbx5 interference were positive but milder and not always reproducible (Fig 3Aii, S3Aii). 3XHMT or Cbx3 knockdown also had a positive effect on reprogramming when c-Myc was included as a reprogramming factor, and reprogrammed colonies appeared at least a day earlier than under control conditions, indicating an improvement of the kinetics of the process (Fig 3B). In addition, overexpression of Jmjd2c, a H3K9me2/me3 demethylase38 important for the maintenance of the pluripotent state39 enhanced reprogramming about two–fold (Fig 3C, S3Aiii). Together, these results extend recent findings that the H3K9 methylation machinery impairs reprogramming to iPSCs and cell fusion-mediated gain of pluripotency9–11,37, and indicate that not only the H3K9 methylation enzymes but also members of the HP1 family act as critical barriers of reprogramming.
To more directly address whether the reprogramming phenotypes observed above are linked to late reprogramming steps, we next tested whether pluripotency could be induced in pre-iPSCs by modulating HP1 or H3K9-HMTase levels. Knockdown of any of the three H3K9-HMTases or of any of the HP1 proteins in pre-iPSCs led to an increase in the number of Nanog-GFP-positive cells (Fig 4Ai). Among the HP1 family members the promoting effect was most pronounced for Cbx3 depletion (Fig 4Ai). Cbx3 knockdown enhanced the appearance of GFP-positive colonies with an ESC-like morphology to a similar extent as the knockdown of all three H3K9 HMTases together (Fig 4Aii, S3C). However, combined knockdown of Cbx3 and 3XHMT did not enhance colony formation further (Fig 4Aii), suggesting that at least some of the events regulated by Cbx3 and the HMTases during reprogramming are overlapping. GFP-positive colonies isolated from the Cbx3 siRNA-treated pre-iPSC cultures were expanded and displayed high expression levels of pluripotency markers such as Esrrb and Nanog (Fig 4B) and silencing of the retrovirally-encoded Oct4 and Sox2 transgenes (Fig S3D), satisfying hallmarks of pluripotency. The finding that decreasing levels of Cbx3 or H3K9 HMTases enhances iPSC formation from pre-iPSCs indicates that these proteins constitute a barrier to late reprogramming events.
Next, we performed qMS analysis of histone PTMs in pre-iPSCs three days after initiation of 3XHMT or Cbx3 knockdown to gain insight into the molecular mechanisms of how these regulators promote late reprogramming events (MS data are summarized in Table S2). We reasoned that the analysis of histone PTMs shortly after initiation of knockdown but before Nanog-GFP expression was detectable (which first appeared seven days after the introduction of siRNAs (Fig S3C)), would reveal direct effects on global chromatin character due to depletion of the factors.
Upon 3XHMT knockdown, H3K9me2 levels decreased strongly, irrespective of the presence or absence of H3K14 acetylation on the same histone molecule, while H3K9me3 decreased only when K14 was acetylated as well (Fig 5A), indicating that the 3XHMT knockdown has specific effects in the context of combinatorial histone modifications. The decrease in H3K9me2/me3 was accompanied by a significant gain in the four unmethylated isoforms of the H3K9/K14 peptide: K9un/K14un, K9un/K14ac, K9ac/K14un, and K9ac/K14ac (Fig 5A). These changes in global chromatin state on the H3K9/K14 peptide trended towards the pattern seen in iPSCs (compare Fig 5A with Fig 1E). With the exception of a few low abundance histone PTMs, we detected no other dramatic changes in the histone PTM profile upon 3XHMT knockdown (Table S2), indicating that the PTM state of the H3K9 site does not immediately affect the majority of other histone PTMs. Upon Cbx3 knockdown most isoforms of the H3K9/K14 peptide did not change significantly in abundance, except those containing H3K9ac with and without K14ac (Fig 5B). We conclude that Ehmt1, Ehmt2, and Setdb1, but not Cbx3, directly contribute to the regulation of global H3K9me2/me3 levels in pre-iPSCs, and that a change in global H3K9me levels itself is not sufficient for the induction of pluripotency as additional time in culture is required for the efficient activation of the pluripotency network.
We also performed genome-wide transcriptional profiling on pre-iPSCs three days after transfection of the siRNAs, to further understand the role of the H3K9-HMTases and Cbx3 in reprogramming. Relatively few genes were differentially expressed in pre-iPSCs depleted for 3XHMT or Cbx3 (3XHMT siRNA: 222 genes 1.5-fold up and 261 genes 1.5-fold down; Cbx3 siRNA: 352 genes 1.5-fold up and 368 genes 1.5-fold down), and about a fifth of the up- and downregulated genes, respectively, changed their expression in the same direction between Cbx3 or 3XHMT knockdown (Fig 6A, Table S3). Further analysis demonstrated that the 3XHMT knockdown drives the gene expression program of pre-iPSCs more strongly towards the iPSCs expression pattern than Cbx3 depletion (Fig 6B). Accordingly, genes upregulated upon 3XHMT knockdown are more highly expressed in pluripotent cells than pre-iPSCs and downregulated genes are significantly lower expressed in pluripotent cells than pre-iPSCs (Fig 6C). For the Cbx3 knockdown this trend was only seen for the downregulated genes (Fig 6C). Interestingly, the 56 genes downregulated both in the 3XHMT and Cbx3 knockdown included Tgfβ2 and 49 of these genes were also expressed at lower levels in iPSCs than pre-iPSCs (Fig 6A), suggesting that the suppression of these genes may be important for the reprogramming enhancement observed upon these knockdowns. Consistent with this, TGFβ signaling is already known to inhibit reprogramming40,41. Inspecting the differentially expressed genes for other critical regulators of reprogramming, we found the pluripotency factor Nanog to be among the most upregulated genes in 3XHMT and Cbx3 depleted pre-iPSCs (Fig 6A–B; 7- and 10-fold up, respectively), which has been shown previously shown to be essential for the final reprogramming stage and to enhance reprogramming when overexpressed42,43. Additional pluripotency factors including Gdf3, Zfp42, Dppa4, and Lin28 were among the upregulated genes in pre-iPSCs specifically upon the 3xHMT knockdown (Fig 6A). We conclude that depletion of Cbx3 or 3XHMT yields partially overlapping gene expression changes in pre-iPSCs that converge on the induction of Nanog and the downregulation of genes that become reduced during the transition to the pluripotent state. Our finding the Cbx3 and 3XHMT knockdowns are not additive in their reprogramming enhancement (Fig 4B) is consistent with the idea that these knockdowns may enhance iPSC formation via overlapping transcriptional responses.
We therefore explored the contribution of Nanog upregulation to the reprogramming enhancement upon 3xHMT or Cbx3 knockdown. Since loss-of-function of Nanog prevents the establishment of iPSCs42, we did not test the consequence of 3XHMT or Cbx3 knockdown in pre-iPSCs lacking Nanog, but instead combined the knockdowns with Nanog overexpression. pre-iPSCs carrying a doxycycline-inducible Nanog transgene were transfected with siRNAs targeting 3XHMT or Cbx3 (Fig 6Di). Immunostaining indicated that over 90% of the infected cells expressed Nanog upon addition of doxycycline, and no expression in the absence of doxycycline (Fig 6Dii). By itself, overexpression of Nanog resulted in a strong induction of reprogrammed colonies similar to that seen upon 3XHMT or Cbx3 knockdown, indicating that high Nanog levels can efficiently convert our pre-iPSCs to iPSC (Fig 6Diii). Importantly, the 3XHMT or Cbx3 knockdowns only conferred a further two-fold enhancement in iPSC colony formation to Nanog expressing pre-iPSC (Fig 6Diii), consistent with the interpretation that Nanog upregulation is a key downstream event in the enhancement of reprogramming upon 3XHMT or Cbx3 depletion.
Our data predicted that Nanog is a target of Cbx3 and H3K9 methylation during reprogramming. To this end, we determined the genomic Cbx3 binding sites in pre-iPSCs and ESCs using ChIP-seq (Table S4). For each cell type, data from two biological replicates were merged for further analysis because they correlated well. We found that Cbx3 occupies regions upstream of the transcriptional start site (TSS) at the repressed Nanog locus in pre-iPSCs, overlapping with known upstream regulatory sites. In ESCs, where Nanog is strongly expressed, Cbx3 binding is absent from these regulatory regions (Fig 7A). Given that Nanog is the most upregulated gene upon Cbx3 knockdown in pre-iPSCs, these data indicate that Cbx3 directly represses Nanog in pre-iPSCs. Notably, the genomic region upstream of the TSS of Nanog is also enriched for H3K9me3 in pre-iPSCs, partially overlapping with Cbx3 occupancy (Fig S4A), in agreement with findings that showed H3K9 methylation at the Nanog promoter in differentiating ESCs39. Together, these findings indicate that Cbx3 functions together with H3K9me3 to repress Nanog in the reprogramming process, strengthening our conclusion that Nanog is an important downstream target on which Cbx3 and the regulation of H3K9 methylation converge during reprogramming.
Although the Nanog locus is bound by Cbx3 at upstream regulatory regions in pre-iPSCs, genome-wide Cbx3 predominantly occupies genic regions in both pre-iPSCs and ESCs (Fig 7B). The number of target genes with significant Cbx3 enrichment was dramatically lower in ESCs compared to pre-iPSCs (Fig S4B), and, within genes, Cbx3 displayed a distinct binding pattern between these two cell types (Fig 7C–E). Specifically, in ESCs, Cbx3 binding within genes increases from the TSS throughout the gene to the 3′ end, a pattern consistent with recent reports on Cbx3 binding in cancer cell lines34. Most of the target genes of Cbx3 in ESCs are also bound in pre-iPSCs (Fig S4B), typically associate with Cbx3 across the gene body (Fig 7E), and function in ribosome biogenesis, gene expression, and nucleosome assembly based on GO analysis. However, in pre-iPSCs Cbx3 additionally occupies the TSS regions of a large number of genes. Despite the different binding pattern, Cbx3-bound genes are on average significantly higher expressed than their unbound counterparts in both cell types (Fig 7F). Grouping Cbx3-bound genes in ESCs and pre-iPSCs into expression tiers also demonstrated that strong gene body and TSS occupancy of Cbx3 favors highly expressed genes (Fig S4C–D). Based on these findings and published reports34, we conclude that Cbx3 typically associates with gene bodies of highly transcribed genes. In addition our data uncover an unexpectedly strong association of Cbx3 with the TSS of expressed genes specifically in pre-iPSCs, which we also observed in early reprogramming intermediates.
Despite the widespread binding of Cbx3 to actively transcribed genes in pre-iPSCs only a relatively small number of genes were differentially expressed upon Cbx3 depletion (Fig 6A). A greater proportion of downregulated than upregulated genes was directly bound by Cbx3 (256 and 196 genes, respectively). Therefore, Cbx3 can both positively and negatively affect its target genes, likely depending on the exact context of its binding. Consistent with this, upregulated genes have slightly more pronounced binding of Cbx3 at the TSS and upstream promoter region than downregulated genes (Fig 7G).
Exploring the nature of the TSS enrichment in pre-iPSCs further, we found that the Mediator co-activator complex, which is part of the RNA polymerase II preinitiation complex (PIC)44, mimics the Cbx3 binding pattern at the TSS (Fig 8A). The association of Cbx3 with the TSS in pre-iPSCs but not ESCs suggested that Cbx3 and Mediator might interact in differentiated cells but not in pluripotent cells. To test this hypothesis, we immunoprecipitated the Mediator complex from ESCs and ESC-derived differentiated cells, taking advantage of a tetracycline-inducible transgene encoding a Flag-tagged subunit of Mediator (Flag-Med29), and determined whether Cbx3 could be detected in the immunoprecipitate (Fig 8B). The results showed that Med29 can co-precipitate Cbx3 much more strongly in differentiated cells than ESCs, indicating that the interaction of Mediator and Cbx3 is regulated in a cell type-dependent fashion. We also studied the recruitment of Cbx3 to the PIC in vitro, employing nuclear extract, the model transcriptional activator GAL4-VP16, and an immobilized GAL4-responsive DNA template. The results show that Cbx3 is recruited from nuclear extracts to the chromatin template, but only upon activator addition, similar to Mediator (Fig 8C). These data suggest that Cbx3 is recruited to the TSS of transcribed genes in an activator-dependent manner by associating with the PIC, and that specifically in differentiated cells. It is conceivable that differential posttranslational modifications, the presence of different splice forms of Mediator subunits, or a different chromatin composition at the TSS in ESCs versus differentiated cells45 could explain the cell type specificity of the Cbx3 association with the PIC at the TSS. Since the association of Cbx3 with the TSS in pre-iPSCs appears to occur independently of H3K9me3 (Fig S4E), the interaction with PIC components is likely responsible for the Cbx3 recruitment to the TSS. Taken together, our findings reveal a function of Cbx3 in the context of transcriptional regulation at the TSS that appears to be specific for differentiated cells and reprogramming intermediates, as Cbx3 occupancy at the TSS is lacking in pluripotent cells.
In summary, our qMS approach has yielded a quantitative and comprehensive analysis of global histone PTMs in differentiated and pluripotent cells, and during reprogramming. Our data indicate that the global histone PTM profile changes late in the reprogramming process, which may be associated with the efficient activation of the pluripotency network, major replication timing changes within early embryonic genes46, and the reactivation of the inactive X chromosome47. We speculate that global chromatin and replication-timing reorganization are key aspects of the final reprogramming stage, required for establishment of the self-sustaining pluripotency network.
The quantitative repository of global histone modification changes generated here can be used as a starting point for further dissection of reprogramming roadblocks and epigenetic differences between pluripotent and differentiated cells. Based on this idea, we determined the role of H3K9 methylation during reprogramming by analyzing proteins involved in the regulation of this histone modification. Despite the fact that our work reveals various distinct functions of Cbx3 and the methyltransferases Ehmt1, Ehmt2, and Setdb1 during reprogramming, reflected in the differential control of global H3K9 methylation levels or association with the basic transcriptional machinery, the removal of the three H3K9-HMTases or Cbx3 elicited partially overlapping transcriptional responses including the reactivation of the silent Nanog locus. Our data suggest that these common expression changes cause reprogramming enhancement.
By examining the role of Cbx3, we found a remarkable switch in the location of Cbx3 between pluripotent and non-pluripotent cells, and a physical association of Cbx3 with the Mediator complex that could be responsible for targeting Cbx3 to the TSS in non-pluripotent cells. Since only a small subset of its target genes become differentially expressed in pre-iPSCs upon knockdown of Cbx3, we speculate that Cbx3 binding at the TSS mediates more subtle functions and acts perhaps to maintain or restore nucleosome density near the promoter during transcription to protect the transcribed DNA. Alternatively, different HP1 family members may also act at the TSS and mask the effects of Cbx3. Consistent with this, another HP1 family member, HP1α, has been shown to localize to the TSS of transcribed genes in Drosophila48, suggesting an evolutionarily conserved role of HP1 family members in the PIC. Understanding the mechanistic basis for the switch in Cbx3 localization between pluripotent and non-pluripotent cells will reveal further insight into the nature of the pluripotent state.
Cell pellets were lysed, nuclei isolated, and histones extracted as previously described16. For each sample, approximately 100 μg of extracted histones were re-suspended in 30μL of 100mM ammonium bicarbonate, pH 8.0. Chemical propionylation derivatization, digestion and desalting of histones was performed as described25, except that histones were digested for 6 hours. We performed both label free and isotopically labeled peptide relative quantification. For isotopically labeled peptide comparative MS analysis, d0- and d10-propionic anhydride were used as previously described15. All proteomics data are available at the Stem Cell Omics Repository at http://scor.chem.wisc.edu/.
Samples were analyzed by LC-MS and MS/MS as described15. In brief, digested samples were loaded by an Eksigent AS2 autosampler onto 75 μm ID fused silica capillary columns packed with 12 cm of C18-reversed phase resin (Magic C18, 5 μm particles, Michrom BioResources), constructed with an electrospray ionization tip. Peptides were separated by nanoflowLC and introduced into a hybrid linear quadrupole ion trap-Orbitrap mass spectrometer (ThermoElectron, San Jose, CA), and resolved with a gradient from 5 to 35% Buffer B in a 110-min gradient (Buffer A: 0.1 M acetic acid, Buffer B: 70% acetonitrile in 0.1 M acetic acid) with a flow rate of 150 nl/min on an Agilent 1200 binary HPLC system. The Orbitrap was operated in data-dependent mode essentially as previously described15. Relative abundances of peptide species were calculated by chromatographic peak integration of full MS scans using an in-house developed computer program. Peptide identity and modifications were verified by manual inspection of MS/MS spectra. Cluster 3.0 was used to create hierarchical clustering of ratio data and Java Treeview for visualization of the output.
The following cell lines were used for histone PTM qMS analysis in Figures 1 and and2:2: a female iPSC (2D4) line generated by retroviral expression of Oct4, Sox2, c-Myc, and Klf413; a male iPSC line (C3) obtained upon retroviral expression of Oct4, Sox2, and Klf4 (i.e. in the absence of cMyc); a female pre-iPSC line (1A2)13 and a male pre-iPSC line (12-1) both obtained upon retroviral expression of Oct4, Sox2, Klf4, and cMyc in Nanog-GFP reporter MEFs. In addition, we used the male ESC line V6.5, and male and female wild-type MEFs from d13.5 embryos. ESCs, iPSCs, and pre-iPSCs were grown in standard mouse ESC media and MEFs in the same media lacking LIF.
Reprogramming experiments were carried out from Oct4-GFP49 or Nanog GFP13 reporter MEFs using pMX retroviruses encoding Oct4, Sox2, and Klf4 as described previously27 and conducted in media containing 15% serum (FBS). MEFs containing a single polycistronic, tet-inducible cassette carrying the four reprogramming factors Oct4, Sox2, Klf4, and cMyc in the Col1A locus, the tet-transactivator M2rtTA in the R26 locus, and the Oct4-GFP reporter, were generated as described50, and induced to reprogram with 2ug/ml doxycycline. Reprogramming was scored by counting the number of GFP-positive ESCs-like colonies at indicated days. All reprogramming experiments from fibroblasts and pre-iPSC experiments were done in biological triplicates, and for each figure, error bars represent standard deviation from two technical replicates of a representative experiment. For pre-iPSC experiments, reprogramming to iPSCs upon siRNA knockdown was assessed by counting Nanog-GFP-positive colonies or quantifying GFP-positive cells by FACS at indicated days. For FACS analysis, 12-1 pre-iPSCs were harvested with trypsin, passed through a 40um cell strainer to obtain single-cell suspensions and analyzed on a LSR cytometer (BD Biosciences). Data were analyzed using the FlowJo software (TreeStar). Fuw-tetO-loxp-mNANOG was created by ligation-independent cloning (Infusion, Clontech Mountain View, CA) by digesting the vector Fuw-tetO-loxp-hKLF4 (Addgene 20727) with EcoRI and PCR-amplifying mouse Nanog from pMX-mNANOG (Addgene 13354). The vector was cotransfected with pCMV-delta8.9 and pCAGS-VSVg (Generous gift from Dr. Donald Kohn, UCLA) into 293T cells and viral conditioned media harvested 48 hours post transfection in serum free media (Ultraculture, Lonza). Expression was confirmed by immunostaining. For over expression, Jmjd2c was cloned from a PCR product using gene specific primers (Forward: ATGCGAATTCATGGAGGTGGTGGAGGTG, Reverse: ATGCGCGGCCGCCTACTGTCTCTTCTGACA) into the EcoRI and NotI sites of pMX vector. Expression was confirmed by qRT-PCR performed with gene specific primers RNA three days after transduction (For-GGCCATGGAAGTAACCTTGA, Rev-GAGGCTTACCAAGTGGATGG).
Sets of four different siRNAs were purchased from Dharmacon and transfected using lipofectamine–RNAi max (Life technologies) according to manufacturers instructions. Of the set of four siRNAs, the one producing the most efficient knockdown was used in reprogramming experiments at a final concentration of 20uM: Cbx1- MU-060281-01 #2, Cbx3 - MU-044218-01 #2, Cbx5 - MU-040799-01 #2, Setdb1 - MU-040815-01 #4, Ehmt1 - LU-059041-01- #3, Ehmt2 - MU053728-00- #3. For control siRNA treatments, we used the non-targeting Luciferase control- D-001210-02. The timing of siRNA transfections is indicated in each figure. For pre-iPSCs reprogramming experiments, reverse transfection of siRNAs was performed once, on 200,000 cells of the 12-1 pre-iPSC line, plated on gelatin. To test knockdown efficiency during reprogramming, RNA was harvested three days after the first transfection and on day 22, i.e. 3 days after the last transfection, and qRT-PCR was performed with gene-specific primers listed below.
To determine the transcriptional changes upon 3XHMT and Cbx3 knockdown in pre-iPSCs, total RNA was extracted from the pre-iPSC line 12-1 three days after the cells were subjected to transfection with control siRNAs (in biological triplicates), si-Cbx3 (in biological triplicates), or a pool of si-Ehmt1, si-Ehmt2, and si-Setdb1 (in biological duplicates), and analyzed on an Affymetrix GeneChip Mouse Genome 430 2.0 array at the UCLA Clinical Microarray core facility. Quantile normalization was performed using the Affymetrix package (affy) from Bioconductor. To convert probe data into gene expression data, probes ending in “_at” and “_a_at” were averaged for each gene. Data at the probe level were normalized with previously published data for ESCs, iPSCs, and MEFs27. Data are deposited in the GEO database under GSE44084.
RNA-seq was performed using 4 ug of mRNA as starting material from ESCs and pre-IPSCs, using standard illumina RNA-seq library construction protocols. Briefly polyadenylated RNA was purified by two rounds of oligo-dT bead selection followed by divalent cation fragmentation under elevated temperature. Following cDNA synthesis with random hexamers, the double-stranded products were end repaired, a single “A” base was added, and Illumina adaptors were ligated onto the cDNA products. Ligation products with an average size of 300 bp were purified by means of agarose gel electrophoresis. The adaptor ligated single-stranded cDNA was then amplified with 10 cycles of PCR. RNA-Seq libraries were sequenced on Illumina HiSeq 2000. The RPKM (reads per kilobase of exon per million) was then computed for each gene.
ESCs expressing FLAG-Med29 were obtained by targeting the 3xFlag-Med29 under control of a tet-inducible promoter into the ColA1 locus51 in V6.5 ESCs carrying the M2rtTA in the R26 locus. Targeting was confirmed by Southern Blotting. Neural precursors were differentiated from these cells after suspension culture of embryoid bodies for four days, and selection in ITSF media for six days52. Nuclear extract of ESCs and neural precursors were prepared and the purification of protein complexes containing Med29 was performed as in previously described53. PIC assembly was performed from HeLa nuclear extract using the immobilized G5E4T and analyzed by Western blotting as described54. All primary antibodies were used at 1:1000 dilution and secondary antibodies at 1:10000 dilution- anti-FLAG from Sigma (F-1804), anti-Med6 (sc-9434), anti-Cbx3 from Millipore (05690), from SantaCruz anti-Med1 (sc-8998), anti-RBBP5 (Bethyl-A300-109), anti-Cdk8 (sc-1521).
12-1 pre-iPSCs or V6.5 ESCs were chemically cross-linked by the addition of formaldehyde to 1% final concentration for 10 minutes at room temperature, and quenched with 0.125 M final concentration glycine. Cells were washed twice in PBS, re-suspended in sonication buffer (50mM Hepes, 140mM NaCl, 1mM EDTA, 1% TritonX-100, 0.1% Na-deoxycholate, 0.1% SDS), and sonicated with a Diagenode Bioruptor. Cell extracts were incubated with an antibody against Cbx3 (Millipore, 05-690; clone 42s2) or Med1 (sc-8998) overnight at 4°C and immunoprecipitates collected with magnetic beads.
Beads were washed twice with RIPA buffer, low salt buffer (20mM Tris pH 8.1, 150mM NaCl, 2mM EDTA, 1% Triton X-100, 0.1% SDS), high salt buffer (20mM Tris pH 8.1, 500mM NaCl, 2mM EDTA, 1% Triton X-100, 0.1% SDS), LiCl buffer (10mM Tris pH 8.1, 250mM LiCl, 1mM EDTA, 1% deoxycholate, 1% NP-40), and with 1xTE. Reverse cross-linking occurred overnight at 65°C with 1% SDS and proteinase K. Illumina/Solexa sequence preparation, sequencing, and quality control were performed according to Illumina protocols, with the minor modification of limiting the PCR amplification step to 10 cycles.
Reads were mapped to mm9 genome using the Bowtie software and only those reads that aligned to a unique position with no more than two sequence mismatches were retained for further analysis. Significant binding events were called as peaks using MACS2.0 using an FDR of 0.05 and the –broadpeaks setting that allows calling of broader domains. Location analysis of called peaks was performed using the Sole-search tool. Visualization of the ChIP-seq signal around the TSS is provided by heatmaps generated using Java Treeview. Briefly, enrichment is displayed after normalization to 1 million reads and subtraction of normalized input values per 100bp window. Data are deposited in the GEO database under GSE44242.
For ChIP-chip of H3K18me1 and H3K79me2, chromatin fragments (500ug) from V6.5 ESCs were enriched with specific antibodies, labeled and hybridized, along with corresponding input fragments, to an Agilent promoter microarray (Agilent-G4490) that contains the promoter regions of 18,300 annotated mouse genes, encompassing regions 5.5kb upstream to 2.5 kb downstream of the respective transcription start sites (TSS) as described27. The H3K18me1 antibody was generated by N. Mishra, and the H3K79me2 antibody was kindly provided by Michael Grunstein at UCLA. The ChIP-chip data sets for H3K4me3 and RNA PolII data have been previously published5,27. Hybridization onto the arrays, washing, and scanning were done according to manufacturer’s protocols. Average probe signals were extracted in a 500bp window-step-wise manner as described previously27.
We thank Vincent Pasque for critical reading of the manuscript and Dr. Michael Grunstein (UCLA) for providing antibodies. KP is supported by the Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research at UCLA, NIH (DP2OD001686 and P01 GM099134) and CIRM (RN1-00564); RS was supported by the Jonsson Comprehensive Cancer Center, CC by a Leukaemia and Lymphoma Research Grant (10040), GB by the Whitcome Pre-doctoral Training Program, BAG by a National Science Foundation Early Faculty CAREER award and an NIH Innovator award (DP2OD007447), and MC by the NIH (GM074701).
Author contributions: R.S., K.P., and B.A.G. planned the project. R.S. and K.P. wrote the manuscript. The following performed experiments, analyzed and interpreted data: R.S., C.C., G.B., R.M., S.P. under K.P.’s supervision, M.G-C. under B.A.G’s supervision, C.H. under M.C.’s supervision, and D.L. under M.P.’s supervision. N.M. generated H3K18me1 antibody.
Database accession numbers