|Home | About | Journals | Submit | Contact Us | Français|
Mouse embryonic stem (ES) cells grown in serum exhibit greater heterogeneity in morphology and expression of pluripotency factors than ES cells cultured in defined medium with inhibitors of two kinases (Mek and GSK3), a condition known as “2i” postulated to establish a naive ground state. We show that the transcriptome and epigenome profiles of serum- and 2i-grown ES cells are distinct. 2i-treated cells exhibit lower expression of lineage-affiliated genes, reduced prevalence at promoters of the repressive histone modification H3K27me3, and fewer bivalent domains, which are thought to mark genes poised for either up- or downregulation. Nonetheless, serum- and 2i-grown ES cells have similar differentiation potential. Precocious transcription of developmental genes in 2i is restrained by RNA polymerase II promoter-proximal pausing. These findings suggest that transcriptional potentiation and a permissive chromatin context characterize the ground state and that exit from it may not require a metastable intermediate or multilineage priming.
► High-resolution genome-wide transcriptome and epigenome of naive pluripotency ► Reduced H3K27me3 at promoters and fewer bivalent domains in naive ES cells ► Reduced lineage priming and increased RNA polymerase II pausing in the naive state ► Naive ES cells show no delay in differentiation
Mouse embryonic stem (ES) cells are characterized by the potency to generate all somatic and germline lineages in vitro and in chimaeric embryos (Smith, 2001). The nature of the transcriptional and epigenetic machinery that maintains this potential throughout massive in vitro expansion has been the subject of intense investigation (Young, 2011). Interest is further heightened by appreciation that knowledge of the molecular underpinning of mouse ES cells may enable derivation of equivalent human naive pluripotent stem cells (Hanna et al., 2010).
ES cells are described as transcriptionally hyperactive (Efroni et al., 2008). Promiscuous transcription has been suggested to constitute a platform for lineage specification (Loh and Lim, 2011). When taken together with the observation that several pluripotency factors are expressed heterogeneously (Chambers et al., 2007; Niwa et al., 2009; Toyooka et al., 2008), the notion has arisen that pluripotency may inherently be a metastable condition (Graf and Stadtfeld, 2008; Hanna et al., 2009; Hayashi et al., 2008). Attention has also been drawn to colocalization at many promoters of histone 3 lysine 4 trimethylation (H3K4me3), associated with transcriptional activation, and histone 3 lysine 27 trimethylation (H3K27me3), linked with repression (Azuara et al., 2006; Bernstein et al., 2006; Stock et al., 2007). These bivalent domains are posited to be poised for either up- or downregulation and to provide an epigenetic blueprint for lineage determination.
The preceding views are based on analyses of ES cells cultured in serum and therefore subject to uncontrolled multifactorial perturbations. It is now possible to derive and maintain pluripotent mouse ES cells without serum factors by using two small molecule kinase inhibitors (2i) in combination with the cytokine leukemia inhibitory factor (LIF) (Ying et al., 2008). The 2i inhibitors, PD0325901 and CHIR99021, selectively target mitogen-activated protein kinase kinase (Mek) and glycogen synthase kinase-3 (Gsk3), respectively. The inhibitors shield pluripotent cells from differentiation triggers: fibroblast growth factor-4 stimulation of the Mek-Erk pathway and endogenous repressor activity of Tcf3 (Kunath et al., 2007; Wray et al., 2011). Use of 2i has enabled derivation of germline-competent ES cells from all mouse strains tested and for the first time from rats (Buehr et al., 2008; Hanna et al., 2009; Kiyonari et al., 2010; Li et al., 2008; Nichols et al., 2009). 2i thus provides a better-tuned environment for rodent ES cells. Indeed, the mosaic expression of pluripotency factors observed in serum is effectively eliminated in 2i (Wray et al., 2011). Furthermore, culture in 2i may mimic the environment in the mature mouse inner cell mass (ICM) where the fibroblast growth factor receptor is downregulated in the epiblast (Guo et al., 2010).
Here we applied massively parallel sequencing technology to characterize the global transcriptome and to map selected histone modifications in naive mouse ES cells maintained in 2i compared with heterogeneous cultures in serum.
Three ES cell lines derived and maintained in 2i plus LIF (“2i” ES cells) were compared with three ES cell lines established and cultured in serum plus LIF (“serum” ES cells) (Tables S1 and S2). Each cell line is functionally pluripotent as demonstrated by competence to generate high-contribution chimaeras with germline transmission. Expression values from RNA-seq data were calculated by quantifying the number of sequence reads for each gene with standardized RPKM values (reads per kilobase of exon model per million mapped sequence reads). This comparison showed that 1,489 genes have more than 2-fold higher transcript abundance in 2i (p value < 0.2), whereas 1,947 genes exhibit more than 2-fold higher expression in serum (Figure 1A). Moreover, 160 genes expressed in 2i (RPKM > 0.5) were silent in serum (RPKM < 0.2) and 461 genes were expressed only in serum (Figure S1A and Table S3 available online). The majority of categorized stem cell maintenance genes ([SCM] GO:0019827), including validated core pluripotency factors, Pou5f1, Nanog, Sox2, Esrrb, Klf2, Klf4, and Tbx3, are transcribed to similar levels in 2i and serum (Figure 1B and Figure S1B). Nine SCM genes are more highly expressed in 2i (Figure 1B and Figure S1C). Of these, only Tcl1 has been implicated as a regulator of self-renewal (Ivanova et al., 2006), and although this transcript is more abundant in 2i, it is also well-expressed in serum. In serum, 16 SCM genes showed higher expression. Factors in this group, such as c-Myc and the Id proteins, are known to be induced by Erk signaling and by serum. They may confer additional robustness to the pluripotent state to counter differentiation stimuli (Ying et al., 2003a). Interestingly, several of these genes are induced when ICM cells are explanted in medium containing serum (Figure S1D; Tang et al., 2010).
Functional annotation clustering of differentially expressed genes by Gene Ontology ([GO] PANTHER [protein analysis through evolutionary relationships]) and Pathway (KEGG [Kyoto encyclopedia of genes and genomes]) analysis (Figure 1C) revealed that genes upregulated in 2i are highly enriched for terms associated with metabolic processes. This is probably in part a direct consequence of inhibition of Mek and Gsk3 and the absence of serum constituents. Reduced c-Myc may further affect metabolic networks. Major differences are also apparent for genes involved in cell-cycle regulation. Proliferation is similar in the two conditions (Ying et al., 2008), however, reflecting the absence of G1 restriction in ES cells.
Upregulated genes in serum are significantly enriched for GO terms linked to developmental processes, particularly ectoderm and mesoderm germ layer specification (Figure 1C). Genes such as Pax6, T (Brachyury), and Runx1 show very low or undetectable expression in 2i but appreciable transcription in serum (Figure 1D). Other ectoderm and mesoderm marker genes such as Runx3, Sox18, Cdx4, and Tal1 are also activated in serum, although only to low levels. In contrast, several genes associated with the germline or with endoderm are expressed at similar levels in both conditions.
ES cells maintained in 2i are morphologically uniform and rather homogeneous in expression of pluripotency regulators (Wray et al., 2010). In contrast, serum ES cells are heterogeneous in morphology (Figure S2A) and expression of factors such as Nanog, Rex1, Stella, and Klf4 (Chambers et al., 2007; Hayashi et al., 2008; Toyooka et al., 2008; Figure S2B). In serum, ES cells with a mono-allelic green fluorescent protein (GFP) knockin at the Rex1 (Zfp42) locus comprise GFP-positive and -negative populations that can be separated by flow cytometry (Figure S2C; Wray et al., 2011). These populations are functionally distinct. Rex1GFP-positive cells plated in serum generate colonies of undifferentiated cells, whereas Rex1GFP-negative cells produce predominantly small, differentiated colonies (Figure S2D). On plating in 2i, the yield of ES cell colonies in Rex1-positive cells is more than 10 times that in Rex1-negative cells. Rex1-negative cells have therefore largely lost clonogenic self-renewal capacity. Consistent with this, although they express Oct4, they lack Nanog and Klf4 (Figure S2E). All Nanog-positive and almost all Klf4-positive cells are within the Rex1GFP-positive fraction. Expression is still mosaic within this population, reflecting transcriptional fluctuation (Chambers et al., 2007; Kalmar et al., 2009; Figure S2E). In contrast, ES cells in 2i are almost uniformly positive for Rex1, Klf4, and Nanog (Wray et al., 2010; Figure S2B). We examined the transcriptome of the Rex1-positive fraction in serum and found a similar increased expression of a subset of SCM genes as in bulk ES cells (Figure 1B). Some ectoderm- and mesoderm-associated genes were also upregulated compared with expression in 2i ES cells but others showed little induction (Figure 1D). In general, differentiation genes were more highly expressed in Rex1GFP-low cells (Figures S2F–S2H). Nonetheless, transcriptomes of the Rex1-positive compartment in serum show many differences from 2i ES cells (Figure S2I). Notably, many genes that show higher expression in 2i are not upregulated in Rex1-positive serum ES cells (Figure 1B). Therefore it is unlikely that a subpopulation of serum cells persist in an equivalent state to 2i.
To assess whether the distinct gene expression patterns in 2i and serum represent “fixed” transcriptional states, we transferred 2i cells to serum and vice versa. Within two to three passages, cultures adopted morphological characteristics of the new condition. We carried out RNA-seq analyses after eight passages. Most of the SCM genes that showed lower expression in 2i were upregulated in serum, whereas transcripts elevated in 2i were downregulated (Figure 2A red dots and black squares, respectively). Reciprocal behavior was observed when serum cells were transferred to 2i (Figure 2B). Furthermore, genes involved in ectoderm and mesoderm germ layer specification were broadly upregulated after transfer of 2i cells to serum and the reverse transcriptional changes were observed when serum cells were passaged in 2i (Figures 2A and 2B, blue stars). Typical examples are shown in Figure 2C. Irrespective of the direction of exchange between 2i and serum, 818 genes were expressed more highly (>2-fold) in 2i and 1,209 showed higher levels in serum (Figure 2D). GO classification identified developmental genes and cell-cycle control genes as highly enriched upon transfer to serum, whereas genes upregulated in 2i were mainly associated with metabolic categories (Figure 2E). The reciprocity in transcriptome changes demonstrates that the transcriptional profiles are interconvertible.
We performed chromatin immunoprecipitation and deep sequencing (ChIP-seq) to analyze posttranslational histone modifications: H3K4me3 and H3K36me3 associated with active promoters and transcribed genes, respectively; H3K27me3 linked to silencing, and H3K9me3 associated with constitutive heterochromatin and imprinted genes (Table S1). We also analyzed the polycomb repressor complex 2 (PRC2) component Ezh2 that methylates H3K27 (Cao et al., 2002).
Determination of average profiles over 2,000 genes that are most highly expressed in both conditions (Figure 3A and Figure S3A) revealed conventional distribution of H3K4me3 on active promoters and of H3K36me3 extending over the coding body. The H3K9me3 ChIP-seq state maps of 2i and serum cells were nearly identical. In both conditions deposits were most prominent at satellites and imprinted genes (Figures S3B–S3E). As expected, the H3K27me3 mark is absent from actively transcribed loci (Figure 3B). It appears to be mutually exclusive with H3K36me3 (Figures 3A–3C and Figure S3F), in line with recent biochemical data showing that PRC2 activity is inhibited by active marks including H3K36me3 (Schmitges et al., 2011). However, H3K27me3 is widely deposited over intergenic regions and inactive genes at levels appreciably higher than random distribution. This lawn of H3K27me3 is qualitatively and quantitatively similar in 2i and serum ES cells (Figure 3C). A pronounced difference is apparent only at promoters of lowly expressed genes (Figure 3B). The averaged profile of these promoters showed markedly less H3K27me3 in 2i than in serum. This was not reflected in any overall increase in expression (Figure S3H). Three independent 2i ES cell lines exhibited a significantly reduced level of H3K27me3 at the promoters of poorly or nonexpressed genes compared to the level in serum cultures (Figure 3D). To investigate whether differences in H3K27me3 deposition reflected heterogeneity in serum, we performed H3K27me3 ChIP-seq on Rex1-positive and Rex1-negative serum subpopulations. Intriguingly, the H3K27me3 signals were very similar, each resembling the total serum ES cell population (Figure 3E).
Intensity plots covering a region of 5 kb up- and downstream of all promoters that are decorated with H3K27me3 in serum demonstrate the major reduction in H3K27me3 in 2i cells (Figure 3F). Ezh2 levels were also diminished at these locations in 2i. In either condition, the H3K27me3 pattern follows a camelback profile with a depleted region around the transcriptional start site. Ezh2 appears as a single peak centered on this trough. Representative examples of differential H3K27me3 profiles are shown in Figure 3G (Figure S3I shows the PCR validation). The Gata6, Pax9, and Lhx1 genes are barely expressed in either 2i or serum, but in all cases the H3K27me3 signal around the promoter is selectively and greatly reduced in 2i. For the Lhx1 locus, adjacent Aatf provides a contrasting example of a gene that is productively transcribed in both 2i and serum and remains devoid of H3K27me3 in either condition.
Given the interchangeable transcriptome profiles between 2i and serum, we examined the epigenomic landscape in ES cells transferred between the two conditions. Cells taken from 2i into serum acquired substantially elevated H3K27me3 at H3K27me3-associated promoters (Figure 3H), the Hox clusters (Figure S4A), and many other loci (Figure S4B). Conversely, serum ES cells transferred into 2i displayed diminished H3K27me3 at these loci. Ezh2 and Suz12 localization similarly switched between culture conditions (Figure 3H and Figures S4A and S4B). Therefore these epigenomic states are interconvertible.
H3K27me3 was reduced by between 63%–75% over all Hox clusters in 2i (Figure 3G and Figure S4A). The Hoxc locus follows this pattern but with a distinctive variation; in the Hoxc13-c12 region H3K27me3 deposition is lost entirely. This region (boxed in Figure 3G) is transcribed only in 2i. Strand-specific RNA-seq profiling after rRNA depletion revealed two nonoverlapping transcripts on the reverse and forward strand (both boxed in Figure S4C, left). These ncRNAs are distinct from the HOTAIR ncRNA located between HOXC11-12 in human (Rinn et al., 2007). Consistent with recent findings (Guttman et al., 2010), we detected known as well as multiple previously unidentified ncRNAs. Many of these, such as H19, showed differential expression between 2i and serum (Figure S4C and Table S4).
We computed the number of H3K27me3 reads over nonrepetitive regions and plotted the frequency of occurrence and the genomic location. In 2i, high H3K27me3 deposition is scarce with very little enrichment at promoters. In contrast, in serum H3K27me3 is elevated at many genomic locations, 60%–65% of which are promoters (Figures 4A and 4B). In 2i, H3K27me3 is somewhat reduced over long interspersed nuclear element (LINE) repeats (Figure 4C). This is more than offset, however, by much higher levels of H3K27me3 present at satellites. Immunoblotting showed that the total cellular level of H3K27me3 is comparable in 2i and serum (Figure 4D), confirming that the differences at promoters are not secondary to a general reduction in H3K27me3 deposition in 2i.
H3K27me3 is deposited by PRC2 and facilitates recruitment of the PRC1 complex. Transcripts of PRC2 and PRC1 subunits were present at similar levels in 2i and serum (Figure S5A). Transcripts for the H3K27me3 demethylases Kdm6a and Kdm6b (also known as Utx and Jmjd3) were also comparable. Ezh2 immunoblotting indicates slightly lower protein in 2i than in serum (Figure S5B). However, phosphorylation of Ezh2 at Thr345, reported to be important for PRC2 recruitment (Kaneko et al., 2010), is similar (Figure S5C). Collectively, these data suggest that the difference in H3K27me3 occupancy at silent promoters in 2i is not primarily attributable to reduced expression of polycomb nor to altered demethylase expression.
Promoters that are marked by H3K27me3 may also display H3K4me3. Such bivalent genes are thought to be poised for activation (Azuara et al., 2006; Bernstein et al., 2006; Mikkelsen et al., 2007). We binned and ranked promoters according to the read density for H3K27me3 (Figure 5A) measured in serum and assessed whether bivalency is preserved in naive ES cells. Applying similar filters and thresholds as used by Mikkelsen et al. (2007), we classified almost 3,000 genes as bivalent in serum (Figure 5B, upper, and Table S5). In 2i, due to the reduced deposition of H3K27me3, many of these genes fall below the threshold, resulting in less than 1,000 genes that qualify as bivalent (Figure 5B, lower). Intensity plots show the general and pronounced diminution in H3K27me3 deposition, whereas H3K4me3 is only slightly altered (Figure 5C). Figure S6A documents the levels of mRNA, H3K27me3, and H3K4me3 in 2i versus serum. Notably, the profiles interconvert upon switching cells between serum and 2i (Figures S6B and S6C).
In both serum and 2i, the bivalent genes are enriched for involvement in developmental processes. Representative examples are the mesoderm specification marker Hey2 and ectodermal Metrnl (Figure 5D). Transcripts are barely detectable in 2i, although H3K4me3 is present and H3K27me3 is low. In serum, transcription is slightly upregulated even though the promoters show a broad gain of H3K27me3. A significant proportion of genes with bivalent promoter marking (31%) exhibit only background transcription in either condition (RPKM < 0.2). However, 14% of the bivalent genes are serum specific (RPKM > 0.5 in serum; RPKM < 0.2 in 2i), whereas a minor fraction (4%) are expressed only in 2i.
In either serum or 2i, H3K27me3 does not colocalize precisely with H3K4me3 but accumulates on either side of the H3K4me3 peak at the transcription start site (Figure 5C). This is consistent with observations that targets of TrxG proteins, which methylate H3K4, show reduced H3K27 methylation (Papp and Müller, 2006; Srinivasan et al., 2008) and that PRC2 activity is inhibited by active marks, including H3K4me3 (Schmitges et al., 2011). Strikingly RNA polymerase II (Pol II) is evident over transcription start sites at higher levels in 2i than in serum (Figure 5C), suggestive of promoter proximal pausing.
c-Myc is implicated in Pol II pause release (Rahl et al., 2010). We previously noted a very low level of c-Myc protein in 2i (Ying et al., 2008). The RNA-seq data show that c-myc mRNA is 40- to 50-fold lower in 2i than in serum and n-myc and l-myc are also reduced (Figure 1B). We analyzed c-Myc targets that are upregulated in serum (Figure 6A and Figure S7A). Averaged profiles show that promoters of these genes are loaded with H3K4me3 and Pol II in 2i. In serum, Pol II is reduced at the promoters and increased over coding bodies. The Pol II traveling ratio is accordingly increased (Figure 6B), consistent with c-Myc acting as a pause release factor in serum. Typical examples are Npm1 and Ncl (Figure 6C).
We assessed to what extent global differences in transcriptome between 2i and serum might be related to c-Myc. Several differentially expressed genes are c-Myc targets, notably cell-cycle regulators (Figures S7B and S7C). These include cdk/cyclinD components that are positively regulated by Myc and are increased in serum, and conversely cell-cycle inhibitors, p16(Ink4A), p19(Arf), and p21 that are repressed by Myc and upregulated in 2i (Figure S7D). Overall, however, direct c-Myc targets as determined by Chen et al. (2008) represent less than 15% of genes differentially expressed between 2i and serum (Figure 6D). Furthermore, gene ontology classification of c-Myc targets upregulated in serum did not identify categories associated with developmental processes. Therefore Myc is unlikely to be a major determinant of differential expression and metastability.
We evaluated the average histone modification profile of all genes that change expression more than 2-fold in 2i versus serum. Genes upregulated in 2i show the expected increased H3K4me3 deposition at the promoter and higher levels of H3K36me3 in the coding body than in serum (Figure 6E and Figure S7E). The repressive mark H3K27me3 is correspondingly reduced. Upregulated genes in serum also show an increase in H3K36me3 over the coding body but in general do not exhibit a significant change in H3K4me3 deposition. More remarkably, on average they show increased H3K27me3.
We then examined Pol II occupancy at these two groups of genes. This showed that upregulation in 2i is reflected in elevated Pol II over the transcriptional start site as well as the coding body (Figure 6F, left). In contrast, genes upregulated in serum showed increased Pol II over the coding body but also on average a reduced signal at the start site (Figure 6F, right). These features indicate that transcriptional elongation at genes already loaded with Pol II is a widespread mechanism of upregulation in serum. RNA-seq data reveal no overt differences in expression of pTEFb components or known regulators of pausing between 2i and serum (Figure S7F). Regulation at the protein level may therefore control differential promoter proximal pause release in naive and metastable ES cells.
ES cells maintained in either 2i or serum can colonize the mouse embryo, demonstrating that they are functionally pluripotent. However, they differ markedly in transcription of ectodermal and mesodermal specification genes. The precocious transcription of lineage-associated genes, often termed lineage priming (Hu et al., 1997), is posited to poise stem cells for differentiation. We therefore compared the differentiation behavior of ES cells maintained in 2i or serum. We first used a monolayer neural induction protocol with Sox1GFP reporter ES cells to quantify differentiation (Ying et al., 2003b). Although ES cells maintained in serum express several neuroectodermal genes, they were less efficient in generating Sox1GFP-positive neural precursors than ES cells taken from 2i (Figure 7A). This could be due to the presence of cells already biased toward a mesodermal fate in serum. Clearly pre-expression of neural genes in serum does not predispose to this fate. We then used Rex1GFP fractionation to compare ES cell subpopulations in serum with 2i ES cells in embryoid body (EB) differentiation. Rex1GFP-positive cells from serum showed similar behavior to 2i cells (Figure 7B). Downregulation of Nanog and Rex1 was followed by upregulation of the postimplantation epiblast marker Fgf5. From 3 days onward, Fgf5 was downregulated and first T (Brachyury; although very minor as compared to the Rex1-negative cells), then mesoderm and endoderm lineage markers Tbx6, Cxcr4, Sox17, and Gata4 appeared. This order is consistent with the developmental progression from blastocyst to gastrulation. In contrast, Rex1GFP-negative cells exhibit accelerated upregulation of T and Tbx6 consistent with their partial differentiation and the loss of self-renewal (Figure S2D).
High-resolution genome-wide analyses have revealed that culture environments impose distinctive transcriptional and epigenomic properties on mouse ES cells. In total some 13,000 genes are transcribed at above background levels (>0.2 RPKM). The corollary of this is that around half of genes are effectively inactive. Therefore, undifferentiated ES cells do not show promiscuous gene expression or global transcriptional hyperactivity (Efroni et al., 2008). Nonetheless, the pluripotent transcriptome displays a broad bandwidth; more than 25% of active genes show 2-fold or greater differences between 2i and serum. Around 1,400 genes, predominantly associated with metabolic processes, are upregulated in 2i. In contrast, KEGG analysis points to decreased expression in 2i of components that might drive differentiation, such as cell communication, mitogen-activated protein kinase (MAPK), and transforming growth factor β (TGFβ and Wnt) pathways. Most strikingly, many ectodermal and mesodermal specification genes that exhibit significant expression in serum are repressed in 2i. Low to absent lineage-affiliated gene expression indicates that multilineage priming is not an intrinsic feature of self-renewing ES cells. Upregulation of such genes in serum suggests that metastability may be an induced condition rather than an inherent property of pluripotent cells.
Some endodermal genes such as Hex retain low-level expression in 2i. This may reflect the potential to generate extraembryonic endoderm (Canham et al., 2010). High levels of Prdm14, which has been reported to repress extraembyronic endoderm transcription factors (Ma et al., 2011), may prevent full activation of this program.
Importantly, ES cells transferred between 2i and serum switch their transcriptional profile. Thus a significant component of previously described ES cell signatures reflects an induced serum response. However, critical pluripotency factors are transcribed at similar or only slightly higher levels in 2i. The pluripotency repressors Tcf3 (Wray et al., 2011) and components of the NuRD complex (Kaji et al., 2006) are also expressed at comparable levels. A subset of SCM factors are specifically upregulated in serum, including the Id genes that are induced by BMPs or fibronectin and are thought to directly counter the effects of Erk activation (Ying et al., 2003a). Increased Eras, shown to be important for ES cell propagation (Takahashi et al., 2003), and factors such as Sall4, Lin28, and Utf1, may also contribute to reinforcing self-renewal in the face of differentiation stimuli.
The conflict between pluripotency factors and lineage specifiers results in metastability and incipient differentiation in serum. It is suggested that this “precarious balance” (Loh and Lim, 2011) may reflect the circumstance in egg cylinder epiblast cells. However, serum stimulation is an artifactual scenario that may be far from representative of the spatiotemporal precision of inductive stimuli in the embryo. To access postimplantation definitive lineages, ES cells should pass through a phase equivalent to egg cylinder epiblast (Rossant, 2008). Consistent with this, Fgf5 is upregulated in EBs prior to definitive germ layer markers. From 2i ES cells and the Rex1GFP-positive fraction of serum ES cells, this process follows similar kinetics. Therefore although serum induces transcriptional and epigenetic changes and associated metastability, developmental potential within the Rex1-positive compartment is not fundamentally altered. This is substantiated by the capacity of ES cells from either condition to contribute extensively to chimaeras. However, a significant proportion of cells in serum lose expression of Rex1 and of core pluripotency factors such as Nanog and Klf4. They are developmentally more advanced and should be considered distinct from ES cells even though they retain Oct4 (Smith, 2010).
Expression of many genes associated with metabolic and biosynthetic processes is enriched in 2i. This is likely in large part a response to absence of serum constituents, loss of MAPK signaling, and inhibition of GSK3 and indicates that ES cells have adaptable metabolomic capacity. Probably as a consequence of low c-Myc, the cell-cycle inhibitors p16, p19, and p21 are upregulated in 2i, even at the protein level (Figure S7D). Nonetheless, ES cells continue to proliferate rapidly, reflecting their freedom from cyclin checkpoint control (Burdon et al., 2002; Stead et al., 2002). These features can explain the robust expansion of ES cells independent of serum factors and likely underlie their latent tumorigenicity (Chambers and Smith, 2004).
In 2i and serum H3K4me3 peaks are globally similar in number and intensity. In contrast, there is a striking difference in the pattern of H3K27me3 deposition. This mark is present as a lawn across intergenic regions and inactive genes (Figure 3C). However, elevated deposits at promoters of repressed genes are greatly diminished in 2i. The majority of these genes show reduced rather than increased transcription in 2i. This promoter-specific diminution in H3K27me3 is common to multiple ES cell lines. The majority of these genes show reduced rather than increased transcription in 2i. Ezh2 is localized less intensely at promoters in 2i, which may underlie the selective reduction in H3K27me3. Global levels of H3K27me3 are similar in 2i and serum. Indeed, H3K27me3 is increased at satellites in 2i, indicating that these may serve as a sink. Notably, there is no change in H3K9me3 over satellites (Figure S3C).
In 2i only around 1,000 genes have both H3K4me3 and H3K27me3 marks, which argues against bivalency as a master epigenetic blueprint. Nonetheless, most of the remaining bivalent genes can be classified as developmental. In serum, more genes are bivalent due to acquisition of H3K27me3. Surprisingly, this is accompanied by a slight overall increase in expression, although the majority remain silent or transcribed at very low levels (Figure S6A). It is conceivable that although the local levels of PRC2 and H3K27me3 are reduced in 2i, they remain sufficient to repress transcription. It should be noted, however, that ES cells lacking PRC2, PRC1, or both are viable and show derepression of lineage-specific markers to only a low level (Leeb et al., 2010). Our findings are thus in line with genetic evidence that polycomb is not a central mechanism for silencing gene expression in the naive state and only becomes critical during differentiation.
RNA polymerase pausing has been identified by GRO-seq analysis (Min et al., 2011) at variable extents at many genes in ES cells cultured in serum. Our findings indicate that pausing is more prevalent in 2i than serum. Induction of c-Myc in serum may facilitate pause release at some loci. This is consistent with recent evidence that Myc function is unnecessary in naive ES cells but required in serum (Hishida et al., 2011). However, many of the genes whose expression is most markedly upregulated in serum, including germ layer specification factors, are not reported Myc targets. Therefore additional mechanisms are likely to control pause release in ES cells.
In mammals pluripotent cells harbor the germline and most pluripotency factors are also key players in germ cell specification and differentiation. It is interesting therefore that in Caenorhabditis elegans and Drosophila, germline development is dependent on transcriptional pausing mediated at the level of pTEFb antagonism by Pie-1 and Pgc, respectively (Nakamura and Seydoux, 2008). This raises the question of whether naive ES cells might contain an analogous factor that interferes with pTEFb to suppress transcriptional elongation. It will also be revealing to determine whether Erk signaling may cause activation of pTEFb (Fujita et al., 2008; Lee et al., 2010).
Recruitment and pausing of RNA polymerase II with lack of consolidated H3K27me3 silencing may constitute a potentiated template for induction of lineage-specific transcription programs. Pausing may serve to minimize the effects of noise and ensure rapid, coordinated, and synchronous gene induction in response to developmental cues or extrinsic stimuli (Boettiger and Levine, 2009; Nechaev and Adelman, 2011). Recent studies also indicate that Pol II pausing inhibits nucleosome assembly (Gilchrist et al., 2010) and could thereby influence histone modification profile. Interestingly, in Xenopus embryos H3K27me3 is not deposited during zygotic gene activation but is acquired later and associated with spatial restriction of gene expression (Akkers et al., 2009). In the mouse ICM, H3K27me localization has not been determined, but various epigenetic silencing components appear to be expressed at low levels (Tang et al., 2010).
Collectively the observations reported here yield insights into the molecular underpinning of naive pluripotency and revise previous assumptions derived from analysis of heterogeneous and metastable serum-treated cultures. The findings provoke questions about the regulation of gene expression in pluripotent cells and the process of lineage specification. Transcriptional potentiation through promoter proximal pausing may play a major role in the establishment and stable maintenance of naive pluripotency. Currently there is great interest in isolating human pluripotent stem cells in a naive state (Hanna et al., 2010; Wang et al., 2011). The distinctive transcriptome and epigenome characteristics of ground state mouse ES cells may provide a valuable criterion against which to measure such claims. In addition, these data sets provide a benchmark resource for analysis and modeling of gene expression control during self-renewal and in the transition from naive pluripotency to lineage commitment.
ES cells were cultured without feeders in the presence of leukemia inhibitory factor (LIF) either in Glasgow modification of Eagles medium (GMEM) containing 10% fetal calf serum or in serum-free N2B27 supplemented with MEK inhibitor PD0325901 (1 μM) and GSK3 inhibitor CH99021 (3 μM), together known as 2i (Ying et al., 2008). E14Tg2a (E14), Rex1GFPd2 (RGD2, Rex1GFP), and HM1 are male ES cells of 129 background established and maintained in serum without feeders. The serum-derived female XT67E1 line is from a mixed 129 and PGK/C3H background (Penny et al., 1996). TNGA female ES cells were derived and maintained in 2i from embryos on a mixed strain 129 and C57BL/6 background heterozygous for eGFP knock-in at the Nanog gene (Chambers et al., 2007). Female and male ES cells from the nonobese diabetic (NOD) strain were derived and maintained in 2i (Nichols et al., 2009). Chromatin immunoprecipitations (ChIP) were performed as described (Marks et al., 2009). Cell sorting (fluorescence-activated cell sorting [FACS]), immunoblotting, RNA isolation, and cDNA synthesis were performed according to standard protocols described in the Extended Experimental Procedures, which also lists antibodies used.
Total RNA was isolated with Trizol (Invitrogen) according to the manufacturer's recommendations. 100 μg total RNA was subjected to two rounds of poly(A) selection (Oligotex mRNA Mini Kit; QIAGEN), followed by DNaseI treatment (QIAGEN). 100–200 ng mRNA was fragmented by hydrolysis (5× fragmentation buffer: 200mM Tris acetate, pH8.2, 500mM potassium acetate and 150mM magnesium acetate) at 94°C for 90 s and purified (RNAeasy Minelute Kit; QIAGEN). cDNA was synthesized with 5 μg random hexamers by Superscript III Reverse Transcriptase (Invitrogen). Ds cDNA synthesis was performed in second strand buffer (Invitrogen) according to the manufacturer's recommendations and purified (Minelute Reaction Cleanup Kit; QIAGEN). Strand-specific rRNA depleted ds cDNA profiling was performed with the ScriptSeq kit (cat. no. SS10924) from Illumina, according to the instructions of the manufacturer. rRNA depletion was performed with the Ribo-Zero rRNA Removal Kit (Human/Mouse/Rat; cat. no. RZH110424). Validation experiments were performed by RT-qPCR with primers as shown in Table S6.
Experiments were performed with 3.3 × 106 cells and 3 μg antibody per ChIP as described (Marks et al., 2009) with two minor modifications. Crosslinking was performed on the culture plates for 20 min and ChIP'ed DNA was purified by Qiaquick PCR purification Kit (QIAGEN). ChIP enrichment levels were analyzed by qPCR for quality control. Antibodies used for ChIP are described in the supplemental information. Validation experiments were performed by qPCR with primers as shown in Table S6.
The following polyclonal antibodies were used for ChIP: H3K4me3 (Diagenode pAb-MEHAHS-024, A1-010); H3K27me3 (Millipore 07-449, DAM-1588246); H3K36me3 (Diagenode CS-058-100, A114-001); H3K9me3 (Abcam ab8898-100, lot 733953) Ezh2 (Active Motif 39639, 23809001); RNA Polymerase II (Diagenode AC-055-100, 001; also known as the 8wg16 RNA Polymerase II antibody, with the nonphosphorylated Ser2 of the RNA Polymerase II carboxyl-terminal domain (CTD) consensus sequence repeat YSPTSPS as main target. This antibody recognizes unphosphorylated (initiating) and Ser5 only phosphorylated RNA Polymerase II (Morris et al., 2005)).
For immunoblot analysis, 3 μg of histone extracts (for histones; Abcam protocol) or 10 μg of nuclear extracts (for nonhistone proteins; prepared according to Ambrosino et al., 2010) were resolved by SDS-PAGE and blotted on nitrocellulose membranes. Membranes were blocked for 1h in TBS-Tween containing 5% milk and incubated overnight at 4°C with the indicated antibodies diluted in TBS-Tween containing 3% milk. After washes, the membranes were incubated with secondary antibodies diluted in TBS-Tween containing 3% milk for 1h at room temperature. HRP conjugates were detected with enhanced chemiluminescence (ECL Plus, Amersham Biosciences). The following primary antibodies were used for immunoblotting: H3K4me3 (Diagenode pAb-MEHAHS-024, A1-010, 1:1000); H3K27me3 (Millipore 07-449, DAM-1588246, 1:1000); H4 (Abcam, ab7311, 826236, 1:1000); Ezh2 (Active Motif 39639, 23809001, 1:1000); Suz12 (Abcam ab12073-100, 418328, 1:500); β-actin (Abcam ab16039, 104192, 1:500), p16 (M-156, sc-1207, H1810), p19_ARF (5-C3-1, sc-32748, A0411), p21 (F-5, sc-6246, G0210). Secondary antibodies used: HRP-conjugated polyclonal swine anti-rabbit IgG (Dako, P0399, 00042894, 1:4000); HRP-conjugated polyclonal rabbit anti-mouse IgG (Dako, P0161, 00046035, 1:3000); HRP-conjugated polyclonal rabbit anti-rat IgG (Dako, P0450, 00017777, 1:1000).
Cells were fixed in 4% formaldehyde, permeabilized with 0.1% Triton X-100 and blocked with 3% donkey serum. Overnight incubation was performed with Oct3/4 (Santa Cruz sc-5279, c-20), Klf4 (R&D Systems AF3158) or Nanog (E-biosciences 14-5761-80) antibody at 4°C. Alexa Fluor 647 donkey anti-goat or anti-mouse IgG were used as secondary antibodies.
After sorting, ES cells were plated in serum or 2i + LIF at clonal density in duplicate wells (800 cells per well of a 6-well dish). Colonies were grown for 5 days in serum or for 7 days in 2i + LIF. Alkaline phosphatase staining was performed to score for colonies consisting of largely undifferentiated cells (undiff), mixed, and largely differentiated (diff) cells.
Cell sorting was performed according to Wray et al. (2011).
Quality control of DNA libraries prepared for sequencing was made by qPCR and by running the products on a Bioanalyzer (BioRad). Samples were sequenced to a depth of approximately 20 million uniquely mapped tags per sample. Sequences were aligned to the mouse MM9 reference genome with the Illumina Analysis Pipeline allowing one mismatch. Only the tags aligning to one position on the genome were considered for further analysis. For RNA-seq, further analysis was performed with the 36 bp aligned sequence reads. For ChIP-seq, identical sequence tags were discarded to obtain a nonredundant set, and the 36 bp sequence reads were directionally extended to 300 bp, corresponding to the length of the original fragments used for sequencing. The output data were converted to Browser Extensible Data (BED) files for downstream analysis and Wiggle (WIG) files for viewing.
For de novo identification of ncRNAs of the strand-specific RNA-seq, signals on the minus and plus strand were analyzed separately. Signals were quantified in 10 kb bins at a genome-wide scale. Bins that overlapped with known coding or noncoding RNAs (present in RefSeq, GenBank, or ENSEMBL) were excluded for further analysis. Subsequently, we selected bins with signals above background. Signals were averaged for TNGA 2i ES cells and E14 cells adapted to 2i (8 passages), as well as for E14 serum ES cells and TNGA cells adapted to serum (8 passages). Known ncRNAs were selected from the RefSeq database. RNAs were considered to be differential whether there was at least a 2-fold difference between the 2i and serum conditions, with an extra constraint set by a p value < 0.2 (student t test).
Identification of the H3K27me3 binding sites (peak calling) was performed via FindPeaks (Fejes et al., 2008) with a loose FDR cut-off of < 1 × 10−2, subpeaks 0.9, triangles distribution and duplicate filter. The number of tags per peak was normalized for the genomic length of the peak (expressed as reads/kb), by which the peaks were categorized (Figure 4A). Overlaps with genomic features were determined with Galaxy (main.g2.bx.psu.edu).
For the repeat analysis of ChIP-seq profiles, the mappings were performed with the maq (mapping and assembly with qualities) aligner version 0.7.1 (Li et al., 2008). The major advantage for the repeat analysis as compared to the ELAND Pipeline is that, if a sequenced read aligns on multiple places on the genome, maq assigns it randomly at one of these positions. This is useful when studying repeat classes, as the reads representing these classes will by definition map on multiple genomic locations. However, these will almost exclusively belong to the same class of repeat. All reads mapped by maq were included in downstream analyses. To enable direct comparisons, the samples were ratio normalized for the total number of tags mapped by maq. Sequence coordinates of various repeat classes were downloaded from the UCSC Table Browser (RepMask 3.2.7; rmskRM327). ChIP-seq tags were considered to represent a repeat class in case of any overlap of the 36 nt sequenced fragment with the repeat class. Subsequently, the number of ChIP-seq tags representing a repeat class was counted.
DNA samples were prepared for sequencing by end repair of 20 ng DNA as measured by Qubit (Invitrogen). Adaptors were ligated to DNA fragments, followed by size selection (~300 bp) and 14 cycles of PCR amplification. Cluster generation and sequencing (36 bp) was performed with the Illumina Genome Analyzer IIx (GAIIx) platform according to standard Illumina protocols. The standard pipeline to generate the sequencing output files are described in the Supplemental Information. All sequencing analyses were conducted based on the Mus musculus NCBI m37 genome assembly (MM9; assembly July 2007). Table S1 summarizes the sequencing output. All RNA-seq and ChIP-seq data (FASTQ, BED, and WIG files) are present in the NCBI GEO SuperSeries GSE23943.
To obtain RNA-Seq gene expression values (RPKM), we used Genomatix (www.genomatix.de). Differential genes were called at a 2-fold difference and p < 0.2 in a Student t test (among three biological replicates for both 2i and serum). The identification of ncRNAs is described in the Extended Experimental Procedures. GO and KEGG analysis was performed with DAVID (http://david.abcc.ncifcrf.gov/).
To compensate for differences in sequencing depth and mapping efficiency, the total number of unique reads of each sample was uniformly equalized, allowing quantitative comparisons. Tag densities on the average profiles were determined by calculating tag density over each base pair (using a 40 bp window size) per 10 million total mapped reads. The genes used for the average epigenetic profiles were based on the 2,000 most active/inactive genes in TNGA-2i. The 2,000 lowest/not expressed genes were selected by the additional requirement of H3K27me3 promoter enrichments of > 3-fold over background in either TNGA-2i or E14-serum. Random distribution values were determined by calculating average read densities of the genomic DNA profile (4.962 reads/kb at 10 million sequenced reads equivalent to an average density of 1.489 per bp). Genes were considered bivalent if both H3K4me3 and H3K27me3 were > 3-fold over random distribution, similar to criteria applied by Mikkelsen et al. (2007). The RNA Polymerase II traveling was calculated as described by Rahl et al. (2010). The Chip-seq repeat analysis procedure is described in the Supplemental Information.
For monolayer neural differentiation, we used Sox1-GFP (46C) ES cells, which contain a GFP knock-in at the endogenous Sox1 locus (Ying et al., 2003b). Cells cultured in 2i or serum were plated at a density of 5,000 cells per cm2. Sixteen hours after plating, media were switched to N2B27 to induce neural differentiation. Percentage of GFP-positive cells was determined by flow cytometry. For EB differentiation, single EBs were formed by sorting 1,500 cells into each well of PrimeSurface96U plates containing 15% serum and no LIF. Sixteen EBs were pooled each day and analyzed by RT-qPCR with TaqMan probes (Applied Biosystems).
We thank Eva Janssen-Megens, Anita Kaan, and Yan Tan for sequencing; Kees-Jan Françoijs, Hinri Kerstens, Simon van Heeringen, and Arjen Brinkman for bioinformatic assistance; and Rachel Walker for help with flow cytometry. We thank Paul Bertone for comments on the manuscript. Tatyana Nesterova and Neil Brockdorff provided the XT67E1 cells and Olga Ujhelly and Andras Dinnyes the HM1 cells. The pThr345-Ezh2 antibody was a gift from Danny Reinberg. The research leading to these results has received funding from the European Union grants HEROIC (18883; FP6/2005-2010), PluriSys (223485; FP7/2009), EuroSyStem (200720; FP7/2008), and ATLAS (221952; FP7/2009), the Deutsche Forschungsgemeinschaft SPP1356 Pluripotency and Reprogramming, and The Wellcome Trust. A.S. is a Medical Research Council Professor.
The GEO accession number for the SuperSeries reported in this paper is GSE23943.