Recent studies have shown that Tet family proteins can catalyze 5-methylcytosince (5-mC) conversion to 5-hydroxymethylcytosine (5-hmC) and play important roles in self-renewal and cell lineage specification in embryonic stem (ES) cells 
. These findings suggest a potential role for 5-hmC-mediated epigenetic regulation in modulating the pluripotency of ES cells. To unveil this new regulatory paradigm in human ES cells, here we used a selective 5-hmC chemical labeling approach coupled with affinity purification and deep sequencing that we developed before to establish the genome-wide distribution of 5-hmC in human ES cells. Integration of 5-hmC distributions with genome-wide histone profiles led us to identify the pluripotency-linked chromatin contexts associated with 5-hmC. Through association with genomic features defined on the basis of chromatin signatures, we find 5-hmC-mediated marking of not only specific promoters and gene bodies, but also distinct enhancer subtypes, including those marked with H3K4me1 and H3K27Ac. Lastly, we find 5-hmC is associated with the binding sites of specific core pluripotency transcription factors and a lack of 5-hmC at others. Our results suggest that 5-hmC is an important epigenetic modification associated with the pluripotent state that could play role(s) in a subset of promoters and enhancers with defined chromatin signatures in ES cells.
By correlating genome-wide distributions of 5-hmC with those of 11 diverse histone marks, we found that 5-hmC displayed relatively strong correlations with H3K4me1 and H3K4me2 versus H3K4me3, which, as expected, is consistent with previous correlations between DNA methylation detected by Methyl-Seq and histone modifications 
. 5-hmC also exhibited a strong correlation with H3K18ac, a mark regulated by CBP/p300 at enhancers that is associated with transcriptional activation. We also found more modest correlations with H3K27ac, H3K27me3, and H4K5ac, and very low correlations with H3K9ac and H3K9me3. However, our data suggested that 5-hmC was not strongly correlated with H3K36me3, a histone modification previously linked to DNA methylation detected by Methyl-Seq. This intriguing difference suggested differential marking of gene bodies by 5-hmC and H3K36me3 in pluripotent cells. Direct comparisons of genic 5-hmC and H3K36me3 indeed revealed that genes with the highest levels of TSS and gene body 5-hmC tend to exhibit intermediate levels of expression and harbor less intragenic H3K36me3, compared to genes with the highest levels of expression. Although a number of intriguing explanations might account for these observations, one possibility is that 5-hmC may function to temper transcription at both the TSS and gene body of intermediately expressed genes, while maintaining their potential to be more fully expressed when needed. Upon full activation, 5-hmC may be at least partially removed as the transcriptional unit acquires H3K36me3 and commits to a more fully active state. Restriction of 5-hmC at the TSS of repressed genes and its presence at both TSSs and gene bodies of intermediately expressed genes may also indicate distinct regulation of 5-hmC at these locations. At TSSs of genes that are repressed or expressed at low levels, Polycomb group complex, PRC2, may interact with 5-hmC to repress but maintain the potential for expression of targeted genes, as has been previously suggested 
. However, such distributions are distinct from those observed in mouse cerebellum 
, where 5-hmC is significantly enriched compared to ES cells, largely absent from TSSs, and high within gene-bodies, positively correlating gene-expression. Thus, distinction of mechanisms differentially influencing the state and regulation of 5-hmC within genes bodies in the context of gene expression outcomes will be important towards understanding the role of 5-hmC in both brain and ES cells.
Our genome-wide analyses of 5-hmC also revealed a general promoter-proximal bias of 5-hmC around RefSeq transcripts in human ES cells, which is consistent with the recently published work on mapping 5-hmC in mouse ES cells 
. This TSS-associated bias was also dependent on gene expression levels, with 5-hmC transitioning from a position directly over the TSS at repressed genes to a bimodal distribution at more highly expressed genes, likely reflecting the observed dual function of 5-hmC in mouse ES cells 
, although this correlation was not strictly linear. Interestingly, we find that the bimodal distribution of 5-hmC is also strongly correlated with the distributions of H3K4me1 and H3K4me2, but inversely correlated with H3K4me3. The bimodal distribution of 5-hmC, H4K4me1, and H3K4me2 around TSSs might reflect the establishment of divergent paused RNAPII, which is known to play a critical regulatory role at developmentally regulated transcripts in ES cells 
. This could thereby point to an influence of 5-hmC on transcription pausing at such promoters in hES cells. We also noted that such a promoter-proximal bias of 5-hmC in ES cells is distinct from that observed in mouse brain, where 5-hmC is largely depleted from TSSs and enriched within gene bodies (Szulwach and Jin, unpublished observations and 
), where it also correlates well with gene expression. This could suggest that such a bias reflects a stem cell-specific role for 5-hmC-mediated gene regulation at and around certain TSSs. Such differences may be accounted for by the enrichment of Tet1, or yet-to-be-identified co-factors of Tet1, in ES cells relative to more differentiated cell types.
Analyses of 5-hmC-enriched peaks and their correlation with enhancer-associated specific histone modifications, such as H3K4me1, H3K18ac, and H3K27ac, suggested that, in addition to being present at promoters, 5-hmC could also mark other diverse regulatory elements in the genome, such as enhancers. Interestingly, assessment of 5-hmC distributions at the predicted enhancers in H1 hES cells demonstrated the enrichment of the epigenetic mark at specific enhancer subtypes, including those enriched for K3K4me1, H3K27ac, H3K18ac, and H4K5ac. Despite a good correlation between 5-hmC and histone marks demarcating enhancers, we found that only small fraction of regions bound by p300 were also enriched for 5-hmC.
Finally, we examined the correlation of 5-hmC distributions with the genome-wide binding sites of six transcription factors that have been linked to maintaining the pluripotency of ES cells 
. We find that 5-hmC can also mark NANOG binding sites, while being depleted at TAF1 sites. These results further suggest diverse roles for 5-hmC in regulating the accessibility of transcription factors in defined chromatin contexts, including those regulating pluripotency in ES cells.
In summary, here we present the genome-wide distribution of 5-hmC and its correlation with 11 diverse histone modifications and six transcription factors in human ES cells. By integrating genomic 5-hmC signals with maps of different histone marks, we link particular pluripotency-associated chromatin contexts with 5-hmC. Our study suggests that 5-hmC could play diverse roles in regulating specific promoters, gene bodies, and enhancers in ES cells, thereby providing a detailed epigenomic map of 5-hmC from which to study its contribution to pluripotency.