To gain information regarding a possible function of hmC, we generated an affinity-purified polyclonal antibody to hmC that binds with high specificity and sensitivity to this mark, as shown by enzyme-linked immunosorbent (ELISA) and DNA immunoprecipitation (DIP) assays (Supplementary Fig. 6
). Genome-wide DIP-seq assays were performed using anti-hmC, anti-mC and IgG on genomic DNA purified from control or TET1-depleted ES cells as well as from Dnmt triple knockout (TKO) mouse ES cells, lacking Dnmt1, Dnmt3a and Dnmt3b14
. We confirmed by ChIP-qPCR that TET1 localizes to its target genes in the Dnmt TKO cells (Supplementary Fig. 7a
). The analyses showed that hmC is located as discrete peaks throughout the genome (). Furthermore, the majority of signals obtained with the hmC antibody were absent in Dnmt TKO mouse ES cells, confirming that generation of hmC requires the pre-existence of mC (). The hmC modification in mouse ES cells is particularly enriched within gene bodies as also observed for the mC mark15 and recently reported for hmC in mouse cerebellum16 (). Strikingly, in contrast to the localization of mC, hmC is also significantly enriched at the TSS coinciding with TET1 (), indicating that a significant fraction of mC is converted to hmC at the TSS. Also, the hmC modification is generally not detectable at repetitive elements such as intracisternal A particle (IAP) elements and minor satellite repeats by DIP-qPCR (Supplementary Fig. 7b
), further demonstrating that hmC and mC show distinct genomic distributions.
Hydroxymethylcytosine localizes to TSS and gene body.
Gene annotation of hmC positive regions around the TSS (−0.7 kilobases to +0.3 kb) showed that 2,424 regions are hmC-positive in wild-type ES cells compared to Dnmt TKO ES cells. Approximately 28% of these regions showed a more than twofold reduction in hmC signal in the DIP-seq analyses upon downregulation of TET1 () and in validation experiments the knockdown of Tet1 led to a significant decrease in hmC levels on tested genes ( and data not shown). Depending on the used false discovery rate cut-off for TET1, between 35% (FDR < 0.01) and 50% (FDR < 0.1) of hmC-positive genes are bound by TET1 (). These results are in agreement with reports showing that Tet1 knockdown only causes a partial decrease in global hmC levels in mouse ES cells9
, and imply that, although TET1 is important for the generation of hmC, other enzymes such as TET2 are also likely to contribute to hmC levels in mouse ES cells.
As for TET1, Gene Ontology analysis of the hmC-positive genes showed enrichment for genes involved in basic cellular processes, but also in the regulation of development and differentiation (Supplementary Fig. 7c
). Moreover, hmC positivity does not correlate with transcriptional activation and surprisingly, most hmC-positive genes seem not to be expressed in mouse ES cells ().
A significant proportion of the TSSs classified as positive for hmC has intermediate or high CpG content ( and Supplementary Fig. 4
). Genome-wide analyses of the hmC distribution relative to CpG content showed that the hmC mark is enriched in regions with relatively high CpG content compared to mC (). Whereas only 15% of hmC-positive TSSs also contain a high mC signal, we find that several hmC-positive regions have low levels of mC, implying that the two marks often co-exist. Upon Tet1 knockdown only a minor global increase in mC was observed as evaluated by genome-wide anti-mC DIP (Me-DIP) (Supplementary Fig. 8a
). However, a few hundred genes show modest TSS specific increases in mC levels after Tet1 knockdown (Supplementary Fig. 8b
). Gene Ontology analyses for these genes showed enrichments for specialized developmental processes (Supplementary Fig. 8c
). Interestingly, we found that approximately a third of the genes reported to acquire DNA methylation during ES cell differentiation2
are marked by hmC in the ES cell state (Supplementary Table 2
). Taken together, these results show that hmC colocalizes with mC in gene-bodies, and that hmC, in contrast to mC, is enriched at TSSs with intermediate to high CpG density, where it may contribute to the regulation of DNA methylation patterns.