DNA methylation at the 5-position of cytosine (5mC) occurs predominantly at CpG dinucleotides in the mammalian genome and is one of the most important epigenetic marks, playing critical roles in host defense, genome imprinting, and X chromosome inactivation (Suzuki and Bird, 2008
). It is well established that individual CpGs located in different genomic regions are differentially methylated depending on cell or tissue type and developmental stage. Furthermore, it is evident that the GC density and gene transcriptional status also influence DNA methylation status. For example, the majority of CpG islands (CGIs) displaying a dense CpG content are hypomethylated while the rest of the genome, including CpG-rich repetitive heterochromatin regions and dispersed CpGs in gene coding regions are usually hypermethylated. Yet it is still poorly understood how genome-wide DNA methylation is differentially regulated at discrete loci and dynamically processed in different cell types and during development.
Increasing evidence suggests that DNA methylation is intimately linked to histone methylation. For instance, it is well known that high levels of DNA methylation at GC-rich repetitive genomic elements are protected first by methyl-binding proteins such as MBDs, which in turn recruit both histone deacetylases and H3K9 methyltransferases. This epigenetic signature can subsequently recruit HP1 protein and thus establish a condensed chromatin structure, which recruits more DNMTs to maintain this methylation pattern. On the other hand, unmethylated CpGs in CGIs recruit factors such as MLL1 and CFP1/SETD1, which only bind to unmethylated CpGs, to establish a unique chromatin environment with high H3K4me3 to deter DNA methyltransferases from binding. Thus, the underlying chromatin structure at CGIs, in terms of modifications and recruited binding partners, likely represents one mechanism to modulate DNMTs mediated DNA methylation.
A longstanding and fascinating question in the epigenetics field is whether there are enzymes capable of directly removing the methyl group. While such an enzyme has been elusive, human TET1 was recently identified as a 5mC hydroxylase that catalyzes the conversion of 5mC to 5-hydroxymethylcytosine (5hmC) (Tahiliani et al., 2009
). The mammalian TET family contains three members, Tet1, Tet2 and Tet3, which share significant sequence homology at their C-terminal catalytic domains (Ito et al., 2010
; Tahiliani et al., 2009
). Similar enzymatic activities for mouse Tet family members have also been described (Ito et al., 2010
; Ko et al., 2010
). The discovery of this family of enzymes has provided a new potential mechanism for altering DNA methylation status. However, little is known as to what extent individual family members regulate the genome-wide 5mC/5hmC patterns and contribute in genome functions.
Tet1 is highly expressed in embryonic stem cells (ESCs) and its depletion leads to a reduction in global 5hmC levels (Koh et al., 2011
). In addition to the 5mC hydroxylase domain, TET1 also contains a conserved CXXC domain (Tahiliani et al., 2009
; Zhang et al., 2010
), a domain employed by other proteins to bind unmethylated CpG DNA and enabling them to modify histone or DNA methylation. The family of CXXC domain-containing proteins includes factors involved in DNA methylation (DNMT1, MBD1) and histone methylation/demethylation (MLL, CFP1, KDM2A), all of which play important roles in gene regulation and contribute to embryonic development. Significantly, our recent study shows that human TET1 is a CpG DNA binding protein that promotes DNA demethylation when it is over-expressed in 293T cells and positively regulates transcription of a reporter gene in a 5mC hydroxylase activity-dependent manner (Zhang et al., 2010
). These findings suggest that Tet1 regulates DNA methylation and gene expression through its ability to convert 5mC to 5hmC.
5hmC was first identified in T-even bacteriophage (Wyatt and Cohen, 1953
), and later found in the vertebrate brain (Kriaucionis and Heintz, 2009
; Penn et al., 1972
) and several other tissues (Globisch et al., 2010
). Interestingly, while 5hmC exists at high levels in mESCs, its level significantly decreases after mESC differentiation (Szwagierczak et al., 2010
; Tahiliani et al., 2009
), and rises again in terminally differentiated cells such as Purkinje neurons (Kriaucionis and Heintz, 2009
). Despite these recent advances, the molecular basis for Tet1 and 5hmC functions in the ESC genome and epigenome is unknown, although a controversial role for Tet1 in maintaining ESC pluripotency and determining ESC differentiation has been proposed (Ito et al., 2010
; Ko et al., 2010
; Koh et al., 2011
Here, we show that Tet1 is capable of binding to unmethylated as well as methylated and hydroxymethylated CpG DNA via its CXXC domain. Further, we report a complete genome-wide mapping of Tet1 binding and 5hmC in mESCs. Complemented with Tet1 depletion studies, this allows us to establish specific correlations among Tet1 occupancy, 5mC and 5hmC levels, histone modification and gene expression in mESCs and reveal an intricate role of Tet1 in its associated gene network. Thus, this study provides a foundation for understanding not only possible functions of Tet proteins and 5hmC but also molecular mechanisms by which Tet proteins, via dynamic regulation of DNA methylation, influence gene transcription and related biological functions in mESCs.