|Home | About | Journals | Submit | Contact Us | Français|
DNA methylation is an epigenetic modification that is implicated in transcriptional silencing. It is becoming increasingly clear that both correct levels and proper interpretation of DNA methylation are important for normal development and function of many organisms, including humans. In this review we focus on recent advances in understanding of how proteins that bind to methylated DNA recognize their binding sites and translate the DNA methylation signal into functional states of chromatin. Although the function of methyl-CpG binding proteins in transcriptional repression has been attributed to their cooperation with corepressor complexes, additional roles for these proteins in chromatin compaction and spatial organization of nuclear domains have also been proposed. Finally, we provide a brief overview of how methyl-CpG proteins contribute to human disease processes such as Rett Syndrome and cancer.
Post-synthetic modification of DNA by methylation is found in most living organisms from bacteria to mammals. In prokaryotes, methylation occurs both on adenine and cytosine nucleotides and is involved in regulation of DNA replication timing, DNA repair and defence against invasion by foreign DNA . The genomes of eukaryotes are modified exclusively at cytosine and in vertebrates only in the context of CpG dinucleotides . Fungi, such as Neurospora crassa, and plants also contain methylated cytosine in non-CpG context [3,4]. Not all eukaryotes have methylated DNA. Species such as yeast and many invertebrates, including the nematode C . elegans and the fly D. melanogaster, contain either no or barely detectable amounts of methylated cytosine in their genomes [2,5].
DNA methylation is introduced into DNA by enzymes of the DNA cytosine methyltransferases family. In vertebrates, these are represented by DNMT1, DNMT3A and DNMT3B . DNMT1, the maintenance DNA methyltransferase, works most efficiently on hemi-methylated DNA and functions to restore symmetrically methylated cytosine on daughter DNA strands generated during replication [7,8]. DNMT3 enzymes, also referred to as de novo methyltransferases, can work on fully unmodified DNA and are essential for establishment of DNA methylation patterns during embryogenesis .
Most genomes with high levels of DNA methylation are depleted of CpGs due to the frequent deamination of methyl-cytosine into thymidine. This generates mCpG:TpG mismatches which, if unrepaired, are further stabilized by DNA replication [2,10]. The remaining CpGs are unevenly distributed throughout the genome. Most gene promoters (~70% in mammals) are imbedded in an unmethylated stretches of DNA with high CpG density, also known as CpG islands [2,11,12]. How the CpG islands are maintained in an unmethylated state and what protects them from the action of DNA methyltransferases is currently unclear. Nevertheless, these sequences are not intrinsically unmethylatable since some of them acquire DNA methylation in differentiated cells and can be found aberrantly methylated in cancers [13,14]. The lack of methylation at CpG islands and heavily methylated coding and intergenic regions generate patterns of DNA methylation that are heritably maintained in somatic cell lineages .
Biological functions of DNA methylation have been intensively studied over several decades. It is now well established that DNA methylation generally associates with silent chromatin which is inhibitory to transcriptional initiation [11,15]. Important processes such as monoallelic expression of imprinted genes in plants and placental animals, X-chromosome inactivation in mammals and suppression of transposable elements in complex genomes include DNA methylation as part of more complex regulatory functions [16,17]. Given that most of the gene promoters are methylation-free, the question of whether DNA methylation is essential for regulation of gene expression on a global scale has been a subject of debate [2,16,18]. Nevertheless, mice with disrupted alleles of Dnmt1 or double null for Dnmt3a and Dnmt3b die early in embryogenesis and show inappropriate expression of large number of genes [9,19,20]. Xenopus laevis embryos depleted of DNMT1 initiate zygotic transcription 2-3 cell cycles earlier than it normally occurs at the midblastula transition and display aberrant expression of developmentally decisive genes [21,22]. Therefore, at least in vertebrates, DNA methylation plays a conserved role in maintaining stable patterns of gene expression. Additional evidence from plants and the primitive chordate C. intestinalis shows that methylated cytosines are enriched within the coding regions of genes and may regulate transcriptional elongation [23,24]. It is worth mentioning that a role for DNA methylation in suppressing “transcriptional noise” in large genomes, i.e. initiation of spurious transcription from cryptic promoters within coding regions of genes or non-coding DNA, has been proposed and still requires more detailed investigation [25,26].
There are two models of how DNA methylation exerts its repressive effect on transcription. In the first model, CpG methylation alters binding sites of transcription factors and directly interferes with gene activation . Examples validating this model include E2F, CREB and c-myc [27-29]. In the second model, methylated cytosines serve as docking sites for proteins that specifically recognize and bind to methylated CpGs and repress transcription indirectly via recruitment of corepressors that modify chromatin [11,15]. In this review we focus on the methyl-CpG binding proteins and how they translate DNA methylation patterns into functional states of chromatin.
Currently two major families of methyl-CpG binding proteins are known in vertebrates: MBDs and Kaiso-like proteins. In addition, recent studies indicate that SRA domain proteins, characterized in some detail in plants, have the ability to bind methylated DNA in non-CpG context.
Ironically, the first methyl-CpG binding protein, MeCP2, was discovered by accident by Adrian Bird and co-workers, who at the time were attempting to identify factors that bind to unmethylated DNA and would function to protect CpG islands from DNA methylation. Instead, protein factors, initially named MeCP1 and MeCP2 that bind specifically to methylated DNA were detected [30,31]. MeCP2 was purified first and represents a 53 kDa protein containing a N-terminal methyl-CpG binding domain (MBD) and a C-terminal transcriptional repression domain (TRD) [32,33]. Mammalian EST database homology searches for sequences encoding a conserved MBD domain led to the identification of four additional proteins currently known as MBD1, MBD2, MBD3 and MBD4  (Figure1). Of those, MBD2 and MBD3 are closely related to each other outside the MBD domain (77% identity) and are likely to represent the ancestral MBD family founders since a homologous MBD2/3 like protein is present in invertebrates, including Drosophila where low levels of DNA methylation are detectable only in early development [5,35]. All MBD proteins, except MBD3, specifically recognize and bind to methylated DNA in vitro and in vivo . Mammalian MBD3, unlike its amphibian homologue, harbours a critical mutation in the MBD domain and does not bind to methylated DNA [15,34]. The MBD family proteins, including MeCP2, are highly conserved in all vertebrates . Interestingly, at least 12 MBD proteins have been identified in the plant Arabidopsis thaliana. Of these AtMBD5, AtMBD6 and AtMBD7 have been shown to bind methylated DNA in vitro .
Kaiso, the founder protein of Kaiso-like family, was independently discovered in two different laboratories as a DNA binding factor involved in non-canonical Wnt signalling and as a protein that binds to methylated DNA [38,39]. Unlike the MBD family members, Kaiso and the two recently identified Kaiso-like proteins ZBTB4 and ZBTB38 contain a conserved POZ domain involved in protein-protein interactions and three C2H2 zinc finger motifs, two of which are essential for binding to methylated DNA  (Figure1). Unmethylated sequences recognized by Kaiso with high affinity have also been reported  raising the possibility that in some circumstances Kaiso and Kaiso-like proteins may bind to nonmethylated DNA and to methylated DNA in others.
Recent reports suggest that another protein fold, the SRA (or YNG) domain could interpret DNA methylation [42-44]. SRA domain-containing proteins fall in two distinct families. The first one is characterised by association of the SRA domain with PHD and RING finger domains. At least 5 members of this family exist in A. thaliana including the product of the recently cloned vim1 gene . So far described mammalian homologues include the Np95 protein, the closely related Np97 (or NIRF) and ICBP90 [42,44]. Recent studies suggest that Np95 plays a critical role in epigentic inheritance of DNA methylation [45,46]. The second family of SRA domain proteins is plant- specific, without obvious mammalian counterparts, and includes members of the SUVH family of SET domain histone methyltransferases in A. thaliana . Interestingly, the SRA domain seems more versatile than the MBD domain in recognizing methylated DNA as in vitro it binds to methyl-cytosine at CpG, CpNpG and even at the asymmetric CpNpN sites, with a marked preference for mCpNpG [42,43].
In the rest of this review we will focus in more detail on the MBD and in part the Kaiso-like proteins in mammalian cells.
The molecular functions of methyl-CpG binding proteins rely on their ability to recognize and bind methylated DNA. As this property is central to understanding their roles in vivo, we will review the current progress in this area in more detail.
Deletion analyses have identified the minimal region of MeCP2 responsible for the interaction with methylated CpGs . Further comparison with other MBD proteins defined the MBD domain as a protein motif of about 75 amino acids [34,47]. Since the “classical” MBD was described, proteins containing MBD-like domain, including ESET/SETDB1 and TIP5, have also been identified in different species. However, as in the case of TIP5 protein, the MBD-like domains are predicted not to form specific interactions with methylated DNA and therefore may serve other functions. A unifying name of TAM domain (TIP5, ARBP, MBD) is now used to unify both canonical MBD and MBD-like domains . MBD-like domains proteins are not the subject of this review.
Sequence comparison of all human MBD family proteins show the presence of 16 strictly conserved amino acids within the MBD domain. MBD3, which does not bind to methylated DNA, lacks four of these conserved residues . Pairwise comparison reveals the presence of two subclasses, with the MBD domains of MBD4 and MeCP2 being more closely related to each others, while those of MBD1, MBD2 and even MBD3 forming a separate subgroup. Solution structures of MBD domains of human MeCP2 and MBD1 have been determined by NMR revealing a similar α/β sandwich fold composed of four β-strands and an α-helix [48,49] (Figure2A and B). Detailed information on how the MBD fold binds symmetrically methylated CpG was derived from the NMR structure of the MBD domain of MBD1 in complex with methylated DNA  (Figure2C and D).
MBD proteins interact with methylated DNA in the major groove, where the two methyl groups from the mCpG point towards the exterior of the double helix (Figure 2C). Several residues from the L1 loop, connecting the β2 and β3 strands, and the α helix respectively make several contacts with the sugar/phosphate backbone on each strand of the DNA molecule (Figure 2D). Four conserved residues (R22, Y34, R44, S45) in MBD1 are involved in recognizing the methyl-CpGs via a complex set of interactions. It appears that each side chain interacts with DNA in a somehow bivalent way, where the polar moiety of each of these residues contacts C or G base, while their hydrophobic regions stack around the methyl groups. Such of bivalent contacts from each important amino acid side chain may explain why both the CpG dinucleotide and the two methyl groups are strictly required for efficient recognition by the MBD. Subtle variations in this network might abolish binding. MBD3 for example has only three of the four conserved residues with Tyrosine (Y) to Phenylalanine (F) substitution at the equivalent position of Y34. The loss of a single hydroxyl group renders MBD3 incapable of binding to methylated DNA [34,51]. This particular arrangement of critical amino acids is likely to explain the high selectivity observed in vitro towards methylated DNA versus either hemimethylated or unmethylated CpGs. The structural data also confirm that one MBD domain can only accommodate one symmetrically methylated CpG as the MBD domain binds DNA as a monomer [32,48]. However, this does not exclude the presence of potential homo or heterodimerisation interfaces on MBD proteins, even if MeCP2 appear to be mostly monomeric in solution . The only case that complicates this picture is MBD4, whose MBD domain seems capable of interacting preferentially with mCpG:TpG mismatches arising from deamination of methyl-cytosine and even with hemimethylated DNA [53,54]. The structural information available so far does not explain why MBD4, whose MBD domain is related to MeCP2 more than any other, would display an altered DNA binding specificity. However, it seems that cytosine methylation and even the MBD itself can be dispensable for MBD4 G:T mismatch-specific thymine glycosylase activity .
About 70 to 80% of CpG are methylated in mammalian genomes, creating a relatively high number of potential binding sites for MBD proteins . Then what determines their pattern of occupancy at these sites? One possible model would be that each MBD protein randomly occupies any available methylated CpG (Figure 3A). In this scenario, the relative abundance of each MBD protein within a cell, together with the methylation density will dictate the occupancy of individual methylated sites. This random behaviour would imply high redundancy and is the principal argument to explain the relatively mild phenotypes of MBD1, MBD2 and Kaiso null mice [56-58]. In another model, one can envisage that other factors may influence the distribution of MBD proteins within a cell nucleus, making it non-uniform and non-random, with each MBD protein occupying unique sites in the genome (Figure 3B). This model would predict that a subset of genes would be affected by the loss of one MBD protein but not other. Examples of genes missexpressed in the absence of specific MBD proteins are becoming more abundant and the phenotypes of MBD deficient mice, although subtle, are markedly different [56,58-61].
In support of the second model, a recent study demonstrates that in primary human fibroblasts, MBD1, MBD2 and MeCP2 do not share binding sites in vivo, at least at the number of genomic sequences examined . Morpholino-mediated depletion of MeCP2 and MBD2 suggested the existence of a mechanism dictating preference of MeCP2 but not MBD2 for a subset of methylated sites in vivo . Whether this selective binding is retained in cancer cells, which tend to accumulate aberrant DNA methylation patterns, is unclear . Specific targeting of MBD proteins, observed in primary cells, may be achieved via interactions with binding partners (see below) including other DNA binding activities which may facilitate targeting of MBDs to chromatin or DNA at specific loci. Experimental evidence for recruitment via partner proteins is currently missing. Another possibility, which has been validated to some extent, is that the various members of the MBD family display different DNA binding specificity, meaning that they recognize and bind to more complex sequences than a single methylated CpG.
Recent in vitro experiment showed that, unlike MBD2, MeCP2 requires a run of four or more A/T base pairs adjacent to methylated CpG for high affinity binding . Furthermore [A/T]≥4 runs are present at MeCP2 target sequences identified in vivo . These findings constitute the first example where the enhanced binding specificity towards a particular set of methylated sequences allows discriminative binding site occupancy of an MBD protein. Whether this is the case for other MBD proteins remains to be determined. However, as MeCP2 and MBD1 contain additional DNA binding domains, it is possible that a single methylated CpG is not sufficient to support high affinity binding of an MBD protein to DNA. Early studies on MeCP2 detected a potential second DNA binding activity independent of the MBD domain and sequence analyses identified the presence of two AT-hooks [31,62,64] (Figure1). The AT-hook motif is capable of interacting with the minor groove of AT rich DNA and has been characterized in high mobility group proteins such as HMGA1 . However, the AT hooks are frequently present in conjunction with other functional DNA or chromatin binding domains , and in the case of MeCP2 their functionality remains to be determined. Surprisingly, the AT hooks are not required for selective binding of MeCP2 to CpG followed by [A/T]≥4 run . However, these motifs may interact with other stretches of A/T-rich DNA in cis or trans. Additionally, a role of the C-terminus of MeCP2 in helping binding to DNA, matrix attachment regions and nucleosome has also been reported [67-70]. Whether multiple DNA and chromatin binding interfaces play a role in MeCP2 function requires further studies.
On the other hand, MBD1 protein carries a second functional DNA binding motif separate from the MBD. Depending on the isoform, MBD1 can have two or three zinc finger motifs defined by 8 conserved cysteines, the CxxC zinc finger [71,72]. However, each copy is not strictly equivalent to one another, as they display primary sequence differences that would alter their biochemical properties. The most C terminal zinc finger (usually referred as CxxC3), which we will consider as a canonical CxxC motif, is also present in various other proteins, including DNMT1, CpG binding protein CGBP, H3K4 histone methylase MLL and H3K36 histone demethylases of the Jumonji family JHDM1A and JHDM1B [71,73,74]. This canonical version of the CxxC zinc finger has been shown to bind non methylated CpGs in vitro in the case of MBD1, MLL CGBP and JHDM1B [71,74-76]. The two other CxxC motifs of MBD1 lack a conserved glutamine residue and a KFGG motif, characteristic of all DNA binding CxxC zinc fingers, and as a consequence are unable to bind DNA . The role of these divergent CxxC zinc fingers is unclear but they might be involved in protein-protein interactions .
In reporter gene assays, MBD1 represses transcription from CpG rich unmethylated promoters in a CxxC3 domain-dependant manner [71,72]. This suggests that this domain could be as efficient as the MBD for targeting MBD1 to DNA in vivo and therefore, MBD1 may play a role in silencing certain unmethylated CpG island promoters. However, the CxxC3 domain by itself does not provide enough sequence specificity to discriminate between different CpG islands, which are defined by their high CpG content. A mechanism(s) that would account for the specific targeting of MBD1, and other CxxC containing protein, to specific DNA loci remains to be uncovered. An attractive hypothesis would be that MBD1 requires each of its two DNA binding domains for efficient binding at specific loci in vivo. As each domain interacts with a very short sequence (2 nucleotides) compared to classical DNA binding transcription factors, one can speculate that the use of two separate DNA binding domains might enhance the specificity for particular sequences in vivo. Another possibility is that, similar to MeCP2 which requires a methylated CpG followed by an [A/T]≥4 run, each DNA binding domain of MBD1 recognizes a more complex sequence than currently known. Whether MBD1 binds DNA through an independent use of its two DNA binding domains, or they collaborate with each other to target efficiently MBD1 to specific loci will be an intriguing question to answer.
It might appear surprising that a methyl-CpG binding protein carries a domain that allows it to bind unmethylated CpGs. However, this ability to interact with methylated and unmethylated DNA is not a unique feature of MBD1. The identification of Kaiso showed that the MBD domain is not the only protein fold able to recognize DNA methylation, as Kaiso and the Kaiso-like proteins ZBTB4 and ZBTB38 use a set of C2H2 zinc fingers to bind methylated DNA [39,40]. Early studies suggested that Kaiso requires at least two mCpGs for efficient binding, while ZBTB4 and ZBTB38 seem to interact with a single mCpG [39,40]. In vitro studies also show that Kaiso interacts specifically with unmethylated consensus sequence, the Kaiso Binding Site (KBS:TCCTGCNA), which is present at promoters of Wnt target genes [77,78]. Interestingly, only zinc fingers 2 and 3 of Kaiso are necessary and sufficient for binding to either type of sequences in vitro . The ability to bind unmethylated DNA is shared by ZBTB4, but surprisingly not by ZBTB38. High resolution structural information may help to explain how the zinc-fingers of Kaiso and Kaiso-like proteins interact with methylated and unmethylated DNA. Such structural studies may facilitate the design of specific point mutations which would allow uncoupling of mCpG and KBS binding activities and clear cut discrimination between the functions of Kaiso and ZBTB4 that rely on their interaction with either methylated or unmethylated DNA.
In summary, several lines of evidence suggest that methyl-CpG binding proteins recognize more complex sequences than a single methylated CpG, thus favouring a gene or locus specific role for each member of the MBD and Kaiso-like families. As MBD proteins are widely expressed in different tissues and constitute relatively abundant chromosomal proteins, it has been suggested that they may also exert functions unrelated to recognition of methylated DNA. Although binding of MBD proteins to other nucleic acids such as RNA and cruciform DNA structures in vitro has been reported [79,80], evidence in vivo for the most part firmly supports the function of MBD proteins in reading DNA methylation patterns at specific loci. One likely explanation for the existence of two (and maybe more) families of divergent methyl-CpG binding proteins, with members that display different sequence specificity towards methylated DNA, could be the evolutionary adaptation of pre-existing nucleic acid binding motifs for binding to methylated DNA.
Early studies have demonstrated that DNA methylation has no major impact on gene expression until the DNA template is assembled into chromatin [81,82]. Once the MBD proteins were discovered and shown to function as methylation-dependent transcriptional repressors, the work in several laboratories has focused on identification of co-repressor complexes associated with these proteins. Currently, it is established that MBD proteins cooperate with histone deacetylases and histone methylase activities that modify chromatin and prevent productive initiation of transcription  (Figure 4).
Protein complexes containing MeCP2 and MBD2 have been purified by biochemical fractionation and found to contain class I histone deacetylase activities HDAC1 and HDAC2 [33,83,84]. MeCP2 was co-purified with Sin3A/HDAC2 complex from Xenopus oocyte extract and a number of studies have shown than the interactions of MeCP2 with Sin3A and HDAC2 are conserved in mammalian cells and are essential for MeCP2-mediated repression [33,83]. A number of other proteins directly or indirectly interacting with MeCP2 have been found including DNMT1, CoREST, NCoR/SMRT, c-SKI, histone H3 lysine 9 methylase activity, RNA splicing factors and chromatin remodelling activities such as ATRX and Brahma (Brm1)-related SWI/SNF complex [85-92]. Nevertheless, the composition of a well defined MeCP2 co-repressor complex that would be present in all cell types or tissues remains elusive. Purification of MeCP2 from mammalian sources, including brain where this protein is most abundant, has produced conflicting results, ranging from complete lack of stable association of MeCP2 with either Sin3A, or any other proteins in nuclear extracts [52,93], to the identification of high molecular weight MeCP2 containing complexes [89,91,94]. Some of these discrepancies could be explained by the use of different biochemical methods that may vary in sensitivity of detection of MeCP2-associated proteins. It is also possible that most of the interactions of MeCP2 with partner proteins are either relatively unstable or cell type- and/or locus- specific . In addition, these findings raise the possibility that DNA-bound MeCP2 may interact with partner proteins differently compared to the unbound form of MeCP2, which behaves as an unusually elongated monomeric molecule in solution . Perhaps affinity tagging of the endogenous MeCP2 protein and tandem purification of the MeCP2 containing complexes from a variety of tissues may provide some interesting insights.
MBD2 and MBD3 co-purify with a large protein complex known as NuRD (Nucleosome Remodelling and Histone Deacetylation) which contains chromatin remodelling ATPase Mi-2, HDAC1 and HDAC2 histone deacetylases as well as other proteins [84,95]. The NuRD complex exists in several forms and may or may not contain MBD proteins. Initially, it was suggested that MBD2 and MBD3 together associate with NuRD [84,96,97], but recent affinity tag purifications of MBD2 and MBD3 complexes from mammalian cells showed that NuRD associates either with MBD2 or MBD3 but never with both proteins . These different MBD complexes have probably no or very little functional overlap since MBD3 null mice die early during embryogenesis while MBD2-deficient animals are viable and fertile . As MBD3 does not bind to methylated DNA, these findings indicate that only a proportion of NuRD complexes would be recruited to methylated DNA and participate in methylation-dependent transcriptional repression. It appears that association with ATP-dependent chromatin remodelling activities is a common feature of MeCP2- and MBD2-associated protein complexes. Although MeCP2 and MBD2 are likely to be responsible for the initial recruitment of these complexes to chromatin assembled on methylated DNA, studies in vitro and in vivo suggest that chromatin remodelling activities further facilitate binding of MBD proteins to methylated sites that are not readily accessible on nucleosomal templates and by doing so stimulate MBD-mediated gene repression [89,99].
MBD1 protein in most assays behaves as a histone deacetylation-independent transcriptional repressor . At least two histone H3 lysine 9 (H3K9) methylase activities SETDB1 and SUV39H have been found associated with MBD1 as well as the heterochromatin protein HP1 [61,101]. In addition the C-terminus of MBD1 binds a SETDB1 co-factor AM/MCAF which stimulates SETDB1 activity to allow more efficient di- and trimethylation of H3K9 [102,103]. Furthermore, it was shown that during S-phase of the cell cycle MBD1/SETDB1 complex can be displaced from methylated DNA by progressing replication forks to allow the formation of a transient complex with p150 subunit of chromatin assembly factor CAF-1 . As a result of this interaction, the MBD1-bound SETDB1 methylates H3K9 of the H3/H4 dimers associated with CAF-1. Thus the S-phase specific MBD1 complex facilitates post-replicative maintenance of the repressive H3K9 chromatin modification on methylated daughter DNA strands . This mechanism provides a plausible explanation of how silenced chromatin can be heritably transmitted through DNA replication and cell division in synchrony with DNA methylation. The function of MBD1 in transcriptional repression and maintenance of H3K9 methylation is negatively regulated by conjugation of SUMO1 . Two E3 SUMO-ligases PIAS1 and PIAS3 sumoylate MBD1 in human cells and compete against SETDB1 for interaction with MBD1. SETDB1 can bind MBD1-SUMO1 in vitro but not in vivo suggesting that there could be a specific binding partner(s) for sumoylated MBD1, which disrupt the formation of the MBD1/SETDB1 complex . Identification of factors that bind MBD1-SUMO1 but not MBD1 will be essential for the mechanistic understanding of how the function of MBD1 might be regulated in response to physiological stimuli. Intriguingly, conjugation of SUMO2/3 to MBD1 has also been reported and, unlike SUMO1, seems to stimulate transcriptional repression by MBD1 . Therefore it is possible that these two modifications recruit different binding partners to MBD1. Given that a number of proteins were found to associate with MBD1, biochemical purification of MBD1 complex(es) may help to determine whether MBD1 stably associates with a set of co-repressor proteins or, like MeCP2, could cooperate with many different nuclear factors.
Similar to MBDs, Kaiso-like proteins function as HDAC-dependent transcriptional repressors. From HeLa cell nuclear extracts, Kaiso co-purifies with NCoR (Nuclear receptor Co-Repressor) complex containing histone deacetylase HDAC3 and this association is required for silencing of methylated MTA2 promoter . Depletion of Kaiso in Xenopus embryos results in derepression of methylated genes before the midblastula transition . However, Xenopus Kaiso is also involved in a non-canonical Wnt pathway where its function is controlled through an association with p120-catenin and is essential for regulation of target genes in a methylation-independent manner [77,78]. Complexes containing Kaiso-like proteins have not yet been purified but ZBTB38 protein was shown to interact with several histone deacetylases activities and co-repressor CtBP .
Taken together, methyl-CpG binding proteins represent an important class of chromosomal proteins which associate with multiple protein partners to modify surrounding chromatin and silence transcription, providing a functional link between DNA methylation and chromatin remodelling and modification.
Independently from the establishment of transcriptionally inactive chromatin via cooperation with corepressor proteins, a more direct role of MBD proteins in organization of higher order chromatin structure has also been proposed. Earlier studies have shown that MeCP2 forms discrete complexes with nucleosomes assembled on methylated DNA, can displace histone H1 from pre-assembled chromatin and in addition is able to interact with the nucleosome core via its C-terminus [67,109]. More recent studies have reported that purified recombinant MeCP2 when added to nucleosomal arrays in vitro causes chromatin compaction, which has been attributed to additional interactions between MeCP2 and DNA or chromatin in cis or trans via a domain(s) different from the MBD [68,69].
In vivo, in mouse cells, MeCP2 as well as other MBD and Kaiso-like proteins localize to condensed pericentric heterochromatin regions known as chromocenters (Figure 5A-D). These chromatin domains are also enriched in histone H3 trimethylated at lysine 9 and heterochromatin proteins HP1α and HP1β [110,111]. During myogenic differentiation of mouse C2C12 cells the pericentric heterochromatin domains undergo reorganization and cluster into smaller number of larger chromocenters . These events are accompanied by an increase in methylation of major satellite DNA and accumulation of MeCP2 and MBD2 proteins in the nuclei of terminally differentiated muscle cells (myotubes). Interestingly, overexpression of MeCP2 and MBD2 in C2C12 myoblasts in the absence of differentiation also induces aggregation of chromocenters indicating that these proteins may be directly involved in reorganization of heterochromatin architecture . As overexpression of MBD domain of MeCP2 is sufficient to cause fusion of chromocenters, these events are unlikely to involve co-repressor HDAC complexes or other proteins interacting with the C-terminal region of MeCP2. The mechanism by which MBD proteins induce aggregation of chromocenters is unclear although it has been suggested that it may be caused by oligomerization of MBD proteins and formation of DNA-MBD-MBD-DNA structures or multiple interactions between MBD protein and DNA as mentioned above . Given that neither MeCP2 nor MBD2 nor their MBD domains form dimers or oligomers in vitro, the second interpretation seems more plausible. However, the chromocenters in mouse cells do not undergo decompaction or visible reorganization in the absence of MeCP2 or MBD2, and the differentiation of muscle tissues in MeCP2 and MBD2 null animals seems to proceed normally [56,113]. Therefore the functional significance of MBD proteins localizing to pericentric heterochromatin domains is yet to be established. Intriguingly, a recent study reports aggregation of chromocenters in mouse ES cells null for Dnmt3a/3b where major satellite DNA is mostly unmethylated . It is possible that mouse cells respond to any drastic changes in components of constitutive heterochromatin by dynamic rearrangement of chromocenters.
In human cells, where the pericentric heterochromatin does not cluster into specific domains that can be visualized by DNA-staining dyes, each MBD protein displays distinct localization (Figure 5E-H). In most human cell types MeCP2 is diffusely distributed throughout the nucleus, while MBD2 and MBD1 form several bright foci on a weaker background of diffuse distribution throughout the nucleus [101,104]. The nature and functional significance of MBD1 and MBD2 foci are currently unclear. In the case of MBD1, these structures seem to be enriched in heterochromatin-specific histone modifications and proteins such H3K9 methylases SUV39H and SETDB1, and HP1 . It would be of interest to determine whether genes silenced by DNA methylation are recruited to MBD foci. Additional studies by chromatin conformation capture (3C) analyzes may provide some essential information regarding these structures.
It is becoming clear that mutations in proteins involved in establishment of DNA methylation patterns as well as DNA methylation effectors, such as MBD proteins, lead to complex human disease phenotypes. For example, mutations in the de novo DNA methyltransferase DNMT3B result in immunodeficiency/centromeric instability/facial anomalies (ICF) syndrome and mutations in the MeCP2 gene, located on the X-chromosome, cause one of the most common forms of mental retardation in females known as Rett syndrome [115,116]. In addition, genome-wide loss of DNA methylation and aberrant methylation of CpG island promoters of genes controlling restricted cellular growth are considered important epigenetic hallmarks of cancer . Despite the functional importance of MBD proteins in recognizing DNA methylation, the means by which MBD proteins mediate the physiological functions of DNA methylation in normal tissues remain for the most part unclear. However, genetic analyses of mice null for specific methyl-CpG binding proteins have allowed their role in disease processes to be investigated in detail.
Rett syndrome (RTT) is a late onset (6 months after birth) severe autism spectrum disorder that affects 1 in 10,000 girls and is caused almost exclusively by mutations in the MeCP2 gene . Due to the random X chromosome inactivation, RTT patients are usually mosaic in the expression of the wild type and the mutant copy of the gene and show abnormal neuronal morphology but not neuronal death . Conditional deletions and neuron-specific expression of MeCP2 in mice have shown that Rett phenotype is caused by MeCP2 deficiency in postmitotic mature neurons [118,119]. Mice null for MeCP2 protein have been generated and shown to recapitulate the most essential features of the human Rett Syndrome [113,119]. MeCP2-deficient males, unlike humans, survive postnatally and develop the symptoms of the disease at ~6 weeks and die after ~11 weeks. MeCP2-deficient females have milder phenotype and therefore survive longer and are fertile [113,119]. Recent reports have challenged previous views that Rett Syndrome is a neurodevelopmental disorder. Thus re-expression of the MeCP2 gene in Mecp2lox-Stop/y mice with progressing disease phenotype is sufficient to reverse the neurological symptoms of RTT . These experiments clearly suggest that MeCP2-deficient neurons develop functionally normal and are not irreversibly damaged by the absence of MeCP2. Important molecular cues, presumably DNA methylation, that allow MeCP2 to function in mature neurons are established appropriately in the absence of MeCP2. These experiments convincingly demonstrate the principle of reversibility of Rett syndrome and are consistent with the hypothesis that MeCP2 is required to stabilize and maintain the state of mature neurons. Nevertheless, it is yet unclear whether MeCP2 function in the brain involves maintenance of specific chromatin conformation or regulation of a few key genes. Target genes for MeCP2-mediated repression are still sparse and microarray studies have not detected major changes in gene expression in the brain of MeCP2-null animals [121-123]. Perhaps, detailed studies of MeCP2 function in specifically in differentiated neurons will help to determine why this protein is so crucial for their integrity.
Substantial evidence in vivo suggests that MBD proteins contribute to transcriptional repression at methylated gene promoters, especially in tumours where many promoter-associated CpG islands are aberrantly methylated [13,36,63]. In mice that carry heterozygous mutant allele of the Apc gene (Apc+/min), a spontaneous inactivation of the wild type allele often occurs in the intestine and leads to development of adenomas and subsequent death of these animals at an age of about 180 days . However, Apc+/min mice lacking Mbd2 or Kaiso proteins show reduced incidence of intestinal tumours and significantly improved survival compared to Apc+/min littermates with wild type Mbd2 or Kaiso alleles [36,57]. Antisense oligonucleotide knock down of MBD2 in xenographs has also been shown to suppress tumour growth validating MBD2 as a potential target for anti-cancer therapy . It will be interesting to determine whether mice null for MBD1 protein are resistant to development of tumours when crossed onto Apc+/min background. On the other hand, Mbd4-deficient Apc+/min mice display accelerated tumour formation consistent with the proposed role of MBD4 in suppression of CpG mutability and tumorigenesis in vivo . Taken together, these genetic studies clearly indicate that methyl-CpG binding proteins contribute to development of cancer phenotypes in mouse models.
Studies in human cancers have identified a number of aberrantly methylated promoters of tumour suppressor genes that are bound by MBD proteins. Candidate gene approach and genome-wide studies using CpG-island microarrays showed that a significant proportion of methylated promoters are bound by a single MBD protein, most often MBD2 [63,127]. However, unlike primary human cells, about half of all methylated promoters seem to be occupied by more than one MBD protein . Whether cooperation between MBD proteins is required for stable silencing of densely methylated CpG islands in cancer cells is yet to be established.
Considerable amount of work over the last years have contributed to understanding of the role of DNA methylation in mammalian development and human disease. Nevertheless, many questions remain unresolved, including most fundamental ones of how the patterns of DNA methylation are established in vertebrate genomes and what causes dramatic changes of these patterns in pathological states such as cancer. Studies on proteins that bind methylated DNA have demonstrated that these proteins act as important effectors of DNA methylation and are involved in establishment and maintenance of transcriptionally silenced chromatin and, perhaps, higher order chromatin structure. Recent data have provided evidence that methyl-CpG binding proteins do not share binding sites in vivo and can recognize methylated DNA within specific sequence context. Perhaps, in some cases the binding selectivity could be attributed to structural properties of the domain that interacts with methylated DNA. In others, cooperation between the methyl-CpG binding domain and other DNA binding motifs could be essential for discrimination between different methylated loci and targeting distinct subsets of genes. Further work in vitro and in vivo, structural studies of these proteins in complex with appropriate DNA sequences and genome-wide analyses of their binding profiles will provide additional essential information and will facilitate more detailed mechanistic understanding of their complex functions.
We apologize to those researchers whose work could not be cited or was cited indirectly due to space limitations. Work from our laboratory is supported by Cancer Research UK, the Wellcome Trust and EMBO Young Investigators programme. T.C. is an EMBO long-term fellow.