|Home | About | Journals | Submit | Contact Us | Français|
DNA methylation is an essential epigenetic mark. Three classes of mammalian proteins recognize methylated DNA: MBD proteins, SRA proteins and the zinc-finger proteins Kaiso, ZBTB4 and ZBTB38. The last three proteins can bind either methylated DNA or unmethylated consensus sequences; how this is achieved is largely unclear. Here, we report that the human zinc-finger proteins Kaiso, ZBTB4 and ZBTB38 can bind methylated DNA in a sequence-specific manner, and that they may use a mode of binding common to other zinc-finger proteins. This suggests that many other sequence-specific methyl binding proteins may exist.
DNA methylation is an essential epigenetic mark in mammals. It is associated with transcriptional repression, in part because methylated DNA recruits specific proteins, which themselves act on chromatin and create a repressive environment (1).
Three families of proteins bind methylated DNA in mammals (2). The first of these families contains a domain called methyl-CpG binding domain (MBD) and comprises MBD1, MBD2, MBD4 and MeCP2. The second family contains a SET- and Ring finger-associated (SRA) domain and comprises UHRF1 and UHRF2. The third family is comprised of three zinc-finger proteins: Kaiso, ZBTB4 and ZBTB38 (3,4). Kaiso and ZBTB4 are deregulated in cancer (5,6).
The structural mechanism by which the MBD and SRA domains interact with methylated DNA has been elucidated (7,8). In contrast, how the zinc-finger proteins recognize methylated DNA is unknown. In vitro, Kaiso binds two types of sequences not only methylated DNA (4), but also the consensus sequence CTGCNA, named Kaiso binding sequence (KBS) (9). The KBS does not contain a CG and cannot be methylated. It is unclear whether KBS binding and methyl-DNA binding are related or separate activities of Kaiso. Similarly, ZBTB4 can bind either methylated DNA or the KBS in vitro (3).
Here, we used ZBTB4 as a model to investigate how human zinc-finger proteins recognize methylated DNA. We report that ZBTB4 and related proteins bind methylated DNA in a sequence-specific manner: the nucleotides surrounding the methylated CpG directly contribute to the binding affinity. ZBTB4 also binds a related sequence in unmethylated DNA, but with lower affinity. The mode of binding could resemble the canonical mechanism used by C2H2 zinc-finger proteins. These findings have implications for the possible targets and roles of the Kaiso-related proteins. In addition, they suggest that other zinc-finger proteins might bind methylated DNA in mammals.
All the plasmids used are listed in Supplementary Table S1.
GST-Kaiso (ZnF), GST-ZBTB4 (ZnF) and their derivatives were expressed and purified as described (9), except that 0.5 mM IPTG was used for induction. GST-ZBTB38 (ZnF) was induced for 16 h at 18°C with 0.5 mM IPTG in LB medium containing 2% glucose.
We used a 49-mer oligonucleotide containing a stretch of 15 random positions (GTTTTCCCAGTCACTAC(N15)GTCATAGCTGTTTCCTG).
The initial random oligonucleotide was made double-stranded by annealing with the oligo SELEX-R (CAGGAAACAGCTATGAC), and polymerization by the Klenow fragment of DNA Polymerase I (New England Biolabs).
For selections, ~100 ng of GST-ZBTB4 immobilized on Glutathione beads was incubated for 30 min at room temperature with 1 µg of double-stranded random oligonucleotide in 100 µl binding buffer (25 mM HEPES pH 7.5, 50 mM KCl, 2.5 mM MgCl2, 0.1% NP-40, 1 µM ZnSO4, 5% Glycerol) containing 5 µg poly(dI-dC) and 5 µg BSA. After five washes with 1 ml of binding buffer, the bound DNA fragments were extracted with phenol/chlorofom/isoamyl alcohol, and ethanol precipitated. These products were then PCR-amplified with the primers SELEX-F (GTTTTCCCAGTCACTAC) and SELEX-R. The PCR reaction consists of 95°C for 3 min, 10, 15 or 20 cycles of (95°C for 30 s, 60°C for 1 min, 72°C for 30 s), followed by 10 min extension at 72°C. The PCR products were subjected to next round of binding reaction.
The products selected after 10 cycles of binding and amplification were cloned and sequenced. The sequences were aligned and the motifs analyzed with Weblogo (http://weblogo.berkeley.edu/).
We used a 48-mer oligonucleotide with a fixed central CG, flanked by random stretches (GTTTTCCCAGTCACTAC(N6)CG(N6)GTCATAGCTGTTTCCTG).
Before selections, the double-stranded DNA was methylated with the CpG methylase SssI (NEB) for 6 h. Binding, washing and PCR was performed identically to these steps in our SELEX protocol. After 10 cycles of selection, the enriched PCR products were cloned and sequenced.
In trial selections, we saw a rapid enrichment of sequences containing tracts of repeated CpGs. As ZBTB4 can bind a single methyl-CpG (3), we sought to eliminate sequences with repeated CpGs. For this, at each round, the oligonucleotides were digested for 1 h with the enzyme BstUI, that cleaves at CGCG. They were then remethylated by SssI and re-selected.
All the oligonucleotides used in electrophoretic mobility shift assay (EMSA) are listed in Supplementary Table S2.
Double-stranded oligonucleotides were end-labeled with γ-32P-ATP using T4 polynucleotide kinase (NEB) and purified on Sephadex-G25 columns (Roche). Approximately 10 ng of GST-fused proteins were incubated for 15 min at room temperature with the labeled probe in 20 µl binding buffer (25 mM HEPES pH7.5, 50 mM KCl, 2.5 mM MgCl2, 0.1% NP-40, 10 µM ZnCl2, 5% Glycerol) containing 1 µg poly(dI-dC) (Sigma) and 1 µg BSA. For competition experiments, proteins were pre-incubated with unlabeled competitor DNA for 20 min on ice and then labeled probes were added to the reaction mixture. The binding mixtures were electrophoresed in 0.5× TBE buffer for 1.5 h at 50 V in a gel containing 0.5× TBE, 2.5 % Glycerol and 5% acrylamide.
For the chelation experiment, GST-ZBTB4 was pre-incubated with 3 mM 1,10-O-phenanthroline (OPA) or the corresponding solvent, methanol, for 20 min at room temperature, before adding back ZnCl2 to a final concentration of 0.2 mM or 0.5mM. The mixtures were then incubated for 5 min at room temperature, followed by incubation with the labeled probes as described above.
In order to rule out major artifacts when using mismatched probes, we verified that all mismatched probes formed duplexes to the same extent (Supplementary Figure S7).
Quantification of bands was done with the ImageQuant TL software (GE Healthcare). The competition experiments were performed at least twice, and representative pictures and accompanied graphs are shown.
The cells were transfected with Lipofectamine 2000 (Invitrogen).
We used the Weblogo software (10).
First, we sought to identify the preferred binding sites for ZBTB4 on unmethylated DNA. A recombinant fragment of ZBTB4 containing the three DNA-binding zinc fingers was used in a site-selection assay (SELEX) (Figure 1A). Most sequences recovered after 10 selection cycles contained an 8-nt motif, CC/TGCCATC, which we named the ZBTB4 binding sequence (Z4BS) (Supplementary Figure S1 and Figure 1B). Under these experimental conditions EMSAs revealed that ZBTB4 binds both CCGCCATC and CTGCCATC, but not a random unselected oligonucleotide (Figure 1C).
To assess the importance of each position in the Z4BS, we used an EMSA competition assay. A 200-fold molar excess of unlabeled Z4BS was sufficient to fully displace the binding to a labeled Z4BS (Figure 1D). We then tested different mutant probes present in the same molar excess. Mutating positions 3 or 4 of the Z4BS, for instance, severely decreased the competing effect of the oligonucleotide. (Figure 1D). This experiment showed that nucleotides 1 through 7 of the Z4BS contribute to binding, with larger contributions of positions 3, 4, 6 and 2, in decreasing order of importance.
The optimal Kaiso binding site, or KBS, is TCCTGCNA (9). The Z4BS is similar to the KBS (Figure 1B), and we found that Kaiso can bind both Z4BS sequences CCGCCATC and CTGCCATC (Supplementary Figures S2 and S6). These data show that the optimal binding site for ZBTB4 on unmethylated DNA is an 8-nt sequence that is related to the KBS.
We next asked whether ZBTB4 has a preferred target on methylated DNA. For this, we modified the methyl-SELEX (Figure 2A; 11).
Twenty-two independent clones were sequenced after 10 selection cycles. Sixteen contained the consensus C/AMGCC/TAT, M being methylated cytosine, that we named methylation-dependent Z4BS (meZ4BS) (Figure 2B and Supplementary Figure S3). ZBTB4 bound the methylated consensus sequence CCGCTAT in EMSA. It also bound this sequence when it was unmethylated, albeit with lower affinity (Figure 2C). The last 6 selected clones did not contain a meZ4BS, and were bound only when methylated (Supplementary Figure S4). Z4BS and meZ4BS are highly similar (Figure 2B).
We then tested the effect of mutations in the meZ4BS (Figure 2D). All the oligonucleotides were 100% methylated at synthesis. The WT sequence CMGCCAT, when present at 400-fold molar excess, fully outcompeted the labeled probe. When position 4, just adjacent to the meCpG, was mutated (m1 oligo), competition was almost abolished. Mutating positions 5 or 6 (oligos m2 and m3, respectively), also affected the competing ability, but to a smaller extent. This establishes that ZBTB4 recognizes methylated DNA in a sequence-specific manner: nucleotides outside the methylated CpG increase the binding affinity. We found that this was also true for ZBTB38 (Figure 2E).
Finally, we determined the binding preference of ZBTB4 for the different methylated and unmethylated targets we had identified. In a competition experiment, we saw the following order of preference: CMGCCAT > CTGCCAT > CCGCCAT (Figure 3A). Therefore, the methylated sequence meZ4BS is the preferred binding target of ZBTB4 in vitro.
Next, we tested the ability of ZBTB4 to bind hemi-methylated DNA, using a competition assay (Figure 3B). A probe methylated only on the upper strand competed as efficiently as a symmetrically methylated probe, showing that ZBTB4 can discriminate cytosine methylation on the upper strand, and that cytosine methylation on the other strand contributes little. In accordance with this model, a probe methylated only on the lower strand competed better than unmethylated DNA, but less well than a probe methylated on the upper strand. This indicates that methylated DNA binding by ZBTB4 is not symmetrical.
Next, we asked if ZBTB4 binds DNA in a zinc-dependent fashion: we chelated Zn2+ ions with OPA, then performed EMSA (Figure 4A). Pre-incubation with OPA abolished the binding of ZBTB4 to both Z4BS and meZ4BS, and addition of Zn2+ restored both. Therefore, zinc is required for ZBTB4 to bind both methylated and unmethylated DNA.
The canonical C2H2 protein Zif268 makes the majority of its contacts to DNA from four positions in each recognition helix: positions −1, +2, +3, and +6 (12). ZBTB4 has three zinc fingers, so 12 such amino acids, of which six are highly conserved (Figure 4B and Supplementary Figure S5). We mutated these to alanine, along with a control, non-conserved residue in zinc finger 3. Mutating R326A abolished recognition of the Z4BS and of the meZ4BS in EMSA (Figure 4C). The mutation of K354 or T376 to alanine did not have a discernible consequence on binding to the Z4BS or the meZ4BS. Mutating S323, L348 or Y351, each modestly reduced binding to the Z4BS, but had little effect on the binding to the meZ4BS. The E350A mutation did not affect binding to the Z4BS, but strongly reduced binding to the meZ4BS. The corresponding Kaiso mutant (E535A) still bound the Z4BS, but also lost binding to the meZ4BS (Supplementary Figure S6).
There are two possible causes for the behavior of the E350A mutation of ZBTB4: this mutation may affect ZBTB4′s ability to directly discriminate methylated C from C or T; alternatively, it may affect ZBTB4′s ability to recognize the G basepaired to the C in position 2 of the primary strand. To discriminate these possibilities, we used mismatched probes. Mutation E350A did not prevent recognition of a TG/GC mismatched probe (Supplementary Figure S7). It did however strongly inhibit recognition of an MG/AC mismatched probe, from which the guanine is absent.
Although we cannot exclude the possibility that the some of the effect observed with the mismatched probes is due to mismatch-induced distortions in the DNA backbone, these data imply that the E350A mutation affects ZBTB4′s ability to directly discriminate methylated C from C or T, rather than affecting its ability to recognize the G on the opposite strand of DNA.
Mutant E350A of ZBTB4 retained a weak capacity to bind the meZ4BS (Figure 4C), but we found that mutating the nucleotides flanking the methylated CpG completely abolished binding (Figure 4D). Collectively, these data argue that residue E350 of ZBTB4 is critical for recognizing methylated cytosine when bound to its high affinity binding site, and that ZBTB4 simultaneously recognizes flanking sequences via several residues in zinc fingers (Figure 5).
We next tested whether the ZBTB4 E350A mutant binds to methylated DNA in vivo. In mouse cells, ZBTB4 localizes to chromocenters, the DAPI-dense regions of the nucleus that contain highly methylated pericentric DNA (3;Figure 4E, left). In contrast, the E350A mutant of ZBTB4 had a diffuse nuclear distribution (Figure 4E, middle). In a minority of cells, the mutant ZBTB4 formed nuclear speckles, but these never co-localized with the chromocenters (Figure 4E, right). These results indicate that E350 is essential for heterochromatin localization in vivo.
One of our main conclusions is that the human zinc-finger protein ZBTB4 binds methylated DNA in a sequence-specific manner. This discovery should be contrasted to two earlier findings.
RFX1, a protein purified from human placenta, was shown to bind some methylated sequences, but not others (13,14). Further experiments showed that the preferred binding sites of RFX1 are non-methylated consensus sequences containing TG; methylated sequences are second-tier targets, bound because of the structural resemblance of methylated cytosine and thymine (Supplementary Figure S9B). In other words, certain methylated sequences resemble the optimal RFX1 binding site and are bound with suboptimal affinity. In contrast, ZBTB4 binds with highest affinity to methylated DNA. Replacing methyl-cytosine by a thymine in the ZBTB4 target yields a sequence that it is still bound by ZBTB4, but with lower affinity.
The other case of binding methylated DNA with sequence discrimination is MeCP2, which prefers methylated sites flanked by A/T tracts (11). The effect is partly indirect: A/T tracts increase MeCP2 binding in part by tightening the minor groove (15). The situation of ZBTB4 is different, as the recognition very likely involves direct contact between the nucleotides that constitute the recognition site, and the exposed amino acids of the zinc fingers.
We report that the human proteins ZBTB38 (Supplementary Figures S8 and Figure 2E) and Kaiso, which are related to ZBTB4, also bind methylated DNA in a sequence-specific fashion. Kaiso strongly binds to a single methylated site if the optimum flanking sequence is present (Supplementary Figure S6). In light of this finding, the original observation that Kaiso needs two consecutive meCpGs for high-affinity binding (4) could be due to the absence of an appropriate consensus around the meCpG. However, our data do not rule out the possibility that consecutive meCpGs are indeed biologically relevant high-affinity sites. In fact, our results show that ZBTB4 recognizes DNA methylation mostly on one strand; this is compatible with a mechanism in which double meCpG sites are bound by two molecules of Kaiso, each on one DNA strand.
In contrast to ZBTB4 and ZBTB38, some targets of Kaiso in vivo are known. Some of these are unmethylated and bound via the KBS (16,17), whereas some others are methylated (18,19). Experiments in Xenopus (20), in human cancer cell lines (5) and in mice (21,22) argue that a least some roles of Kaiso are linked to DNA methylation.
The 50 million methylated CpGs in the human genome (23) probably vastly outnumber the methyl-binding proteins in cells. This number is not known for Kaiso, ZBTB4, or ZBTB38 but, as a comparison, there are about 50 000 molecules of TATA-binding protein in a mammalian cell (24). We hypothesize that the sequence-specificity of Kaiso, ZBTB4 and ZBTB38, helps recruit them to a subset of methylated sites. This will have to be tested in cells, keeping in mind that transcription factor targets bound in vitro and in vivo usually present an overlap that is general but not total (25,26).
Some of the DNA methylation dynamics during differentiation and cancer occurs not on CpG islands, but at flanking CpG ‘shores’ where the CpG density is low (27,28). These regions are interesting candidates for the recruitement of Kaiso-related proteins.
An interesting discovery made in the course of this work is that, unlike MBD proteins, the zinc-finger proteins we have studied have a significant affinity for hemimethylated DNA. This behavior is reminiscent of the SRA protein UHRF1 (29,30) and, by analogy, suggests that the Kaiso-related proteins might play a role in the maintenance of DNA methylation.
The binding of ZBTB4 to methylated DNA is strand-specific, requires zinc, and involves amino-acids that are predicted to be important by the standard model of C2H2 zinc finger/DNA interactions. In other words, it is possible that ZBTB4 uses a canonical mechanism to recognize methylated DNA; a putative model of these zinc finger/DNA interaction is provided in Supplementary Figure S9A. This prediction awaits experimental confirmation by a structural analysis.
An artificial C2H2 zinc-finger protein has been engineered to discriminate methylated and unmethylated DNA (31). It has very little sequence similarity to the Kaiso-related proteins. In particular, it lacks an equivalent of E350, the glutamic acid we found critical for Kaiso and ZBTB4 to bind methylated DNA. This could be explained by the fact that the methyl-cytosine recognized by the artificial protein is in a different position (in the middle of a triplet), and therefore not contacted by a residue in the same position as E350 of ZBTB4.
To conclude, we note that there are approximately 700 proteins with C2H2 zinc fingers in the human genome (32). If, as our results suggest, a canonical binding mode permits the recognition of methylated DNA, then it is possible that other methyl-binding proteins with zinc fingers exist.
Supplementary Data are available at NAR Online.
Institut National du Cancer (to N.S.); Defossez lab from Centre National de la Recherche Scientifique; Institut National du Cancer (Programme ATIP Plus); Association pour la Recherche contre le Cancer (grant n°4859); Ligue contre le Cancer (Comité de Paris). Funding for open access charge: Centre National de la Recherche Scientifique.
Conflict of interest statement. None declared.