Previous studies identified epigenetic changes in ICF patients harboring DNMT3B mutations in terms of locus-specific DNA methylation and histone modifications.
13,15 Here we focus on DNA methylation, the direct effect of altered DNMT3B function. To get an unbiased insight of changes in DNA methylation occurring genome-wide, we performed whole-genome bisulphite sequencing (WGBS) of lymphoblastoid cell lines (LCLs) obtained from a female ICF patient (caucasian, 1 y old) harboring mutated DNMT3B (A603T/STP807ins;
16,17 ICF) and one healthy gender-matched control sample (CTRL; caucasian, 4 y old). All data are freely available at the Gene Expression Omnibus (GSE37578,
www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE37578). Overall, we generated 667 million reads for the CTRL and ICF samples from which 88.8 and 85.9% mapped uniquely to the reference genome (hg19), respectively. Genome-wide we obtained DNA methylation information for 93% and 90% of Cs and 94% and 93% of CGs with an average coverage of 9.9 and 9.5-fold for the CTRL and ICF sample, respectively. Comparing global CpG methylation level for both samples revealed profound decreased levels in the ICF patient. Here, the average methylation level of CpG sites dropped from 59.7% to 34.4% and in total we identified 8,955,695 (32%) significantly differentially methylated CpG sites (X
2 test, p < 0.05). The full set of WGBS data from CTRL and ICF is illustrated in (), using Circos.
18 In line with previously published reports using single loci approaches, we detected a decrease in methylation in centromeric satellite repeats of the chromosomes 1, 9 and 16. However, screening for differentially methylated site in an ICF patient at base-pair resolution for the first time revealed hypomethylated centromeric satellite regions on all other chromosome, even to larger extents (). In addition to the centromeres a global loss 41% (27–52%) of methylation was detected for all autosomes (). Most importantly, as previously suggested,
19 chromosome X showed more profound losses (63%) indicating a general failure in heterochromatic regions. Displaying average DNA methylation level along the chromosome determined additional hypomethylated regions outside of centromeric repeat-rich regions spanning the entire chromosome as displayed for chromosome 1 and X (). Overall, a loss of methylation was abundant along all genomic features, with losses detected in promoters, exons, introns, intragenic regions (). However the loss of DNA methylation was more intense in non-genic regions.
From a regulatory standpoint, DNA methylation decreased in CpG rich and poor promoters, however to a higher extend in the latter (). The same effect was observed for CpG islands, showing conserved hypomethylation at the CpG rich islands and an increased loss of DNA methylation at the island shores in the ICF patient. Interestingly, the shape of DNA methylation was conserved around the transcription start site (TSS) in CpG rich promoters, with changes preferentially occurring at the edges of the hypomethylated regions (). While genome-wide a drop of methylation from 60% to 34% was observed, proximate flanking regions of hypomethylated loci maintain a methylation level of around 50% suggesting the surrounding sites and shape of the hypomethylated region important to be preserved. In cancer cells CpG island shores were identified as crucial elements for gene regulation with high variable methylation level between normal and diseased tissue.
20,21 In ICF patients aberrant methylation in CpG island shores might be responsible for previous reported changes in expression.
13 In addition to the conserved DNA methylation structure at promoters, CpG rich promoter containing genes were described as strongly expressed with a low evolutionary rate,
22 supporting a positive selection on epigenetic and genetic level. For CpG poor promoters, we observed a profound loss of DNA methylation at the TSS and the surrounding regions, however with conserved shape of the overall structure (). Comparing protein-coding and non-coding genes presented similar tendencies, with the non-coding RNA gene promoters showing general higher methylation levels (
Fig. S1).
Regional enrichment of differential methylation
Displaying the methylation profile of the healthy control and ICF patient along the chromosomes, revealed a more profound drop of DNA methylation in the Giemsa-positive chromosomal regions, those representing loci of dense chromatin compaction (). Accordingly, the average methylation level significantly decreased in the ICF patient with Giemsa staining intensities (; Student’s t test; p < 0.01). As DNMT3B functions are closely related to chromatin structure, especially heterochromatin formation, we analyzed the DNA methylation profile of regions presenting distinct histone marks in more detail. Here, we took advantage of chromatin immunoprecipitation sequencing (ChIP-seq) experiments, enrolled in the ENCODE project processing a lymphoblastoid cell line of a healthy donor (GM12878), enabling us to determine positions occupied by distinct histone marks. Analyzing nine different histone marks and regions occupied by CTCF and H2A.Z for their average DNA methylation level on the concomitant positions, revealed a significant loss of methylation in the repressive Polycomb mark H3K27me3 loci, the repressive heterochromatin mark H3K9me3 loci
23,24 and H4K20me1 loci, previously associated to repressed
25 and active
23,26 genes (; X
2 test; p < 0.01). All marks were previously associated to be enriched in the inactive X chromosome.
27 Furthermore, genes harboring the repressive histone mark H3K27me3 were previously associated with DNMT3B activity.
28 Also the CTCF and H2A.Z bound regions revealed a significant loss of DNA methylation (X
2 test; p < 0.05). While for CTCF a clear suppressive function was described, H2A.Z binding is rather associated to active gene promoters. However, in mice models an association to pericentric heterochromatin has been observed.
29 Most importantly, when we analyzed the autosomes separately, all aforementioned marks remained significantly altered, suggesting the regions to be affected genome-wide and not driven by the highly hypomethylated X chromosome. In contrary, histone marks clearly associated to transcriptional activity lost DNA methylation to a much lesser extent.
23 Particularly, the DNA methylation level of H3K79me2 and H3K36me3 enriched regions almost remained unchanged, by showing only 6% reductions and highly behaving against the genome-wide trend.
Furthermore, we aimed to analyze alterations occurring in repetitive elements in the ICF patient in more detail. Therefore, we extracted all sequencing reads with multiple mapping to the reference genome and analyzed them separately to extract DNA methylation changes in transposons and repeats. Strikingly, different repeat families appeared to be altered to variant extends, with satellite repeats revealing the highest reduction (-76%; ). Furthermore, transposons (RC, -66%; LINE, -58%, LTR, -57%) showed a very high reduction and RNA repeats (scRNA, -22%, rRNA, -27%, tRNA, -34%) the lowest reduction of DNA methylation, suggesting DNMT3B activity crucial for centromere stability as well as transposon repression ().
When we analyzed in particular the methylation status of neighboring CpG sites, we unexpectedly observed, regarding the massive changes observed genome-wide, that the DNMT3B mutant patients did not show significant less correlation between neighboring sites, suggesting the global hypomethylation to be due to a reduced activation, rather than a loss introduced by errors/failure of DNMT3B at random sites (). Therefore, we suggest that the here analyzed mutations are causing a reduction of activation of DNMT3B, while maintaining the enzyme’s processivity. This might be due to an impaired interaction with DNMT3L as previously shown for DNMT3B mutations in ICF patients.
30 Indeed, the A603T variant, known to be SAM binding deficient, also losses the ability to form homo-hetero-oligomers with DNMT3B and DNMT3L proteins in vitro and in vivo.
31,32 As interaction of DNMT3B with DNMT3L highly increases the activity of the methyl-transferase,
33 we suggest that an impaired binding of DNMT3B to its activator is responsible for the global hypomethylation by paralleled conserved DNA methylation structures. Detecting an homogenously methylated genome in both samples suggests that the overall epigenomic landscape is maintained, consistent with the aforementioned maintenance of CpG rich promoter structures ().
Hypomethylated regions (HMRs)
As the ICF patient displayed a global loss of DNA methylation at high magnitude, we wondered if this affects the genome-wide structure of DNA methylation fingerprints such as hypomethylated regions (HMRs).
34,35 HMRs are of special interest as they present loci of regulatory potential and were previously shown to alter their methylation level and shape following differentiation from hematopoietic stem/progenitor cells into myeloid or lymphoid lineages.
34 As an impaired maturation of B-cells is observed in ICF patients, deregulation of HMRs might contribute to the disease. Therefore, we assessed and characterized HMRs throughout the genome and compared these between both samples. We observed that the number of HMRs more than tripled in the ICF patients. While healthy cells harbored around 48,000 distinct regions of hypomethylation, the ICF patient reveals 164,000 HMRs. The changes are even more striking considering the size and CpG content of the HMRs in the patient samples as they cover 38.2% of the genome and harbor 39.3% of all CpGs. In contrary, HMRs in the healthy control sample represent only 3.1% of the genome and 9.9% of CpG sites. This is also reflected by the geometric mean sizes of 1.4 kb (IC 95%: 0.2–8.4 kb) vs. 5.0 kb (IC 95%: 0.8–30.9 kb) of the control and ICF sample, respectively.
To further characterize the distinct HMR populations in the healthy and ICF samples, we assessed HMR size, CpG content and score
34,35 (determines the number of CpG sites in an HMR weighted inversely by their methylation status (more methylation, lower value). The Uscore presents a normalized score value (Uscore = score/CpG) to allow comparisons independent of HMR length. While the healthy cells revealed mainly CpG dense, highly hypomethylated loci and some CpG poorer regions with moderate hypomethylation, the ICF patient presents CpG poor regions with low to intermediate DNA methylation levels (). Although we observed a global hypomethylation of the genome displayed by large blocks of reduced DNA methylation, a subpopulation of HMRs kept their size and composition probably enabling ubiquitous expression by structural conservation of gene promoters crucial for cell survival. These conserved regions displayed a high CpG content and low level of DNA methylation, features frequently observed at CpG rich promoters, previously identified to maintain their DNA methylation structure.
Differentially methylated regions (DMRs)
In order to identify specific loci differentially methylated between the control and ICF sample, we screened the methylomes for regions consistently changing between both samples. DMRs were defined as regions of at least five consistently differentially methylated CpG sites between the control and ICF sample. In total, we assigned 315,813 DMRs present in all chromosomes. Consistent with before reported genome-wide changes, the DMR represented almost exclusively hypomethylation (296,964 out of 315,813) in the ICF patient (). Interestingly, despite the global loss of methylation distinct regions displayed a gain of methylation level in the ICF patient. In particular, we found 18849 (6.0%) DMRs representing a gain of methylation in the ICF patient, displaying a distinct genomic distribution compared with their hypomethylated counterparts. The majority of hypo-DMRs (57.8%) mapped to intergenic regions, 37.7% were overlapping gene bodies (intragenic) and 4.5% were located in gene promoters. In contrast, the majority of hyper-DMRs were associated to genic regions with 66.6% and 11.8% mapping to gene bodies and promoter regions, respectively. Analyzing the chromosomal distribution of DMRs revealed an increased abundance on the X chromosome than on the autosomes with 55% and 41% of the chromosomes covered by the DMRs, respectively.
Although genomic hypomethylation appeared throughout the genome, the loss of methylation at CpG rich gene promoters did not present diffuse pattern but appeared rather organized ensuring expression of genes crucial for survival. In line, we detected tissue-specific genes highly methylated in the healthy control that revealed a de novo established HMR in their promoters. Here in particular testis-cancer specific genes of the TUDOR family (TDRD1 and TDRD9) presented sharply structured de novo HMRs in their promoter () accompanied by re-expression of the testis-specific genes in B-cells of ICF patients, as previously detected.
13In the here analyzed disease context hypermethylated regions represent a special event as they behave against the global loss of DNA methylation, but also against the expectations, as ICF patients harbor mutations in DNMT3B, repressing its activity. DNMT3B binding to H3K4 was previously described to be impaired by DNMT3L upon mono-, bi- or tri-methylation of the H3K4 residue. In ICF patient, we detected an enrichment of hypermethylated DMRs in particular at methylated H3K4 marked loci, suggesting the residual activity of DNMT3B to be misguided to those regions ().
36 Here, mislocated DNMT3B activity might be mediated by an impaired interaction with DNMT3L as previously determined for different DNMT3B mutants in ICF patients.
30 DNMT3L interacts with unmethylated H3K4 tails, whereas H3K4 methylation blocks the binding and subsequent DNA methylation of the marked regions by DNMT3B.
33,37 Consistently, we detected an enrichment of hypermethylated DMRs at sites previously inhibited by DNMT3L binding such as methylated H3K4 (). As especially H3K4me1 marks the boundaries flanking the TSS
23 (
andS2), an mislocated activity of DNMT3B in the flanking region could be suspected. Indeed, displaying promoter sites harboring hypermethylated DMRs revealed an increase of methylation in the sharply dropping boundaries, resulting in more narrow HMRs of CpG prich and poor promoters (). In addition, hyper-DMRs reveal a promoter distribution highly similar to the positioning of H3K4me1 (). Here, chromatin immunoprecipitation bisulphite sequencing (ChIP-BS-seq
38) of H3K4me1 of healthy and ICF samples could clearly clarify if these regions gain methylation in the patient.
Analyzing genes harboring hyper-DMRs in their respective promoter regions by Gene Ontology (GO) analysis (Biological process, level 5) revealed an enrichment of basic cellular mechanisms, such as RNA metabolism, regulation of transcription, cellular biosynthesis, nucleobase, nucleoside, nucleotide and nucleic acid metabolism, macromolecule biosynthesis and transferase activity (FDR, p < 0.05).
Therefore we propose a model in which for certain constitutively active housekeeping genes the promoter structure is maintained to ensure transcription. However, these genes present slightly narrowed gene promoter HMRs in ICF patients, suggesting that the responsible mechanism is failing to entirely restore the original status of the healthy donor (). We suggest that mislocated DNMT3B activity is hypermethylating sharp promoter boundaries, which are marked and protected by H3K4me1 in the original state (). In wild-type cells binding of DNMT3B to methylated H3K4 is impaired by DNMT3L resulting in hypomethylated H3K4 marked regions and wider promoter HMRs.
From a disease point of view, genes gaining methylation are associated to B-cell maturation, a function impaired in ICF patients causing agammaglobulinemia and severe immunological defects. Although based on different genetic alteration, ICF patients share phenotypical overlap with patients suffering from congenital agammaglobulinemia,
39 which also present an early onset of recurrent infections. This X chromosome-linked linked disease is caused by mutation of the Bruton’s tyrosine kinase (
BTK), resulting in an impaired B-cells maturation and immune system failure. BTK is involved in the signal transduction activated by the B-cell receptor (BCR), which involves additional molecules such as SYK and the B-cell linker (BLNK), a scaffold protein that binds BTK, PLCγ2, GRB2, VAV1 and NCK1.
39,40 Considering similarities in the disease phenotype of congenital agammaglobulinemia and ICF, it is tempting to speculate that the BCR pathway is also altered in ICF, although not by genetic but epigenetic alterations. Consistently, we found hypermethylated DMRs in the promoter region of
GRB2,
VAV1 and
NCK1 creating smaller HMRs overlapping the TSS (). In line, we found further genes of the BCR pathway epigenetically altered. In particular, the BTK activator
SYK revealed a hypermethylated DMR and the BTK repressor
SH3BP541 a hypomethylated DMR in their promoter regions (). It is of note that the hypomethylation of the
SH3BP5 promoter was not diffuse, but rather creating a sharply formed HMR likely to favor genes expression through its transcription promoting structure. Most importantly, the expression of
SYK and
SH3BP5 were shown to be down- and upregulated in ICF patients, respectively, consistent with their promoter methylation profiles.
13 Interestingly,
ZBTB24, the second gene genetically altered in ICF Type 2 patients harbored a hypermethylated DMR in its promoter. Although not mutated, inactivating hypermethylation of
ZBTB24 might contribute to the Type 1 disease phenotype as apparent in ICF Type 2 patients.