In the mammalian genome DNA methylation occurs by covalent modification of the fifth carbon (C
5) in the cytosine base and the majority of these modifications are present at CpG dinucleotides within the genome. However, in mouse embryonic stem cells, the genomic DNA contains methylated CpA, CpT and CpG sequences [
14] instead of exclusive CpG methylation, which is predominately found in somatic cells. Nevertheless, 5-methyl cytosine (Me
5C) accounts for about 1% of total DNA bases and therefore is estimated to represent 70–80% of all CpG dinucleotides in the genome [
15]. The CpG dinucleotides are distributed unevenly across the human genome, but are concentrated in dense pockets called CpG islands (CGIs). The methylation pattern in any given cell is the outcome of independent but dynamic processes of methylation and demethylation. In the mammalian genome, methylation patterns in differentiated somatic cells are generally stable and inheritable. However, reprogramming (demethylation/remethylation) of methylation pattern takes place during two developmental stages, in germ cells and in preimplantation embryos. In contrast to genome-wide demethylation occurrence in the primordial germ cells, genomes of mature sperms and eggs in mammals are highly methylated as compared to somatic cells [
8,
16].
Although the majority of CpGs are methylated in the genome, CpG dinucleotides within CGI promoters are typically unmethylated during development and in normal (non-neoplastic/non-senescent) tissue types. The CGIs are genomic DNA regions with high frequency of CpG dinucleotides. Typically, a CGI is a region with at least 200 bp with a greater than 50% GC and an observed/expected CG ratio greater than 60% [
17]. Comprehensive analysis of CGIs in human chromosomes 21 and 22 by Takai and Jones [
18] revealed that regions of DNA of greater than 500 bp with a G+C equal to or greater than 55%, and observed CG/expected CG of 0.65 were more likely to be associated with the 5′ regions of genes. With this definition most of the
Alu-repetitive elements were excluded. These islands overlap with promoter regions of 50–60% of human genes [
19]. However, a subset of promoter CGIs are methylated in a tissue-specific manner during development, showing an exception to the general rule that CGI methylation in normal tissue is limited to X-inactivation and imprinted genes [
8,
20]. This observation was supported further by recent findings of genome-wide profiling of DNA methylation demonstrating that non-X-linked promoter CGIs are methylated in normal tissues and escape methylation in germ line cells [
21]. Another study has estimated 6–8% of CGIs to be methylated in the genomic DNA of human brain, blood, muscle and spleen [
22]. Interestingly, in the same study, CGIs displayed tissue-specific methylation of genes essential for development, suggesting a programmed mechanism of DNA methylation. Another means of DNA methylation propagation is
via methylation spreading that begins with genome-wide demethylation that starts shortly after fertilization. Remethylation of most of the genome occurs after the blastocyst stage [
23] and continues at a slower pace during the rest of the developmental period. Even though the phenomenon of spreading has not been fully understood, it was proposed as a self-perpetuating interaction between chromatin-modifying proteins and DNA methylation [
24]. Indeed, many of the chromatin modification enzymes responsible for gene silencing are found associated with each other in mammalian cells. Some of the examples of DNMT1-associated proteins are HDAC1 [
25], histone methyltransferase G9a [
26], ATP-dependent chromatin modeling enzyme SNF2H [
27], and Polycomb protein EZH2 [
28]. Therefore, the above hypothesis that initial DNA or histone methylation will attract repressive complexes, and create a transcriptionally unfavorable chromatin conformation is very plausible. This alteration in chromatin structure, in turn, influences the nearby chromatin and makes it more prone to methylation spreading. This phenomenon is well documented in
Arabidopsis, where tandem repeats upstream of endogene SDC element recruit non-CG DNA methylation directed by histone methylation and siRNA, and display spreading of siRNAs and methylation beyond the repetitive DNA [
29]. Existing pieces of evidence in mammalian cells show that there are certain
cis-acting elements that are dispersed throughout the genome and they can either act as a methylation signal element or methylation boundary during methylation spreading. For example, in the mouse
Aprt gene, two upstream B1 repetitive DNA elements were identified to provide
de novo methylation signal for spreading [
30]. These elements reside in the large stretches of DNA dubbed as methylation centers. Other retrotransposon elements such as B2, Alu (human equivalent of B1), and LINE- 1 (long interspersed nuclear element-1) are also considered to possess
de novo signaling activity for methylation spreading [
24]. In contrast, the Sp1 binding sites within the
Aprt promoter provide the counteracting force against spreading. Indeed, site-directed mutation of one or more Sp1 sites eliminates the binding of transcription factors and allows methylation to spread at the
Aprt promoter [
31]. However, (ATAA)
n repeat sequences in the human
GSTP1 gene, Sp1 and CTCF elements in the
BRCA1 gene, act as boundary elements for prevention of methylation spreading onto CGIs [
32,
33]. Recent experimental work on genome-wide DNA methylation analysis discovered an overrepresentation of putative zinc finger binding sites at the boundaries of methylation-resistant CGIs. This observation suggested that these sites may reinforce transcription factor binding and thereby block methylation spreading and promote transcription [
34]. Dynamic equilibrium between methylation spreading and its suspension is likely to be responsible for establishing and maintaining stable DNA methylation patterns in human somatic cells. Furthermore, a combined study of bioinformatic approaches and methylation data from chromosome 21 has demonstrated that DNA sequence, repeat frequencies, and predicted DNA structures correlated with methylation status of CGIs [
35].
Aberrant gene expression is one of the key features associated with complex diseases such as cancer, type II diabetes, schizophrenia and autoimmune disease. These diseases are known to be heritable, although they do not follow clear Mendelian inheritance patterns. There are several lines of evidence suggesting that epigenetic abnormalities, together with genetic alterations, are responsible for the deregulation of key regulator genes resulting in these diseases. The epigenetic mechanism provides an alternative explanation for some of the features in complex diseases, including late onset, gender effects, parent-of-origin effects, and fluctuation of symptoms [
36]. For example, in cancer cells, normally unmethylated CGIs are often hypermethylated to silence flanking tumor suppressor genes during neoplasia [
37,
38]. On the other hand, demethylation (hypomethylation) of normally methylated CGIs can lead to unscheduled activation of genes, as was first shown at
MAGE-1 locus, which is normally expressed only in germ line cells but is activated in human tumors [
39]. Indeed, the pattern of cancer-associated methylation of CGIs also depends on other factors, such as cell lineage and environmental stimuli. Apart from cancer, a rare genetic disease ICF (immunodeficiency, centromeric instability and facial anomalies) syndrome was correlated with methyltransferase machinery. These ICF patients have mutations in the DNMT3B gene, which leads to hypomethylation of satellite DNA and specific chromosomal decondensation [
40]. Thus, DNA methylation and enzymatic apparatus play a significant role during normal embryonic development and diseases.