|Home | About | Journals | Submit | Contact Us | Français|
DNA methylation plays an important role in regulating normal development and carcinogenesis. Current understanding of the biological roles of DNA methylation is limited to its role in the regulation of gene transcription, genomic imprinting, genomic stability, and X chromosome inactivation. In the past 2 decades, a large number of changes have been identified in cancer epigenomes when compared with normals. These alterations fall into two main categories, namely, hypermethylation of tumor suppressor genes and hypomethylation of oncogenes or heterochromatin, respectively. Aberrant methylation of genes controlling the cell cycle, proliferation, apoptosis, metastasis, drug resistance, and intracellular signaling has been identified in multiple cancer types. Recent advancements in whole-genome analysis of methylome have yielded numerous differentially methylated regions, the functions of which are largely unknown. With the development of high resolution tiling microarrays and high throughput DNA sequencing, more cancer methylomes will be profiled, facilitating the identification of new candidate genes or ncRNAs that are related to oncogenesis, new prognostic markers, and the discovery of new target genes for cancer therapy.†
When normal cells are transformed to cancer cells, a series of genetic lesions and/or epigenetic disruptions that favor the uncontrolled growth of cells occur. Mutation of tumor suppressor genes, such as p53, leads to loss of function of the protein that is normally required for non-transformed cells. Epigenetic changes, including global DNA hypomethylation and hypermethylation of tumor suppressor genes, are frequently observed in cancer cells. Such changes cause genomic instability that increases mitotic recombination or silencing of tumor suppressor genes that play critical roles in the control of cell proliferation and transformation. In this review, we discuss the role of DNA methylation in cancer cells and summarize recent advancements of techniques that facilitate genome-wide study of the cancer epigenome.
Methylation is the only known epigenetic modification of DNA. Other epigenetic marks of chromatins include different types of post-translational modifications of histones, which are highly diverse and some are closely correlated with DNA methylation (see Kouzarides, 2007 for review of histone modification and their function). DNA methylation is important, as it is a well-known crucial regulator in different biological processes, such as embryonic development, transcription, chromatin structure, X chromosome inactivation, genomic imprinting, genomic instability, and carcinogenesis. Methylation of DNA occurs exclusively in 5-cytosine. In mammals, the majority of cytosine methylation is observed in CpG dinucleotides. Non-CpG methylation is rare, and likely to be restricted to embryonic stem cells (Ramsahoye et al., 2000). Since transcriptionally active regions of the genome are usually CpG rich, methylation of CpG sites is one of the critical factors that affect gene transcription. Many regions of the genome contain large clusters of CpG dinucleotides. These regions are called CpG islands and they are present in ~ 70% of human promoters (Saxonov et al., 2006). In normal somatic cells, most of the CpG islands are unmethylated. Aberrant hypermethylation of the CpG island linked to some tumor suppressor genes is acquired during tumorigenesis. The reason for aberrant methylation is largely unknown. It might be caused by dysregulation of the methyltransferases of DNA or other chromatin binding proteins.
The pattern of DNA methylation is dynamic during development but becomes static in differentiated cells. This unique epigenetic code is heritable and thus, a mechanism for regulation of methylome is required. Currently, three DNA methyltransferases have been identified, namely, DNMT1, DNMT3A, and DNMT3B. These developmentally regulated genes play critical roles in the establishment and maintenance of DNA methylation.
DNMT1 is responsible for the maintenance of cytosine methylation. The epigenetic “code” is heritable. Methylation of cytosine is passed from parental cells to daughter cells if epigenetic marks have been stably established. As DNA replicates, DNMT1 methylates the newly synthesized, hemimethylated DNA in cooperation with MECP2. MECP2 is a methyl-CpG-binding protein that recognizes methylated CpG sites and, when associated with DNMT1, forms a complex to copy the parental DNA methylation to the daughter DNA strands during cell division (Kimura and Shiota, 2003). The function of DNMT1 is far more complicated than just methylation maintenance. DNMT1 interacts with a variety of proteins, such as transcription factors (p53, STAT3, and HP1), histone modifiers (HDAC1, HDAC2), and ligands (DAXX), to specifically repress targeted genes (Robertson et al., 2000; Rountree et al., 2000; Muromoto et al., 2004; Esteve et al., 2005; Zhang et al., 2005a; Smallwood et al., 2007). Furthermore, DNA methyltransferases (DNMT1, DNMT3A, and DNMT3B) interact with polycomb group (PcG) protein EZH2 to methylate EZH2-binding promoters, suggesting that the two major epigenetic repression systems are closely connected (Vire et al., 2006). Mutation of Dnmt1 in murine embryonic stem (ES) cells causes reduction of two third of cytosine methylation in the genome and demethylation of endogenous retroviral DNA. Germ-line mutation of Dnmt1 causes abnormal development and embryonic lethality (Li et al., 1992).
DNA methyltransferase-3 proteins are implicated in de novo methylation of CpG islands. DNMT3A is involved in parental imprinting. Imprinted genes are exclusively methylated in either parental allele, and are, therefore, monoallelically expressed. Knockout of Dnmt3a or Dnmt3b in mice blocks de novo methylation and leads to lethality (Okano et al., 1999). However, conditional knockout of Dnmt3a in male germ cells causes impaired spermatogenesis and loss of paternal imprinting. Offsprings of Dnmt3a conditional knockout female die in utero due to the lack of maternal imprinting on Peg3 and Snrpn. However, Dnmt3b conditional mutants and their offspring show no apparent phenotype (Kaneda et al., 2004).
Unlike DNMT3A and DNMT3B, DNMT3L does not show any methyltransferation activity. It is a cofactor that enhances the de novo methylation activity of DNMT3A (Chedin et al., 2002). Disruption of Dnmt3L in mouse results in the failure of the establishment of maternal methylation imprints, indicating that this cofactor is as important as Dnmt3a and Dnmt3b in the acquisition of imprinting (Bourc’his et al., 2001).
DNA methylation is an important regulator in many biological processes. In mammals, DNA methylation is essential for normal development, and defects in methylation causes diseases.
The mechanism of gene regulation in eukaryotic cells is more complicated than that in prokaryotic cells. Histone proteins provide an additional layer of gene regulation through epigenetic marks on histones or DNA. Double-stranded DNA wraps histone proteins to form chromatin. The state of chromatin can be either “active” or “silent,” depending on the interaction between transcriptional factors and the cis-acting elements (promoters or enhancers) of the genes. It is well known that hypermethylated promoters are usually associated with gene repression. Inhibition of de novo methylation with methyltransferase inhibitors such as 5-azacytidine and 5-aza-2′-deoxycytidine can restore the expression of methylation silenced genes (del Senno et al., 1986). The mechanism by which the gain of methyl groups in CpG sites shuts down gene expression is not clear. The first proposed model for this mechanism is that methyl groups in promoters provide a physical barrier to accessibility by transcription factors. Many transcription factors, such as AP-2, c-myc, CREB/ATF, E2F, MLTF/USF, and NF-kB, are known to bind promoters with unmethylated CpG dinucleotides, but fail to bind methylated CpG sequences. However, transcription factor like CTF and Sp1 are insensitive to methyl-CpG, suggesting that DNA methylation only affects the transcription of a subset of methylated genes (Tate and Bird, 1993). The second model for methylation mediated gene repression involves a family of methyl-binding proteins. For instance, complexes of methyl-CpG-binding protein-1 (MECP1) and protein-2 (MECP2) preferentially bind methylated CpG sites and inhibit transcription (Boyes and Bird, 1991; Nan et al., 1993). These complexes contain several methyl-CpG-binding domain (MBD) proteins (MBD1, MBD2, MBD3, MBD4, and Kaiso) that bind to methylated CpG sites to suppress transcription initiation. Binding of MECP complexes to methylated promoters either prohibits the access of transcription factors, or recruits histone deacetylase, another repressive epigenetic modification enzyme, to achieve gene silencing (Ng et al., 1999).
In addition, another mode of transcription regulation involves the binding of the CTCF protein to Imprint Control Regions (ICR) of imprinted genes. The role of CTCF protein in the regulation of monoallelic expression of H19/Igf2 locus has been well studied. In this model, the ICR is located between the Igf2 and H19 genes. The paternally methylated ICR prevents binding of CTCF protein to the insulator sequence, and therefore, permits the downstream enhancer to activate Igf2 expression but suppress the expression of H19 (Bell and Felsenfeld, 2000; Hark et al., 2000). The binding of CTCF protein is controlled by the methylation of the ICR. This is another illustration of DNA methylation-mediated gene regulation.
The pattern of DNA methylation in somatic cells changes during embryonic development until they fully differentiate and gain tissue-specific methylation. In germ cells, differential methylation between the male and female genomes occurs at different stages of development.
In mammals, there are two waves of global demethylation during development. Soon after fertilization, the highly methylated gametes are actively demethylated, a process called reprogramming. However, demethylation is not synchronized between the male and female genomes. In the zygote, the highly methylated male genome is rapidly demethylated only hours after fertilization, before the first round of DNA replication commences (Mayer et al., 2000; Oswald et al., 2000). Reprogramming of the male genome is believed to be an active process that involves the demethylation of DNA and remodeling of sperm chromatin where the sperm-specific protamines are replaced by acetylated histones. Demethylation of the maternal genome is thought to be a passive process in which DNA replication dilutes the methylome in the absence of nuclear Dnmt1. Both parental genomes gain methylation during implantation, possibly with the participation of Dnmt3a and Dnmt3b. It should be noted that imprinted genes are protected from the first wave of global demethylation. The protection of imprinted genes from demethylation in the zygote ensures proper monoallelic expression of imprinted genes, many of which are important in early embryogenesis. The second wave of global demethylation occurs in primordial germ cells (PGC) prior to gameto-genesis. Between 10.5 and 11.5 days post coitum (dpc), murine PGCs migrate to the genital ridge where they differentiate into gonocytes. A rapid and active erasure of DNA methylation of regions within imprinted loci commences between 10.5 and 13.5 dpc in both male and female embryos (Hajkova et al., 2002). During this period, imprinted genes such as H19 are demethylated in their differential methylated region (DMR) (Hajkova et al., 2002; Sato et al., 2003). Methylation in imprinted regions is acquired before birth on 13.5 dpc and continues after birth. The timing of re-establishment of different imprinted genes in the two sexes is different.
Although several methyltransferases have been found to be responsible for maintenance (Dnmt1) and establishment (Dnmt3a, Dnmt3b, and Dnmt3L) of methylation, rapid demethylation of the zygote after fertilization and erasure of the methylated imprinted regions in PGCs suggest that there exists a temporally controlled demethylase for this active process. However, the existence of DNA demethylases is still controversial, although MBD2 is proposed to be a demethylase, in addition to functioning as a methyl-CpG-binding protein (Ng et al., 1999; Detich et al., 2002).
The genome is subject to a series of genetic and/or epigenetic alterations when normal cells are undergoing neoplsatic transfomation. This can be caused by prolonged exposure to carcinogens, viral infection, imbalance of hormones, spontaneous mutation of tumor suppressor genes, or any disruption in the epigenome that favors the growth of tumor cells. Tumor cells gain survival advantage as their proliferation rate overcomes apoptosis. These cells become malignant cancer if they acquire the capability to invade adjacent tissues or further migrate to distant organs. Studies of the cancer genome reveal different molecular mechanisms that lead to tumorigenesis. These include the gain or loss of genetic materials (copy number variation), mutation of genes, or disruption of the epigenome that alters gene activity without changing the DNA sequence. Usually, cancers are formed as a consequence of multiple effects. Many cancers are found to be associated with changes in the epigenome that dysregulate normal transcriptome. Aberrant DNA methylation is frequently observed and considered to be a hallmark of cancers. Disruption of methylation can be global or localized. Global hypomethylation in repetitive DNA sequences destabilizes the chromosomes and increases the rate of genomic rearrangement. Alternatively, hypermethylation in CpG islands of tumor suppressor genes prevents these genes from inhibiting tumorigenesis.
Hypermethylation is more frequently reported than hypomethylation in cancers. Promoter-associated CpG islands play an important role in the regulation of gene transcription. In normal somatic cells, most CpG islands are unmethylated. However, acquisition of methylation in some CpG islands is observed in almost all types of primary tumors as compared to their normal counterparts. The mechanism of cancer hypermethylation is not fully understood. Several studies have shown that this might involve the interaction of the de novo methyltransferase DNMT1 and other DNA binding proteins. For example, DNMT1 forms a complex with Rb, E2F1, and HDAC1 to repress transcription from promoters containing E2F-binding sites in cancer cells (Robertson et al., 2000). Moreover, DNMT1 interacts with p53 to repress p53 responsive genes, Survivin, and Cdc25C (Esteve et al., 2005). Since DNMT1 shows low sequence specificity, targeted methylation is possibly achieved through interaction between DNA binding proteins (which bind to DNA with a particular consensus sequence) and DNMT1, and probably other histone modifiers, such as HDAC.
Numerous reports show that DNA hypermethylation can occur in many genes involved in different biochemical pathways that are related to tumor development or progression. Table 1 summarizes the most frequently reported genes that are silenced by DNA methylation; many of them demonstrate hypermethylation in CpG islands. These genes regulate a number of cellular processes including cell cycle (CDKN2A/p16-INK4, CDKN2B/p15-INK4B, CCND2, RB1), DNA repair (MGMT, BRCA1, MLH1), apoptosis (DAPK, TMS1, TP73), metastasis (CDH1, CDH13, PCDH10), detoxification (GSTP1), hormone response (ESR1, ESR2), Ras signaling (RASSF1), and Wnt signaling (APC, DKK1). Hypermethylation of some genes, such as CDKN2A/p16-INK4, RASSF1, and MGMT, is frequently observed in multiple cancer types whereas hypermethylation of others appears to be limited to a particular cancer type. These genes include BEX1 and BEX2 in glioma (Foltz et al., 2006), PPP1R13B in acute leukemia (Roman-Gomez et al., 2005b), and PRSS21 in testicular germ cell tumors (Kempkensteffen et al., 2006). Certain cancer types appear to be more vulnerable to epigenetic disruptions. According to the cancer methylation database PubMeth, the most often reported cancers associated with DNA hypermethylation are lung, gastric, colorectal, leukemia, brain, liver, breast, and prostate (Ongenaert et al., 2008). However, the prevalence of reports on hypermethylation in these major cancers does not indicate the infrequency of methylation disruption in other cancer types. Like other major tumors, rare malignant tumors, such as testicular germ cell tumors, have been known to be epigenetically changed, although many of the disrupted genes reflect the origin of the tumors (Lind et al., 2007).
Defect of cell cycle control is one of the characteristics of cancer cells. This explains why suppression of genes involved in cell cycle control is common to many types of tumor. RASSF1 is a tumor suppressor gene known to inhibit cell proliferation by negatively regulating cell cycle progression at G1/S phase transition through inhibiting accumulation of cyclin D1 (Shivakumar et al., 2002). Hypermethylation of RASSF1 is prevalent in a wide variety of cancers, probably reflecting the intrinsic factors common to tumorigenesis (Yu et al., 2003). Aberrant methylation is also found in genes of signaling pathways. Hypermethylation of SOCS-1, for example, leads to the activation of the STAT3 pathways in head and neck squamous cell carcinomas (Lee et al., 2006b).
Cancer cells usually acquire aberrant methylation of multiple tumor-related genes that cooperate to confer survival advantage to neoplastic cells (Leung et al., 2001; Lee et al., 2002). Clinical studies must include a statistically significant sample size to reveal the frequency of aberrant methylation. A considerable variation of the frequency for a certain tumor suppressor gene is observed in different types of cancers, probably due to the different grades of cancers and different sample sizes.
The human cancer genome was first found to be hypomethylated in 1983 (Feinberg and Vogelstein, 1983). Global hypomethylation and the resulting genomic instability are regarded as hallmarks of cancers today. It is generally thought that global hypomethylation occurs early in tumorigenesis and predisposes cells to genomic instability and further genetic changes. Gene specific demethylation appears at a later stage. This allows tumor cells to adapt to their local environment and promote metastasis (Robertson, 2005). Hypomethylation has also been found to be correlated with tumor progression and cancer metastasis (Widschwendter et al., 2004a).
In contrast to hypermethylation that leads to gene silencing, hypomethylation of genes is usually accompanied with reactivation of transcription. In cancers, hypomethylation is often associated with oncogenes. c-Myc, a transcription factor that acts as an oncogene, is one of the widely reported hypomethylated genes in cancers. Hypomethylation of c-Myc was first found in cultured cell lines in 1984 (Cheah et al., 1984), and subsequently identified in other cancers, such as hepatocellular carcinoma (Kaneko et al., 1985; Nambu et al., 1987), leukemia (Tsukamoto et al., 1992), and gastric carcinoma (Fang et al., 1996). Its methylation is also known to be associated with bladder and colorectal cancer progression (Del Senno et al., 1989; Sharrard et al., 1992). The cancer-testis gene MAGE (melanoma antigen) is normally expressed in germ cells only, but reactivated in various tumor types. Reactivation by demethylation was observed during gastric cancer progression (Honda et al., 2004). Promoter hypomethylation and reactivation of MAGE-A1 and MAGE-A3 was also observed in colorectal cancer cell lines and cancer tissues (Kim et al., 2006). Moreover, hypomethylation of P-cadherin (CDH3) was found in colorectal carcinogenesis (Milicic et al., 2008), as well as in invasive breast carcinomas (Paredes et al., 2005). c-Ha-Ras is another hypomethylated oncogene involved in signal transduction by activating several cascades of kinases which lead to growth, differentiation, apoptosis, or senescence. Hypomethylation of c-Ha-Ras was reported in gastric carcinoma (Fang et al., 1996). DNA hypomethylation of the oncogene synuclein γ (SNCG) causes it to be over-expressed in breast and ovarian cancers (Gupta et al., 2003), gastric cancer (Yanagawa et al., 2004), and liver cancer (Zhao et al., 2006).
In addition, many other genes were found to be hypomethylated and reactivated in cancers, although their role in oncogenesis needs to be confirmed. These include PSG in testicular germ cell cancer (Cheung et al., unpublished observations), WNT5A, CRIP1, and S100P in prostate cancer (Wang et al., 2007), L1 cell adhesion molecule (L1CAM) in colorectal cancer (Kato et al., 2009), and the cancer/testis antigen gene, XAGE-1, in gastric cancers (Lim et al., 2005).
Although global hypomethylation was found in a wide variety of tumors, the role of hypomethylation is not fully understood. It is unclear whether hypomethylation is the consequence of tumor transformation or the cause of tumorigenesis. This question could possibly be answered by genetic deletion of Dnmt1, the only known methyltransferase for methylation maintenance. However, since homozygous Dnmt1 knockout mice are lethal during gestation (Lei et al., 1996), a modified animal model is needed for studying hypomethylation in vivo. In one study, a hypomorphic allele of Dnmt1 was combined with a null allele to generate the heterozygous mice in which the endogenous Dnmt1 level was reduced to 10%. Cells of the heterozygotes displayed genome-wide hypomethylation in all tissues. The mice developed T cell lymphomas and had a high frequency of chromosome 15 trisomy (Gaudet et al., 2003). These experiments suggest that DNA hypomethylation plays a crucial role in tumor development by promoting chromosomal instability.
Pericentromeric heterochromatin contains tightly packed repetitive DNA sequence (LINE, SINE, IAP, and Alu elements). In normal cells, heterochromatin is highly methylated and epigenetically silenced to reduce transcriptional noise. In cancers, global demethylation is commonly observed. Methylation of LINE-1 (long interspersed nucleotide elements) helps to maintain genomic stability and integrity. Loss of methylation increases genomic instability and results in a higher chance of mitotic recombination, both of which are frequently observed in tumor development.
Global hypomethylation of LINE-1 is widely reported in different cancer types, including colorectal cancer (Estecio et al., 2007; Ogino et al., 2008), urothelial carcinoma (Jurgens et al., 1996), malignant germ cell tumors (Alves et al., 1996), ovarian cancer (Pattamadilok et al., 2008), cervical cancer (Shuangshoti et al., 2007), neuroendocrine tumors (Choi et al., 2007), prostate cancer (Cho et al., 2007), and chronic myeloid leukemia (Roman-Gomez et al., 2005a). In a study using pyrosequencing to determine the methylation status of LINE-1 and Alu sequences in 48 primary non-small cell carcinomas, hypomethylation of the retrotransposable elements was found to correlate with genomic instability (Daskalos et al., 2009). It was, therefore, proposed as a surrogate marker for cancer-linked genome demethylation (Ogino et al., 2008).
Our genome-wide DNA methylation analysis in cancer cells (Cheung et al., in press) revealed several thousand DMRs. However, only less than 3% of DMRs are mapped to gene promoters. The majority of DMRs are located in intergenic regions or introns. It is still unclear why the cancer genome displays differential methylation in these non-regulatory regions. One possible function of intergenic and intronic DMRs is to regulate the expression of non-coding RNAs (ncRNA). Many ncRNAs, such as miRNAs and snoR-NAs, are located in intergenic or intronic regions. Some are expressed through the action of independent promoters whereas others might be the splicing products of the host mRNAs (for intronic ncRNAs). It is estimated that half of the miRNAs are associated with CpG islands (Weber et al., 2007a). Several studies attempt to reveal the role of DNA methylation on regulation of miRNAs (Saito et al., 2006; Datta et al., 2008; Lujambio et al., 2008). Demethylation of cancer cell lines by 5-aza-2′-deoxycytidine restored expression of these miR-NAs, indicating that like many tumor suppressor genes, miRNA is another class of ncRNAs that is epigenetically disrupted. In our study (Cheung et al., in press), miR-199a and miR-184 were reactivated by 5-aza-2′-deoxycytidine treatment of embryonal carcinoma cells. Both miRNAs are hypermethylated in intronic and intergenic regions respectively. In another study, miR-148a, miR-34b/c, and miR-9 were found to be silenced by DNA methylation. These epigenetically regulated miRNAs act as tumor suppressors that contribute to suppression of cancer development and metastasis (Lujambio et al., 2008). Other hypermethylated miRNAs in cancers include miR-127 as a negative regulator of proto-oncogene BCL6 (Saito et al., 2006), miR-124 as a negative regulator of CDK6 (Lujambio et al., 2007), and miR-1 in hepatocellular carcinogenesis (Datta et al., 2008). It is anticipated that more DNA methylation-regulated miRNAs will be identified by genome-wide analysis of cancer methylomes.
The majority of current evidence linking DNA methylation, transcriptional regulation, and disease are derived from cancer research. Significant changes in global DNA methylation have been observed in cultured cancer cells and primary human tumor tissues. These changes include global DNA hypomethylation of centromeric repeats, repetitive sequences, and gene-specific hypermethylation of CpG islands (Lister and Ecker, 2009). Over the last decade, the number of studies on the role of DNA methylation in cancer development has grown dramatically and “cancer epigenetics” is now the focus of many exciting and significant advances in cancer research. Diagnosis, prognosis, and therapeutic regimes based on DNA methylation are on the horizon. However, the understanding of the biological significance of aberrant DNA methylation in the cancer genome remains limited. This is largely due to the lack of high-throughput technologies and relevant genome information. In the past, DNA methylation analysis was usually gene-based using qualitative or quantitative polymerase chain reaction (PCR)-based methods. Examples include methylation specific PCR (MSP) (Licchesi and Herman, 2009), Combined Bisulfite Restriction Analysis (COBRA) (Xiong and Laird, 1997), Methylation Sensitive Single Nucleotide Primer Extension (Ms-SNuPE) (Gonzalgo and Jones, 2002), small scale bisulfite sequencing (Frommer et al., 1992), and Quantitative Methylation-Specific PCR (QMSP, also known as MethylLight) (Jeronimo et al., 2001). Each method has its advantages over the others (Table 2). To survey whole-genome DNA methylation by these methods was costly and ineffective. In fact, only about 0.1% of the studies reported examined detailed DNA methylation in the genome (Schumacher et al., 2006).
With the completion of various genome projects and recent developments in high-throughput and whole-genome profiling techniques, large scale DNA methylation analysis has become feasible. Unlike whole genome transcriptome assays that are based on unified RNA sequence annotation, the design of whole genome methylome assays are more complicated due to the elusive and dynamic pattern of 5-methylcytosine in the genome. Such DNA modification, usually referred to as the “fifth base” (Bird, 1986), was not included in the original genome projects. Thus, there is no universal reference available for designing probes or assays to differentiate the fifth base from the unmethylated cytosine. Therefore, despite the wide availability of whole genome expression assays, identification of sites of DNA methylation throughout a genome has not been possible until recently. The full extent of the effect of global DNA methylation on gene expression and chromatin structure remains largely unknown. The challenge has been overcome by recent availability of highly specific antibodies, high density microarrays, and massive parallel sequencing technologies. These technologies enable global mapping of this epigenetic modification at a very high or even single base resolution, providing new insights into the regulation and dynamics of DNA methylation in genomes. A number of global methylation methods are available; the differences are the resolution, features of DNA surveyed, and the qualitative or quantitative nature of the method.
The procedure of whole genome DNA methylation profiling can be divided into two steps: the first step is to identify and enrich methylcytosines in the DNA sample (Fig. 1). Common methods include (1) restriction enzyme-based method; (2) chromatin immunoprecipitation (ChIP); and (3) bisulfite conversion. The second step involves capturing the enriched or chemically modified DNA by high-throughput and high resolution whole genome assays that use (1) high density tiling microarrays; or (2) massive parallel sequencing.
Digestion with methylation-sensitive restriction enzyme followed by Southern blot analysis was used to examine the overall methylation status of CpG islands (Reilly et al., 1982). However, this approach does not provide information of methylcytosine in a specific sequence context. This approach is further hampered by the efficiency of restriction enzyme digestion and the amount of input DNA (>5 μg) required. Replacing Southern blot analysis with PCR in the subsequent modification (e.g., COBRA) allows the application in small scale DNA methylation analysis. The restriction enzyme-based method can also be combined with other experimental approaches to gain global methylation information, including Restriction Landmark Genomic Scanning (RLGS) (Akama et al., 1997), array-based Differential Methylation Hybridization (DMH)/Array-PRIMES (Huang et al., 1999), and HpaII tiny fragment Enrichment by Ligation-mediated PCR (HELP) (Khulan et al., 2006).
Restriction landmark genomic scanning is a two-dimensional gel electrophoresis approach based on the use of methylation-sensitive restriction enzymes (e.g., NotI). Up to 2000 end-labeled landmark sites can be displayed in a single RLGS experiment. The labeling of the sites is based on the incorporation of radionucleotides into the restriction site by DNA polymerase. Methylated sites are not digested and are not labeled, and thus, do not contribute to the two-dimensional pattern of RLGS fragments. Spots present in a normal profile but absent in a tumor profile represent methylation of the landmark site. It allows quantitative global DNA methylation analysis in the context of CpG islands. This approach provides a platform for the simultaneous assessment of over 2000 CpG islands (Hatada et al., 1991; Okazaki et al., 1995).
The main strength of RLGS resides in its unbiased approach towards the analysis of CpG islands irrespective of their association with known genes, thus providing a unique tool for the discovery of novel hypermethylated sequences in mammalian genomes. In addition, it can be applied to any genome without prior knowledge of DNA sequence. RLGS has been used in the identification of novel imprinted genes and genes frequently hypermethylated (Kuromitsu et al., 1995; Costello et al., 2000; Fruhwald et al., 2001; Blanchard et al., 2003; Dai et al., 2003; Motiwala et al., 2003; Smiraglia et al., 2003; Song et al., 2005; Wang et al., 2008; Yamagata et al., 2009), and genomic hypomethylation (Konishi et al., 1996; Nagai et al., 1999; Morey et al., 2006) and methylation of 3′ untrnlated regions (Smith et al., 2007) in several types of cancers.
Despite its power in the systematic detection of epigenetic alterations due to DNA methylation, the identification of polymorphic sites is difficult with RLGS because the resulting spots contain very little target DNA and many unlabeled DNA fragments. Another major limitation of RLGS is that methylation can only be assessed in CpG islands, which contain the sequence for the methylation-sensitive enzyme used in the assay. Sequence polymorphisms in any of the enzyme recognition sequences needed for RLGS or genomic deletions result in the effective loss of signal, which could be incorrectly interpreted as DNA methylation. Finally, the assay requires relatively large amounts of high molecular weight genomic DNA (>1 μg), which makes this approach unsuitable for the analysis of samples when the amount of DNA recovered is low or when the DNA is highly fragmented.
To overcome its limitations, imaging and simulation software were developed for better spot identification. For example, virtual image restriction landmark genomic scanning (Vi-RLGS) was developed to compare actual RLGS patterns with computer-simulated RLGS patterns (Koike et al., 2008). The new vRLGS system is highly robust for the identification of novel RLGS spots. Assignment of specific genomic sequences to RLGS spots is also improved with second generation virtual RLGS (Smiraglia et al., 2007).
Studies on global changes of DNA methylation at the CpG island level can also be achieved by combining restriction enzyme digestion and CpG island microarrays. DMH is the first successful attempt to build an array-based DNA methylation assay. It uses a methylation-insensitive restriction enzyme (MseI) to digest genomic DNA followed by ligation with DNA linkers. The ligation product is then digested with methylation-sensitive restriction enzymes, HpaII and BstUI. The product of the second round of enzyme digestion is amplified by PCR using primers complimentary to the linker sequence. The PCR products are then labeled with fluorescent dyes (Cy3 or Cy5) and then hybridized to a CpG island microarray. Similar to other restriction enzyme-based methods, the specificity of DMH depends on the efficient digestion of genomic DNA by methylation-sensitive restriction enzymes. Incomplete digestion could lead to the generation of false-positive results. The technique was used to successfully identify epigenetic alterations in cancers, including breast (Huang et al., 1999; Yan et al., 2000; Fan et al., 2006; Yan et al., 2006), ovary (Balch et al., 2005), colon (Paz et al., 2003), and brain (Felsberg et al., 2006; Waha et al., 2007; Vladimirova et al., 2009).
DMH has also been adopted commercially. Epigenomics (http://www.epigenomics.com) has partnered with Affymetrix to develop a new DMH microarray for highly efficient methylation profiling of human samples.
HELP assay interrogates cytosine methylation status on a genomic scale (Khulan et al., 2006; Oda and Greally, 2009). In this assay, two restriction enzymes (HpaII and MspI) are used. HpaII only cleaves sites where the cytosine in the CpG is not methylated. Resulting DNA fragments after digestion with each of these enzymes are separately amplified by PCR and labeled with different fluorescent dyes. The particular PCR process used in the HELP assay will produce DNA fragments with a size of 200 bp to 2000 bp known as HTFs (HpaII Tiny Fragments). Comparison of the quantity of HTFs derived from MspI and HpaII treatment will reveal the methylation state of the different genomic sites. The relative amounts of MspI and HpaII fragments are compared by hybridizing to tiling microarray. Beside CpG island methylation, it also provides insights into the distribution of cytosine methylation in other genomic regions.
Chromatin immunoprecipitation (ChIP) is an approach that allows one to investigate interactions between proteins and DNA. It was first applied in studying the regulation of Hsp70 genes in Drosophila (Solomon et al., 1988). The technique has also been applied extensively in cancer research (Ren and Dynlacht, 2004; Wang, 2005; Neff and Armstrong, 2009). The procedure involves cross-linking of chromatin proteins-DNA complex by formaldehyde and generation of short random fragments of this chromatin by sonication. Using antibodies directed against the protein of interest, cross-linked chromatin fragments are immunoprecipitated. The isolated antibody-chromatin-complexes and the input or non-immunoprecipitated materials are treated to remove the crosslink and the DNA is purified. Both control and immunoprecipitated samples are amplified by quantitative PCR using primers specific for the genomic region of interest. With different antibody combinations, ChIP allows for profiling chromatin-associated factors, histone modifications, and histone variants, as well as local nucleosome density. When ChIP is combined with DNA microarray technology (ChIP-chip), it can be applied in the identification of DNA binding sites for transcriptional factors (Rodriguez and Huang, 2005; Wu et al., 2006; Jiang and Pugh, 2009). Combining ChIP with genomic tiling array hybridization or massive-parallel sequencing (ChIP-seq) allows whole genome studies, including global methylome analysis.
Although RLGS has been proven useful in identifying differential methylated regions in a variety of tumors, it is limited to detecting methyl groups at defined restriction sites and the data obtained are limited by the frequency of the restriction enzyme recognition sequence (Smiraglia and Plass, 2002). ChIP-Chip provides an alternative solution to RLGS. Methylated DNA immunoprecipitation (MeDIP/mDIP) (Weber et al., 2005; Keshet et al., 2006; Mohn et al., 2009; Sorensen and Collas, 2009; Thu et al., 2009) is a ChIP-chip based method that uses an antibody against 5-methylcytosine to capture methylated DNA fragments. Enriched fragments are then detected by hybridizing to genomic tiling microarrays. It is suitable for unbiased interrogation of whole genome methylation to uncover non-CpG island methylation regions. Using mDIP approach, Weber et al. (2005) showed that only a small set of promoters is methylated differentially, suggesting that aberrant methylation of CpG island promoters in malignancy may be less frequent than previously speculated. Follow-up study also demonstrated CG-depleted regions to be strikingly hypomethylated, manifesting a degree of change greater than those at the CpG islands tested in the same experiment (Weber et al., 2007b).
The amount of starting material is critical for successful microarray hybridization in MeDIP/mDIP experiments. The number is highly variable depending on the quality and specificity of the antibody, binding frequency of protein to DNA, and sonication control. The removal of repetitive DNA elements with high methylation content (e.g., ALU, satellites) in the microarrays also helps to reduce methylation signal noise.
ChIP-seq is an alternative method for reading ChIP results by using high-throughput sequencing technologies (Barski and Zhao, 2009; Hoffman and Jones, 2009; Neff and Armstrong, 2009). Similar to MeDIP/mDIP procedure, the methylated DNA is immunoprecipitated with an antibody against 5-methylcytosine. The 5′ ends of the enriched DNA fragments are sequenced in parallel. Depending on the technology, the sequences are read in short or long fragments known as tags. The tags are assembled and mapped to the reference genome using alignment algorithms (Pettersson et al., 2009). ChIP-seq data provide single base resolution information on methylation and the digital nature of sequencing data allows comparison between different ChIP-seq experiments directly. The drawbacks of the ChIP-seq approach include high cost, long experiment time, and extensive sequencing. Significant amounts of non-relevant methylation signals from repetitive DNA elements will also be included in the dataset.
Genomic DNA is treated with bisulfite to convert unmethylated cytosine to uracil. Methylated cytosine is not affected by this treatment. This procedure is sensitive and is independent of the presence or absence of restriction enzyme recognition sequence. Similar to ChIP, the chemically modified DNA can be detected by microarrays containing bisulfite-modified targets (Zhou et al., 2006) or direct sequencing (Cokus et al., 2008; Lister et al., 2008; Meissner et al., 2008). Unlike classic whole genome sequencing, the Watson and Crick strands of bisulfite-treated sequences are not complementary to each other, because bisulfite conversion occurs on cytosine only. As a result, there will be four distinct strands after PCR amplification: BSW (bisulfite Watson), BSWR (reverse complement of BSW), BSC (bisulfite Crick), and BSCR (reverse complement of BSC). This increases the amount of work in the alignment step. It also requires an effective method in asymmetric C/T matching. Mapping of millions of bisulfite reads to the reference genome remains a computational challenge.
A microarray is a solid support on which DNA of known sequence is deposited. The DNA may take the form of oligonucleotides, cDNA, or clones, and act as probes to detect sequences present in the sample through hybridization. Depending on resolution, a whole genome human microarray chip could contain more than 2 millions probes. DNA microarrays were originally developed for high-throughput gene expression analysis. The fast, comprehensive, and flexible nature makes it an indispensable tool in the post-genomic era.
Tiling microarrays are high-resolution microarrays made of probes ranging from 5 to 60 bp. In contrast to classical microarray design where probes are biased to the annotated gene regions, the probe sequences in tiling microarrays tile along the genome without considering sequence features. This design allows unbiased interrogation of the whole genome. Transcriptorne analysis by tiling arrays has unveiled that a large portion of unannotated genome is actually undergoing active transcription (Johnson et al., 2005; Willingham and Gingeras, 2006). They are useful in splice variant analysis and the detailed examination of gene structure (Finocchiaro et al., 2007); this research so far has challenged our notion on gene definition. Commercial tiling microarray platforms are available through Affymetrix and NimbleGen, as the tiling microarray 1.0/2.0 series or the HD1/2 series, respectively.
The capillary sequencer is the main workhorse of the Human Genome Project. It does not require radiation and polyacrylamide gel electrophoresis as initially invented by Frederick Sanger in the 1970s (Sanger et al., 1977; Sanger et al., 1992). However, it is still cumbersome and slow, with relatively high cost to run ($0.10/1000 bases). This situation was changed in 2005, with the introduction of the 454 sequencer and later the other new players, such as Illumina and SOLiD (Sequencing by Oligonucleotide Ligation and Detection). These sequencing technologies are referred to as “next-gen” sequencing (Table 3) (Morozova and Marra, 2008).
Founded by Jonathan Rothberg, the technology of 454 sequencing (http://www.454.com) was developed by 454 Life Sciences, a Roche company. The method relies on tiny resin beads to anchor the DNA fragments, which are amplified and denatured to single stranded form. The beads are then put into wells on a plate along with enzyme beads. The polymerase and primer attach to the DNA fragment to initiate the sequencing reaction. As the nucleotides are incorporated into the DNA strand, light is given off. Light intensity is proportional to the number of A’s, T’s, C’s, or G’s incorporated. The latest 454 machine is able to read one gigabase of DNA sequence within days, at a cost of $0.02/1000 bases.
In 2006, Solexa debuted a new sequencing technology. Instead of using beads for DNA fragment capture, DNA fragments are amplified in dense clusters on a slide to provide stronger fluorescence signals. Fluorescence signals specific to A, T, C, and G are read as the bases are incorporated into the DNA fragment template in each cluster. The platform made its mark delivering the first African, Asian, and cancer patient genomes. It was acquired by Illumia (http://www.illumina.com) in 2006.
Applied Biosystems rolled out the SOLiD sequencing technology in 2007. Unlike 454 and Illumia platforms that rely on DNA polymerase for replicating new DNA strands a base at a time (sequencing through synthesis), SOLiD sequences by ligation, hybridizing a range of probes to the DNA template. The advantage of this sequencing method is that each base is read twice. This increases the confidence level in genome-wide SNP analysis.
Compared to 454, both SOLiD and Illumina sequence DNA are around 20 times cheaper, at about $0.001/1000 bases, and take just half a day to read one gigabase. They also have the advantage of being able to handle more samples simultaneously.
Although the wide adaptation of next-generation sequencing remains unknown due to expensive start-up costs, the third generation sequencers are approaching. Companies like Pacific Biosciences, Oxford Nanopore Technologies, and Complete genomics are working towards the “$1000 genome” goal. As sequencing becomes more affordable, “The beginning of the end for microarrays?” question has emerged (Shendure, 2008), because the digital nature of sequencing data allows direct comparison across different platforms without normalization and detection bias in the microarray platforms.
Epigenetic changes have been recognized as one of the most important molecular signatures of human tumors in recent years. Aberrant promoter hypermethylation is now considered to be a bonafide mechanism for transcriptional inactivation. Promoter hypermethylation at the CpG islands of certain tumor suppressor genes could lead to the disruption of multiple pathways. Increasing numbers of hypermethylated genes are implicated to correlate with malignant potential and prognosis in cancer.
The development of DNA methylation markers for early cancer detection holds the promise of being accurate, sensitive, and cost-effective for risk assessment, early diagnosis, and prognosis. DNAs from body fluids, blood, serum, or tissue samples can be readily obtained by noninvasive or minimally invasive techniques (Chan et al., 2002; Lee et al., 2002; Cairns, 2007). A panel of markers can be applied to increase the sensitivity and provide a potentially powerful system of biomarkers for developing molecular detection strategies for virtually every form of human cancer. This non-invasive approach will promote epigenetics into one of the most exciting areas in cancer management and translational cancer research.
What makes DNA methylation even more exciting than traditional genetics is that this inheritable change is reversible. Unlike genetic alterations, which are almost impossible to revert, DNA methylation is a reversible event. The epigenetic effect due to DNA hypermethylation can be reversed by using demethylating agents such as DNA methyltransferase (DNMT) inhibitors, 5-azacitidine, and 5-aza-2′-deoxycytidine. DNA demethylating agents could be potentially developed into standard regiments for cancer therapy. Drugs such as decitabine have shown promising results in clinical trials in solid and liquid tumors (Jabbour et al., 2008). 5-Azacitidine (5AC) and 5-aza-2′-deoxyazacytidine (DAC) have recently been approved for clinical use in the treatment of myelodysplastic syndrome of all types and chronic myelomonocytic leukemia, which demonstrate response rates between 20 and 40% in patients to whom no previous standard of care was available (Griffiths and Gore, 2008). In addition, over-expression of both HDAC and DNMT has been demonstrated to be associated with epigenetic inactivation of tumor suppressor genes, as well as cell cycle and apoptosis regulators. The HDAC and DNMT inhibitors possess direct cytotoxic properties, and can sensitize tumor cells to conventional radiotherapy and chemotherapy (Miremadi et al., 2007; Fandy, 2009). Preliminary clinical studies have found that the combined effects of DNMT and HDAC inhibitors led to complete or partial responses in patients with hematological malignancies (Schneider-Stock and Ocker, 2007; Fabre et al., 2008; Griffiths and Gore, 2008). However, due to the non-specific nature of nucleotide analogs, it is critical to monitor the effects in both tumor and normal tissues to ensure that no long-term damage is inflicted. Nevertheless, the use of these inhibitors will open up new and promising possibilities for cancer patient management and treatment.
Despite increasing numbers of candidate genes affected by DNA methylation in cancer being identified, there are still numerous targets waiting to be discovered. Our understanding of the peculiarities of DNA methylation and its biological effects in the human cancer genome is still very limited. With the completion of the human genome sequence and the application of high-throughput techniques, various cancer methylomes can be expected to be unmasked in the near future. Emerging evidence from various methylome studies are striking. They suggest that the majority of DMRs are either located outside the CpG islands or genomic regions without annotations and gene evidence (Weber et al., 2005; Keshet et al., 2006; Ordway et al., 2007; Smith et al., 2007; Weber et al., 2007b). These observations implicate that non-promoter non-CpG island methylation could play an active role in epigenetic alteration. It is not clear whether DNA methylation changes in these intergenic regions have functional consequences in terms of gene expression. Nevertheless, the data will further provide clues in elucidating the molecular mechanisms of DNA methylation in cancer during neoplastic transformation.
With the success of the Human Genome Project (HGP), decoding the epigenome is on the horizon. The need to define the reference epigenomes has been recognized by various public and private organizations, including the Roadmap for Epigenomics (http://nihroadmap.nih.gov/epi-genomics/) by the National Institutes of Health (NIH) (Henikoff et al., 2008), the Alliance for the Human Epigenome And Disease (AHEAD), and the Human Epigenome Project (HEP) (http://www.epigenome.org/, a joint collaboration of The Wellcome Trust Sanger Institute, Epigenomics AG, and The Centre National de Génotypage) (Eckhardt et al., 2004). These projects will establish a basis for recognizing abnormal epigenetic patterns and impact the prevention and cure of human diseases. The collection of DNA methylation signature for each form of human cancer will be a reality in the near future.
Like other genome projects, a comprehensive epigenome project requires a strong bioinformatics foundation, as the volume and complexity of data defining multiple genome-wide epigenetic regulators in multiple cell types are overwhelming. To allow efficient handling and processing of “big data” (Doctorow, 2008; Goldston, 2008; Waldrop, 2008), it will require collaborative efforts to standardize the formats, programming language, and protocols in various databases (Lee, 2008). This interoperability effort will allow accommodating the data generated by these projects and other external biomedical resources, and will facilitate the subsequent integrative analysis of the epigenome as a system.
In conclusion, epigenetics opens a new avenue to answer vitally important basic science questions that will benefit the study of major human diseases. It is clear that epigenetics is rapidly moving to the forefront of biomedical research. If the $1000 personal genome becomes a reality in the near future; personal epigenome will be an option. Such information will enable enhanced diagnosis and allow tailor-made treatment on an individual basis.
Grant sponsors: National Institutes of Health (NIH; Intramural Research Program), Eunice Kennedy Shriver National Institute of Child Health and Human Development
†This article is a US Government work and, as such, is in the public domain in the United States of America.
Hoi-Hung Cheung, Section on Developmental Genomics, Laboratory of Clinical Genomics, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland and from the School of Biomedical Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR, China.
Tin-Lap Lee, Section on Developmental Genomics, Laboratory of Clinical Genomics, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland.
Owen M. Rennert, Section on Developmental Genomics, Laboratory of Clinical Genomics, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland.
Wai-Yee Chan, Section on Developmental Genomics, Laboratory of Clinical Genomics, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland and from the School of Biomedical Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR, China.