DNA methylation is one of the most important epigenetic alterations involved in the control of gene expression. Bisulfite sequencing of genomic DNA is currently the only method to study DNA methylation patterns at single-nucleotide resolution. Hence, next-generation sequencing of bisulfite-converted DNA is the method of choice to investigate DNA methylation profiles at the genome-wide scale. Nevertheless, whole genome sequencing for analysis of human methylomes is expensive, and a method for targeted gene analysis would provide a good alternative in many cases where the primary interest is restricted to a set of genes.
Here, we report the successful use of a custom Agilent SureSelect Target Enrichment system for the hybrid capture of bisulfite-converted DNA. We prepared bisulfite-converted next-generation sequencing libraries, which are enriched for the coding and regulatory regions of 174 ADME genes (i.e. genes involved in the metabolism and distribution of drugs). Sequencing of these libraries on Illumina’s HiSeq2000 revealed that the method allows a reliable quantification of methylation levels of CpG sites in the selected genes, and validation of the method using pyrosequencing and the Illumina 450K methylation BeadChips revealed good concordance.
DNA methylation plays a key role in epigenetic regulation of eukaryotic genomes. Hence the genome-wide distribution of 5-methylcytosine, or the methylome, has been attracting intense attention. In recent years, whole-genome bisulfite sequencing (WGBS) has enabled methylome analysis at single-base resolution. However, WGBS typically requires microgram quantities of DNA as well as global PCR amplification, thereby precluding its application to samples of limited amounts. This is presumably because bisulfite treatment of adaptor-tagged templates, which is inherent to current WGBS methods, leads to substantial DNA fragmentation. To circumvent the bisulfite-induced loss of intact sequencing templates, we conceived an alternative method termed Post-Bisulfite Adaptor Tagging (PBAT) wherein bisulfite treatment precedes adaptor tagging by two rounds of random primer extension. The PBAT method can generate a substantial number of unamplified reads from as little as subnanogram quantities of DNA. It requires only 100 ng of DNA for amplification-free WGBS of mammalian genomes. Thus, the PBAT method will enable various novel applications that would not otherwise be possible, thereby contributing to the rapidly growing field of epigenomics.
DNA methylation is one of the most studied epigenetic marks in the human genome, with the result that the desire to map the human methylome has driven the development of several methods to map DNA methylation on a genomic scale. Our study presents the first comparison of two of these techniques - the targeted approach of the Infinium HumanMethylation450 BeadChip® with the immunoprecipitation and sequencing-based method, MeDIP-seq. Both methods were initially validated with respect to bisulfite sequencing as the gold standard and then assessed in terms of coverage, resolution and accuracy. The regions of the methylome that can be assayed by both methods and those that can only be assayed by one method were determined and the discovery of differentially methylated regions (DMRs) by both techniques was examined. Our results show that the Infinium HumanMethylation450 BeadChip® and MeDIP-seq show a good positive correlation (Spearman correlation of 0.68) on a genome-wide scale and can both be used successfully to determine differentially methylated loci in RefSeq genes, CpG islands, shores and shelves. MeDIP-seq however, allows a wider interrogation of methylated regions of the human genome, including thousands of non-RefSeq genes and repetitive elements, all of which may be of importance in disease. In our study MeDIP-seq allowed the detection of 15,709 differentially methylated regions, nearly twice as many as the array-based method (8070), which may result in a more comprehensive study of the methylome.
Recent progress in high-throughput technologies has greatly contributed to the development of DNA methylation profiling. Although there are several reports that describe methylome detection of whole genome bisulfite sequencing, the high cost and heavy demand on bioinformatics analysis prevents its extensive application. Thus, current strategies for the study of mammalian DNA methylomes is still based primarily on genome-wide methylated DNA enrichment combined with DNA microarray detection or sequencing. Methylated DNA enrichment is a key step in a microarray based genome-wide methylation profiling study, and even for future high-throughput sequencing based methylome analysis.
In order to evaluate the sensitivity and accuracy of methylated DNA enrichment, we investigated and optimized a number of important parameters to improve the performance of several enrichment assays, including differential methylation hybridization (DMH), microarray-based methylation assessment of single samples (MMASS), and methylated DNA immunoprecipitation (MeDIP). With advantages and disadvantages unique to each approach, we found that assays based on methylation-sensitive enzyme digestion and those based on immunoprecipitation detected different methylated DNA fragments, indicating that they are complementary in their relative ability to detect methylation differences.
Our study provides the first comprehensive evaluation for widely used methodologies for methylated DNA enrichment, and could be helpful for developing a cost effective approach for DNA methylation profiling.
Genome-wide dynamic changes in DNA methylation are indispensable for germline development and genomic imprinting in mammals. Here, we report single-base resolution DNA methylome and transcriptome maps of mouse germ cells, generated using whole-genome shotgun bisulfite sequencing and cDNA sequencing (mRNA-seq). Oocyte genomes showed a significant positive correlation between mRNA transcript levels and methylation of the transcribed region. Sperm genomes had nearly complete coverage of methylation, except in the CpG-rich regions, and showed a significant negative correlation between gene expression and promoter methylation. Thus, these methylome maps revealed that oocytes and sperms are widely different in the extent and distribution of DNA methylation. Furthermore, a comparison of oocyte and sperm methylomes identified more than 1,600 CpG islands differentially methylated in oocytes and sperm (germline differentially methylated regions, gDMRs), in addition to the known imprinting control regions (ICRs). About half of these differentially methylated DNA sequences appear to be at least partially resistant to the global DNA demethylation that occurs during preimplantation development. In the absence of Dnmt3L, neither methylation of most oocyte-methylated gDMRs nor intragenic methylation was observed. There was also genome-wide hypomethylation, and partial methylation at particular retrotransposons, while maintaining global gene expression, in oocytes. Along with the identification of the many Dnmt3L-dependent gDMRs at intragenic regions, the present results suggest that oocyte methylation can be divided into 2 types: Dnmt3L-dependent methylation, which is required for maternal methylation imprinting, and Dnmt3L-independent methylation, which might be essential for endogenous retroviral DNA silencing. The present data provide entirely new perspectives on the evaluation of epigenetic markers in germline cells.
In mammals, germ-cell–specific methylation patterns and genomic imprints are established throughout large-scale de novo DNA methylation in oogenesis and spermatogenesis. These steps are required for normal germline differentiation and embryonic development; however, current DNA methylation analyses only provide us a partial picture of germ cell methylome. To the best of our knowledge, this is the first study to generate comprehensive maps of DNA methylomes and transcriptomes at single base resolution for mouse germ cells. These methylome maps revealed genome-wide opposing DNA methylation patterns and differential correlation between methylation and gene expression levels in oocyte and sperm genomes. In addition, our results indicate the presence of 2 types of methylation patterns in the oocytes: (i) methylation across the transcribed regions, which might be required for the establishment of maternal methylation imprints and normal embryogenesis, and (ii) retroviral methylation, which might be essential for silencing of retrotransposons and normal oogenesis. We believe that an extension of this work would lead to a better understanding of the epigenetic reprogramming in germline cells and of the role for gene regulations.
Alterations in DNA methylation have been reported to occur during development and aging; however, much remains to be learned regarding post-natal and age-associated epigenome dynamics, and few if any investigations have compared human methylome patterns on a whole genome basis in cells from newborns and adults. The aim of this study was to reveal genomic regions with distinct structure and sequence characteristics that render them subject to dynamic post-natal developmental remodeling or age-related dysregulation of epigenome structure. DNA samples derived from peripheral blood monocytes and in vitro differentiated dendritic cells were analyzed by methylated DNA Immunoprecipitation (MeDIP) or, for selected loci, bisulfite modification, followed by next generation sequencing. Regions of interest that emerged from the analysis included tandem or interspersed-tandem gene sequence repeats (PCDHG, FAM90A, HRNR, ECEL1P2), and genes with strong homology to other family members elsewhere in the genome (FZD1, FZD7 and FGF17). Our results raise the possibility that selected gene sequences with highly homologous copies may serve to facilitate, perhaps even provide a clock-like function for, developmental and age-related epigenome remodeling. If so, this would represent a fundamental feature of genome architecture in higher eukaryotic organisms.
DNA methylation is an important epigenetic mechanism for regulating the activity of the genome. Inter-individual differences in the epigenome, including the DNA methylome, are thought to account for the missing variance in disease susceptibility that has not been identified in Genome-Wide Association Studies (GWAS). Large-scale profiling of DNA methylation in population cohorts at the sample size of thousands to tens of thousands is necessary to characterize the epigenetic component of diseases susceptibility. Although whole genome bisulfite sequencing has been demonstrated in mammalian-size genomes, it is still too costly for a large sample size. In addition, only a very small fraction of CpG sites in the human genome is variable and carries information related to the epigenetic state of the cells, whereas the majority of CpG sites are static. For large-scale methylation sequencing projects, ideally the sequencing cost should be spent only on the informative sites in the genome.
We have previously developed a targeted bisulfite sequencing method based on bisulfite padlock probes (BSPP), which can quantify the absolute CpG methylation levels on an arbitrary set of genomic regions. In this talk I will present a second-generation BSPP method, which has a highly optimized protocol for production-scale methylation sequencing at a batch size of 96 samples. We also designed and optimized a set of ∼300,000 padlock probes targeting a set of carefully selected genomic regions (DMRs, enhancers, insulators, promoters, DNase I hypersensitive sites) throughout the entire human genome. A computational pipeline for probe design and efficient processing of methylation sequencing data was also developed. This method allows us to obtain highly accurate measurements of CpG and non-CpG methylation on >500,000 highly informative sites at the cost of <$250 per sample. Preliminary results on clinical samples processed with our second-generation BSPP method will be discussed.
The field of epigenetics is now capitalizing on the vast number of emerging technologies, largely based on second-generation sequencing, which interrogate DNA methylation status and histone modifications genome-wide. However, getting an exhaustive and unbiased view of a methylome at a reasonable cost is proving to be a significant challenge. In this article, we take a closer look at the impact of the DNA sequence and bias effects introduced to datasets by genome-wide DNA methylation technologies and where possible, explore the bioinformatics tools that deconvolve them. There remains much to be learned about the performance of genome-wide technologies, the data we mine from these assays and how it reflects the actual biology. While there are several methods to interrogate the DNA methylation status genome-wide, our opinion is that no single technique suitably covers the minimum criteria of high coverage and, high resolution at a reasonable cost. In fact, the fraction of the methylome that is studied currently depends entirely on the inherent biases of the protocol employed. There is promise for this to change, as the third generation of sequencing technologies is expected to again ‘revolutionize’ the way that we study genomes and epigenomes.
DNA methylation; epigenetics; high-throughput sequencing; tiling arrays
Deep sequencing after bisulfite conversion (BS-Seq) is the method of choice to generate whole genome maps of cytosine methylation at single base-pair resolution. Its application to genomic DNA of Arabidopsis flower bud tissue resulted in the first complete methylome, determining a methylation rate of 6.7% in this tissue. BS-Seq reads were mapped onto an in silico converted reference genome, applying the so-called 3-letter genome method. Here, we present BiSS (Bisufite Sequencing Scorer), a new method applying Smith-Waterman alignment to map bisulfite-converted reads to a reference genome. In addition, we introduce a comprehensive adaptive error estimate that accounts for sequencing errors, erroneous bisulfite conversion and also wrongly mapped reads. The re-analysis of the Arabidopsis methylome data with BiSS mapped substantially more reads to the genome. As a result, it determines the methylation status of an extra 10% of cytosines and estimates the methylation rate to be 7.7%. We validated the results by individual traditional bisulfite sequencing for selected genomic regions. In addition to predicting the methylation status of each cytosine, BiSS also provides an estimate of the methylation degree at each genomic site. Thus, BiSS explores BS-Seq data more extensively and provides more information for downstream analysis.
With the availability of complete genome sequences for a growing number of organisms, high-throughput methods for gene annotation and analysis of genome dynamics are needed. The application of whole-genome tiling microarrays for studies of global gene expression is providing a more unbiased view of the transcriptional activity within genomes. For example, this approach has led to the identification and isolation of many novel non-protein-coding RNAs (ncRNAs), which have been suggested to comprise a major component of the transcriptome that have novel functions involved in epigenetic regulation of the genome. Additionally, tiling arrays have been recently applied to the study of histone modifications and methylation of cytosine bases (DNA methylation). Surprisingly, recent studies combining the analysis of gene expression (transcriptome) and DNA methylation (methylome) using whole-genome tiling arrays revealed that DNA methylation regulates the expression levels of many ncRNAs. Further capture and integration of additional types of genome-wide data sets will help to illuminate additional hidden features of the dynamic genomic landscape that are regulated by both genetic and epigenetic pathways in plants.
DNA methylation is a critical epigenetic mark that is essential for mammalian development and aberrant in many diseases including cancer. Over the past decade multiple methods have been developed and applied to characterize its genome-wide distribution. Of these, Reduced Representation Bisulfite Sequencing (RRBS) generates nucleotide resolution Illumina-based libraries that enrich for CpG-dense regions by methylation-insensitive restriction digestion. Here we provide an extensive, optimized protocol for generating RRBS libraries and discuss the power of this strategy for methylome profiling. We include information on sequence analysis and the relative coverage over genomic regions of interest for a representative mouse MspI generated RRBS library. Contemporary sequencing and array-based technologies are compared against sample throughput and coverage, highlighting the variety of options available to investigate methylation on the genome-scale.
The potential importance of DNA methylation in the etiology of complex diseases has led to interest in the development of methylome-wide association studies (MWAS) aimed at interrogating all methylation sites in the human genome. When using blood as biomaterial for a MWAS the DNA is typically extracted directly from fresh or frozen whole blood that was collected via venous puncture. However, DNA extracted from dry blood spots may also be an alternative starting material. In the present study, we apply a methyl-CpG binding domain (MBD) protein enrichment-based technique in combination with next generation sequencing (MBD-seq) to assess the methylation status of the ~27 million CpGs in the human autosomal reference genome. We investigate eight methylomes using DNA from blood spots. This data are compared with 1,500 methylomes previously assayed with the same MBD-seq approach using DNA from whole blood. When investigating the sequence quality and the enrichment profile across biological features, we find that DNA extracted from blood spots gives comparable results with DNA extracted from whole blood. Only if the amount of starting material is ≤ 0.5µg DNA we observe a slight decrease in the assay performance. In conclusion, we show that high quality methylome-wide investigations using MBD-seq can be conducted in DNA extracted from archived dry blood spots without sacrificing quality and without bias in enrichment profile as long as the amount of starting material is sufficient. In general, the amount of DNA extracted from a single blood spot is sufficient for methylome-wide investigations with the MBD-seq approach.
archived blood spots; methylation; next-generation sequencing; DNA extraction; MBD-seq
In the bacterial world, methylation is most commonly associated with restriction-modification systems that provide a defense mechanism against invading foreign genomes. In addition, it is known that methylation plays functionally important roles, including timing of DNA replication, chromosome partitioning, DNA repair, and regulation of gene expression. However, full DNA methylome analyses are scarce due to a lack of a simple methodology for rapid and sensitive detection of common epigenetic marks (ie N6-methyladenine (6 mA) and N4-methylcytosine (4 mC)), in these organisms. Here, we use Single-Molecule Real-Time (SMRT) sequencing to determine the methylomes of two related human pathogen species, Mycoplasma genitalium G-37 and Mycoplasma pneumoniae M129, with single-base resolution. Our analysis identified two new methylation motifs not previously described in bacteria: a widespread 6 mA methylation motif common to both bacteria (5′-CTAT-3′), as well as a more complex Type I m6A sequence motif in M. pneumoniae (5′-GAN7TAY-3′/3′-CTN7ATR-5′). We identify the methyltransferase responsible for the common motif and suggest the one involved in M. pneumoniae only. Analysis of the distribution of methylation sites across the genome of M. pneumoniae suggests a potential role for methylation in regulating the cell cycle, as well as in regulation of gene expression. To our knowledge, this is one of the first direct methylome profiling studies with single-base resolution from a bacterial organism.
DNA methylation in bacteria plays important roles in cell division, DNA repair, regulation of gene expression, and pathogenesis. Here, we use a novel sequencing technique, Single-Molecule Real-Time (SMRT) sequencing, to determine the methylomes of two related human pathogen species, Mycoplasma genitalium G-37 and Mycoplasma pneumoniae M129. Our analysis identified two novel methylation motifs, one of them present uniquely in M. pneumoniae and the other common to both bacteria. We also identify the methyltransferase responsible for the common methylation motif and suggest the one associated with the M. pneumoniae unique motif. Functional analysis of the data suggests a potential role for methylation in regulating the cell cycle of M. pneumoniae, as well as in regulation of gene expression. To our knowledge, this is one of the first genome-wide approaches to study the biological role of methylation in a bacterial organism.
Methylation, the addition of methyl groups to cytosine (C), plays an important role in the regulation of gene expression in both normal and dysfunctional cells. During bisulfite conversion and subsequent PCR amplification, unmethylated Cs are converted into thymine (T), while methylated Cs will not be converted. Sequencing of this bisulfite-treated DNA permits the detection of methylation at specific sites. Through the introduction of next-generation sequencing technologies (NGS) simultaneous analysis of methylation motifs in multiple regions provides the opportunity for hypothesis-free study of the entire methylome. Here we present a whole methylome sequencing study that compares two different bisulfite conversion methods (in solution versus in gel), utilizing the high throughput of the SOLiD™ System. Advantages and disadvantages of the two different bisulfite conversion methods for constructing sequencing libraries are discussed. Furthermore, the application of the SOLiD™ bisulfite sequencing to larger and more complex genomes is shown with preliminary in silico created bisulfite converted reads.
DNA methylation is an epigenetic modification that plays a crucial role in normal mammalian development, retrotransposon silencing, and cellular reprogramming. Although methylation mainly occurs on the cytosine in a CG site, non-CG methylation is prevalent in pluripotent stem cells, brain, and oocytes. We previously identified non-CG methylation in several CG-rich regions in mouse germinal vesicle oocytes (GVOs), but the overall distribution of non-CG methylation and the enzymes responsible for this modification are unknown. Using amplification-free whole-genome bisulfite sequencing, which can be used with minute amounts of DNA, we constructed the base-resolution methylome maps of GVOs, non-growing oocytes (NGOs), and mutant GVOs lacking the DNA methyltransferase Dnmt1, Dnmt3a, Dnmt3b, or Dnmt3L. We found that nearly two-thirds of all methylcytosines occur in a non-CG context in GVOs. The distribution of non-CG methylation closely resembled that of CG methylation throughout the genome and showed clear enrichment in gene bodies. Compared to NGOs, GVOs were over four times more methylated at non-CG sites, indicating that non-CG methylation accumulates during oocyte growth. Lack of Dnmt3a or Dnmt3L resulted in a global reduction in both CG and non-CG methylation, showing that non-CG methylation depends on the Dnmt3a-Dnmt3L complex. Dnmt3b was dispensable. Of note, lack of Dnmt1 resulted in a slight decrease in CG methylation, suggesting that this maintenance enzyme plays a role in non-dividing oocytes. Dnmt1 may act on CG sites that remain hemimethylated in the de novo methylation process. Our results provide a basis for understanding the mechanisms and significance of non-CG methylation in mammalian oocytes.
Methylation of cytosine bases in DNA is an epigenetic modification crucial for normal development, retrotransposon silencing, and cellular reprogramming. In mammals, the vast majority of 5-methylcytosine occurs at CG dinucleotides, and thus most studies to date have focused on this dinucleotide. However, recent studies have shown that 5-methylcytosine is abundant at non-CG (CA, CT, and CC) sites in certain tissues and certain cell types in human and mouse. We previously identified non-CG methylation in CG-rich sequences, including the imprint control regions in mouse germinal vesicle oocytes, but its global distribution and the enzymes responsible are unknown. Using advanced high-throughput sequencing technology applicable to minute amounts of DNA, we obtained high-resolution methylation maps of newborn non-growing oocytes, adult germinal vesicle oocytes, and mutant germinal vesicle oocytes lacking any of the four DNA methyltransferase family proteins. Our results revealed that non-CG methylation accumulates genome-wide in close proximity to highly methylated CG sites during the oocyte growth stage. We also found that the de novo DNA methyltransferase proteins Dnmt3a and Dnmt3L are responsible for non-CG methylation in oocytes. Unexpectedly, we found that the maintenance methyltransferase Dnmt1 has a role in de novo CG methylation. Our study provides a basis for understanding the mechanisms and significance of non-CG methylation in mammalian oocytes.
DNA methylation plays important biological roles in plants and animals. To examine the rice genomic methylation landscape and assess its functional significance, we generated single-base resolution DNA methylome maps for Asian cultivated rice Oryza sativa ssp. japonica, indica and their wild relatives, Oryza rufipogon and Oryza nivara.
The overall methylation level of rice genomes is four times higher than that of Arabidopsis. Consistent with the results reported for Arabidopsis, methylation in promoters represses gene expression while gene-body methylation generally appears to be positively associated with gene expression. Interestingly, we discovered that methylation in gene transcriptional termination regions (TTRs) can significantly repress gene expression, and the effect is even stronger than that of promoter methylation. Through integrated analysis of genomic, DNA methylomic and transcriptomic differences between cultivated and wild rice, we found that primary DNA sequence divergence is the major determinant of methylational differences at the whole genome level, but DNA methylational difference alone can only account for limited gene expression variation between the cultivated and wild rice. Furthermore, we identified a number of genes with significant difference in methylation level between the wild and cultivated rice.
The single-base resolution methylomes of rice obtained in this study have not only broadened our understanding of the mechanism and function of DNA methylation in plant genomes, but also provided valuable data for future studies of rice epigenetics and the epigenetic differentiation between wild and cultivated rice.
Cultivated and wild rice; Methylomes; Transcriptional termination regions (TTRs); Gene expression
DNA methylation is an epigenetic mark linking DNA sequence and transcription regulation, and therefore plays an important role in phenotypic plasticity. The ideal whole genome methylation (methylome) assay should be accurate, affordable, high-throughput and agnostic with respect to genomic features. To this end, the methylated DNA immunoprecipitation (MeDIP) assay provides a good balance of these criteria. In this Methods paper, we present AutoMeDIP-seq, a technique that combines an automated MeDIP protocol with library preparation steps for subsequent second-generation sequencing. We assessed recovery of DNA sequences covering a range of CpG densities using in vitro methylated λ-DNA fragments (and their unmethylated counterparts) spiked-in against a background of human genomic DNA. We show that AutoMeDIP is more reliable than manual protocols, shows a linear recovery profile of fragments related to CpG density (R2 = 0.86), and that it is highly specific (>99%). AutoMeDIP-seq offers a competitive approach to high-throughput methylome analysis of medium to large cohorts.
DNA methylation; Automation; Whole genome; High-throughput sequencing; MeDIP
Sequencing-based DNA methylation profiling methods are comprehensive and, as accuracy and affordability improve, will increasingly supplant microarrays for genome-scale analyses. Here, four sequencing-based methodologies were applied to biological replicates of human embryonic stem cells to compare their CpG coverage genome-wide and in transposons, resolution, cost, concordance and its relationship with CpG density and genomic context. The two bisulfite methods reached concordance of 82% for CpG methylation levels and 99% for non-CpG cytosine methylation levels. Using binary methylation calls, two enrichment methods were 99% concordant, while regions assessed by all four methods were 97% concordant. To achieve comprehensive methylome coverage while reducing cost, an approach integrating two complementary methods was examined. The integrative methylome profile along with histone methylation, RNA, and SNP profiles derived from the sequence reads allowed genome-wide assessment of allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression.
DNA methylation; Sequencing; Bisulfite
The ability to assay genome-scale methylation patterns using high-throughput sequencing makes it possible to carry out association studies to determine the relationship between epigenetic variation and phenotype. While bisulfite sequencing can determine a methylome at high resolution, cost inhibits its use in comparative and population studies. MethylSeq, based on sequencing of fragment ends produced by a methylation-sensitive restriction enzyme, is a method for methyltyping (survey of methylation states) and is a site-specific and cost-effective alternative to whole-genome bisulfite sequencing. Despite its advantages, the use of MethylSeq has been restricted by biases in MethylSeq data that complicate the determination of methyltypes. Here we introduce a statistical method, MetMap, that produces corrected site-specific methylation states from MethylSeq experiments and annotates unmethylated islands across the genome. MetMap integrates genome sequence information with experimental data, in a statistically sound and cohesive Bayesian Network. It infers the extent of methylation at individual CGs and across regions, and serves as a framework for comparative methylation analysis within and among species. We validated MetMap's inferences with direct bisulfite sequencing, showing that the methylation status of sites and islands is accurately inferred. We used MetMap to analyze MethylSeq data from four human neutrophil samples, identifying novel, highly unmethylated islands that are invisible to sequence-based annotation strategies. The combination of MethylSeq and MetMap is a powerful and cost-effective tool for determining genome-scale methyltypes suitable for comparative and association studies.
In the vertebrates, methylation of cytosine residues in DNA regulates gene activity in concert with proteins that associate with DNA. Large-scale genomewide comparative studies that seek to link specific methylation patterns to disease will require hundreds or thousands of samples, and thus economical methods that assay genomewide methylation. One such method is MethylSeq, which samples cytosine methylation at site-specific resolution by high-throughput sequencing of the ends of DNA fragments generated by methylation-sensitive restriction enzymes. MethylSeq's low cost and simplicity of implementation enable its use in large-scale comparative studies, but biases inherent to the method inhibit interpretation of the data it produces. Here we present MetMap, a statistical framework that first accounts for the biases in MethylSeq data and then generates an analysis of the data that is suitable for use in comparative studies. We show that MethylSeq and MetMap can be used together to determine methylation profiles across the genome, and to identify novel unmethylated regions that are likely to be involved in gene regulation. The ability to conduct comparative studies of sufficient scale at a reasonable cost promises to reveal new insights into the relationship between cytosine methylation and phenotype.
DNA methylation is one of the most important heritable epigenetic modifications of the genome and is involved in the regulation of many cellular processes. Aberrant DNA methylation has been frequently reported to influence gene expression and subsequently cause various human diseases, including cancer. Recent rapid advances in next-generation sequencing technologies have enabled investigators to profile genome methylation patterns at singlebase resolution. Remarkably, more than 20 eukaryotic methylomes have been generated thus far, with a majority published since November 2009. Analysis of this vast amount of data has dramatically enriched our knowledge of biological function, conservation and divergence of DNA methylation in eukaryotes. Even so, many specific functions of DNA methylation and their underlying regulatory systems still remain unknown to us. Here, we briefly introduce current approaches for DNA methylation profiling and then systematically review the features of whole genome DNA methylation patterns in eight animals, six plants and five fungi. Our systematic comparison provides new insights into the conservation and divergence of DNA methylation in eukaryotes and their regulation of gene expression. This work aims to summarize the current state of available methylome data and features informatively.
DNA methylation; methylome; single-base resolution; CpG; gene body; broadness; deepness; promoter
Changes in genomic DNA methylation patterns are generally assumed to play an important role in the etiology of human cancers. The Dnmt3a enzyme is required for the establishment of normal methylation patterns, and mutations in Dnmt3a have been described in leukemias. Deletion of Dnmt3a in a K-ras–dependent mouse lung cancer model has been shown to promote tumor progression, which suggested that the enzyme might suppress tumor development by stabilizing DNA methylation patterns. We have used whole-genome bisulfite sequencing to comprehensively characterize the methylomes from Dnmt3a wildtype and Dnmt3a-deficient mouse lung tumors. Our results show that profound global methylation changes can occur in K-ras–induced lung cancer. Dnmt3a wild-type tumors were characterized by large hypomethylated domains that correspond to nuclear lamina-associated domains. In contrast, Dnmt3a-deficient tumors showed a uniformly hypomethylated genome. Further data analysis revealed that Dnmt3a is required for efficient maintenance methylation of active chromosome domains and that Dnmt3a-deficient tumors show moderate levels of gene deregulation in these domains. In summary, our results uncover conserved features of cancer methylomes and define the role of Dnmt3a in maintaining DNA methylation patterns in cancer.
Dnmt3a is generally assumed to be a de novo DNA methyltransferase that plays an important role in establishing DNA methylation patterns during embryogenesis. However, mutations in the human DNMT3A gene have been detected in various cancers, suggesting that the enzyme might also be relevant for DNA methylation in adult tissues and in tumors. We have established genome-wide methylation profiles at single base pair resolution to define Dnmt3a-dependent methylation changes in a mouse tumor model. Our results show that mouse tumors with a functional Dnmt3a enzyme are characterized by regional hypomethylation, while Dnmt3a-deficient tumors showed a uniformly hypomethylated genome. Further data analysis revealed that Dnmt3a is required for maintaining normal DNA methylation patterns specifically in gene bodies and in active chromosome domains. Our study thus defines the role of Dnmt3a in maintaining DNA methylation patterns and provides a paradigm for understanding the effects of DNMT3A mutations on human cancer methylomes.
Cancer cells undergo massive alterations to their DNA methylation patterns that result in aberrant gene expression and malignant phenotypes. However, the mechanisms that underlie methylome changes are not well understood nor is the genomic distribution of DNA methylation changes well characterized.
Here, we performed methylated DNA immunoprecipitation combined with high-throughput sequencing (MeDIP-seq) to obtain whole-genome DNA methylation profiles for eight human breast cancer cell (BCC) lines and for normal human mammary epithelial cells (HMEC). The MeDIP-seq analysis generated non-biased DNA methylation maps by covering almost the entire genome with sufficient depth and resolution. The most prominent feature of the BCC lines compared to HMEC was a massively reduced methylation level particularly in CpG-poor regions. While hypomethylation did not appear to be associated with particular genomic features, hypermethylation preferentially occurred at CpG-rich gene-related regions independently of the distance from transcription start sites. We also investigated methylome alterations during epithelial-to-mesenchymal transition (EMT) in MCF7 cells. EMT induction was associated with specific alterations to the methylation patterns of gene-related CpG-rich regions, although overall methylation levels were not significantly altered. Moreover, approximately 40% of the epithelial cell-specific methylation patterns in gene-related regions were altered to those typical of mesenchymal cells, suggesting a cell-type specific regulation of DNA methylation.
This study provides the most comprehensive analysis to date of the methylome of human mammary cell lines and has produced novel insights into the mechanisms of methylome alteration during tumorigenesis and the interdependence between DNA methylome alterations and morphological changes.
DNA methylation has been traditionally viewed as a highly stable epigenetic mark in post-mitotic cells, however, postnatal brains appear to exhibit stimulus-induced methylation changes, at least in a few identified CpG dinucleotides. How extensively the neuronal DNA methylome is regulated by neuronal activity is unknown. Using a next-generation sequencing-based method for genome-wide analysis at a single-nucleotide resolution, we quantitatively compared the CpG methylation landscape of adult mouse dentate granule neurons in vivo before and after synchronous neuronal activation. About 1.4% of 219,991 CpGs measured show rapid active demethylation or de novo methylation. Some modifications remain stable for at least 24 hours. These activity-modified CpGs exhibit a broad genomic distribution with significant enrichment in low-CpG density regions, and are associated with brain-specific genes related to neuronal plasticity. Our study implicates modification of the neuronal DNA methylome as a previously under-appreciated mechanism for activity-dependent epigenetic regulation in the adult nervous system.
DNA methylation is a biochemical process where a DNA base, usually cytosine, is enzymatically methylated at the 5-carbon position. An epigenetic modification associated with gene regulation, DNA methylation is of paramount importance to biological health and disease. Recently, the quest to unravel the Human Epigenome commenced, calling for a modernization of previous DNA methylation profiling techniques. Here, we describe the major developments in the methodologies used over the past three decades to examine the elusive epigenome (or methylome). The earliest techniques were based on the separation of methylated and unmethylated cytosines via chromatography. The following years would see molecular techniques being employed to indirectly examine DNA methylation levels at both a genome-wide and locus-specific context, notably immunoprecipitation via anti-5′methylcytosine and selective digestion with methylation-sensitive restriction endonucleases. With the advent of sodium bisulfite treatment of DNA, a deamination reaction that converts cytosine to uracil only when unmethylated, the epigenetic modification can now be identified in the same manner as a DNA base-pair change. More recently, these three techniques have been applied to more technically advanced systems such as DNA microarrays and next-generation sequencing platforms, bringing us closer to unveiling a complete human epigenetic profile.
DNA; methylation; bisulfite; sequencing; methods
DNA methylation can control some CpG-poor genes but unbiased studies have not found a consistent genome-wide association with gene activity outside of CpG islands or shores possibly due to use of cell lines or limited bioinformatics analyses. We performed reduced representation bisulfite sequencing (RRBS) of rat dorsal root ganglia encompassing postmitotic primary sensory neurons (n = 5, r > 0.99; orthogonal validation p < 10−19). The rat genome suggested a dichotomy of genes previously reported in other mammals: low CpG content (< 3.2%) promoter (LCP) genes and high CpG content (≥ 3.2%) promoter (HCP) genes. A genome-wide integrated methylome-transcriptome analysis showed that LCP genes were markedly hypermethylated when repressed, and hypomethylated when active with a 40% difference in a broad region at the 5′ of the transcription start site (p < 10−87 for -6000 bp to -2000 bp, p < 10−73 for -2000 bp to +2000 bp, no difference in gene body p = 0.42). HCP genes had minimal TSS-associated methylation regardless of transcription status, but gene body methylation appeared to be lost in repressed HCP genes. Therefore, diametrically opposite methylome-transcriptome associations characterize LCP and HCP genes in postmitotic neural tissue in vivo.
bisulfite sequencing; dorsal root ganglion; HCP promoter; high CpG content promoter genes; integrated methylome-transcriptome analysis; LCP promoter; low CpG content promoter genes; peripheral nervous system; rat