DNA methylation is the only known covalent modification to the eukaryotic genome and serves multiple functions in plants and higher animal phyla. In mammals it is primarily observed at cytosine residues within the symmetrical CpG dinucleotide, which provides a mechanism for stable marker inheritance through enzymatic recognition of newly synthesized, hemi-methylated DNA [1
]. Moreover, this heritability is maintained independently of any underlying nucleotide sequence, making cytosine methylation a truly "epigenetic" mark [2
In mammals, CpG methylation is essential for development and is generally regarded as a terminal silencer of expression, though it also plays roles in more nuanced programs such as maintenance of parental allele-specific imprinting and dosage compensation by X inactivation in females [2
]. Moreover, the contributing paternal genome within a fertilized zygote demonstrates an active global demethylation that provides strong evidence towards CpG methylation’s dynamic potential as an epigenetic modifier, not exclusively as a terminal silencer [4
]. Substantial alterations of DNA methylation have been observed in multiple cancers, characterized by localized hypermethylation at target gene promoters and a global loss of methylation [7
Experimental analysis of DNA methylation states is complicated by the fact that it does not alter base pairing and is lost during PCR amplification. Traditionally, measurement of methylated cytosine has instead relied on a chemical reaction using high temperature, low pH, and treatment with sodium bisulfite, a protocol that specifically deaminates unmethylated cytosines and converts them into uracils, while leaving methylated cytosines unchanged [9
]. Subsequent PCR amplification of bisulfite converted DNA replaces uracil by thymine, giving rise to a methylation-specific single nucleotide polymorphism that is detectable by conventional sequencing and alignment against the reference sequence.
Recent technical advances such as tiling microarrays and high-throughput sequencing have dramatically increased the scale at which DNA methylation can be analyzed [3
]. Some array-based strategies utilize bisulfite converted DNA to probe defined regions by bimodal hybridization of either unconverted or converted sequences. Others enrich directly for methylated DNA by co-precipitation using either methyl-binding proteins or antibodies that target the 5-methyl-CpG hapten. Methylated DNA immunoprecipitation (MeDIP) has provided preliminary methylome profiles of mammalian promoters and the entire Arabidopsis genome [11
], but even the most advanced array technologies are only capable of screening pre-selected genomic regions. High-throughput sequencing strategies address this latter issue and achieve greater coverage (MeDIP-Seq) [14
]. While MeDIP-seq can identify immunoprecipitable 5-methyl-CpG-containing fragments, it cannot determine the methylation status of individual CpGs within the fragment. Recently, a bisulfite conversion protocol compatible with ultra-high throughput sequencing (BS-seq) was used to assess the Arabidopsis methylome at single-CpG resolution with approx. 20-fold sequencing coverage [15
]. However, the larger genome-size (3 Gb in human vs 120 Mb in Arabidopsis) and a more uneven CpG distribution renders a comparable study extremely costly in mammalian models.
Reduced Representation Bisulfite Sequencing (RRBS) utilizes the same high-throughput sequencing strategy as BS-seq, but enriches its libraries by digesting genomic DNA with restriction endonucleases that are specific for CpG containing motifs () [17
]. RRBS therefore provides an enhanced coverage for the CpG dinucleotide and yields single base pair resolution data within multiple regions of interest, including CpG islands, promoters and enhancer elements (). CpG fragment enrichment substantially reduces the sequencing depth required for whole genome coverage by under-representing CpG poor, constitutively methylated, intergenic regions. Instead, a small but reproducible subset of CpG-rich restriction fragments in the genome is sequenced at sufficient depth. Because RRBS operates by fragmenting DNA at specific restriction sites and sequencing coverage relative to the reduced representation genome is high, the vast majority of fragments are sequenced in all RRBS analyses for a given species, increasing the method’s utility for comparative DNA methylation profiling [10
Reduced Representation Bisulfite Sequencing
Here we provide an extended protocol for generating RRBS libraries, emphasizing critical steps, checkpoints for quality control, and portions that can be customized to meet the needs of more specialized studies (). To date, the majority of RRBS libraries generated within our lab have been in Mus musculus
], and the protocol is largely adapted to provide robust coverage of this genome, but all of the general principles and most of the specific steps extend to other organisms such as human. This review concludes with a discussion on the fundamental distinctions and attributes of other screening protocols, as well as on future directions for genome-scale DNA methylation mapping.