Genomic DNA purification
Genomic DNA was isolated as previously reported [15
]. Purified DNA was quantified using a Quant-iT DNA Broad Range assay kit (Invitrogen, Grand Island, NY, USA, catalogue number Q-33130) and subsequently diluted to 20 ng/μl in low TE (10 mM Tris-HCl, 0.1 mM EDTA, PH 8.0). Equal amounts of DNA samples (100 ng) were added to distinct wells in a 96-well PCR plate (Axygen, Union City, CA, USA, catalogue number PCR-96M2-HS-C). For the chimera experiment, CpG Methylated NIH 3T3 mouse genomic DNA was purchased from New England Biolabs (Ipswich, MA, USA).
Samples of 5 μl genomic DNA were transferred to a new 96-well PCR plate with a 12-channel pipette. The MspI (New England Biolabs, catalogue number R0106L) digestion was conducted in a 30 μl reaction containing 3 μl of 10× NEB buffer 2, 1 μl of MspI (20 U/μl) and 21 μl H2
O. To facilitate pipetting, a master mixture for 110 reactions, which compensates for reagent loss, was set up as follows: 330 μl of 10× NEB buffer 2, 110 μl of MspI and 2,310 μl of H2
O. Next, 220 μl of the master mixture was added to each of the 12 wells in a row of a 96-well plate. Out of these, 25 μl were then pipetted to the sample/DNA plate using a 12-channel pipette. After carefully sealing the plate with one piece of adhesive tape sheet (Qiagen, Valencia, CA, USA, catalogue number 19570), the plate was then spun down briefly, vortexed to mix and was further spun for 30 s at 2,000 rpm in a PCR plate centrifuge. The plate was then incubated overnight at 37°C in an incubator. A diagnostic gel can be run on select samples at this point to determine MspI digestion efficiency, although this is usually not necessary (Figure S2a in Additional file 1
Gap filling and A-tailing
Without deactivating MspI and cleaning-up the digestion reactions, DNA end repair and A-tailing were conducted by adding Klenow fragment (3'→5' exo-) (New England Biolabs, catalogue number M0212L) and dNTP mixture containing 10 mM dATP, 1 mM dCTP and 1 mM dGTP (New England Biolabs, catalogue number N0446S) directly into each well of the digestion plate. To simplify pipetting, an excessive amount of master mixture (110×) containing 110 μl of the Klenow fragment (3'→5' exo-) and 110 μl of the dNTP mix was made, and an aliquot of 18 μl was pipetted to each of the 12 wells in a clean row of a 96-well plate; 2 μl of that mix was added to each sample using a 12-channel pipette. Next, the sample plate was sealed and spun briefly to bring down any liquid accumulated on plate walls. The plate was vortexed to mix and spun for 30 s at room temperature using the plate centrifuge. The reaction was performed in a thermocycler (Eppendorf, Mastercycler EP Gradient S) without the heated lid. The program was set to 30°C for 20 minutes, 37°C for 20 minutes then 4°C indefinitely. The two temperatures are necessary for each step, the gap filling and the A-tailing, to facilitate both reactions.
A 2× concentration of SPRI AMPure XP beads (Beckman Coulter, Brea, CA, USA, catalogue number A63881; 64 μl beads for 32 μl sample) were added to each well using an 8-channel pipette. Beads and samples were mixed by pipetting up and down at least five times. Then, the mixtures were incubated at room temperature for 30 minutes. After DNA binding, the 96-well plate was placed onto a DynaMag™-96 Side magnet (Invitrogen, catalogue number 123-31D) for 5 minutes. The supernatant was carefully removed from the side opposite the accumulated beads, and the beads were then washed twice with 100 μl of 70% ethanol. Five minutes after the second wash, the ethanol was removed, and the duplex of the plate and the DynaMag™-96 Side magnet was put into a fume hood to dry the beads for 10 minutes. After drying of the beads, 20 μl of EB buffer (New England Biolabs, catalogue number B1561) was added to each well using an 8-channel pipette. The plate was then covered with a new tape sheet, vortexed to resuspend DNA, and spun down as described previously.
Multiplexed adapter ligation
A 110× ligation master mix was made for 96 reactions as follows: 330 μl of 10× T4 ligation buffer, 110 μl of T4 ligase (New England Biolabs, catalogue number M0202M), and 440 μl of H2O (1× volume: 3 μl of 10× T4 ligation buffer, 1 μl of T4 Ligase, 4 μl of H2O). Master mix (72 μl) was added to each of the 12 wells in a clean row of a 96-well plate. Next, 18 μl of each Illumina TruSeq adapter (Illumina, Dedham, MA, USA, catalogue number PE-940-2001; from a 1:20 diluted 9 μM stock) were added to corresponding wells in the row (Illumina TruSeq adapters contain 5 mC instead of C and can therefore be used for RRBS). After mixing the adapter-ligase mixtures, 10 μl of each was distributed to correlated samples using a 12-channel pipette. This brought the ligation reaction volume of each sample to 30 μl. The plate was placed into a thermocycler and incubated at 16°C overnight without the heated lid- the heated lid could potentially destroy the ligase.
Library pooling and bisulfite conversion
After ligation the plate was removed from the thermocycler and the beads were resuspended. Next, the plate was placed back into the thermocycler, and the enzyme was deactivated at 65°C for 20 minutes. It is important to note that the beads need to be resuspended prior to enzyme deactivation because resuspension is difficult after heating to 65°C. Samples were then pooled into eight 1.5 ml microfuge tubes. To bind the DNA back to the beads, a 2× solution (720 μl) of 20% polyethylene glycol (8,000 g/mol), 2.5 M NaCl was added to each tube. The samples were mixed and incubated at room temperature for 30 minutes to ensure maximum binding. After incubation, the samples were put onto a DynaMag™-2 magnet (Invitrogen, catalogue number 123-21D) and incubated for 5 minutes to allow bead attraction to the magnet. The liquid was removed, and the beads were washed with 1.0 ml of 70% ethanol. After removing the ethanol, the tubes were placed in the fume hood to dry the beads until cracks were observed (taking about 30 to 50 minutes). For eluting DNA from the beads, 25 μl of EB buffer was added to each tube; the tubes were vortexed for 20 s and were then centrifuged briefly. The tubes were placed back onto the magnet and the eluent (about 23 μl) was transferred to a new 1.5 ml microfuge tube. About 2 μl is lost due to adherence to the beads, and 3 μl of each sample was set aside for the ligation efficiency test by PCR as described previously [15
], except that 0.3 μM of TruSeq primers (forward primer, 5'-AATGATACGGCGACCACCGAGAT-3'; reverse primer, 5'-CAAGCAGAAGACGGCATACGA-3'; Integrated DNA Technologies, Coralville, IA, USA) were utilized.
The remaining 20 μl samples were put through two consecutive bisulfite conversions, and bisulfite converted DNA was cleaned up as described in [15
]. After determining the optimized PCR cycle number for each sample, a large-scale PCR reaction (200 μl) for each sample was performed as recommended [15
Final SPRI bead clean-up
After the PCR was completed, each well was pooled into a 1.5 ml tube. A 1.2× SPRI bead clean-up (240 μl SPRI beads into a 200 μl library pool) as mentioned above was conducted to remove PCR primers and adapter dimers. The DNA was eluted in 40 μl of EB buffer. To minimize adapter dimers, a second round of SPRI bead clean-up was performed at 1.5× (60 μl SPRI beads into a 40 μl library pool). The final library DNA samples were eluted with 40 μl EB buffer. The pooled libraries were quantified using a Qubit fluorometer (Invitrogen catalogue number Q32857) and a Quant-IT dsDNA HS assay kit (Invitrogen catalogue number Q-33120), and the qualities were determined by running a 4 to 20% Criterion precast polyacrylamide TBE gel (Bio-Rad, Waltham, MA, USA, catalogue number 345-0061). An equal quantity of starting genomic DNA prevents a bias toward more concentrated libraries, so accuracy in these measurements is imperative for sequencing success. The samples were sequenced on an Illumina Hiseq 2000 machine at the Broad Institute Sequencing Platform.
The MspI recognition cut site (C^CGG) creates fragments that will make the first three bases of every read non-random. This would result in high apparent cluster density, poor DNA cluster localization, and significant data loss during sequencing on the Illumina HiSeq 2000. To improve performance of these samples and increase coverage obtained, we used a method referred to as 'dark sequencing' in which imaging and cluster localization were delayed until the fourth cycle of sequencing chemistry, beyond the extent of bias from the MspI cut site (Figure S3 in Additional file 1
To do this, we loaded a HiSeq 2000 with a custom recipe file co-developed with Illumina plus extra reagents to support primer re-hybridization. The custom recipe created a new initial 'template read' in which the first three biased bases were incorporated without imaging, followed by four cycles that were incorporated, imaged, and used by the sequencer for cluster localization. Next, the recipe removed the newly synthesized strand using NaOH and a buffer wash, re-hybridized fresh sequencing primer to the sample, and began read 1 data collection as usual from the first base but using the pre-existing cluster map or 'template' generated by the template read. HiSeq Control Software (HCS) provided by Illumina prevented cluster intensity files from the template read to enter downstream analysis.
As all custom chemistry steps were defined by the recipe, this workflow required very little additional hands-on time compared to a standard HiSeq run setup. The template read took approximately 6 h and consumed seven cycles of sequencing reagents prior to the start of data collection. Additional reagents to support re-hybridization after the template read were loaded at the beginning of the run alongside other read 1 and index read sequencing reagents. The following positions differed from the standard setup for an indexed single read run: Pos 16, 3 ml Read 1 Sequencing primer; Pos 18, 5 ml 0.1 N NaOH, Pos 19, 6 ml Illumina wash buffer.
After the removal of adapters and barcodes, 29 bp reads were aligned to the hg19 genome using MAQ. CpG methylation calling was performed by observing the bisulfite transformation in the read as opposed to the genome sequence.
RRBS data have been deposited at the Gene Expression Omnibus (GEO) under accession [GSE40429].