Genomic representations by LM-PCR are used by a number of different applications, including representational oligonucleotide microarray analysis (ROMA) (23
), high-density SNP microarrays (25
) and other epigenomic assays testing cytosine methylation (26–28
). The HELP assay uses representations from HpaII to distinguish methylated from unmethylated loci in the genome, with a concurrent MspI representation defining the full range of potential HpaII-amplifiable fragments. The more fragments that can be represented, the greater the level of detail we can achieve in ROMA, SNP or epigenomic analyses. By increasing the representation of shorter fragments using dual adapters and modified PCR conditions, we achieve greater coverage of CG-dense regions in particular. By computational analysis, we show that the proportional coverage for CpG islands or our updated definition of CG clusters (18
) approaches the maximum possible (98.5% and 98.6%, respectively, ). In addition, the number of fragments at each CG-dense locus and refSeq promoter is increased 2- to 3-fold compared with the previous representation, which enables a more detailed analysis of DNA methylation in these promoter regions. We demonstrate that these shorter fragments (50–200
bp) can be labeled with the random priming technique and hybridized to microarrays with signal ranges comparable to those of larger fragments. Our practical lower limit for microarray analysis is constrained by the size of the oligonucleotides we use (≥50
nt), making it difficult to represent fragments smaller than the oligonucleotides themselves. For massively parallel sequencing applications, the sequences generated are generally shorter (~35
bp) and need only be long enough to allow accurate mapping to the reference genome, increasing the number of testable loci compared with microarrays.
This new representation allowed us to design high-resolution HELP microarrays for the human, mouse, rat and cow genomes, each representing over 1 million loci throughout these genomes. By using a high-density platform, we can represent these loci on a single microarray, allowing analysis of the entire genome in a single hybridization experiment. We performed a high-resolution HELP assay on human ES cells to test how the cytosine methylation patterns observed correlate with genomic annotations. We found relative hypomethylation of CpG islands and CG clusters compared with other sequences, and less methylation at promoters compared with gene bodies or intergenic regions, patterns consistent with prior observations about the distribution of cytosine methylation (22
When moving epigenomic assays to the clinical setting, a critical issue to address is the potential to use the limited sample amounts that can be acquired from biopsies. As the generation of genomic representations in the HELP assay involves PCR, we tested whether the amplification step could allow us to generate adequate amounts of material from limited starting amounts of DNA. We used 10
ng of DNA in our studies as an amount representing approximately 2
cells, and found the profiles generated to be highly comparable with those generated using our usual microgram quantities of starting material. We conclude that the HELP assay can be used on relatively limited numbers of cells. Our ongoing studies are testing whether we can decrease the starting amounts of DNA still further.
We also found that the MspI representation on its own allows an accurate and detailed analysis of copy number. This approach is similar to the published ROMA technique (23
) in that it uses genomic representations by methylation-insensitive restriction enzymes, but is substantially higher resolution with >1.32 million loci represented. With MspI/HpaII sites more frequent in (C
G) mononucleotide-rich regions which tend also to be more gene-rich (9
), the MspI representation offers selectively higher resolution in gene-rich regions, which is potentially advantageous. The HELP-CNV application allows DNA to be tested for cytosine methylation and copy-number variation simultaneously, a valuable tool especially in cancer research, in which epigenomic alterations (29
) and copy-number changes (30
) are frequent.
A major advantage of HELP-seq compared with microarray-based HELP appears to be one of sensitivity of detection of hypomethylated loci. We believe that this is due to the lower background noise for sequencing compared with microarrays, which always generate a fluorescence intensity reading whether there is genuine signal present or not. There are other potential advantages of HELP-seq, including the capacity to detect allelic differences in methylation, information about repetitive sequences and the ability to detect events in HpaII fragments smaller than the 50-bp lower limit for microarray studies. We also note that we identified a substantial number of HpaII sites (~3%) that are not present in the reference human genomic sequence, variability that would not be captured by or that would cause errors in microarray-based approaches.
The ideal assay to test cytosine methylation would test every CG dinucleotide individually and quantitatively throughout the genome, preserving information about cis
-relationships of methylation states between CGs, and allowing high sample throughput. No such assay exists at present. While massively parallel sequencing-based approaches promise to make nucleotide-resolution studies possible (2
), at present their sample quantity demands and costs remain daunting. The HELP assay falls into a category of assays that act to screen the genome at lower resolution, the ‘discovery’ step that defines loci for more detailed, nucleotide-resolution studies. Other assays in this category include methylation-dependent restriction enzyme assays (32
) and affinity-based assays using antibodies (1
) or other natural methyl-binding proteins (3
). The comparative advantages of HELP include its capability of using a single array and the easy technical validation of results, as the focus is solely on methylation at the HpaII/MspI sites (CCGG) generating the representations. While HpaII/MspI sites constitute only ~8% of the CG dinucleotides in the genome (9
), the presence of methylation ‘states’ in cis
that may extend over as much as 1
) allow discovery assays to flag interesting regions by testing a subset of CGs. When we measure the proportion of CGs in the human genome residing in proximity to the HpaII sites on the HELP microarray we used, we find that approximately two-thirds are within 1
kb of these sites (Supplementary Figure 2
). While epigenomic discovery approaches directly test only a minority of CGs, they have the potential of ‘flagging’ the majority of CGs in the genome. We conclude that the high-resolution HELP assay is technically simple and robust and offers the capacity to test both the epigenome and copy-number variability, and has the potential for use with limited numbers of cells and adaptation to massively parallel sequencing platforms.