Genomic measurements of chromatin structure consist of two phases—isolation/separation of DNA associated with a particular type of chromatin, and characterization of the isolated nucleic acid pool. Fractionation techniques used in genomics experiments are often the same as those used for single-gene studies, but the measurement technology used is an “omics” technology rather than PCR or blotting. The two major types of fractionation used to study chromatin structure are nuclease digestion to enrich for protected genomic regions, and affinity techniques such as chromatin immunoprecipitation.
As an example of the first, DNase I has long been known to preferentially cleave regulatory regions of metazoan genes due to the relative absence of histones at these genomic loci. Similarly, micrococcal nuclease is typically used to determine nucleosome positions, since this nuclease exhibits a strong preference for linker DNA over nucleosomal DNA. These characteristics have allowed researchers to infer aspects of chromatin structure from broad genomic surveys of nuclease sensitivity. For example, a number of genome-scale studies have measured the locations of DNase I hypersensitive sites in human cell lines (4
). Here, isolated nuclei are treated with a titration of DNase I, and cleavage sites are recovered and analyzed by microarray or sequencing for the identity of hypersensitive genomic sequences. A similar, but nuclease-independent, technique called FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) is an alternative method that enriches regulatory regions based on differential solubility caused by the differing amounts of protein associated with regulatory vs coding regions (7
In terms of measurement technology, the DNA microarray was the dominant genomic measurement technology for a decade, but the incredible power of high-depth sequencing has recently spread from dedicated genome centers into wider circulation. As both DNA microarrays and DNA sequencing are relatively well understood, we touch only on advantages and disadvantages of these technologies for chromatin studies.
In a typical DNA microarray study, an isolated nucleic acid population is labeled with a fluorescent dye and hybridized to a microarray. Microarray resolution is limited by probe spacing (and even ultradense tiling does not necessarily achieve single-bp resolution, due to hybridization of sequences with extensive, but incomplete, overlap), and coverage is limited by probe number. Microarrays are relatively cheap, however (250-bp resolution whole-genome yeast microarrays cost roughly $200 each), and two-color hybridization schemes allow relative changes to be sensitively detected for both high- and low-abundance features.
So-called deep sequencing is increasingly used now (particularly in mammalian systems) and offers excellent spatial resolution (single base pair, in principle), and complete genomic coverage. Furthermore, sequencing provides allele-specific information in diploid organisms, whereas single-nucleotide discrimination is nontrivial in microarray studies. The two major sequencing methodologies used to date (more are already available but have not been widely published) have been 454 sequencing, which provides ~100,000 sequences several hundred base pairs in length, and Illumina 1G “Solexa” sequencing, which provides several million shorter (~30–70 bp) sequencing reads. Disadvantages of sequencing are higher cost (~$1000 per run), and the double-edged sword of complete coverage—sequencing mRNA from a mammalian cell will generate huge numbers of reads from housekeeping genes such as actin and GAPDH, meaning that less-abundant genes will yield much lower numbers of reads and higher experimental variability.