DNase I footprinting has long been used in an in vitro
context to interrogate protein-DNA interactions. However, application of this approach to the study of in vivo
interactions has proven difficult, and only a handful of studies have been reported for highly targeted loci such as individual cis
. By coupling DNase I digestion of intact nuclei with massively parallel sequencing and computational analysis of nucleotide-level patterns, the digital genomic footprinting approach we describe now enables genome-scale detection of the in vivo
occupancy of genomic sites by DNA-binding proteins. Although detection of individual binding events is dependent on the depth of sequence coverage at a given position, the approach takes advantage of the concentration of cleavages within DNase I hypersensitive regions. In the case of mammalian genomes, DNase I cleavage is highly targeted to DNase I hypersensitive sites, which comprise only 1-2% of the genome in each cell type. As such, although the human genome is ~250-fold larger than the yeast genome, the collective span of human DNase I hypersensitive sites is only 1-2% of the genome, and therefore potentially addressable with only modest scale-up.
To date, genome-scale localization of regulatory factor binding sites has largely relied on a top-down approach centered on chromatin immunoprecipitation. Several limitations of this approach are addressed by digital genomic footprinting. Whereas ChIP requires prior knowledge of each DNA-binding protein to be interrogated by genome-wide location analysis, and can be carried out on only one protein at a time, DNase I footprinting addresses all factors simultaneously in their native state, and detects regions of direct binding at nucleotide precision vs. inference based on motif enrichment analysis. However, many regulatory factors share common binding sequences, and ChIP offers definitive identification of the protein of interest. The joint application of digital genomic footprinting with ChIP should therefore provide particularly rich information concerning the fine-scale architecture of cis-regulatory circuitry.
Digital genomic footprinting also provides a powerful tool for annotation of the genomes of diverse organisms about which little is known beyond the genome sequence itself. In these contexts, top-down approaches to regulatory factor binding site localization are limited. By contrast, digital genomic footprinting can be applied to develop rapidly both a gene-by-gene map and a lexicon of major regulatory motifs.
Cis-regulatory alterations accompanying different growth, conditions or cell differentiation and cycling impact multiple regulators simultaneously and are difficult to study. The approach described herein is readily extensible to the analysis of such changes across the genome by sampling sequential time points to visualize cis-regulatory dynamics. Digital genomic footprinting therefore has the potential to expose and probe the cis-regulatory regulatory framework of virtually any sequenced organism in a single experiment, regardless of its prior level of functional characterization.