Type II restriction endonucleases cleave double-stranded DNA at a constant position with respect to a short (3–8 bp) recognition sequence (1
). Their exquisite specificity has rendered them among the most useful tools in molecular biology (1
). However, the impact of additional variables such as organic solvent, ion, small molecule and enzyme concentrations has large effects on the specificity of restriction endonucleases, often leading to cleavage at non-cognate sites (termed star activity) (3–7
). Many commonly used restriction endonucleases show some star activity even under standard reaction conditions (3
). The DNA substrate itself can also modulate cleavage. It has been noted that nucleotides flanking the recognition site confer large contributions to the energetics of cleavage (8–12
). Quantitative analysis of star activity and flanking effects will help to elucidate the structure–function rules for restriction enzymes, define the window of optimal restriction endonuclease specificity as well as tailor reaction conditions toward novel target sequences.
Despite the conserved functionality among the restriction endonuclease family, these enzymes show great divergence in both sequence and mechanism (1
). Apart from isoschizomers, most members show little sequence homology to each other or other known proteins (1
). Additionally, the variable distribution of base-contacting residues among the restriction endonucleases has confounded recognition sequence prediction (9
). Consequently, restriction endonuclease characterization must be carried out empirically for each enzyme. Star activity (4
) and flanking preference (8–12
) have been investigated for several enzymes. These experiments have been performed on homogeneous substrates. A series of oligonucleotides containing different star or flanking sequences are synthesized, annealed, cleaved and analyzed one by one, making exhaustive studies difficult. Recognition site determination is typically carried out by digestion of a homogeneous plasmid or virus DNA substrate followed by agarose gel visualization of cleavage products (6
). This technique is lacking both in its substrate complexity and sensitivity. A given cognate or star site could occur few times in these substrates, and at times, not at all. This limits the ability to accurately quantify activity at different cleavage sites owing to a lack of diversity of flanking nucleotides. Star activity is often several orders of magnitude lower than cleavage at the cognate site (3
). Consequently a large component of star activity will remain cryptic when cleavage products must be of sufficient abundance to be visualized on an agarose gel.
The growing amount of prokaryotic genomic sequence putatively coding for uncharacterized restriction endonucleases (26
) in conjunction with ongoing efforts to engineer altered specificities (22–25
) will be aided by high-throughput methods to quantify restriction endonuclease activity instead of the methods currently available. For example, to characterize the genome-wide digestion patterns of the methylation-specific restriction endonuclease AbaSDFI (31
), genomic rat brain DNA was digested with AbaSDFI to map 5-hydroxymethylcytosines, the digestion products were cloned into plasmids and Sanger sequenced one by one to map 122 cleavage sites to the rat genome. A similar strategy was used to demonstrate the relaxed specificity of the restriction enzyme TspGWI in the presence of sinefungin by Sanger sequencing 218 clones (5
High-throughput sequencing has become a valuable tool for analyzing DNA–protein interactions. The ability to experimentally pair a DNA–protein interaction to a sequencing event has enabled techniques such as ChIP-seq (32
) to provide sensitive statistics on transcription factor–DNA binding. We use derivations of the RAD-seq (33
) method to quantitatively measure restriction endonuclease activity across the sequenced Drosophila melanogaster
and Escherichia coli
genomes. This method specifically prepares DNA adjacent to restriction sites for Illumina sequencing, allowing the relative sequence counts of sites with different flanking nucleotides to be determined. The RAD-seq protocol was carried out with serial enzyme dilutions to identify flanking motif enrichment in enzyme-limiting reactions. Modifications were made to the protocol to sequence all cleavage events regardless of overhang to generate a complex profile of relative activities at cognate and star sites in a single experiment. We apply these methods to quantify the cleavage patterns of EcoRI and MfeI, to compare star activity with their engineered high-fidelity counterparts and to quantify the effect of flanking nucleotides on MfeI activity.