|Home | About | Journals | Submit | Contact Us | Français|
The C2H2 zinc finger is the most commonly utilized framework for engineering DNA-binding domains with novel specificities. Many different selection strategies have been developed to identify individual fingers that possess a particular DNA-binding specificity from a randomized library. In these experiments, each finger is selected in the context of a constant finger framework that ensures the identification of clones with a desired specificity by properly positioning the randomized finger on the DNA template. Following a successful selection, multiple zinc-finger clones are typically recovered that share similarities in the sequences of their DNA-recognition helices. In principle, each of the clones isolated from a selection is a candidate for assembly into a larger multi-finger protein, but to date a high-throughput method for identifying the most specific candidates for incorporation into a final multi-finger protein has not been available. Here we describe the development of a specificity profiling system that facilitates rapid and inexpensive characterization of engineered zinc-finger modules. Moreover, we demonstrate that specificity data collected using this system can be employed to rationally design zinc fingers with improved DNA-binding specificities.
The Cys2His2 zinc finger is the most abundant class of DNA-binding domain found in human transcription factors (1). The considerable expansion of this domain in higher eukaryotes is likely related to its modularity and to its ability to specify a diverse range of different DNA sequences. Multiple research groups have shown that individual fingers can be re-engineered to recognize novel target sequences and that synthetic zinc-finger proteins (ZFPs) composed of multi-finger arrays fused to effector domains can be used to modify gene expression or genomic structure in human cells (2–6). The ultimate utility of engineered zinc fingers will depend upon developing the capability to rapidly generate synthetic ZFPs that are highly specific for their intended in vivo target sequence.
Although various methods for characterizing the DNA-binding specificity of a ZFP have been previously described, each of these methods has certain limitations that make them less than optimal for higher-throughput applications. In vitro approaches for determining the DNA-binding specificity of a transcription factor, such as Systematic Evolution of Ligands through EXponential enrichment (SELEX) or protein-binding microarrays (7,8), require the purification of each protein of interest and either multiple rounds of selection or specialized reagents and equipment; these requirements create barriers to the utilization of these systems for high-throughput applications. Other approaches that seek to comprehensively assess all possible variants of a given binding site, such as ELISA or yeast two-hybrid reporter assays, are useful for the analysis of short binding sites (9–11), but they become cumbersome when the number of positions varied in a target sequence increases to more than 4 bases (i.e.—more than 44 = 256 combinations). Recently, we described a bacterial one-hybrid (B1H) system that provides a simple method for determining the DNA-binding specificity of a transcription factor (12,13). This methodology requires only basic molecular biology expertise to perform and it does not require protein purification. However, like SELEX, this approach requires the sequencing of many binding site clones to generate a composite recognition motif for a transcription factor, thereby decreasing its utility for high-throughput applications. The number of sequences required to generate an informative motif is inversely dependent on the length of the binding site recognized by the factor, but is also dictated by the nuances in specificity that need to be detected, e.g. 20 sequences will provide a rough approximation of the 10 base pair DNA-binding specificity of the Zif268 zinc-finger domain.
To facilitate its use in high-throughput applications, we modified the B1H system so that base pair preferences present in a pool of selected binding sites can be assessed in a single sequencing reaction. Analysis of binding sites in this manner is possible if all of the selected sequences in a population are in register with one another. With this restriction, the summation of the sequencing outputs from each clone within the population will be coherent. The register of the sequences can be fixed by using ‘constant’ zinc fingers of defined specificity to anchor the zinc finger to be characterized over the randomized portion of the experimental binding site (Figure 1). This pre-positioning ensures that the selected portion of each binding site is in an identical position in each DNA molecule. Gogos and colleagues (14) demonstrated the feasibility of this type of approach for naturally occurring ZFPs when using SELEX to characterize the DNA-binding specificities of individual fingers present in different splice variants of the transcription factor CF2.
In this report, we describe the implementation and validation of a fixed binding site library approach in the context of the B1H site-selection system. We use this modified B1H system to profile the specificities of 34 engineered zinc-finger domains obtained using a previously described ‘bi-partite’ zinc finger selection strategy (15). We also demonstrate how specificity information generated from this method can be used to further optimize the binding specificities of engineered multi-finger domains.
The ‘Zif23*-binding site library’ (Figure 1A) was constructed using a specially synthesized oligonucleotide that minimizes bias within its randomized regions (Integrated DNA Technologies, Coralville, IA, USA). NotI and AscI restriction sites are present at each end of the library oligonucleotide (5′-CCGGGCGGCCGCTNNNNAAAAATNNNNNGGCGGCCGGGTAGGCGCGCCGAATTC-3′) for cloning into the pH3U3 reporter vector (12,13). The complementary strand was generated by primer extension using a complementary primer (5′-GAATTCGGCGCGCCTACCC-3′) in a 50μl reaction volume containing 1 × Taq DNA polymerase buffer (NEB), 0.3mM dNTP, 10μM library oligonucleotides, 30μM primer, and 2units of Taq DNA polymerase (NEB) in a single reaction cycle (95°C for 2min, 60°C for 3min, and 72°C for 10min). Following extension, the duplex DNA product was purified on a 3% TAE low-melting temperature agarose gel, and this DNA was recovered by electroelution (16). The duplex library was ethanol-precipitated to remove excess salts and digested with NotI and AscI sequentially. The digested product was purified by the same procedure described above for the double-stranded library DNA. The pH3U3 reporter vector was similarly digested with NotI and AscI. This digested backbone was purified on a 1% TAE agarose gel and isolated using a QIAquick Gel Extraction Kit (Qiagen). Test ligation reactions of different ratios of reporter plasmid and library insert were performed to optimize the ligation conditions. The optimized plasmid/insert ratio (1:6) was used to ligate the digested library into the pH3U3 reporter vector in 20μl containing 1 × T4 DNA ligase buffer (NEB) and 0.5μl T4 DNA ligase (400units/μl NEB) at room temperature for 30min. This ligation mixture was ethanol-precipitated, washed with 70% ethanol, air-dried, resuspended in 10μl of ddH20, and then used to electroporate 70μl of electrocompetent XL1-Blue cells (Stratagene). Electroporated cells were immediately recovered for 1h in 10ml SOC at 37°C with shaking at 120r.p.m. Following recovery, 20μl of the culture was removed and 5μl drops of a series of 10-fold serial dilutions were placed on a 2 × YT plate containing 30μg/ml kanamycin to determine the number of independent transformants in the library. Kanamycin was added to the remaining cell culture to a final concentration of 30μg/ml and growth of the culture was continued at 37°C with shaking at 200r.p.m. After 4h of amplification, the cells were collected by centrifugation at 5000g for 10min and a plasmid miniprep protocol was used to isolate pooled binding site library DNA from half of the cells (QIAprep Miniprep Kit, Qiagen). A fraction of the recovered library was transformed into the bacterial selection strain (US0ΔpyrFΔhisB) and counter-selection was performed on 10 150mm × 15mm round plates containing 2.5mM 5-FOA YM agar medium to remove self-activating clones from the library (12). The number of transformants subjected to counter-selection (~6 × 106) exceeds the theoretical size of the library by more than 20-fold. Surviving colonies were washed off each plate using 10ml 2 × YT medium, 15–20 3mm sterile glass beads and gentle agitation. The recovered cells were pelleted by centrifugation (5000g) at 4°C for 15min. Half of these cells were resuspended in 3.75ml P1 buffer from the QIAprep Miniprep kit and then aliquoted into 15 1.5ml-eppendorf tubes. Plasmid DNA was isolated using the standard Qiagen miniprep protocol except that the cell lysates were loaded onto five miniprep columns in the purification step and the optional PB wash step was performed.
We used a previously described bacterial two hybrid (B2H) selection system to identify artificial C2H2 zinc fingers from a randomized library which was built using a strategy described by Isalan and colleagues (15). In this approach, randomized libraries are constructed by partially randomizing the recognition helix residues of two zinc fingers from the three-finger Zif268 DNA-binding domain. In these libraries, the unaltered portions of the Zif268 domain anchor the randomized finger recognition helices over a target DNA sequence. We constructed a C2H2 ZF library analogous to the ‘Zif23 library’ previously described by Isalan and colleagues (15) in which three residues within finger 2 and six residues within finger 3 of Zif268 are partially randomized (Figure 1B and C). Our library differs from the original Zif23* library in that it possesses greater variability in the amino acid residues present at each randomized position (Supplementary Figure 1). We termed this improved library the ‘Zif23* finger library’. In contrast to Isalan and colleagues (15), we then used the B2H system previously described by Joung and colleagues (17,18) (instead of phage display) to isolate artificial C2H2 ZFs with novel specificities from this Zif23* finger library. The Zif23* finger library was built using cassette mutagenesis in a plasmid which expresses each library member as a Gal11P-fusion, thereby permitting the DNA-binding activity of each library member to be interrogated using the B2H selection system (Figure 1D). Our new Zif23* finger library was constructed from ~1.3 × 108 independent transformants and the library contains reasonable distributions of amino acids within the randomized recognition helix positions as determined by inspection of 10 random candidates from our Zif23* library (data not shown). Zinc finger proteins within the Zif23* library recognize a target DNA sequence of the form 5′NNNNNGGCG(g/t)3′ in which the constant recognition helix residues from Zif268 in finger 1 and in part of finger 2 bind to the constant bases of this sequence. Seventeen complementary selection strains that each harbor a different binding site of the form 5′XXXXXGGCGT3′ (Table 1) were constructed and sequence-verified (data not shown). Briefly, each target DNA sequence was cloned upstream of a weak promoter controlling the expression of a co-cistronic HIS3/aadA gene cassette and then transferred to a single copy F′ episome by homologous recombination using modified and miniaturized versions of previously described methods (17,19).
The B1H selection system was used to rapidly characterize the DNA-binding specificities of ZFPs identified from selection experiments performed with the Zif23* finger library (13). Briefly, each ZFP to be characterized was excised from its B2H expression vector by simultaneous digestion with NotI (NEB) and XbaI (NEB). This DNA fragment, which contains the engineered and anchor fingers, was purified on a 1% agarose gel, isolated using a QIAquick Gel Extraction Kit (Qiagen) and sub-cloned into an α-fusion expression vector (pB1H1) using the unique NotI and AvrII restriction sites (12,13). Electrocompetent US0ΔpyrFΔhisB cells containing each α-ZFP clone in pB1H1 were prepared and transformed with ~100ng of the Zif23* binding site library. Upon further investigation we found that the selection procedure can be streamlined by cotransforming the selection strain US0ΔpyrFΔhisB with the Zif23* binding site library (100ng) and the α-ZFP expression vector (1μg) using electroporation. The transformants were recovered in 1ml SOC for 1h at 37°C on a roller wheel at 80r.p.m. Following recovery, the cells were pelleted by centrifugation at 4000g for 10min at 25°C and resuspended in 1ml NM medium (a modified minimal medium (16)) supplemented with 0.1% histidine, 30μg/ml kanamycin and 30μg/ml chloramphenicol. The cells were grown for an additional hour at 37°C on a roller wheel. Following the NM media recovery step, the cells were transferred into a sterile 1.5ml eppendorf tube and pelleted at 18000g for 1min at 25°C. The cell pellet was washed by resuspension in 1ml sterile water and pelleted again at 18 000g for 1min at 25°C. This wash step was repeated two additional times to remove all traces of histidine and then the cells were resuspended in 0.5ml NM medium lacking histidine. The cells were plated on NM medium plates (150 × 15mm) containing 30μg/ml kanamycin, 30μg/ml chloramphenicol, 10μM IPTG, and either 1 or 2mM 3-AT at a density of ~5 × 106 transformants per plate (16). These plates were incubated at 37°C for 1–2 days until well-defined colonies appeared. Surviving colonies were washed off the plate in 10ml of 2 × YT medium using 15–20 3mm sterile glass beads and gentle agitation by hand. Harvested cells were pelleted by centrifugation at 5000g for 10min at 4°C. Plasmid DNA was isolated from the collected cells, or only a fraction of the cells if the cell pellet size exceeded the amount of cells recovered from a typical 5ml saturated culture, using a QIAprep Miniprep Kit treating the cell pellet as a single plasmid miniprep. The sequence preference of each ZFP was determined by sequencing the harvested pH3U3 reporter plasmids as a pool using a primer of sequence 5′- GAAATATGTATCCGCTCATGAC-3′. DNA prepared from the surviving colonies contains a mixture of both the pH3U3 reporter plasmids and the pB1H1 α-ZFP expression plasmid, and therefore approximately four times the amount of DNA typically required for a standard sequencing reaction was used. The pH3U3 reporter vector is a low copy number plasmid and occasionally the sequencing results from the plasmid mixture were unsatisfactory due to a high level of baseline noise. When this occurred the binding site region from the pool of selected pH3U3 plasmids was PCR-amplified by primers that flank this region (forward primer 5′- GAAATATGTATCCGCTCATGAC-3′; reverse primer 5′-CCAGAGCATGTATCATATGGTCCAGAAACCC-3′) in a 50μl reaction with 1 × Taq DNA polymerase buffer (NEB), 0.3mM dNTP, 0.5μM of each primer, 2u of Taq DNA polymerase (NEB) using 30 amplification cycles (94°C for 20s, 55°C for 30s and 72°C for 30s) followed by a 6min final extension at 72°C. The PCR product from this amplification was isolated by running a 1% TAE agarose gel and purified using QIAquick Gel Extraction Kit (Qiagen). This purified PCR product was sequenced using the forward primer. In retrospect, sequencing the PCR products generated from the pool of selected binding sites provided a more robust method for the analysis of the pooled binding sites.
Two chimeric ZFP modules (B2* and D1*) were generated by combining the recognition helices from two different pairs of proteins (B1/B2 and D1/D2). Single amino acid substitutions were introduced into clone B2 and D1 using a two-step PCR procedure with the mutant codon integrated into the overlapping primers (primer information is available upon request). Each newly assembled ZFP module was digested with NotI and XbaI (NEB) and inserted into pB1H1 between the NotI and AvrII sites for characterization using the B1H-binding site selection system as described above.
We tested the feasibility of using a fixed binding site library for the characterization of zinc-finger modules by using it to determine the binding specificity of two of the three zinc fingers from the transcription factor Zif268. To accomplish this, we constructed a ‘Zif23*-binding site library’ that contained two elements: a 10 base pair Zif268 binding site in which five base pairs contacted by fingers 2 and 3 are randomized and five base pairs recognized by finger 1 and a portion of finger 2 are fixed and a clone discrimination ‘Key’ (Figure 1A), which is an additional four base pair randomized segment upstream of the Zif268 binding site that serves as an internal control to detect any sequence bias in the selected clones that is not due to interaction with the ZFP. The Key region also allows one to determine whether identical binding sites obtained more than once from a given selection are independent isolates should the sequencing of individual clones be desired. The sequences flanking each randomized region were chosen to avoid the creation of alternate binding sites for fingers 1 and 2 of Zif268 that might complicate interpretation of the selection results.
Synthetic oligonucleotides encoding the Zif23*-binding site library were introduced into the reporter plasmid pH3U3 upstream of a weak promoter that drives expression of the co-cistronic yeast HIS3 and URA3 reporter genes (Figure 1A)(13). The constructed library contains ~2 × 106 unique clones, ensuring significant oversampling of the 262144 (=49) potential sequences encoded within the library. Although control experiments demonstrated that the frequency of auto-activating sequences (i.e. binding sites that activate transcription of the weak promoter in the absence of a test ZFP) in our library is only ~0.005%, we nonetheless used 5-FOA counterselection to reduce the number of auto-activating sequences as previously described. To do this, the library was transformed into our selection strain (US0ΔpyrFΔhisB) and plated on YM minimal media containing 2.5mM 5-FOA, thereby eliminating self-activating sequences due to toxicity associated with expression of the URA3 reporter gene (12,13,16). DNA from the ~6 × 106 colonies surviving on these plates was recovered as a pool to generate a purified binding site library for our binding site selection experiments. The purified library was sequenced as a pool and the resulting chromatogram reveals no significant bias within the randomized positions in either the Key or ZFP-binding site regions (Figure 2).
The DNA-binding specificity of fingers 2 and 3 of Zif268 was determined by transforming a selection strain (already harboring a plasmid which expresses an α-Zif268 hybrid protein) with the Zif23*-binding site library (~2.5 × 106 transformants) and then selecting for HIS3 reporter gene expression by plating the transformants on histidine-deficient NM medium containing 2mM 3-AT (a competitive inhibitor of the HIS3 enzyme). This selection yielded ~3000 colonies after 36h of growth; these colonies were pooled and the mixture of reporter vectors from this population was isolated and sequenced. The previously defined consensus binding site (5′-GCG(T/G)GGGCGG-3′) of Zif268 (20,21) is readily apparent in the sequencing chromatogram from the selected pool of clones (Figure 2; randomized base positions highlighted in bold). This result demonstrates that the B1H-based binding site profiling method can be used to rapidly determine the DNA-binding specificity of a zinc-finger domain using only a single DNA sequencing reaction.
To further test the utility of the B1H profiling method, we wished to determine the specificities of a series of engineered zinc-finger domains using this approach. To obtain a large number of multi-finger domains with different DNA-binding specificities, we constructed a library of Zif268 variants in which three of the recognition helix residues in finger 2 (positions 3, 5 and 6 numbered relative to the beginning of the recognition helix) and six of the recognition helix residues in finger 3 (positions −1, 1, 2, 3, 5 and 6) were randomized (Figure 1B). As previously described by Isalan and colleagues (15), this randomized library can be used to select novel three-finger domains capable of binding to potential target DNA sequences of the form 5′-NNNNNGGCG(t/g)-3′ (where N can be any of the four possible nucleotides; Figure 1B). Using a previously described B2H selection system (17,18), we performed selections with this library against 17 different binding sites to identify novel three-finger domains capable of binding to each of these sequences. A small number (~10) of the hundreds of surviving clones from each of the 17 different selections were sequenced to identify the amino acid residues present at each randomized position and to estimate the diversity that remained in the selected pool of clones. As anticipated, for each target DNA sequence, the sequences of multi-finger domains identified by selection closely resemble one another; in addition, comparison of fingers from different selections reveals significant differences in the amino acid sequences obtained for each binding site, suggesting that the selection protocol identified members of the library with different DNA-binding activities (Supplementary Figure 2).
We used the Zif23*-binding site library and the B1H profiling method to characterize two multi-finger proteins isolated from each of the 17 selections described above (34 proteins in total; Table 1). The Zif23*-binding site library is compatible with these engineered zinc-finger domains because the unmodified portions of fingers 1 and 2 from the Zif268-framework serve as anchors to position the re-engineered portions of fingers 2 and 3 over the randomized base positions within the binding site library. To perform these binding site selections, a DNA fragment encoding each engineered three-finger ZFP was introduced into the pB1H1 expression vector to permit its expression as an RNA polymerase α-subunit fusion(13). To determine the DNA-binding specificity of each engineered domain, ~5 × 106 US0ΔpyrFΔhisB cells containing both the Zif23* binding site library and one of the 34 different α-ZFP expression vectors were plated on NM minimal medium containing 1 or 2mM 3-AT. Typically 103–104 colonies appeared after ~36h of growth at 37°C; the plasmid DNAs isolated from pools of these clones were sequenced. The vast majority of the pools displayed nucleotide bias within the randomized portion of the Zif268 binding site region that is partially or completely consistent with the expected specificity of each re-engineered Zif268 variant (Figure 2 and Supplementary Figure 3). Of the 34 Zif268 variants examined using this approach, >80% displayed good specificity for their target sequence (defined as a match between the expected target sequence and the tallest peak in sequence chromatogram at three or more of the five positions; Table 2).
Although some of the engineered variants show partially degenerate specificities, this is not unexpected since many naturally occurring C2H2 ZFs possess partially degenerate specificities (e.g. the known degeneracy in the binding site signature of Zif268 seen in Figure 2). In addition, the B2H selections used to isolate the engineered zinc-finger domains were not performed at the highest possible stringency and only a small number of variants (~10) were sequenced for each selection. Thus, the selections have a very low probability of identifying zinc-finger domains with fully optimized DNA-binding affinities and specificities.
Because the DNA-binding specificities of some of the 34 engineered zinc-finger domains were not completely optimal, we hypothesized that we might be able to use the specificity data generated using the B1H profiling system to rationally introduce mutations that improve the specificities of these modules. For example, for target binding sites B and D (Figure 2), the two different Zif268 variants we characterized for each site specified different subsets of bases within the intended five base pair target site. If we assume that the engineered three-finger domains utilize a canonical, Zif268-like recognition pattern (22,23), we can infer the amino acid residue in each finger that is responsible for defining the specificity of each base position in the target DNA binding site (Figure 1B). Based on this assessment, we designed new fingers that would be expected to possess improved specificities for target sites B or D. For example, the B1 and B2 three-finger domains were isolated from a selection with the intended target site 5′-GACTG-3′. Our binding site selection results indicate that the B1 module displays good specificity for four bases in its target site but only a weak preference for the T at the fourth position (5′-GACtG-3′, Figure 3). The B2 module also displays good specificity but for a different subset of four of the five bases in its target site—it weakly specifies A instead of G in the fifth position (5′-GACTa-3′). If the key recognition residues based on the structure of Zif268 are acting independently at these positions then substituting the histidine at position 3 of finger 2 from the B1 module for the asparagine at the same position in the B2 module should result in a hybrid protein (B2*) with the desired specificity. Information from deterministic and probabilistic recognition codes also predict histidine at position 3 should recognize a guanine in the middle position of a triplet subsite (24–26). Moreover, histidine is the most commonly occurring amino acid at this position in the pool of fingers sequenced to recognize this site (Supplementary Figure 1). As shown in Figure 3, the B1H profiling system shows that the rationally designed B2* module fully specifies all five bases in the ‘B’ target site 5′-GACTG-3′ as expected. A similar result was observed for the D1 and D2 modules isolated from a selection with the target sequence 5′-GAATT-3′. The D1 module specified all of the positions within its binding site except the 5′ G (5′-mAATT-3′). The D2 module effectively specified the 5′ G in its binding site but poorly specified two of the central positions (5′-GAxxT-3′). Arginine occupies position 6 of the recognition helix in finger 3 of the D2 module, which should specify G, whereas valine occurs at position 6 of finger 3 in the D1 module, which is not expected to display a strong preference for any particular DNA base based on deterministic and probabilistic recognition codes (24–26). Substitution of arginine for valine at position 6 in the D1 module results in a module, D1*, that displays the desired DNA-binding specificity for the entire binding site (5′GAAtt3′). These experiments demonstrate that the DNA-binding specificity data generated by the B1H profiling system can be used to improve the DNA-binding specificities of engineered ZFPs.
We have shown in this report that combining the B1H binding site selection system together with a fixed register binding site library permits a rapid and inexpensive qualitative assessment of the specificity of an engineered ZFP module. Once an appropriate binding site library is constructed and the nucleotide diversity within the randomized region is verified by sequencing to ensure that there is no significant bias that could adversely affect the interpretation of results, the specificity of an individual ZFP module can be determined in a single selection step: a single sequencing reaction performed on the pool of selected binding sites provides the relative distribution of bases present at each position. Since most artificial ZFPs are selected in a context where one or more fingers are randomized while others are held constant (due to inherent limitations on library complexity) (15,18,27,28), the B1H profiling system is complementary to a typical zinc-finger selection system. In principle, the binding specificities of the most promising modules from a zinc-finger selection can be rapidly determined using this profiling method and then combined with complementary selected modules to generate ZFPs with entirely novel specificities. Following assembly of the engineered modules, the specificity of the full protein can be interrogated using the standard B1H binding site selection system (13,16).
This type of high-throughput analysis should prove particularly powerful for the characterization of large databases of fingers or modules that are selected from a common finger library framework to specify different sequences. Although the experiments described herein have focused on the characterization of one-and-a-half fingers from a three-finger protein, the profiling system should also be amenable to the analysis of individual fingers (re-engineered by either selection and/or design) provided that they are placed in an appropriate framework with constant ‘anchoring’ fingers (24–26,29). Moreover, because the B1H system also works with other non-zinc-finger DNA-binding domains, it should be possible to employ this method to evaluate other classes of engineered DNA-binding domains, such as homeodomains (30) or homing endonucleases (31), where information about the specificity of only a portion of the domain is desired. It is important to note that the information obtained by sampling a pool of selected sequences using this method should not be considered quantitative, as each surviving clone may contribute a different amount of DNA to the pool due to differences in colony size. Furthermore, the relative peak heights in a chromatogram are susceptible to potential bias due to variability in the sequencing chemistry, such as the sequence-dependent efficiency of termination (32). However, recent improvements in sequencing chemistry and reaction conditions have resulted in much more consistent peak heights throughout a sequencing chromatogram(33,34). Concerns about potential chromatogram bias due to the sequencing chemistry can be alleviated by sequencing the complementary strand of the selected region to verify the observed base ratios (35). It is also worth noting that other selectable markers—GFP and CAT—have been developed and validated for the B1H system and that they may provide useful alternatives for performing these selections (36). For instance, cell sorting using a GFP reporter could be used to isolate all clones above some desired total cell fluorescence threshold to create a pooled population of cells for analysis. Finally, this method will not provide any information about the co-variation of bases within the binding sites for a pool of selected clones, but this information can be obtained by sequencing individual clones from the selection (7,37).
The ability to use the specificity data produced from this technique to engineer zinc-finger chimeras with improved DNA-binding specificities highlights an additional application of this method. A similar type of specificity optimization of individual finger modules has been employed by the Barbas laboratory using an Enzyme-linked immunosorbent assay (ELISA)-based assay to assess the specificity of fingers for a set of individual target sequences (27,38). However, this ELISA-based assay typically interrogates the specificity on a sequence-by-sequence basis. In contrast, single-pool analysis via a B1H selection instead of interrogating 4N different sequences has significant advantages with regards to throughput, although this sacrifices a direct comparison between individual sites. We would also note that the simple single residue substitutions that were successfully employed herein to generate chimeras with improved specificity may only work in certain instances because recognition helix residues in ZFPs do not always act independently to specify bases within their binding site. In some cases, there may be context-dependent effects at the level of both the DNA and protein sequence that influence base preferences within the binding site (21). Ultimately, the ability to rapidly and inexpensively characterize the specificity of a large number of zinc-finger clones should provide an important avenue for generating more comprehensive data sets on the DNA-binding specificities of not only zinc fingers but also other families of DNA-binding domains. Such data sets will provide an important basis for creating a more accurate description of the complex determinants for each family that are responsible for sequence-specific DNA-recognition. This technology should also have practical immediate uses in efforts to construct more specific artificial ZFPs for targeted gene regulation (39,40) or modification through the use of tethered nucleases (4,41–43).
Supplementary Data are available at NAR Online.
We thank Gary Stormo and Michael Brodsky for their helpful suggestions regarding these experiments. Work by S.A.W. and X.M. was supported by the NIH (NIGMS) grant 1R01GM068110 (S.A.W.). Work by S.T.-B. and J.K.J. was supported by start-up funds from the MGH Department of Pathology. J.K.J. is supported by the NIH (R01 GM069906 and R01 GM072621). Funding to pay the Open Access publication charges for this article was provided by the NIH (NIGMS) grant 1R01GM068110 (S.A.W.).
Conflict of interest statement. None declared.