|Home | About | Journals | Submit | Contact Us | Français|
ZiFiT (Zinc Finger Targeter) is a simple and intuitive web-based tool that provides an interface to identify potential binding sites for engineered zinc finger proteins (ZFPs) in user-supplied DNA sequences. In this updated version, ZiFiT identifies potential sites for ZFPs made by both the modular assembly and OPEN engineering methods. In addition, ZiFiT now integrates additional tools and resources including scoring schemes for modular assembly, an interface with the Zinc Finger Database (ZiFDB) of engineered ZFPs, and direct querying of NCBI BLAST servers for identifying potential off-target sites within a host genome. Taken together, these features facilitate design of ZFPs using reagents made available to the academic research community by the Zinc Finger Consortium. ZiFiT is freely available on the web without registration at http://bindr.gdcb.iastate.edu/ZiFiT/.
Engineered zinc finger proteins (ZFPs) are important tools for gene regulation and genome modification because they can be used to target functional domains to virtually any desired location in a complex genome (1,2). Zinc finger nucleases (ZFNs) consist of an engineered ZFP fused to a non-specific nuclease domain and can be used to create double-stranded breaks (DSBs) in specific endogenous genes (3). These DSBs can be exploited to induce highly efficient insertion or alteration of DNA sequences via homologous recombination at the targeted locus (4–11). Alternatively, imperfect repair of a ZFN-induced DSB by non-homologous end joining can lead to highly efficient generation of gene-specific knockouts (7,12–17).
Engineered C2H2 ZFPs comprise multiple (usually three to six) ZF domains joined together by a fixed amino-acid linker sequence(s), typically TGEKP. Each individual domain conforms to the zinc-finger motif X2–C–X2–4–C–X12–H35–H which, when chelated with a zinc ion, forms a ββα fold. This structure presents a stabilized α-helix (the recognition helix) capable of making base specific contacts with approximately three bases in the major groove of double-stranded DNA. Adjacent ZF domains in a ZFP typically specify adjacent DNA triplets, establishing specificity for an extended target site (18).
ZFPs can be engineered to recognize new DNA sequences by altering as many as six important residues in the recognition helix. Because ZFP libraries covering this sequence space (2018 variants for a three finger protein) cannot be built or adequately sampled using existing molecular biology techniques, much effort has been placed on developing alternative methods for engineering multi-finger proteins (7,19–29). These methods typically involve identifying individual zinc-finger domains that recognize specific DNA triplet ‘subsites’ and then joining these domains together to create multi-finger proteins. The two most common ZFP engineering methods, modular assembly and oligomerized pool engineering (OPEN), both use variations on this general approach (2). Modular assembly usually assumes that a single domain (module) can recognize a specific DNA triplet regardless of the position of the triplet within the target site or the identities of adjacent neighboring fingers (i.e. it assumes binding of the ZF module is context-independent). Appropriate modules are simply joined together to create a ZFP that should recognize the target sequence. Modular assembly is relatively simple to accomplish; however, ZFPs generated using this method have been shown to have a high failure rate in vivo (30–32). In contrast, OPEN uses customized ‘pools’ of ZF modules selected to recognize triplets in a specific sequence context. These pools can be assembled to create combinatorial libraries (with up to one million unique solutions for a three-finger protein) from which the ZFPs best able to bind the chosen target DNA site are identified. Although OPEN is somewhat more labor intensive, it is more robust, with a higher success rate than modular assembly (7).
ZiFiT provides a simple interface for scanning a DNA sequence to identify potential ZFP and ZFN binding sites. The updated version (3.2) identifies target sites for proteins engineered using either OPEN or modular assembly. ZiFiT 3.2 also provides several new tools to help researchers evaluate ZFP targets, including validated scoring schemes for ranking potential target sites, a tool for querying NCBI BLAST servers for potential off-target sites, and a seamless interface with the Zinc Finger Database [ZiFDB, a database of engineered ZFPs (33)].
The modular assembly engineering approach employs individual zinc-finger domains (modules) that have been pre-characterized (in the middle position of a three-finger array) to bind a specific DNA triplet subsite. Several (three to six) of these modules can be arranged and linked together to generate a ZFP that recognizes an extended DNA sequence corresponding to the desired target site. ZiFiT provides support for the three most commonly used module sets developed by three independent research groups (21,23,27).
OPEN utilizes pools of zinc-finger domains pre-characterized to bind a specific DNA triplet subsite in the first, second or third position of the target site for a three-finger ZFP (7). Appropriate pools are combined to generate hundreds of thousands of distinct solutions for a given 9-bp target DNA sequence. The best solutions are subsequently identified using a bacterial two-hybrid (B2H) assay (34). Although this method requires considerable effort, it reliably generates ZFPs that bind with high affinity and specificity to their intended target site (7,9,11,16).
The GNN score is an empirical estimate of the probability that a modularly assembled three-finger ZFP will provide >1.6-fold activation in the B2H assay [proteins that fail to meet this cutoff have been shown to fail to function in mammalian cells (30)]. Using probabilities based on evaluation of a set of 168 three-finger ZFPs generated by modular assembly, ZFPs designed to bind target sites containing 3, 2, 1 and 0 triplets of the form GNN (where N is any nucleotide) are predicted to have success rates of 59, 29, 12 and 0%, respectively (30).
The affinity score is an energy-based parameter that predicts which modularly assembled three-finger ZFPs are most likely to function by inferring the contributions of the individual modules. Scores are calculated by estimating the relative free energy contributions of individual modules from dissociation constants reported for modules in the middle (F2) position of a three-finger ZFP (27). These affinity scores have been calibrated to B2H fold-activation values for modularly assembled ZFPs tested in vivo. ZFPs with affinity scores less than five are expected to have adequate affinity to function in the B2H assay (35). These scores do not directly address specificity and are available only for ZFPs composed exclusively of Barbas GNN and TGG modules.
The ZiFiT 3.2 interface enables customizable searches for potential ZFP and ZFN-binding sites that can be targeted using either the modular assembly or OPEN engineering methods. After selecting ZiFiT from the menu bar, users select their preferred engineering method (modular assembly or OPEN) and target type (ZFP or ZFN), e.g. OPEN – ZFN. In all interfaces, users enter their DNA query sequence into the Sequence input box near the top of the page (Figure 1). Sequences can be submitted either in FASTA format or raw text. Ideally, sequences should be entered using uppercase characters to denote exons and lowercase characters to denote introns. Entering information in this format can facilitate target selection for certain experimental applications, such as generation of knockout mutations via ZFN-induced DSBs within a protein-encoding region. This function can be disabled by de-selecting the Exon/Intron Case Sensitivity check box immediately below the Sequence input box.
The engineering method chosen by the user determines which sequences can be targeted. For modular assembly, users can choose one or more of the three available module sets by selecting the Barbas, ToolGen and Sangamo check boxes at the top of the page (21,23,26,27,36). Several studies have generated functional zinc fingers by combining modules from different sets (21). For OPEN, users indicate which pools they wish to use by choosing the corresponding target DNA triplet for each finger position in a three-finger array. All published OPEN pools currently available from the Zinc Finger Consortium are checked by default (7).
Users can specify the number of ZF modules to include in the Left Array and Right Array using dropdown menus below the Sequence input box. ZiFiT restricts the number of modules in concordance with the available reagents. OPEN reagents and Sangamo modules are specific to three-finger ZFPs or ZFNs, whereas practitioners of modular assembly implementing only Barbas and Toolgen modules may scan for individual target sites for ZFPs consisting of three to eight fingers, or for dimeric target sites for ZFNs consisting of three or four fingers for each target ‘half-site’. Users should note that although increasing the number of modules might be expected to confer enhanced specificity, this is not always the case because deformation of the DNA upon ZFP binding may limit the number of ZF domains that bind concurrently (37). In addition, because three domains are often sufficient to bind DNA with high affinity, longer ZFP arrays may require disrupted linkers between domains to prevent binding at unintended sub-sequences within the target site (24).
For ZFN targets, users must also specify the length of the spacer region between the binding sites for the left and right ZF arrays using the Spacer dropdown menu immediately below the Sequence input box. An active dimeric ZFN cleaves within the spacer region and its preferred length is dependent on the sequence and length of the amino-acid linker between the ZF and nuclease domains. Zinc Finger Consortium vectors harbor a linker that works with spacers of 5 or 6 bp of DNA (38–40). An additional linker permitting spacers with 6 or 7 bp of DNA has also been identified (40). By default, ZiFiT scans for ZFN targets with spacers of 5, 6, or 7 bp. Users can choose to limit the scan to only one or two spacer lengths.
Advanced search options are accessible by selecting the Advanced link in the lower right hand corner (this link then toggles to Basic, which hides the Advanced options). Advanced options allow users to customize their scan by adjusting additional constraints. For example, users can restrict the minimum and maximum number of GNN, ANN, CNN and TNN triplets for reported targets. This feature can help users identify the best sites by restricting searches to more successful GNN-rich sites (13,30). Additional Advanced options available for a subset of the input interfaces include: (i) Ignore Asp overlap: This option is available to users of modular assembly; it refers to ‘target site overlap’ in which an Asp in position +2 of the recognition helix specifies a fourth base (41). This option is useful for troubleshooting why ZiFiT fails to return an expected target site; its use during design should be restricted to advanced users. (ii) Search both strands: Because ZFPs bind directionally in a 5′–3′ manner, they can be engineered to bind either strand of DNA. ZiFiT searches both strands by default. Additional guidance for using advanced options is provided on the ZiFiT Instructions and FAQ pages.
ZiFiT scans the user-supplied input DNA sequence for potential ZFP binding or ZFN cleavage sites based on the selected engineering method and any additional user-defined restrictions. For each user submission, ZiFiT displays a graphic map of the submitted sequence with each target site (‘hit’) indicated above the sequence (Figure 2A). (Users may need to ‘enable pop-ups’ in their browser for this feature to function properly.) The submitted sequence is displayed as a red bar at the bottom of the map. When an Exon/Intron Case-Sensitive search is performed (see Program Input), exons are represented by thick red bars and introns by thin red lines. Hits are represented as short colored bars above the sequence track, with overlapping hits overflowing vertically into auxiliary tracks. Each bar is a clickable link to detailed target information on the main output page. For ZFN scans, hits are color-coded according to the length of the spacer.
The main output page opens with a summary of the search parameters, followed by a dropdown Sort By menu that can be used to sort individual hits based on position or score when available (see ‘Materials and Methods’ section). Because ZFN targets consist of two ZFP target sites, both of which must be targeted successfully, score-based sorting considers the score of the inferior scoring array site before the better scoring array target. When users implement an ‘Exon/intron case-sensitive’ scan for ZFN targets, a Filter intronic splice sites checkbox is present immediately to the right of the Sort By menu. Selecting this box will hide ZFN targets whose spacers occur within (or overlap) an intron (Figure 2B). In addition to its web-based output, ZiFiT also provides a text version of the output which can be downloaded as a .csv file from the top of the output page.
Each hit is named using the description/comment line of the submitted FASTA sequence and an index number. If no sequence name is supplied by the user, this parameter is set to ‘Unknown’. Names for ZFN targets also include the spacer length. For example, for a submission with the FASTA description ‘>ZFN-SAMPLE,’ the third ZFN target site with a spacer of 7 bp would be labeled ‘ZFN-SAMPLE-SP-7-3′. Immediately beneath each hit name, the double-stranded DNA target sequence is displayed, along with its position within the submitted sequence. Individual triplets within this sequence are highlighted with distinct colors denoting the targets of individual ZF domains. Each highlighted ZFP is linked to ZiFDB, a database of engineered ZFPs. Clicking these links automatically queries ZiFDB for available information regarding ZFPs tested against the same or similar target DNA sequences (33). If an expected functional activity score is available for a given ZFP, the score is presented on the same line as the target. ZiFiT currently provides two validated functional scoring schemes for modular assembly targets (30,35). Scoring schemes for OPEN targets are under development.
Individual ZF targets can be expanded using the ‘+’ to the left of the target name. Expanding a target reveals three types of information. (i) A table describing reagents that can be used to generate corresponding ZFPs. Each row in the table describes a reagent (pool or module) corresponding to a Triplet DNA sequence, which is color-coded according to its position in the double-stranded sequence above the table. Entries in the Reference Number column of the table signify the names of reagents (either modules or pools) that are currently available to the academic research community through the Zinc Finger Consortium. Modules (and other reagents for performing modular assembly) are available from the non-profit plasmid distribution service Addgene (http://www.addgene.org/zfc) (42). OPEN pools are available by request from the Joung lab (jjoung/at/partners.org) and other reagents for practicing OPEN are also available from Addgene (http://www.addgene.org/zfc). (ii) Sequences of oligonucleotides that must be synthesized to create bacterial two-hybrid selection and/or reporter strains needed to screen or select ZFPs for DNA-binding activity (34,42). (iii) An Organism dropdown menu for selecting a host genome and BLAST button that can be used to scan the selected host organism genome for exact and similar target matches. The BLAST button submits search parameters to NCBI and via a popup directs users to NCBI website where they can initiate the query by selecting the ‘view report’ button. This is useful because it is generally desirable to avoid targeting sites that occur frequently in a genome (e.g. sites that fall within repeat regions). When using BLAST to search for genomic ZFN targets, the spacer is replaced with N’s to prevent it from positively influencing a scan. Due to the nature of the algorithm (and a fixed spacer size in the case of a nuclease), this query is not guaranteed to identify all similar sites. ZiFiT output may need to complete loading before BLAST queries are accessible. ZiFiT is freely available on the web without registration at http://bindr.gdcb.iastate.edu/ZiFiT/.
National Institutes of Health [T32CA009216] to J.D.S.; National Science Foundation Graduate Research Fellowship  to M.L.M.; National Science Foundation [DBI 0923827, DBI 0501678] to D.F.V.; National Institutes of Health [R01 GM069906, R01 GM072621, R01 GM088040], the National Science Foundation [DBI 0923827] and Massachusetts General Hospital Pathology Service to J.K.J.; National Science Foundation [DBI 0923827] to D.D. and D.R. Funding for open access charge: National Science Foundation [DBI 0923827].
Conflict of interest statement. None declared.
The authors respectfully acknowledge those who have shared their results with the zinc finger research community.