Engineered zinc-finger nucleases (ZFNs) can be used to introduce targeted alterations into genomes of model organisms, plants, and human cells.1, 2
Repair of ZFN-induced double-strand breaks (DSBs) by error-prone non-homologous end-joining (NHEJ) leads to efficient introduction of insertion or deletion mutations (indels) at the site of the DSB. Alternatively, repair of a DSB by homology-directed repair with an exogenously introduced donor template can promote efficient introduction of alterations or insertions at or near the break site.
Widespread adoption and large-scale use of ZFN technology have been hindered by continued lack of a robust, easy-to-use, and publicly available method for engineering zinc-finger arrays. One approach, known as modular assembly, joins together pre-selected zinc-finger modules into arrays,3
a procedure simple enough to be practiced by any researcher. Some recent reports have demonstrated a high failure rate for this method,4, 5
although the consequent need to construct and test large numbers of ZFNs for any given target gene can be mitigated by using a more limited subset of modules.6
We recently described a robust selection-based method known as Oligomerized Pool ENgineering (OPEN),7
but the labor and expertise required to screen combinatorial libraries have limited its broad adoption.3
Sangamo BioSciences, Inc. has also developed a platform for engineering ZFNs and although some details of this method have been published,8
its practice requires access to a proprietary archive of engineered zinc-finger units.9
Researchers may purchase customized ZFNs made by the Sangamo approach through the Sigma-Aldrich CompoZrR service but the cost of these proteins9
limits the scale and scope of projects that can be performed.
Here we describe Context-Dependent Assembly (CoDA), a publicly available platform of reagents and software that is simple to practice and shows a success rate comparable to selection-based methods such as OPEN. With the CoDA approach, three-finger arrays are assembled using N- and C-terminal fingers that have been previously identified in other arrays containing a common middle finger (). CoDA can be practiced using a large archive consisting of 319 F1 and 344 F3 units (Supplementary Tables 1 and 2
) engineered to function well when positioned adjacent to one of 18 fixed F2 units (Methods
). Thus, in contrast to modular assembly, CoDA does not treat fingers as independent modules but instead explicitly accounts for context-dependent effects between adjacent fingers,10, 11
thereby increasing the probability that a multi-finger array will function well. CoDA is rapid and requires no specialized expertise; multi-finger arrays can be constructed in one to two weeks or less using standard cloning techniques or commercial DNA synthesis.
Schematic overview of Context-Dependent Assembly (CoDA)
To test the CoDA approach, we assembled 181 three-finger arrays and evaluated each for its ability to bind its cognate DNA target site using a well-established bacterial two-hybrid (B2H) reporter assay.4, 7
Previous work has shown that three-finger arrays that fail to activate transcription by more than 1.57-fold in the B2H reporter assay are likely to be inactive as ZFNs in mammalian cells4
and those that activate by three-fold or more have a high probability of functioning efficiently as ZFNs in zebrafish, plant, and human cells.7, 12–15
Of the 181 CoDA arrays we tested using the B2H reporter assay, <8% (14 arrays) activated transcription by <1.57-fold and >76% (139 arrays) activated transcription by >3.00-fold (Supplementary Fig. 1
and Supplementary Table 3
). These failure and success rates (as predicted by the B2H reporter assay) are comparable to what we have previously observed with three-finger arrays made by OPEN (Supplementary Note
). Because so few (<25%) of the CoDA arrays we tested gave <3.00-fold activation in the B2H reporter assay, our results suggest that one could potentially skip the B2H reporter assay step and move directly to testing in the final desired cell type of interest.
We compared the efficacy of CoDA with that of modular assembly by using both approaches to construct three-finger arrays for 26 different nine bp sites and testing these proteins for DNA-binding activity in the B2H reporter assay (Supplementary Table 4
). We observed that, for these sites that can be targeted by both methods, CoDA outperforms modular assembly (Supplementary Fig. 2
and Supplementary Note
). The most likely explanation for the relatively higher success rates of CoDA is its explicit consideration of context-dependent activities between fingers.10, 11
We note that these differences in success rates become potentially more pronounced when one considers that two functional arrays must be engineered to create dimers of ZFNs.
We applied CoDA to engineer ZFNs for endogenous gene targets in zebrafish and plants. Using CoDA zinc-finger arrays that activated transcription at least three-fold in the B2H reporter assay, we constructed ZFN pairs for 24 gene targets in zebrafish, 13 gene targets in Arabidopsis thaliana
, and one target present in two duplicated genes in soybean (). CoDA ZFNs induced targeted indel mutations with high efficiencies in 12 out of 24 zebrafish target sites (≤ 1% to 16.7%; and Supplementary Fig. 3
), in six out of 13 Arabidopsis
gene targets (1.1% to 8.4%; and Supplementary Fig. 4
), and in a target site present in two duplicated soybean genes in transformed root tissue (18.8% and 10.7%; and Supplementary Fig. 4
Endogenous zebrafish and plant genes targeted by CoDA ZFNs
Our overall per target success rate for obtaining mutations with CoDA ZFNs is 50% (19 out of 38 target sites) in zebrafish and plants, a frequency comparable to our success rate of ~67% (16 out of 24 target sites) with OPEN ZFNs in zebrafish, plants, and human cells (refs. 7, 12–15
and unpublished data). We note that, for CoDA, success rate as calculated per ZFN and per target site is the same, since a single ZFN is synthesized per site. Although we do not know why some CoDA and OPEN ZFNs fail to induce mutations, we hypothesize that chromatin state or DNA methylation of the site or stability or folding of the protein might be responsible. Regardless of the precise mechanism, we recommend that users of CoDA plan to make ZFNs for at least two target sites per gene of interest to increase the likelihood that at least one pair will successfully introduce mutations.
CoDA still possesses some limitations compared to existing methods. Although modular assembly was less efficient than CoDA in our direct comparisons, modular assembly can potentially be used to target sites that CoDA currently cannot5, 6
and one recent report demonstrated a comparable success rate of 23% for modular assembly using a more limited subset of modules.6
In addition, although CoDA accounts for context-dependence between adjacent fingers, it also has some limitations relative to selection-based methods such as OPEN. For example, CoDA constrains the identity of the middle finger (F2) and does not “balance” the effects of all three fingers on affinity and specificity of the final array. In addition, CoDA in its current form guides assembly of arrays to 9 bp target sites, ignoring the identities of the adjacent upstream and downstream bases. Thus, for highly demanding therapeutic applications (e.g.—introduction of alterations into human pluripotent stem cells13
), ZFNs made by OPEN may still be preferable to those made by CoDA and it may be necessary to engineer zinc-finger arrays with greater specificities. Nonetheless, our overall results demonstrate that CoDA provides a method for assembling zinc-finger arrays that accounts for context-dependent effects, is easier to perform than OPEN selections, and yields ZFNs that function efficiently for gene modification.
With the current archive of CoDA units, a potential ZFN target site can be found approximately once in every 500 bp of random sequence (Supplementary Note
). However, actual targeting range can be higher, depending upon genomic sequences . For example, ~81% of 27,305 unique protein coding transcripts in the zebrafish genome (Ensembl Zv8.57 database) contain one or more potentially targetable ZFN sites (mean of 4.37), a frequency equivalent to one potential site every ~400 bp of transcript coding sequence. By contrast, ~63% of 33,200 unique protein coding transcripts in the Arabidopsis genome (TAIR9 release) contain one or more potential ZFN target sites (mean of 2.45), a frequency equal to one potential site every ~790 bp of transcript coding sequence. To enable users to identify potential CoDA target sites in any given gene sequence, we have updated our publicly available web-based Zinc Finger Targeter (ZiFiT) program (http://bindr.gdcb.iastate.edu/ZiFiT/
In summary, CoDA provides an effective alternative method for using publicly available reagents to engineer ZFNs. With CoDA, dozens of zinc-finger arrays can be rapidly assembled or commercially synthesized in 1 to 2 weeks without the need for labor-intensive selection and moved directly into cells for testing as ZFNs. We note that the rapidity and high success rate of CoDA enabled us to mutate 20 endogenous genes in three different organisms. CoDA will foster broader adoption of ZFN technology and also enable large-scale ZFN projects focused on multi-gene pathways or genome-wide alterations that are difficult to implement using existing methodologies.