The hallmark feature of TAL effectors that makes them such remarkably powerful tools for DNA targeting, their long arrays of 33–35 amino acid repeats that specify nucleotides in the recognition site in a straightforward and modular fashion, also makes them challenging to engineer. Commercial synthesis is effective (10
) but expensive. PCR-based methods (11
) carry the risk of artifact and recombination. Assembly by sequential ligation of sequence-verified modules (8
) is inexpensive and assures array integrity, but is time consuming. The Golden Gate method using the reagents we describe here, provides a cost-effective, robust and rapid solution. TAL effector constructs with arrays of up to 31 RVDs are assembled in just two cloning steps using a set of sequence-verified modules. Furthermore, the reagents provide great flexibility for cloning arrays in different contexts and expressing them in different organisms, either in our set of backbone plasmids for TALENs, TAL effectors, or TAL effector fusions to additional proteins, or by simple subcloning or Gateway recombination into other vectors.
Zhang et al.
) recently presented a protocol and set of templates for Golden Gate-like assembly that involves PCR amplification of modules, intermediary arrays and full-length arrays to yield TAL effector DNA binding domains with 13 RVDs fused in a backbone vector to VP64 (see also www.taleffectors.com
). This marked a significant advance that enabled the authors to rapidly assemble custom arrays and demonstrate the utility of TAL effector-based proteins as custom transcription factors to activate endogenous genes in human cells. However, the method and plasmids we describe here offer more versatility for broader utility, not only with regard to the available contexts and portability of the arrays, as noted above, but also in array length. The ability with our reagents to construct arrays ranging from 12 to 31 RVDs allows fine-tuning for targeting and will be important for testing the important outstanding question of the relationship of length to affinity and specificity. The broad range in array length also offers greater flexibility to systematically address other important questions including the contributions of individual RVD–nucleotide associations to affinity and specificity, as well as the effect of position on mismatch tolerance (1
). This could be accomplished, e.g. by starting with an array of minimal functional length and comparing the effects of adding or interspersing additional RVDs aligned to different nucleotides in the target.
Our method has the technical advantage of involving no PCR. Although the Zhang et al
) repeat templates for different RVDs are codon engineered to guard against slippage and inter-repeat recombination during PCR amplification, this strategy does not prevent recombination between repeats carrying the same RVD, particularly if they are present in tandem. Also, in part because our method involves no PCR, though it is 2 days longer, it is less labor-intensive and time consuming day to day.
Though all of the custom arrays made for this study use just the four most common RVDs, our plasmid set includes modules with NK, which users might opt to substitute for NN to specify G, because NN sometimes associates with A. We note however, based on data presented by Miller et al. (e in ref. 10), that NK also associates substantially with A in some contexts. Modules with yet additional RVDs can be generated readily by mutagenesis of an existing set.
Among the genes we selected for targeting with TALENs, we deliberately chose some for which targeting with ZFNs has proven difficult. For example, one of the most common mutations in patients with cystic fibrosis is a deletion of 3
nt (DF508) in CFTR
; however, best efforts to engineer a ZFN for this position only succeeded in targeting a site >120
bp away, a distance that would likely compromise gene targeting efficiency (18
). For our CFTR TALENs, the DF508 mutation resides within the spacer sequence at the site of TALEN cleavage. Similarly, we previously created herbicide resistant tobacco plants by gene targeting with ZFNs that recognize and cleave the acetolactate synthase gene (24
). The nearest ZFN that could be engineered to the desired site of modification was 188
bp away, whereas our TALENs cleave within 10
bp of the desired sequence modification. Finally, AT-rich sequences have been difficult to target with ZFNs; we successfully targeted two sites in the AT-rich (75.5%) Plasmepsin V
gene of Plasmodium falciparum
, which has an overall genome content of 80.6% AT (29
). Generally, the high success rate of TALENs designed using our software, which found sites in diverse sequences on average every 35
bp, suggests that targetability of TALENs will prove superior to the public ZFN platforms, which are estimated to be capable of targeting on average every 500
). Indeed, we anticipate our estimate of targeting range is conservative, as some TALENs that do not follow our design principles still recognize and cleave DNA efficiently (10; Supplementary Table S2
Activity varied among the TALENs we tested in the yeast assay. The reason for this is not clear. It could relate to expression levels or variability in the assay itself, but more likely, the data reflect inherent differences in the DNA binding affinity of the arrays, possibly related to their length and composition. The relationship of array length and composition to overall affinity is still an open question that must be addressed. The important conclusion for this study is that all of the TALENs were active, demonstrating that the targeting approach as well as the Golden Gate methods and plasmids for assembly are robust. Our results in Arabidopsis
protoplasts and human cells, along with recent results from other groups (10
), indicate that TALENs are likely to be broadly effective for genome engineering.
We have deposited all of our plasmids for constructing and expressing TALENs as well as TAL effectors with or without a stop codon in the non-profit clone repository AddGene (www.addgene.org
). To complement our method and reagents, we have also made our software for TALEN site selection and design freely accessible as an online tool, the TAL Effector Nucleotide Targeter at http://boglabx.plp.iastate.edu/TALENT/
. Although our success rate was high with TALENs designed using the software, we have not shown that it is ‘necessary’ to follow the guidelines on which the software is based. So, even though the guidelines place only relatively minor constraints on targeting, the online tool allows users to exclude them individually to increase candidate target site frequency. Also, because optimal spacing may differ for different TALEN architectures, the software provides the option to specify desired spacer lengths. In making these resources available, we hope to facilitate further characterization of TAL effector DNA targeting properties, broad adoption of TALENs and other TAL effector-based tools and further development of the utility of these unique DNA binding proteins.