|Home | About | Journals | Submit | Contact Us | Français|
Designer DNA-binding proteins based on transcriptional activator-like effectors (TALEs) and zinc finger proteins (ZFPs) are easily tailored to recognize specific DNA sequences in a modular manner. They can be engineered to generate tools for targeted genome perturbation. Here, we review recent advances in these versatile technologies with a focus on designer nucleases for highly precise, efficient, and scarless gene modification. By generating double stranded breaks and stimulating cellular DNA repair pathways, TALE and ZF nucleases have the ability to modify the endogenous genome. We also discuss current applications of designer DNA-binding proteins in synthetic biology and disease modeling, novel effector domains for genetic and epigenetic regulation, and finally perspectives on using customizable DNA-binding proteins for interrogating neural function.
Recent advances in DNA sequencing technology offer enormous potential for understanding the genetic and epigenetic basis of nervous system function as well as disease processes. However, the detailed mechanisms of cognitive function and neuropsychiatric diseases remain poorly understood. In order to study cellular programs that orchestrate brain processes, tools that facilitate precise editing or regulation of the genome would enable systematic interrogation of neural function from molecules to behavior. We discuss advances in genome engineering technologies that provide efficient insertion, removal, replacement, or regulation of genomic sequence in a targeted manner.
By tethering different effector domains to engineered DNA-binding proteins of programmable specificity, such as TAL effectors (TALEs) or zinc finger proteins (ZFPs), it is now possible to achieve highly specific gene targeting or modulation.1−3 Designer nucleases, which couple the catalytic domain of restriction endonucleases to a DNA binding domain,4,5 have the potential to introduce specific mutations or desired transgenes almost anywhere within the genome. Targeted nucleases have been used in several gene therapy studies in mice, successfully treating diseases such as hemophilia or HIV in cellular and animal models.6,7 Such experiments suggest the possibility for similar strategies in the brain, from monogenic neurological disease to disorders with more complicated genetic contributions. Other functional variants of TALEs and ZFPs include transcriptional activators and repressors,2,8,9 opening the door to probe the epigenetic dynamics underlying brain function. Here, we review developments in targeted nuclease engineering with a focus on the emerging TALE technology, followed by a brief examination of current applications. Finally, we will discuss perspectives on applications of designer binding proteins for understanding neurological processes and disease mechanisms.
A variety of genome manipulation techniques have been developed over the past decades to reverse engineer the genetic contributions to cellular phenotype, including retrovirus-mediated transgene integration, site-specific recombinases, and gene targeting through homologous recombination (HR). Retrovirus-mediated transgene integration is carried out through random or semirandom insertion into hotspots within the genome, and can cause insertional mutagenesis that may confound experimental conclusions.10 Site-specific recombinases rely on the introduction of exogenous recombinase recognition sites into the targeted locus, which is undesirable.11 Finally, traditional gene targeting relies on HR, which suffers from a low absolute efficiency. Successful transgene insertion typically occurs in 1 out of 106 to 109 treated cells.12 As a result, recombination efficiency is a bottleneck for stimulating efficient gene targeting.
Despite this, much progress has been made in mammalian systems, especially in mice, due to clever selection strategies and the availability of embryonic stem cells (ES cells). In order to guard against random integrants and yield the desired recombined product, a double positive/negative selection strategy can be used.13 ES-cell based gene targeting, however, is slow and expensive, and for many model organisms germ-line competent ES cell lines are not available and selection/screening strategies do not work well.
More advanced tools that can combine highly specific, efficient, and scarless gene modification would be an ideal technological advance for more effective genome engineering. The development of targeted nucleases is proving to be a versatile way to overcome deficiencies with traditional gene targeting approaches.
Studies employing homing meganucleases (MNs), a class of site-specific endonucleases, were able to stimulate homologous recombination by several orders of magnitude.14,15 MNs are able to introduce double strand breaks (DSBs) at specific target sequences and stimulate homologous recombination by activating the endogenous cellular DNA damage repair machinery (Figure (Figure1,1, Table 1). However, applications involving MNs for targeted genome editing have remained limited due to the low probability of finding MN-binding sites at the target genomic locus.16 Altering the substrate specificity of MNs has proved challenging, as their DNA binding and nuclease catalytic domains are coupled, making it difficult to reengineer specificity without affecting cleavage activity.
The development of zinc finger nucleases (ZFNs) has proved to be a versatile way to address these problems with MN engineering. Separating the DNA recognition and cleavage domains of the ZFN made it possible to alter substrate specificity without impairing endonuclease cleavage. Zinc fingers (ZFs) are protein structural motifs that direct some transcription factors to their genomic targets. ZF domains can be found in tandem arrays of individual fingers, with each finger within the larger ZF protein recognizing a typically 3 bp DNA sequence in a remarkably modular manner (17) (Figure (Figure2a).2a). As a result, it appeared that a customized array of individual zinc finger domains assembled into a ZFP could be designed to target a larger DNA sequence.
The first synthetic zinc finger nucleases were developed by fusing a ZF protein to the catalytic domain of the Type IIS restriction enzyme FokI.4,5 It was later realized that the FokI domain required dimerization to introduce a DSB,18 leading to the development of a paired heterodimeric nuclease design strategy that increased cleavage specificity while minimizing off-target activity (Figure (Figure22b).19 The current generation of ZFN cleaves genome targets containing two 9–18 bp binding sites separated by a 5–6 bp spacer. This is important for nuclease specificity because cleavage should not occur at single ZFN binding sites. As a result, the overall 18–36 bp sequence specificity of a ZFN pair should be unique within even a complex eukaryotic genome.
ZFNs have been used to target many genes in a wide variety of organisms (see reviews, refs (20 and 21)). However, successful ZFN design and application can often be quite difficult. The most basic design strategy, modular assembly, optimizes individual zinc fingers against target triplet DNA sequences and then links them together to target a larger sequence. This assembly method suffers from a high failure rate (22) when assembled in an array, because zinc-finger domains exhibit context-dependent binding preference. Individual ZFs are affected by their neighboring modules, neighboring modules, making the selection of functional ZFNs requires an extensive and time-consuming screening process23,24 that also has a limited target sequence space.25 Zinc fingers are also known to exhibit low levels of off-target activity, although thisbe minimized through careful selection and design.26,27
Similar to ZFPs, transcriptional activator-like effectors (TALEs) can also be customized to recognize specific DNA sequences. They were first discovered in plant pathogenic bacteria of the genus Xanthomonas and are virulence factors that are translocated into rice crops via a Type III bacterial secretion system.28 Once inside the host cell, TAL effectors then upregulate host genes by binding and activating target promoters in the host genome, leading to disease symptoms such as citrus canker and bacterial blight.
Naturally occurring TALEs contain a DNA-binding domain composed of a tandem array of typically 34 amino acid modules that are largely identical (Figure (Figure2c).2c). Two amino acids, at positions 12 and 13 of each tandem repeat, are collectively known as a repeat variable diresidue (RVD) and have been identified to specify DNA binding.29,30 The target DNA sequence of a TAL effector can be deciphered based on the sequence of RVDs and also a simple cipher correlating each RVD to its target base. Four of the most commonly occurring RVDs in natural TALEs (Asn-Ile (NI), Asn-Gly (NG), His-Asp (HD), and Asn-Asn (NN)) preferentially recognize A, T, C, and G/A respectively.30 The RVD Asn-His (NH) has been recently reported to recognize guanine more specifically.9
By assembling specific combinations of repeat monomers with different RVDs, synthetic TALE DNA-binding domains can be constructed to target novel DNA sequences. TALEs are increasingly being adopted because their DNA binding domains are more modular and easier to customize than those of zinc finger proteins. In addition, designer TALEs can be synthesized in less than 1 week.2,31−35 TALEs can be fused to the FokI catalytic domain to generate TALE nucleases (TALENs) and facilitate genome editing (Figure (Figure22d).3,31−33,36,37
The ability to modify the endogenous genome at will opens a wealth of opportunities for probing the genetic and epigenetic basis of cellular function and disease. Designer nucleases have already been widely adopted to generate novel animal and cellular models of diseases that more accurately model human disease genetics and mechanisms.38 For instance, ZFNs and TALENs can be used to introduce the exact mutations found in patient populations into animal and cellular models to study the causal role of specific disease mutations. Additionally, targeted nucleases enable gene targeting in a significantly broader spectrum of animal species beyond the traditionally used laboratory mice, thereby giving researchers the ability to choose the most appropriate animal and cellular models for their respective studies.
Designer nucleases can be used to increase the efficiency of genome modification in transgenically accessible animal models as well to enable the manipulation of organisms that are difficult to manipulate using conventional gene targeting methods. For example, embryonic stem cell (ES cell) based knockout generation in rodents is notoriously time-consuming and typically requires at least 1 year of backcrossing. By introducing ZFNs or TALENs into the pronucleus of single cell mouse embryos, knockout animals can be generated in as little as 4 months.39
The rat has traditionally been an intractable genetic model, which has been frustrating because it serves as an excellent model organism for complex human disease traits as well as for neurophysiological studies. Technical advances over the last 4 years have led to the capture of authentic rat ES cells40,41 and the subsequent production of knockout rats from modified rat ES cells.42 However, as in mice, ES cell targeting in rats also requires selection screening and laborious backcrossing to derive the desired genetic background. Both ZFNs and TALENs have been established as tools for robust targeted integration and in vivo gene knockout in rats.43,44 Targeted nucleases can significantly increase the targeting efficiency and reduce the overall time required for achieving successful gene knockout.
Designer nucleases are also promising for model organisms that have not proved to be amenable to traditional methods for genetic manipulation. Customized nucleases can also be used to facilitate gene targeting in higher level mammals such as nonhuman primates, which have been an intractable model organism because monkey ES cells have so far failed to incorporate into host blastocysts.45,46 TALENs and ZFNs may be used to directly engineer embryos or sperm progenitor cells,47 thus overcoming the difficulties facing ES cell-based approaches. Particularly for neuroscience and neuropsychiatric diseases studies, nonhuman primate animal models may be able to better recapitulate the disease mechanisms and behavioral sophistication manifested in human neuropsychiatric disease conditions.
Genetic manipulation of ES and induced pluripotent stem cells (iPSCs) is a powerful technique for studying the genetic basis of diseases that affect a particular cell type, developing patient-specific therapies, and investigating developmental gene regulation. While classical gene targeting in ES cells has been limited due to low HR rates of less than 1 in 106 cells, the advent of targeted nucleases has led to the development of many stem cell-based applications (see ref (48) for review).
Recent advances in the genetic analysis of patient genomes have uncovered large sets of genetic mutations implicated in disease. However, the complexity of human genetic backgrounds renders the study of genetic basis of disease difficult. In addition to the locus of interest, cells derived from two distinct individuals will likely differ in over 106 positions. Targeted nucleases based on TALENs and ZFNs are already enabling researchers to introduce specific mutations of interest into isogenic pluripotent stem cells (PSCs) so that these isogenic cells can be converted into the relevant cell types for downstream studies (Figure (Figure33).38 By recapitulating possible deletions, duplications, or other genetic mutations that are implicated with the disease pathophysiology, targeted nucleases can be used to develop new or more phenotypically accurate models of disease.
Nuclease-mediated editing of PSCs is of particular interest because these cell lines may be differentiated into almost every cell type. To facilitate the application of PSCs in disease modeling, TALENs and ZFNs can be used to generate cell type-specific reporter cell lines by inserting reporter genes such as fluorescent proteins into the loci of specific cell type markers. These reporter cell lines can be used in cDNA or small molecule screening assays to identify a combination of genes that are capable of directing the differentiation of stem cells into a specific cell type. Additionally, these reporter cell lines can be engineered to carry mutations implicated in particular disease states. They can then be differentiated into an appropriate cell type, while the reporter allows for tracking of cell line differentiation. This will guide the functional study of the disease mechanisms that result from specific genetic mutations in the cell type of interest.
Successful in vivo gene therapy has been reported with retroviral-mediated transgene insertion. Adeno-associated virus-based delivery of a functional copy of the retinoid protein RPE65 led to visual restoration in a canine model of childhood blindness,49 while lentiviral-mediated therapy in hematopoietic stem cells halted brain demyelination in human patients.50 However, there are important caveats with these delivery methods. All forms of viral-mediated gene replacement suffer from loss of endogenous regulation because of random or semirandom insertion of the viral vector. In some cases, this can also lead to the activation of proto-oncogenes by retrovirus enhancer activity.51
Designer nucleases targeted to endogenous loci would avoid both of these problems with viral therapy. One solution has been to pursue ex vivo therapeutics by performing targeted gene correction in ES and iPS cells followed by transplantation into patients. This approach to therapeutic cloning was successfully used to knockout the receptor and treat HIV in human patients,6 as well as to reconstitute the hematopoietic system of mouse models of sickle cell anemia.52 In addition to these cellular therapy methods, targeted nucleases also offer a more generalizable solution by enabling direct correction of gene mutations in affected cell in vivo. A recent study demonstrated ZFN-mediated insertion of the functional exons of blood coagulation factor IX into liver cells of a mouse model of hemophilia, a blood clotting disorder.7 This increased circulating levels of working blood clot factor, thereby restoring hemostasis.
Direct gene correction of these or similar disorders by TALENs is an exciting avenue for pursuing gene therapy in the brain. Some neurodegenerative genetic disorders are monogenic or largely caused by a small set of mutations, such as Huntington’s disease, yet many of these diseases have no known cure. Gene therapy strategies may be developed to treat these neurological genetic disorders through stereotactic delivery of therapeutic vectors into specific brain regions or brain-wide gene delivery through ventricular or spinal injection. Table 2 lists several especially attractive candidates for targeted nuclease-mediated gene therapy.
Although the versatility of targeted nucleases in successful strategies for gene therapy is highly encouraging for investigating neurological function, it is not yet clear if targeted nucleases can be applied in the brain. One important consideration for gene editing within the developed brain is the DNA repair capabilities of nondividing cells such as postmitotic neurons. Given that the versatility of nuclease-mediated therapies relies on homology-directed repair, it is important to better understand the DNA break repair capabilities of mature neurons.
Whereas nonhomologous end joining (NHEJ) should in principle occur in any stage of the cell cycle, homologous recombination is thought to largely occur during the S and G2 phases of the cell cycle53 with the sister chromatid acting as the repair template. Progress of the main pathway of DSB repair is cell-cycle dependent, with end resection blockage in the G1 stage. End resection is required to reveal the single-stranded DNA tails necessary for strand invasion, and cyclin-dependent kinases appear to be required for its activation in S and G2 phase. End resection is robust in some specialized neurons, with one recent study in terminally differentiated rod cells demonstrating both NHEJ and HR activity (by its single-stranded annealing subpathway) upon a meganuclease-stimulated double-stranded break.54,55 This is encouraging for the application of targeted nucleases in other types of neurons, possibly in the central nervous system.
Other effector domains on DNA-binding proteins can also be used (Figure (Figure4).4). TALEs and ZFPs can be engineered into designer transcription factors2,8,9,56,57 by attaching transcription modulation domains to the DNA binding modules. These designer transcription factors can be targeted to specific genes in the genome to modulate the activity of specific genes and probe their roles in brain function as well as disease states. It is important to consider that TALEs acting as transcription factors require physical binding to its DNA substrate to deliver its effector activity. As a result, the chromatin state of any locus of interest has to be taken into account. Open chromatin is typically defined as a region with DNase I hypersensitivity, so careful study of genome-wide analyses for DNaseI hypersensitive tracts in the target cell type could be a way to improve the success of TAL effector targeting. Another open area is the fusion of TALEs to chromatin modifiers and regulators to modify the epigenomic state and affect transcriptional regulation. Other editing domains such as recombinases, transposases, or deaminases are alternative ways to modify genomic sequences. Finally, fluorescent proteins can be fused to DNA binding domains to illuminate chromosomal configuration.
Overall, the development of designer DNA binding proteins based on TALEs and ZFPs establishes a powerful platform for interrogating the function of genes in normal nervous system function as well as for engineering neuropsychiatric disease mechanisms. These technologies will be especially powerful when applied in specific cell type or brain circuits to dissect the role of gene mutations and epigenetic regulation in facilitate neural circuit function and computation. When combined with powerful readout methodologies at the molecular, circuit, and behavioral levels, targeted genome engineering technologies have the potential to significantly advance our understanding of the molecular basis of brain function.
We thank the entire Zhang laboratory for their support.
National Institutes of Health, United States
Both authors contributed equally to this work. P.D.H. and F.Z. performed literature review and wrote the manuscript.
Work described in this review was supported by grants from NIH (1R01NS073124), the McKnight, Gates, Damon Runyon, Keck, Klingenstein, and Simons Foundations, the Searle Scholars Program, Bob Metcalfe, and Mike Boylan. P.D.H. is supported by a James Mills Pierce Fellowship.
The authors declare no competing financial interest. Additional information can be found at the TAL Effector Resources Web site (http://www.taleffectors.com).