Here, we have systematically explored the potential of domain fusion to expand the number of native pseudo-dimeric single-chain LHE scaffolds for genome engineering applications, focusing on the recently described I-OnuI family (
26,
43). To establish parameters for extraction of NTDs and CTDs from single-chain LHEs, and for development of a structure-independent method for generation of these domain fusion chimeras, we examined the structure/function relationships of chimeras generated by fusion of NTDs and CTDs extracted from I-OnuI and I-LtrI. Using insights from this work, we systematically generated domain fusion chimeras from I-OnuI, I-LtrI and four other I-OnuI family enzymes, and characterized their biochemical properties using yeast surface display. Our results suggest that simple direct fusion approaches can yield active enzymes in ~50% of cases, and that introduction of even limited variation into the interface residues allows for recovery of active enzymes from ~70% of domain fusion pairs.
A significant result emerging from our studies is that the linker peptide in single-chain LHEs forms not only important, predictable interactions with the NTD, but also functionally impacts the LAGLIDADG interface. Even when using a hybrid ‘1/2-and-1/2’ approach, which was designed to conserve important linker interactions, and which preserved activity in all native enzymes (e.g. E, left panel), we observed a few examples where alteration of linker composition led to a decrease in activity (e.g. incorporation of an ‘SGT’ bridge into Onu-Ltr, E). Therefore, in contrast to the flexible parameters that may be used in designing linkers to create single-chain versions of the homodimeric enzyme I-CreI, it is evident that the linker peptides in single-chain enzymes have evolved to interact in a meaningful manner with the domains, as well as with the interfacial region (
39). Linker composition must therefore be taken into account in LHE engineering, not only in the development of a strategy to generate chimeric enzymes, but also potentially in both later stage optimization of a chimeric enzyme, as well as in the optimization of single-chain LHEs whose domains have been engineered separately and later recombined.
Our exploration of C4 cleavage specificity provides a comprehensive data set for the capacity of I-OnuI family enzymes to cleave targets with varying sequences at the middlemost base pairs, the ‘C4.’ These data demonstrate that I-OnuI family enzymes have remarkably tight C4 specificity, exhibiting significant cleavage activity towards only approximately 4–8 of 256 possible sequences in this region. This specificity is retained in domain fusion chimeras. As each domain appears to contribute to the specificity at these central basepairs, domain chimerization will allow for considerable expansion of potential target sites, as the C4 nt are not currently targeted for engineering due to their unpredictable biochemistry. Furthermore, the AT-rich nature of the C4 targets that are typically cleavable by I-OnuI family enzymes suggests that the energetics of DNA unwinding in the C4 region is an important influence on LAGLIDADG cleavage efficiency, and likely is of central importance to the biochemistry of cleavage within this class of enzymes.
Our survey of structure-independent domain fusions of six I-OnuI family LHEs revealed several patterns that may potentially be exploited to increase the chance of a successful domain fusion among domains from any of the I-OnuI family enzymes. One obvious pattern is that certain domains (e.g. NTD of I-LtrI or the CTDs of I-OnuI, I-LtrI and I-PanMI) proved extremely amenable to direct domain fusion, resulting in highly active chimeric enzymes for the majority of pairs, whereas other domains, (e.g. CTD of I-SscI) would not form active or even stable enzymes with any other domains. This effect was not related to the level of homology, as even chimeras of I-GzeI and I-PanMI, which share >70% identity, achieved only a 50% success rate (
Supplementary Figure S8). Thus, choice of domain fusion pairs so as to include a promiscuous partner, and exclude non-promiscuous partners, is a simple method to increase the likelihood of an obtaining an active enzyme from a direct fusion. A second important pattern is that domain fusion success was increased when a ‘common interface’ between partners was introduced which was native to one of the partner domains. For example, domain fusion chimeras were achieved in 7/10 instances when an I-OnuI domain was used with the I-OnuI-derived common interface. This observation may be exploited in a general approach to domain fusion by introducing residue variation encompassing what is observed throughout the I-OnuI family, into the ‘common interface’ residue set for every fusion pair. With such an approach, our results suggest that small libraries could be screened with relatively minor efforts to identify domain fusions with high levels of activity for the vast majority of domain pairs.
From our studies, it is evident that domain fusion using NTDs and CTDs extracted from single-chain I-OnuI family enzymes is an efficient approach to generating highly active chimeric enzymes that specifically cleave hybrid target sites. With a simple domain fusion strategy, we achieved ~50% success in generation of active chimeras, and by introducing limited variation into the interface residues, we were able to attain catalytically active chimeras for ~70% of those attempted with relatively minor effort. Our results further suggest that introducing interface residue variation into each domain, followed by the generation of a small library of enzymes for each domain pair, would lead to recovery of highly active chimeric enzymes from the majority of domain fusion pairings. Significantly, the close correlation we observed between ROSETTA energetics calculations and the observed stability and cleavage properties of chimeric enzymes derived from I-Onu and I-LtrI supports previous work, in which structural analysis was used to create stable, active domain fusions from disparate LHEs (
24,
25). Structural analysis of multiple members of the I-OnuI family could thus facilitate choice of optimal domain partners for direct fusion, further reducing the cost and effort of generating active chimeric enzymes. With the expanding set of characterized LHEs, these methods promise to markedly expand the number of starting scaffolds for engineering, thus enabling broader use of LHEs in genome engineering applications.