Discrete DNA binding is of fundamental biological importance, allowing the bound proteins to target their function to specific locations within the DNA. A more complete understanding of the molecular basis for specific DNA recognition is desirable, both to enable the engineering of proteins to bind at DNA sequences of choice, and to improve prediction of DNA specificity for the numerous uncharacterized DNA binding proteins increasingly available from next generation sequencing technologies. The type II restriction endonucleases (REs) and their companion DNA methyltransferases comprise two large, well-characterized families of DNA binding proteins that exhibit exquisite sequence specificity. REBASE currently reports that 274 unique sequences are recognized among the more than 4000 biochemically characterized type II REs and companion DNA methyltransferases (1
). The REs have typically evolved very great discrimination between their recognition sequence and all other DNA sequences (2
), though the companion DNA methyltransferases may be somewhat less precise, since there is not the same level of selection against modifying non-cognate sites as there is against cutting non-cognate sites in the host genome (3
). Crystal structures are available for 31 of the REs and 13 of the DNA methyltransferases, many of which are co-crystals with the enzyme bound to its DNA target (1
). These structural studies reveal that specific DNA recognition is generally quite complex, involving both direct and water-mediated contacts that typically saturate the available hydrogen bonding potential of the functional groups on the DNA bases recognized (4
). The type II REs have diverged to the point that they generally share little amino-acid sequence similarity, except in the case of isoschizomers that recognize the same sequence or close homologs that recognize related sequences (5
). This lack of sequence conservation and the complexity of specific base recognition have made it difficult to predict which amino-acid residues determine recognition, or to rationally alter the DNA specificity of type II endonucleases.
Although it would be desirable to engineer enzymes to bind at and act upon any DNA sequence of choice, to date attempts to alter DNA specificity in the type II REs have met with only limited success. Early on it was hoped that structural information would guide the rational mutagenesis of key residues to alter particular base recognition, for example in EcoRI (6
), or to add contacts to increase the length of the recognition sequence, for example with EcoRV (7
). However, efforts to rationally alter specificity in these enzymes were largely unsuccessful. A significant effort was made to alter BamHI recognition. Guided by the structure (9
), residues making specific base contacts in BamHI were extensively mutagenized, and an in vivo
selection for binding using a catalytically inactive BamHI was employed in an attempt to change the BamHI recognition sequence (11
). No mutants recognizing a new sequence were isolated, though a mutant that preferentially cut at the same sequence but required that the adenine base be methylated was found (11
). Comparison of the structures of BamHI (GGATCC) and BglII (AGATCT), which share similar recognition sequences, led to the conclusion that type II REs ‘are remarkably resilient to alterations in the binding specificity’ (12
). BstYI (RGATCY) recognizes a degenerate sequence that includes BamHI and BglII sites, as well as AGATCC. Attempts were made to alter BstYI to recognize only AGATCT (BglII) by a directed evolution approach, taking advantage of the M.BglII DNA methyltransferase for host DNA protection (13
). A variant enzyme that preferred AGATCT 12-fold over AGATCC, and that no longer cut GGATCC was obtained, but a complete change in specificity was not accomplished. Subsequently the structure of BstYI was determined (14
). Analysis of the structures revealed that although BstYI, BglII and BamHI share many similarities and are ideal test cases for changing specificity, the enzymes use surprisingly distinct recognition strategies, leaving open the question whether it is possible to completely switch specificity, even for the relatively conservative change of reducing BstYI recognition to specificity for only the BglII sequence (15
). In another study, an approach that used random mutagenesis coupled with a genetic screen was employed in an attempt to alter the specificity of NotI (GCGGCCGC). This approach successfully isolated variants that recognize the wild-type sequence plus several sequences that differ at one base (16
), but was not successful in generating a completely new specificity. Overall, the lack of success in altering specificity in the orthodox type II REs has suggested the general rule that changing a contacting residue results in a drop in catalytic activity but not a change of specificity (17
Somewhat more success in generating enzymes with new specificity has been achieved with the unorthodox type II REs. One such approach successfully generated new specificities by joining two existing half-site recognition domains into new combinations. It was observed that naturally occurring recombination could create new specificities in type I R-M systems, where each half site was derived from a different parental R-M system (18
). This was exploited in vitro
to generate type I R-M systems having new, hybrid specificities (19
). This approach has recently been extended to type II REs that, like the type I enzymes, recognize split sequences, resulting in hybrid enzymes that recognize a new sequence consisting of one half site from each of two parental enzymes (20
A different approach was applied to alter specificity for Eco57I [CTGAAG(16/14)] (21
). This took advantage of the unorthodox nature of this type IIG enzyme, which is a fused endonuclease—DNA methyltransferase (22
). In this approach, endonuclease activity was abolished while DNA methylation activity was retained, the enzyme was randomly mutated, then methylase selection (23
) was performed using a separate enzyme that recognizes a different sequence to isolate variants that now recognized and modified this new sequence to protect against the selecting endonuclease. This yielded an Eco57I variant that changed recognition at the fourth position from A to R: CTGRAG. This approach successfully produced an enzyme with altered specificity; however, because the method requires an existing endonuclease for the methylase selection step, it is limited to the generation of variants having previously known recognition sequences or subsets thereof.
The homing endonucleases have proved more amenable to rational specificity change than the type II REs, as evidenced by the recent successful alteration of a C:G to a G:C base pair through computational redesign of the specific contacts to this base pair (24
). Additionally, significant progress has been made in engineering homing endonuclease DNA specificity, particularly with the enzyme I-CreI, through an approach of semi-rational mutagenesis and efficient screening for desired specificity changes (25–29
). However, the type II REs have to date proved recalcitrant to engineering changes in recognition specificity.
MmeI is an unusual type II restriction enzyme that cuts DNA two turns of the helix away from its asymmetric recognition sequence (30–32
). MmeI possesses both DNA methyltransferase and endonuclease activities in the same polypeptide. However, unlike previously described restriction-modification systems, MmeI relies on single strand modification for host protection (33
). We have recently identified a family of REs highly similar to MmeI, which we proposed form a new subgroup within the type II endonucleases, the type IIL enzymes (34
). The members of this family share significant primary amino-acid sequence similarity throughout their entire polypeptide chains. Their endonuclease and methyltransferase functions are highly conserved, however, their recognition sequences can be quite divergent. This overall conservation of function but diversification of substrate recognition indicates DNA specificity is undergoing rapid evolution. These proteins thus provide an excellent opportunity to study the determinants of DNA specificity. Currently no structure is available for any members of this family.
Here we report the rational engineering of new specificity variants within the MmeI family of type IIL REs. The engineered enzymes exhibit true conversion of specificity, and many exhibit specific activity comparable to the naturally occurring enzymes. The method described holds promise for the ability to engineer new type II REs, and also DNA methyltransferases, that specifically recognize more unique DNA sequences than have been isolated to date from natural sources.