Restriction enzymes (REases) are commercial reagents commonly used in recombinant DNA technologies. They are attractive models for studying protein-DNA interactions and valuable targets for protein engineering. They are, however, extremely divergent: the amino acid sequence of a typical REase usually shows no detectable similarities to any other proteins, with rare exceptions of other REases that recognize identical or very similar sequences. From structural analyses and bioinformatics studies it has been learned that some REases belong to at least four unrelated and structurally distinct superfamilies of nucleases, PD-DxK, PLD, HNH, and GIY-YIG. Hence, they are extremely hard targets for structure prediction and homology-based inference of sequence-function relationships and the great majority of REases remain structurally and evolutionarily unclassified.
SfiI is a REase which recognizes the interrupted palindromic sequence 5'GGCCNNNN^NGGCC3' and generates 3 nt long 3' overhangs upon cleavage. SfiI is an archetypal Type IIF enzyme, which functions as a tetramer and cleaves two copies of the recognition site in a concerted manner. Its sequence shows no similarity to other proteins and nothing is known about the localization of its active site or residues important for oligomerization. Using the threading approach for protein fold-recognition, we identified a remote relationship between SfiI and BglI, a dimeric Type IIP restriction enzyme from the PD-DxK superfamily of nucleases, which recognizes the 5'GCCNNNN^NGGC3' sequence and whose structure in complex with the substrate DNA is available. We constructed a homology model of SfiI in complex with its target sequence and used it to predict residues important for dimerization, tetramerization, DNA binding and catalysis.
The bioinformatics analysis suggest that SfiI, a Type IIF enzyme, is more closely related to BglI, an "orthodox" Type IIP restriction enzyme, than to any other REase, including other Type IIF REases with known structures, such as NgoMIV. NgoMIV and BglI belong to two different, very remotely related branches of the PD-DxK superfamily: the α-class (EcoRI-like), and the β-class (EcoRV-like), respectively. Thus, our analysis provides evidence that the ability to tetramerize and cut the two DNA sequences in a concerted manner was developed independently at least two times in the evolution of the PD-DxK superfamily of REases. The model of SfiI will also serve as a convenient platform for further experimental analyses.