Restriction enzymes (REases) are commercial reagents commonly used in recombinant DNA technologies. They are attractive models for studying protein-DNA interactions and valuable targets for protein engineering. They are, however, extremely divergent: the amino acid sequence of a typical REase usually shows no detectable similarities to any other proteins, with rare exceptions of other REases that recognize identical or very similar sequences. From structural analyses and bioinformatics studies it has been learned that some REases belong to at least four unrelated and structurally distinct superfamilies of nucleases, PD-DxK, PLD, HNH, and GIY-YIG. Hence, they are extremely hard targets for structure prediction and homology-based inference of sequence-function relationships and the great majority of REases remain structurally and evolutionarily unclassified.
SfiI is a REase which recognizes the interrupted palindromic sequence 5'GGCCNNNN^NGGCC3' and generates 3 nt long 3' overhangs upon cleavage. SfiI is an archetypal Type IIF enzyme, which functions as a tetramer and cleaves two copies of the recognition site in a concerted manner. Its sequence shows no similarity to other proteins and nothing is known about the localization of its active site or residues important for oligomerization. Using the threading approach for protein fold-recognition, we identified a remote relationship between SfiI and BglI, a dimeric Type IIP restriction enzyme from the PD-DxK superfamily of nucleases, which recognizes the 5'GCCNNNN^NGGC3' sequence and whose structure in complex with the substrate DNA is available. We constructed a homology model of SfiI in complex with its target sequence and used it to predict residues important for dimerization, tetramerization, DNA binding and catalysis.
The bioinformatics analysis suggest that SfiI, a Type IIF enzyme, is more closely related to BglI, an "orthodox" Type IIP restriction enzyme, than to any other REase, including other Type IIF REases with known structures, such as NgoMIV. NgoMIV and BglI belong to two different, very remotely related branches of the PD-DxK superfamily: the α-class (EcoRI-like), and the β-class (EcoRV-like), respectively. Thus, our analysis provides evidence that the ability to tetramerize and cut the two DNA sequences in a concerted manner was developed independently at least two times in the evolution of the PD-DxK superfamily of REases. The model of SfiI will also serve as a convenient platform for further experimental analyses.
In this study we attempted to modify the PCR-RFLP method using restriction enzyme MwoI for the identification of medically important Aspergillus species. Our subjects included nine standard Aspergillus species and 205 Aspergillus isolates of approved hospital acquired infections and hospital indoor sources. First of all, Aspergillus isolates were identified in the level of species by using morphologic method. A twenty four hours culture was performed for each isolates to harvest Aspergillus mycelia and then genomic DNA was extracted using Phenol-Chloroform method. PCR-RFLP using single restriction enzyme MwoI was performed in ITS regions of rDNA gene. The electrophoresis data were analyzed and compared with those of morphologic identifications. Total of 205 Aspergillus isolates included 153 (75%) environmental and 52 (25%) clinical isolates. A. flavus was the most frequently isolate in our study (55%), followed by A. niger 65(31.7%), A. fumigatus 18(8.7%), A. nidulans and A. parasiticus 2(1% each). MwoI enabled us to discriminate eight medically important Aspergillus species including A. fumigatus, A. niger, A. flavus as the most common isolated species. PCR-RFLP method using the restriction enzyme MwoI is a rapid and reliable test for identification of at least the most medically important Aspergillus species.
Aspergillus; identification; MwoI
Thus far, identification of functionally important residues in Type II restriction endonucleases (REases) has been difficult using conventional methods. Even though known REase structures share a fold and marginally recognizable active site, the overall sequence similarities are statistically insignificant, unless compared among proteins that recognize identical or very similar sequences. Bsp6I is a Type II REase, which recognizes the palindromic DNA sequence 5′GCNGC and cleaves between the cytosine and the unspecified nucleotide in both strands, generating a double-strand break with 5′-protruding single nucleotides. There are no solved structures of REases that recognize similar DNA targets or generate cleavage products with similar characteristics. In straightforward comparisons, the Bsp6I sequence shows no significant similarity to REases with known structures. However, using a fold-recognition approach, we have identified a remote relationship between Bsp6I and the structure of PvuII. Starting from the sequence–structure alignment between Bsp6I and PvuII, we constructed a homology model of Bsp6I and used it to predict functionally significant regions in Bsp6I. The homology model was supported by site-directed mutagenesis of residues predicted to be important for dimerization, DNA binding and catalysis. Completing the picture of sequence–structure–function relationships in protein superfamilies becomes an essential task in the age of structural genomics and our study may serve as a paradigm for future analyses of superfamilies comprising strongly diverged members with little or no sequence similarity.
Most epigenetic studies assess methylation of 5′-CpG-3′ sites but recent evidence indicates that non-CpG cytosine methylation occurs at high levels in humans and other species. This is most prevalent at 5′-CHG-3′, where H = A, C or T, and it preferentially occurs at 5′-CpA-3′ and 5′-CpT-3′ sites. With the goal of facilitating the detection of non-CpG methylation, the restriction endonucleases ApeKI, BbvI, EcoP15I, Fnu4HI, MwoI and TseI were assessed for their sensitivity to 5-methylcytosine at GpCpA, GpCpT, GpCpC or GpCpG sites, where methylation is catalyzed by the DNA 5-cytosine 5′-GpC-3′ methyltransferase M.CviPI. We tested a variety of sequences including various plasmid-based sites, a cloned disease-associated (CAG)83•(CTG)83 repeat and in vitro synthesized tracts of only (CAG)500•(CTG)500 or (CAG)800•(CTG)800. The repeat tracts are enriched for the preferred CpA and CpT motifs. We found that none of the tested enzymes can cleave their recognition sequences when they are 5′-GpC-3′ methylated. A genomic site known to convert its non-CpG methylation levels upon C2C12 differentiation was confirmed through the use of these enzymes. These enzymes can be useful in rapidly and easily determining the most common non-CpG methylation status in various sequence contexts, as well as at expansions of (CAG)n•(CTG)n repeat tracts associated with diseases like myotonic dystrophy and Huntington disease.
non-CpG methylation; CpG methylation; 5-methylcytosine; trinucleotide repeats; ApeKI; BbvI; EcoP151; Fnu4HI; MwoI and TseI
Multiple sclerosis (MS) is an inflammatory neurodegenerative disease in which the insulating membrane of central nervous system is damaged. The etiology of MS includes both genetic and environmental causes. A Genome — Wide Association Study (GWAS) recognized genetic single nucleotide polymorphisms (SNP) linked with MS predisposition among which immunologically related genes are considerably over signified. The purpose of the present study is to explore the association of rs1520333 C/T polymorphism in the IL7 gene variants with the risk of MS in a subset of Iranian population.
Materials and Methods:
In this case — control study, 110 cases with MS and 110 controls were contributed. DNA was extracted from blood samples and to amplify the fragment of interest contain rs1520333 SNP, polymerase chain reaction — restriction fragment length polymorphism method was implemented for genotyping of the DNA samples with a specific restriction enzyme (MwoI).
SPSS for Windows software (version 18.0; SPSS, Chicago, IL, USA) was used for statistical analysis.
We demonstrated the important association between G allele [odds ratio (OR) =1.6614, confidence interval (CI) =1.12-2.47, P = 0.0124] and GG genotype (OR = 7.45, 95% CI = 2.13-25.97, P 0.0016) of the rs1520333 SNP for susceptibility to MS after adjustment for age, and gender. OR adjusted for age, gender, and body mass index has displayed similar outcomes.
These results indicate that the rs1520333 SNP is a significant susceptibility gene variant for development of MS in the Iranian population. Nevertheless, functional studies are required to completely elucidate how this SNP contributed to MS pathogenesis.
GWAS; IL7 gene; multiple sclerosis; polymorphism
Catalytic domains of Type II restriction endonucleases (REases) belong to a few unrelated three-dimensional folds. While the PD-(D/E)XK fold is most common among these enzymes, crystal structures have been also determined for single representatives of two other folds: PLD (R.BfiI) and half-pipe (R.PabI). Bioinformatics analyses supported by mutagenesis experiments suggested that some REases belong to the HNH fold (e.g. R.KpnI), and that a small group represented by R.Eco29kI belongs to the GIY-YIG fold. However, for a large fraction of REases with known sequences, the three-dimensional fold and the architecture of the active site remain unknown, mostly due to extreme sequence divergence that hampers detection of homology to enzymes with known folds.
R.Hpy188I is a Type II REase with unknown structure. PSI-BLAST searches of the non-redundant protein sequence database reveal only 1 homolog (R.HpyF17I, with nearly identical amino acid sequence and the same DNA sequence specificity). Standard application of state-of-the-art protein fold-recognition methods failed to predict the relationship of R.Hpy188I to proteins with known structure or to other protein families. In order to increase the amount of evolutionary information in the multiple sequence alignment, we have expanded our sequence database searches to include sequences from metagenomics projects. This search resulted in identification of 23 further members of R.Hpy188I family, both from metagenomics and the non-redundant database. Moreover, fold-recognition analysis of the extended R.Hpy188I family revealed its relationship to the GIY-YIG domain and allowed for computational modeling of the R.Hpy188I structure. Analysis of the R.Hpy188I model in the light of sequence conservation among its homologs revealed an unusual variant of the active site, in which the typical Tyr residue of the YIG half-motif had been substituted by a Lys residue. Moreover, some of its homologs have the otherwise invariant Arg residue in a non-homologous position in sequence that nonetheless allows for spatial conservation of the guanidino group potentially involved in phosphate binding.
The present study eliminates a significant "white spot" on the structural map of REases. It also provides important insight into sequence-structure-function relationships in the GIY-YIG nuclease superfamily. Our results reveal that in the case of proteins with no or few detectable homologs in the standard "non-redundant" database, it is useful to expand this database by adding the metagenomic sequences, which may provide evolutionary linkage to detect more remote homologs.
The majority of experimentally determined crystal structures of Type II restriction endonucleases (REases) exhibit a common PD-(D/E)XK fold. Crystal structures have been also determined for single representatives of two other folds: PLD (R.BfiI) and half-pipe (R.PabI), and bioinformatics analyses supported by mutagenesis suggested that some REases belong to the HNH fold. Our previous bioinformatic analysis suggested that REase R.Eco29kI shares sequence similarities with one more unrelated nuclease superfamily, GIY-YIG, however so far no experimental data were available to support this prediction. The determination of a crystal structure of the GIY-YIG domain of homing endonuclease I-TevI provided a template for modeling of R.Eco29kI and prompted us to validate the model experimentally.
Using protein fold-recognition methods we generated a new alignment between R.Eco29kI and I-TevI, which suggested a reassignment of one of the putative catalytic residues. A theoretical model of R.Eco29kI was constructed to illustrate its predicted three-dimensional fold and organization of the active site, comprising amino acid residues Y49, Y76, R104, H108, E142, and N154. A series of mutants was constructed to generate amino acid substitutions of selected residues (Y49A, R104A, H108F, E142A and N154L) and the mutant proteins were examined for their ability to bind the DNA containing the Eco29kI site 5'-CCGCGG-3' and to catalyze the cleavage reaction. Experimental data reveal that residues Y49, R104, E142, H108, and N154 are important for the nuclease activity of R.Eco29kI, while H108 and N154 are also important for specific DNA binding by this enzyme.
Substitutions of residues Y49, R104, H108, E142 and N154 predicted by the model to be a part of the active site lead to mutant proteins with strong defects in the REase activity. These results are in very good agreement with the structural model presented in this work and with our prediction that R.Eco29kI belongs to the GIY-YIG superfamily of nucleases. Our study provides the first experimental evidence for a Type IIP REase that does not belong to the PD-(D/E)XK or HNH superfamilies of nucleases, and is instead a member of the unrelated GIY-YIG superfamily.
Genomics and metagenomics are currently leading research areas, with DNA sequences accumulating at an exponential rate. Although enormous advances in DNA sequencing technologies are taking place, progress is frequently limited by factors such as genomic contig assembly and generation of representative libraries. A number of DNA fragmentation methods, such as hydrodynamic sharing, sonication or DNase I fragmentation, have various drawbacks, including DNA damage, poor fragmentation control, irreproducibility and non-overlapping DNA segment representation. Improvements in these limited DNA scission methods are consequently needed. An alternative method for obtaining higher quality DNA fragments involves partial digestion with restriction endonucleases (REases).
We have shown previously that class-IIS/IIC/IIG TspGWI REase, the prototype member of the Thermus sp. enzyme family, can be chemically relaxed by a cofactor analogue, allowing it to recognize very short DNA sequences of 3-bp combined frequency. Such frequently cleaving REases are extremely rare, with CviJI/CviJI*, SetI and FaiI the only other ones found in nature. Their unusual features make them very useful molecular tools for the development of representative DNA libraries.
We constructed a horse genomic library and a deletion derivative library of the butyrylcholinesterase cDNA coding region using a novel method, based on TaqII, Thermus sp. family bifunctional enzyme exhibiting cofactor analogue specificity relaxation. We used sinefungin (SIN) – an S-adenosylmethionine (SAM) analogue with reversed charge pattern, and dimethylsulfoxide (DMSO), to convert the 6-bp recognition site TaqII (5′-GACCGA-3′ [11/9]) into a theoretical 2.9-bp REase, with 70 shortened variants of the canonical recognition sequence detected. Because partial DNA cleavage is an inherent feature of the Thermus sp. enzyme family, this modified TaqII is uniquely suited to quasi-random library generation.
In the presence of SIN/DMSO, TaqII REase is transformed from cleaving every 4096 bp on average to cleaving every 58 bp. TaqII SIN/DMSO thus extends the palette of available REase prototype specificities. This phenomenon, employed under partial digestion conditions, was applied to quasi-random DNA fragmentation. Further applications include high sensitivity probe generation and metagenomic DNA amplification.
Restriction endonucleases (REases) are highly specific DNA scissors that have facilitated the development of modern molecular biology. Intensive studies of double strand (ds) cleavage activity of Type IIP REases, which recognize 4–8 bp palindromic sequences, have revealed a variety of mechanisms of molecular recognition and catalysis. Less well-studied are REases which cleave only one of the strands of dsDNA, creating a nick instead of a ds break. Naturally occurring nicking endonucleases (NEases) range from frequent cutters such as Nt.CviPII (^CCD; ^ denotes the cleavage site) to rare-cutting homing endonucleases (HEases) such as I-HmuI. In addition to these bona fida NEases, individual subunits of some heterodimeric Type IIS REases have recently been shown to be natural NEases. The discovery and characterization of more REases that recognize asymmetric sequences, particularly Types IIS and IIA REases, has revealed recognition and cleavage mechanisms drastically different from the canonical Type IIP mechanisms, and has allowed researchers to engineer highly strand-specific NEases. Monomeric LAGLIDADG HEases use two separate catalytic sites for cleavage. Exploitation of this characteristic has also resulted in useful nicking HEases. This review aims at providing an overview of the cleavage mechanisms of Types IIS and IIA REases and LAGLIDADG HEases, the engineering of their nicking variants, and the applications of NEases and nicking HEases.
Type II restriction endonucleases (REases) are one of the basic tools of recombinant DNA technology. They also serve as models for elucidation of mechanisms for both site-specific DNA recognition and cleavage by proteins. However, isolation of catalytically active mutants from their libraries is challenging due to the toxicity of REases in the absence of protecting methylation, and techniques explored so far had limited success. Here, we present an improved SOS induction-based approach for in vivo screening of active REases, which we used to isolate a set of active variants of the catalytic mutant, Cfr10IE204Q. Detailed characterization of plasmids from 64 colonies screened from the library of ∼200 000 transformants revealed 29 variants of cfr10IR gene at the level of nucleotide sequence and 15 variants at the level of amino acid sequence, all of which were able to induce SOS response. Specific activity measurements of affinity-purified mutants revealed >200-fold variance among them, ranging from 100% (wild-type isolates) to 0.5% (S188C mutant), suggesting that the technique is equally suited for screening of mutants possessing high or low activity and confirming that it may be applied for identification of residues playing a role in catalysis.
The molecular basis of the interaction of KpnI restriction endonuclease (REase) and the corresponding methyltransferase (MTase) at their cognate recognition sequence is investigated using a range of footprinting techniques. DNase I protection analysis with the REase reveals the protection of a 14–18 bp region encompassing the hexanucleotide recognition sequence. The MTase, in contrast, protects a larger region. KpnI REase contacts two adjacent guanine residues and the single adenine residue in both the strands within the recognition sequence 5′-GGTACC-3′, inferred by dimethylsulfate (DMS) protection, interference and missing nucleotide interference analysis. In contrast, KpnI MTase does not show elaborate base-specific contacts. Ethylation interference analysis also showed the differential interaction of REase and MTase with phosphate groups of three adjacent bases on both strands within the recognition sequence. The single thymine residue within the sequence is hyper- reactive to the permanganate oxidation, consistent with MTase-induced base flipping. The REase on the other hand does not show any major DNA distortion. The results demonstrate that the differences in the molecular interaction pattern of the two proteins at the same recognition sequence reflect the contrasting chemistry of DNA cleavage and methylation catalyzed by these two dissimilar enzymes, working in combination as constituents of a cellular defense strategy.
In continuing our research into the new family of bifunctional restriction endonucleases (REases), we describe the cloning of the tsoIRM gene. Currently, the family includes six thermostable enzymes: TaqII, Tth111II, TthHB27I, TspGWI, TspDTI, TsoI, isolated from various Thermus sp. and two thermolabile enzymes: RpaI and CchII, isolated from mesophilic bacteria Rhodopseudomonas palustris and Chlorobium chlorochromatii, respectively. The enzymes have several properties in common. They are large proteins (molecular size app. 120 kDa), coded by fused genes, with the REase and methyltransferase (MTase) in a single polypeptide, where both activities are affected by S-adenosylmethionine (SAM). They recognize similar asymmetric cognate sites and cleave at a distance of 11/9 nt from the recognition site. Thus far, we have cloned and characterised TaqII, Tth111II, TthHB27I, TspGWI and TspDTI.
TsoI REase, which originate from thermophilic Thermus scotoductus RFL4 (T. scotoductus), was cloned in Escherichia coli (E. coli) using two rounds of biochemical selection of the T. scotoductus genomic library for the TsoI methylation phenotype. DNA sequencing of restriction-resistant clones revealed the common open reading frame (ORF) of 3348 bp, coding for a large polypeptide of 1116 aminoacid (aa) residues, which exhibited a high level of similarity to Tth111II (50% identity, 60% similarity). The ORF was PCR-amplified, subcloned into a pET21 derivative under the control of a T7 promoter and was subjected to the third round of biochemical selection in order to isolate error-free clones. Induction experiments resulted in synthesis of an app. 125 kDa protein, exhibiting TsoI-specific DNA cleavage. Also, the wild-type (wt) protein was purified and reaction optima were determined.
Previously we identified and cloned the Thermus family RM genes using a specially developed method based on partial proteolysis of thermostable REases. In the case of TsoI the classic biochemical selection method was successful, probably because of the substantially lower optimal reaction temperature of TsoI (app. 10-15°C). That allowed for sufficient MTase activity in vivo in recombinant E. coli. Interestingly, TsoI originates from bacteria with a high optimum growth temperature of 67°C, which indicates that not all bacterial enzymes match an organism’s thermophilic nature, and yet remain functional cell components. Besides basic research advances, the cloning and characterisation of the new prototype REase from the Thermus sp. family enzymes is also of practical importance in gene manipulation technology, as it extends the range of available DNA cleavage specificities.
We reported previously that TspGWI, a prototype enzyme of a new Thermus sp. family of restriction endonucleases-methyltransferases (REases-MTases), undergoes the novel phenomenon of sinefungin (SIN)-caused specificity transition. Here we investigated mutant TspGWI N473A, containing a single amino acid (aa) substitution in the NPPY motif of the MTase. Even though the aa substitution is located within the MTase polypeptide segment, DNA cleavage and modification are almost completely abolished, indicating that the REase and MTase are intertwined. Remarkably, the TspGWI N473A REase functionality can be completely reconstituted by the addition of SIN. We hypothesize that SIN binds specifically to the enzyme and restores the DNA cleavage-competent protein tertiary structure. This indicates the significant role of allosteric effectors in DNA cleavage in Thermus sp. enzymes. This is the first case of REase mutation suppression by an S-adenosylmethionine (SAM) cofactor analogue. Moreover, the TspGWI N473A clone strongly affects E. coli division control, acting as a ‘selfish gene’. The mutant lacks the competing MTase activity and therefore might be useful for applications in DNA manipulation. Here we present a case study of a novel strategy for REase activity/specificity alteration by a single aa substitution, based on the bioinformatic analysis of active motif locations, combining (a) aa sequence engineering (b) the alteration of protein enzymatic properties, and (c) the use of cofactor–analogue cleavage reconstitution and stimulation.
Endonuclease-methyltransferase; Thermus sp. enzyme; Enzymatic reaction cofactor; Cofactor analogue; Sinefungin; S-adenosylmethione; Mutant activation; Specificity change
Specific cleavage of large DNA molecules at few sites, necessary for the analysis of genomic DNA or for targeting individual genes in complex genomes, requires endonucleases of extremely high specificity. Restriction endonucleases (REase) that recognize DNA sequences of 4–8 bp are not sufficiently specific for this purpose. In principle, the specificity of REases can be extended by fusion to sequence recognition modules, e.g. specific DNA-binding domains or triple-helix forming oligonucleotides (TFO). We have chosen to extend the specificity of REases using TFOs, given the combinatorial flexibility this fusion offers in addressing a short, yet precisely recognized restriction site next to a defined triple-helix forming site (TFS). We demonstrate here that the single chain variant of PvuII (scPvuII) covalently coupled via the bifunctional cross-linker N-(γ-maleimidobutryloxy) succinimide ester to a TFO (5′-NH2-[CH2]6 or 12-MPMPMPMPMPPPPPPT-3′, with M being 5-methyl-2′-deoxycytidine and P being 5-[1-propynyl]-2′-deoxyuridine), cleaves DNA specifically at the recognition site of PvuII (CAGCTG) if located in a distance of approximately one helical turn to a TFS (underlined) complementary to the TFO (‘addressed’ site: 5′-TTTTTTTCTCTCTCTCN∼10CAGCTG-3′), leaving ‘unaddressed’ PvuII sites intact. The preference for cleavage of an ‘addressed’ compared to an ‘unaddressed’ site is >1000-fold, if the cleavage reaction is initiated by addition of Mg2+ ions after preincubation of scPvuII-TFO and substrate in the absence of Mg2+ ions to allow triple-helix formation before DNA cleavage. Single base pair substitutions in the TFS prevent addressed DNA cleavage by scPvuII-TFO.
We previously defined a family of restriction endonucleases (REases) from Thermus sp., which share common biochemical and biophysical features, such as the fusion of both the nuclease and methyltransferase (MTase) activities in a single polypeptide, cleavage at a distance from the recognition site, large molecular size, modulation of activity by S-adenosylmethionine (SAM), and incomplete cleavage of the substrate DNA. Members include related thermophilic REases with five distinct specificities: TspGWI, TaqII, Tth111II/TthHB27I, TspDTI and TsoI.
TspDTI, TsoI and isoschizomers Tth111II/TthHB27I recognize different, but related sequences: 5'-ATGAA-3', 5'-TARCCA-3' and 5'-CAARCA-3' respectively. Their amino acid sequences are similar, which is unusual among REases of different specificity. To gain insight into this group of REases, TspDTI, the prototype member of the Thermus sp. enzyme family, was cloned and characterized using a recently developed method for partially cleaving REases.
TspDTI, TsoI and isoschizomers Tth111II/TthHB27I are closely related bifunctional enzymes. They comprise a tandem arrangement of Type I-like domains, like other Type IIC enzymes (those with a fusion of a REase and MTase domains), e.g. TspGWI, TaqII and MmeI, but their sequences are only remotely similar to these previously characterized enzymes. The characterization of TspDTI, a prototype member of this group, extends our understanding of sequence-function relationships among multifunctional restriction-modification enzymes.
Restriction endonucleases (REases) are DNA-cleaving enzymes that have become indispensable tools in molecular biology. Type II REases are highly divergent in sequence despite their common structural core, function and, in some cases, common specificities towards DNA sequences. This makes it difficult to identify and classify them functionally based on sequence, and has hampered the efforts of specificity-engineering. Here, we define novel REase sequence motifs, which extend beyond the PD-(D/E)XK hallmark, and incorporate secondary structure information. The automated search using these motifs is carried out with a newly developed fast regular expression matching algorithm that accommodates long patterns with optional secondary structure constraints. Using this new tool, named Scan2S, motifs derived from REases with specificity towards GATC- and CGGG-containing DNA sequences successfully identify REases of the same specificity. Notably, some of these sequences are not identified by standard sequence detection tools. The new motifs highlight potential specificity-determining positions that do not fully overlap for the GATC- and the CCGG-recognizing REases and are candidates for specificity re-engineering.
secondary structure; protein motif; physicochemical properties; restriction endonucleases; regular expression; specificity-determining positions
Background—An imbalance between the
proinflammatory cytokine interleukin 1β (IL-1β) and the
anti-inflammatory cytokine IL-1 receptor antagonist (IL-1ra) has been
postulated as a pathogenic factor in inflammatory bowel disease (IBD).
Aims—To study allelic frequencies
of novel polymorphisms in the genes for IL-1β and IL-1ra in patients
with IBD and to assess the relation between ex vivo cytokine production
and allelic variants of the IL-1β and IL-1ra genes.
Subjects—Two hundred and seventy
healthy controls, 74 patients with ulcerative colitis (UC), 72 with
Crohn's disease (CD), 40 with primary sclerosing cholangitis for the
allelic frequencies, and 60 healthy individuals for the ex vivo
Methods—Genotyping was performed by
polymerase chain reaction and subsequent cleavage with specific
endonucleases (Mwo1, MspAI1, Alu1, Taq1, BsoF1) for five novel
restriction fragment length polymorphisms (RFLPs) in the genes for
IL-1ra and IL-1β.
Results—No significant differences were found in
the allelic frequencies or allele carriage rates of the markers in the
IL-1β and IL-1ra genes between CD, UC, and healthy controls. No
association between the genetic markers and cytokine production levels
was observed. Patients with UC carried the combination of both the infrequent allele of the Taq1 RFLP and the Mwo1 RFLP significantly more
frequently (35.2% in UC versus 71.1% in controls).
Conclusions—UC is associated
with carriage of both infrequent alleles of the Taq1 and Mwo1 RFLPs.
However, it could not be confirmed whether the association reflects a
pathogenic mechanism underlying UC.
cytokine gene polymorphisms; interleukin 1 receptor
antagonist; interleukin 1β; Crohn's disease; ulcerative colitis
This article continues the series of Surveys and Summaries on restriction endonucleases (REases) begun this year in Nucleic Acids Research. Here we discuss ‘Type II’ REases, the kind used for DNA analysis and cloning. We focus on their biochemistry: what they are, what they do, and how they do it. Type II REases are produced by prokaryotes to combat bacteriophages. With extreme accuracy, each recognizes a particular sequence in double-stranded DNA and cleaves at a fixed position within or nearby. The discoveries of these enzymes in the 1970s, and of the uses to which they could be put, have since impacted every corner of the life sciences. They became the enabling tools of molecular biology, genetics and biotechnology, and made analysis at the most fundamental levels routine. Hundreds of different REases have been discovered and are available commercially. Their genes have been cloned, sequenced and overexpressed. Most have been characterized to some extent, but few have been studied in depth. Here, we describe the original discoveries in this field, and the properties of the first Type II REases investigated. We discuss the mechanisms of sequence recognition and catalysis, and the varied oligomeric modes in which Type II REases act. We describe the surprising heterogeneity revealed by comparisons of their sequences and structures.
For a very long time, Type II restriction enzymes (REases) have been a paradigm of ORFans: proteins with no detectable similarity to each other and to any other protein in the database, despite common cellular and biochemical function. Crystallographic analyses published until January 2008 provided high-resolution structures for only 28 of 1637 Type II REase sequences available in the Restriction Enzyme database (REBASE). Among these structures, all but two possess catalytic domains with the common PD-(D/E)XK nuclease fold. Two structures are unrelated to the others: R.BfiI exhibits the phospholipase D (PLD) fold, while R.PabI has a new fold termed ‘half-pipe’. Thus far, bioinformatic studies supported by site-directed mutagenesis have extended the number of tentatively assigned REase folds to five (now including also GIY-YIG and HNH folds identified earlier in homing endonucleases) and provided structural predictions for dozens of REase sequences without experimentally solved structures. Here, we present a comprehensive study of all Type II REase sequences available in REBASE together with their homologs detectable in the nonredundant and environmental samples databases at the NCBI. We present the summary and critical evaluation of structural assignments and predictions reported earlier, new classification of all REase sequences into families, domain architecture analysis and new predictions of three-dimensional folds. Among 289 experimentally characterized (not putative) Type II REases, whose apparently full-length sequences are available in REBASE, we assign 199 (69%) to contain the PD-(D/E)XK domain. The HNH domain is the second most common, with 24 (8%) members. When putative REases are taken into account, the fraction of PD-(D/E)XK and HNH folds changes to 48% and 30%, respectively. Fifty-six characterized (and 521 predicted) REases remain unassigned to any of the five REase folds identified so far, and may exhibit new architectures. These enzymes are proposed as the most interesting targets for structure determination by high-resolution experimental methods. Our analysis provides the first comprehensive map of sequence-structure relationships among Type II REases and will help to focus the efforts of structural and functional genomics of this large and biotechnologically important class of enzymes.
The restriction endonuclease (REase) R.KpnI is an orthodox Type IIP enzyme, which binds to DNA in the absence of metal ions and cleaves the DNA sequence 5′-GGTAC^C-3′ in the presence of Mg2+ as shown generating 3′ four base overhangs. Bioinformatics analysis reveals that R.KpnI contains a ββα-Me-finger fold, which is characteristic of many HNH-superfamily endonucleases, including homing endonuclease I-HmuI, structure-specific T4 endonuclease VII, colicin E9, sequence non-specific Serratia nuclease and sequence-specific homing endonuclease I-PpoI. According to our homology model of R.KpnI, D148, H149 and Q175 correspond to the critical D, H and N or H residues of the HNH nucleases. Substitutions of these three conserved residues lead to the loss of the DNA cleavage activity by R.KpnI, confirming their importance. The mutant Q175E fails to bind DNA at the standard conditions, although the DNA binding and cleavage can be rescued at pH 6.0, indicating a role for Q175 in DNA binding and cleavage. Our study provides the first experimental evidence for a Type IIP REase that does not belong to the PD…D/EXK superfamily of nucleases, instead is a member of the HNH superfamily.
DNA methyltransferases (MTases), unlike MTases acting on other substrates, exhibit sequence permutation. Based on the sequential order of the cofactor-binding subdomain, the catalytic subdomain, and the target recognition domain (TRD), several classes of permutants have been proposed. The majority of known DNA MTases fall into the α, β, and γ classes. There is only one member of the ζ class known and no members of the δ and ε classes have been identified to date. Two mechanisms of permutation have been proposed: one involving gene duplication and in-frame fusion, and the other involving inter- and intragenic shuffling of gene segments.
Two novel cases of sequence permutation in DNA MTases implicated in restriction-modification systems have been identified, which suggest that members of the δ and ζ classes (M.MwoI and M.TvoORF1413P, respectively) evolved from β-class MTases. This is the first identification of the δ-class MTase and the second known ζ-class MTase (the first ζ-class member among DNA:m4C and m6A-MTases).
Fragmentation of a DNA MTase gene may result from attack of nucleases, for instance when the RM system invades a new cell. Its reassembly into a functional form, the order of motifs notwithstanding, may be strongly selected for, if the cognate ENase gene remains active and poses a threat to the host's chromosome. The "cut-and-paste" mechanism is proposed for β-δ permutation, which is non-circular and involves relocation of one segment of a gene. The circular β-ζ permutation may be explained both by gene duplication or shuffling of gene fragments. These two mechanisms are not mutually exclusive and probably both played a role in the evolution of permuted DNA MTases.
BmrI (ACTGGG N5/N4) is one of the few metal-independent restriction endonucleases (REases) found in bacteria. The BmrI restriction-modification system was cloned by the methylase selection method, inverse PCR, and PCR. BmrI REase shows significant amino acid sequence identity to BfiI and a putative endonuclease MspBNCORF3798 from the sequenced Mesorhizobium sp. BNC1 genome. The EDTA-resistant BmrI REase was successfully over-expressed in a pre-modified E. coli strain from pET21a or pBAC-expIQ vectors. The recombinant BmrI REase shows strong promiscuous activity (star activity) in NEB buffers 1, 4, and an EDTA buffer. Star activity was diminished in buffers with 100–150 mM NaCl and 10 mM MgCl2. His-tagged BmrI192, the N-terminal cleavage domain of BmrI, was expressed in E. coli and purified from inclusion bodies. The refolded BmrI192 protein possesses non-specific endonuclease activity. BmrI192 variants with a single Ser to Cys substitution (S76C or S90C) and BmrI200 (T200C) with a single Cys at the C-terminal end were also constructed and purified. BmrI200 digests both single-strand (ss) and double-strand (ds) DNA and the nuclease activity on ss DNA is at least 5-fold higher than that on ds DNA. The Cys-containing BmrI192 and BmrI200 nuclease variants may be useful for coupling to other DNA binding elements such as synthetic zinc fingers, thio-containing locked nucleic acids (LNA) or peptide nucleic acids (PNA).
BmrI; EDTA-resistant nuclease; cleavage domain; chimeric endonuclease
The ββα-Me restriction endonuclease (REase) Hpy99I recognizes the CGWCG target sequence and cleaves it with unusual stagger (five nucleotide 5′-recessed ends). Here we present the crystal structure of the specific complex of the dimeric enzyme with DNA. The Hpy99I protomer consists of an antiparallel β-barrel and two β4α2 repeats. Each repeat coordinates a structural zinc ion with four cysteine thiolates in two CXXC motifs. The ββα-Me region of the second β4α2 repeat holds the catalytic metal ion (or its sodium surrogate) via Asp148 and Asn165 and activates a water molecule with the general base His149. In the specific complex, Hpy99I forms a ring-like structure around the DNA that contacts DNA bases on the major and minor groove sides via the first and second β4α2 repeats, respectively. Hpy99I interacts with the central base pair of the recognition sequence only on the minor groove side, where A:T resembles T:A and G:C is similar to C:G. The Hpy99I–DNA co-crystal structure provides the first detailed illustration of the ββα-Me site in REases and complements structural information on the use of this active site motif in other groups of endonucleases such as homing endonucleases (e.g. I-PpoI) and Holliday junction resolvases (e.g. T4 endonuclease VII).
An industrial approach to protein production demands maximization of cloned gene expression, balanced with the recombinant host’s viability. Expression of toxic genes from thermophiles poses particular difficulties due to high GC content, mRNA secondary structures, rare codon usage and impairing the host’s coding plasmid replication.
TaqII belongs to a family of bifunctional enzymes, which are a fusion of the restriction endonuclease (REase) and methyltransferase (MTase) activities in a single polypeptide. The family contains thermostable REases with distinct specificities: TspGWI, TaqII, Tth111II/TthHB27I, TspDTI and TsoI and a few enzymes found in mesophiles. While not being isoschizomers, the enzymes exhibit amino acid (aa) sequence homologies, having molecular sizes of ~120 kDa share common modular architecture, resemble Type-I enzymes, cleave DNA 11/9 nt from the recognition sites, their activity is affected by S-adenosylmethionine (SAM).
We describe the taqIIRM gene design, cloning and expression of the prototype TaqII. The enzyme amount in natural hosts is extremely low. To improve expression of the taqIIRM gene in Escherichia coli (E. coli), we designed and cloned a fully synthetic, low GC content, low mRNA secondary structure taqIIRM, codon-optimized gene under a bacteriophage lambda (λ) P
promoter. Codon usage based on a modified ‘one amino acid–one codon’ strategy, weighted towards low GC content codons, resulted in approximately 10-fold higher expression of the synthetic gene. 718 codons of total 1105 were changed, comprising 65% of the taqIIRM gene. The reason for we choose a less effective strategy rather than a resulting in high expression yields ‘codon randomization’ strategy, was intentional, sub-optimal TaqII in vivo production, in order to decrease the high ‘toxicity’ of the REase-MTase protein.
Recombinant wt and synthetic taqIIRM gene were cloned and expressed in E. coli. The modified ‘one amino acid–one codon’ method tuned for thermophile-coded genes was applied to obtain overexpression of the ‘toxic’ taqIIRM gene. The method appears suited for industrial production of thermostable ‘toxic’ enzymes in E. coli. This novel variant of the method biased toward increasing a gene’s AT content may provide economic benefits for industrial applications.
To explore the possibility of using restriction enzymes in a synthetic biology based on artificially expanded genetic information systems (AEGIS), 24 type-II restriction endonucleases (REases) were challenged to digest DNA duplexes containing recognition sites where individual Cs and Gs were replaced by the AEGIS nucleotides Z and P [respectively, 6-amino-5-nitro-3-(1′-β-d-2′-deoxyribofuranosyl)-2(1H)-pyridone and 2-amino-8-(1′-β-d-2′-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)-one]. These AEGIS nucleotides implement complementary hydrogen bond donor–donor–acceptor and acceptor–acceptor–donor patterns. Results allowed us to classify type-II REases into five groups based on their performance, and to infer some specifics of their interactions with functional groups in the major and minor grooves of the target DNA. For three enzymes among these 24 where crystal structures are available (BcnI, EcoO109I and NotI), these interactions were modeled. Further, we applied a type-II REase to quantitate the fidelity polymerases challenged to maintain in a DNA duplex C:G, T:A and Z:P pairs through repetitive PCR cycles. This work thus adds tools that are able to manipulate this expanded genetic alphabet in vitro, provides some structural insights into the working of restriction enzymes, and offers some preliminary data needed to take the next step in synthetic biology to use an artificial genetic system inside of living bacterial cells.