Search tips
Search criteria 


Logo of narLink to Publisher's site
Nucleic Acids Res. 2013 April; 41(7): 4198–4206.
Published online 2013 March 12. doi:  10.1093/nar/gkt102
PMCID: PMC3627594

Characterization of the 5-hydroxymethylcytosine-specific DNA restriction endonucleases


In T4 bacteriophage, 5-hydroxymethylcytosine (5hmC) is incorporated into DNA during replication. In response, bacteria may have developed modification-dependent type IV restriction enzymes to defend the cell from T4-like infection. PvuRts1I was the first identified restriction enzyme to exhibit specificity toward hmC over 5-methylcytosine (5mC) and cytosine. By using PvuRts1I as the original member, we identified and characterized a number of homologous proteins. Most enzymes exhibited similar cutting properties to PvuRts1I, creating a double-stranded cleavage on the 3′ side of the modified cytosine. In addition, for efficient cutting, the enzymes require two cytosines 21–22-nt apart and on opposite strands where one cytosine must be modified. Interestingly, the specificity determination unveiled a new layer of complexity where the enzymes not only have specificity for 5-β-glucosylated hmC (5βghmC) but also 5-α-glucosylated hmC (5αghmC). In some cases, the enzymes are inhibited by 5βghmC, whereas in others they are inhibited by 5αghmC. These observations indicate that the position of the sugar ring relative to the base is a determining factor in the substrate specificity of the PvuRts1I homologues. Lastly, we envision that the unique properties of select PvuRts1I homologues will permit their use as an additive or alternative tool to map the hydroxymethylome.


DNA modifications are present across many forms of life. One of the more commonly identified epigenetic modifications is cytosine methylation [5-methylcytosine (5mC)]. Depending on its location in the DNA, a 5mC modification performs a variety of biological roles, from protection against restriction enzymes to gene regulation. Prokaryotes contain restriction-modification systems where DNA methyltransferases modify the host DNA, and restriction enzymes serve as a protector against non-methylated foreign DNA. However, evolution has allowed several bacteriophages to survive in which modified bases that are resistant to many restriction enzymes are incorporated into their genome (1). One well-known example is bacteriophage T4. During replication, all cytosines are replaced with 5-hydroxymethylcytosine (5hmC), which is further modified by α and β glucosylation of the hydroxymethyl group (2). Even though 5hmC is resistant to most restriction enzymes, McrA (3), McrBC (4) and Type IV SauUSI (5) have been shown to specifically restrict its infection in vivo. Additionally, PvuRts1I (6) and GmrSD UT and CT (7) have shown to restrict DNA containing 5-glucoylhydroxymethylated DNA (5ghmC). T4 phage DNA consists of 30% beta glucosylated 5hmC and 70% alpha glucosylated 5hmC (8).

In eukaryotes, 5mC has been associated with the regulation of transcriptional activity and shown to affect fundamental processes such as development, imprinting and genome stability (9). Recently, 5hmC was discovered in human brain tissue and in mouse embryonic stem cells (10,11) and has subsequently generated much interest within the scientific community. 5hmC was identified as the oxidative product of 5mC, a reaction catalyzed by the ten eleven translocation (TET) family enzymes (12). Furthermore, mutations in human TET2 are associated with myeloid malignancies, further supporting the physiological relevance of 5hmC (13).

Even though the exact role of 5hmC in higher organisms is still unclear, current literature proposes two possible functions. It can serve as an intermediate for cytosine demethylation (14–16) or it may influence chromatin structure by altering the binding of methyl CpG binding proteins (17,18). To fully elucidate the biological function of this new modification, methods to map the hydroxymethylome need to be developed.

There are currently three reported methods for single base-resolution hydroxymethylome mapping. Two of these methods, oxoBS-seq and TAB-seq, use bisulfite sequencing coupled with either chemical or enzymatic oxidation, respectively (19,20). In these methods, 5mC and 5hmC are read as cytosine after bisulfite sequencing, while further oxidized products of 5hmC (5-formylcytosine or 5-carboxylcytosine) are deaminated and subsequently read as thymine (21). The third method, Aba-seq, uses the enzymatic properties of AbaSI (formally designated AbaSDFI), a member of the PvuRts1I restriction enzyme family shown to exhibit high specificity for 5hmC over 5mC and C, cleaving at a fixed distance away from the modification (22). Even though all three methods can map the 5hmC genome to base resolution, Aba-seq has certain advantages: using a restriction enzyme preserves the quality of the DNA, is semi-quantitative and allows less abundant 5hmC sites to be accurately identified (23).

Owing to the increasing evidence for the importance of 5hmC in mammalian epigenetics and the success of Aba-seq in mapping the 5hmC genome to base resolution, we have sought to determine the in vitro biochemical properties of PvuRts1I homologues identified in REBASE. We thus characterized >25 family members focusing on comparing their substrate selectivity for different forms of cytosine modifications, in addition to their cut sites and recognition site requirements. Interestingly, in addition to observing differential cutting on beta-glucosylated T4 DNA (T4β), we also observed differential specificities for alpha-glucosylated T4 DNA (T4α) among the homologues. For example, AbaSI cutting is greatly inhibited by 5-α-glucosylated hmC (5αghmC) when compared with 5-β-glucosylated hmC (5βghmC), while PvuRts1I cutting is enhanced on 5αghmC when compared with 5βghmC.


Cloning, expression and purification of PvuRts1I homologues

C-terminally intein-tagged PvuRts1I homologous proteins were purified to high homogeneity from Escherichia coli strain T7 Express [New England Biolabs (NEB) #C2566] essentially as described (22). The sequences encoding the gene for a majority of the PvuRts1I enzyme family (Table 1) were optimized using Optimizer (24) and synthesized by Integrated DNA Technologies Inc (San Jose, CA, USA).

Table 1.
PvuRts1I homologue information

A large range of concentrations (0.016–4.5 mg/ml) were used for the characterization experiments due to variation in the expression levels of the proteins. The units of each enzyme used in the experiments could be calculated from their specific activities, resulting in a range of 1–400 units (Table 1). One unit of enzyme is defined as the amount to digest 1 µg of substrate (either T4gt or T4β, depending on the preference of each enzyme) to completion in NEB buffer 4 (50 mM potassium acetate, 20 mM Tris–acetate, 10 mM magnesium acetate, 1 mM dithiothreitol, pH 7.9) at 25°C, 20 min.

Analytical size exclusion chromatography

Analytical size exclusion chromatography was performed on a superdex 200 10/300GL column (GE # 17-5175-01), pre-equilibrated in 500 mM potassium acetate, 10 mM Tris–acetate, pH 8.0 buffer. The column was calibrated with blue dextran to measure the void volume (Vo), thyroglobulin (669 kDa), apoferritin (443 kDa), β-amylase (200 kDa), bovine serum albumin (BSA; 66 kDa) and carbonic anhydrase (29 kDa) in the equilibration buffer. A standard curve was generated by plotting the molecular masses on a logarithmic scale versus Ve (elution volume)/Vo. After calibration, the column was re-equilibrated in the same buffer, and the homologues, varying in concentration from 200 µg to 3 mg (depending on the stock concentrations) were applied to the column. Ve/Vo for each protein was calculated and the molecular weights were determined from the standard curve.

T4 α-glucosyltransferase

The pAII17-α-glucosyltransferase (AGT) plasmid containing the coding region for AGT was transformed into a dcm E. coli strain T7 Express (NEB # C2566). After selection on solid LB media containing ampicillin (100 µg/ml), individual colonies were used to inoculate 1 L luria broth (LB) media containing ampicillin (100 µg/ml). The culture was incubated at 37°C until the OD600 reached 1.2, after which protein expression was induced with 0.2 mM isopropylthio-β-galactoside. After incubating at 16°C overnight, cells were harvested by centrifugation, suspended in 25 ml of 20 mM Tris–HCl, 0.1 mM ethylenediaminetetraacetic acid (EDTA), 100 mM NaCl buffer, pH 7.5 (eluent A) and sonicated at 4°C. Cell debris was removed by centrifugation, and the cell-free extract was loaded onto a 5 ml DEAE column, followed by a 5 ml HiTrap Heparin HP column and then a 5 ml HiTrap Q HP column, where the DEAE and HiTrap Heparin HP columns were pre-equilibrated in eluent A at pH 7.5 and the HiTrap Q HP column was pre-equilibrated in eluent A at pH 8.0. AGT was eluted from each column with a linear gradient of NaCl, and protein purity was analyzed by sodium dodecyl sulphate–polyacrylamide gel electrophoresis (PAGE).

Glucosylation assay for T4gt DNA

A standard glucosylation assay consisted of a fixed concentration of uridine diphosphate (UDP)-glucose [1-3H] (American Radiolabeled Chemical, Inc; ART 0525), 100 ng of T4gt DNA and varying concentrations from a 2-fold dilution series of AGT in NEB buffer 4 for 2 h at 37°C. The reactions were stopped by flash freezing in an ethanol/dry ice bath. The samples were processed by applying the thawed reaction mixture to a 2.5 cm DE81 membrane (GE Healthcare# 3658-325) under air pressure using a vacuum manifold (Millipore). The reaction was washed three times with 0.2 M ammonium bicarbonate, followed by three times with deionized water and lastly, three times with ethanol. The membranes were dried, and the amount of tritium incorporation was determined by standard scintillation counting for 1 min (Perkin Elmer TriCarb 2900TR).

DNA substrates for specificity determination

Specificities of enzymes were determined on non-methylated lambda DNA (C), XP12 DNA [5mC (25)], phage T4gt DNA (containing non-glucosylated 5hmC), T4β (containing 5βghmC) and T4α (containing 5αghmC). Non-methylated lambda DNA was purchased from Sigma (# D3654). XP12, T4wt and T4gt genomic DNA were purified from phage cultures. DNA containing either 5βghmC or 5αghmC was obtained by further modification of T4gt DNA by the T4 β-glucosyltransferase [(BGT), NEB #M0357] and AGT, respectively. The relative specificities of the PvuRts1I homologues were determined by incubating 100 ng of each DNA substrate with a 2-fold serial dilution of each enzyme in Diluent E (250 mM potassium acetate, 10 mM Tris–acetate buffer, pH 8.0, 0.2 mg/ml BSA) in NEB buffer 4 for 20 min at room temperature. The reaction products were then resolved on a 0.8% agarose gel.

Substrates for cleavage site determination and recognition site requirements

The DNA oligonucleotides containing a top-strand 5hmC modification and 3′ fluorescein amidite (FAM) labels were synthesized by Integrated DNA Technologies. The sequences are as follows: 5′-CCA TAC ATA TCC CTT ACT TCT CCT AA(5hmC) GTG GAT GAT AAA GGT AGT TTA TGT GGA-3′FAM and 5′-TCC ACA TAA ACT ACC TTT ATC ATC CAC GTT AGG AGA AGT AAG GGA TAT GTA TGG-3′FAM. Double-stranded oligonucleotides (10 µM final) were obtained by heating solutions with equal concentrations of top- and bottom-strand oligonucleotide to 95°C followed by a gradual cooling to room temperature. The DNA oligonucleotides with both a top- and bottom-strand 5hmC and 5′FAM labels were synthesized with forward (5′-CCA TAC ATA TCC CTT ACT TCT CCT A) and reverse (5′-TCC ACA TAA ACT ACC TTT ATC ATC CAC G-3′) polymerase chain reaction primers and using 5′-CCA TAC ATA TCC CTT ACT TCT CCT AAC GTG GAT GAT AAA GGT AGT TTA TGT GGA-3′ as a template in the presence of dhmCTP /dATP/dGTP/dTTP and Taq DNA Polymerase (NEB #M0273). Purification yielded a final concentration of 25–30 ng/µl of double-stranded oligonucleotide. Subsequently, both the 5′ and 3′FAM-labeled double-stranded oligonucleotides were glucosylated by an overnight incubation with BGT. The cleavage sites were then determined by incubating 100–150 ng of double-stranded oligonucleotide with each enzyme for 30 min at room temperature. The reaction products were resolved using a 20% polyacrylamide 7 M urea denaturing gel.

The oligonucleotides containing 5hmC used in Figure 4 were synthesized by the NEB Organic Synthesis Division. Similar to the preparation of the 3′FAM oligonucleotides used for the cut site determination, equal concentrations of top and bottom strands were mixed to yield a 10 µM final concentration of double-stranded substrate. The oligonucleotides were annealed by heating to 95°C followed by gradually cooling the solution to room temperature. To determine the recognition-site requirements for the enzymes, five different synthetic oligonucleotides (A, C/C, C, 5mC and 5hmC) were synthesized. Synthetic oligonucleotide A contains an 5hmC modification and a non-cytosine residue 22 nt away and on the opposite strand, oligonucleotide C/C contains two cytosines 22 nt apart and on opposite strands, oligonucleotide C consists of an 5hmC modification and a cytosine 22 nt away and on the opposite strand, oligonucleotide 5mC contains an 5hmC modification and an 5mC modification 22 nt away and on the opposite strand and lastly, oligonucleotide 5hmC contains two 5hmC modifications 22 nt apart and on opposite strands (Figure 4). The sequences of the substrates are listed in Table 2. Each substrate (77 ng) was incubated with enzyme for 30 min at room temperature. The reaction products were then resolved using a 5% agarose gel.

Table 2.
Oligonucleotides for cytosine modification dependence
Figure 4.
Activity of enzymes on different modified oligonucleotides. The sequence of the oligonucleotides can be found in Table 2. (A) Schematic of the five modified oligonucleotides used for the activity determination; A, C/C, C, 5mC and 5hmC. The two indicated ...


A number of PvuRts1I homologues were identified by blasting the PvuRts1I protein sequence against the NR and ENV_NR databases [(26), Table 1]. Seven homologues have identical sequences and are therefore omitted from the comparison.

Of the 28 hits from NCBI/BLAST, five were inactive when tested for activity against T4wt and T4gt in crude lysates and three were not purified using the methods described here. The remaining recombinant proteins were fused with a cleavable intein- and chitin-binding domain and were purified to near homogeneity. Figure 1A shows an example of the purity of the homologues and indicates that they are relatively pure. The remaining homologues also show a similar level of purity to those pictured. Furthermore, size exclusion chromatography (Figure 1B) indicates that all of the homologues are likely dimers (Table 3).

Figure 1.
Purified homologues and gel filtration analysis of the homologues. (A) The following homologues were run on a Tris–glycine 4–20% gel as an example of the relative purity of the proteins. Lane M is the ColorPlus protein ladder (NEB # P7710). ...
Table 3.
Oligomeric state of the homologues as determined by gel filtration

In vitro conversion rate of T4 DNA by the α-glucosyltransferase

BGT can fully glucosylate T4 DNA in vitro (27). To determine the percent of glucosyl incorporation by AGT, the extent of glucosylation by AGT and BGT on a common substrate was compared. Incorporation was comparable, indicating an in vitro conversion rate of 5hmC to 5αghmC by AGT of 100% in T4 DNA.

Relative selectivity of the PvuRts1I homologues

AbaSI, a PvuRts1I homologue, has previously been characterized as a modification-dependent restriction endonuclease that recognizes 5hmC as well as 5ghmC with little to no activity toward 5mC and C (22). With the hopes of discovering an enzyme with even higher selectivity than AbaSI, we sought to characterize a set of 20 PvuRts1I homologues. A similar method to that used in the initial substrate selectivity determination of PvuRts1I (22) was used to determine the substrate selectivity of the PvuRts1I homologues. Each homologue was assayed on non-methylated lambda DNA (containing unmodified cytosines), phage XP12 DNA (containing 5mC), phage T4gt DNA (containing non-glucosylated 5hmC DNA), phage T4β DNA (containing 5βghmC) and phage T4α DNA (containing 5αghmC). The relative selectivity for each enzyme is defined as the ratio of activity on the different modified cytosine substrates. The homologues with the highest selectivity include AbaUI, which exhibits a relative selectivity of 5hmC:5αghmC:5βghmC:5mC:C = 512:16:8192:8:ND (ND: non-detectable, meaning no apparent difference between cut and uncut substrate, Figure 2); AbaA1, which exhibits a relative selectivity of 5hmC:5αghmC:5βghmC:5mC:C = 1024:128:16384:16:ND and BbiDI, which exhibits a relative selectivity of 5hmC:5αghmC:5βghmC:5mC:C = 4096:4096:256:2:1. Figure 2 illustrates the comparison of the specific activities for all the homologues. In contrast to the β-glucosyl modification, the α-glucosyl modification resulted in varying effects on the relative selectivity of the homologues. For AbaSI, the 5αghmC modification had inhibitory effects of 1/500 when compared with 5βghmC. In contrast, for PvuRts1I, the 5αghmC modification enhanced selectivity by 32-fold when compared with 5βghmC. These modifications are important because even though they are not known to exist in the human genome, in vitro 5hmC can be converted to 5αghmC and 5βghmC by BGT and AGT, respectively.

Figure 2.
Relative selectivity of PvuRts1I homologues. Selectivity was determined on DNA with different modified cytosines: dcm (unmodified cytosine), XP12 (methylated cytosines), T4wt (hydroxymethylated cytosines), T4α (α-glucosylated ...

Cleavage properties

To determine the cleavage properties for the PvuRts1I homologues, two different substrates were designed, a hemi-glucosylhydroxymethylated oligonucleotide with 3′FAM labels (Figure 3A) and fully glucosylhydroxymethylated oligonucleotide with 5′FAM labels (Figure 3B), and subjected to enzyme digestion, which would allow the detections of each of the cleavage products. A 3′FAM-labeled substrate allows us to determine the cleavage pattern of a single 5ghmC site and the cleavage site on the same strand of the modification. The 5′FAM-labeled substrate will allow us to determine the cleavage site on the opposite strand of the modification and whether the cleavage properties are altered if there is a fully hydroxymethylated site. Denaturing PAGE allowed for the single-base resolution of the small digested fragments, which were subsequently compared with size markers and with the cleavage pattern produced by AbaSI.

Figure 3.
Cleavage site determination of the PvuRts1I homologues. (A) The left side of the figure shows the sequence of the hemi-glucosylhydroxymethylated 3′FAM-labeled 54-bp oligonucleotide used to determine the cleavage site on the same strand of the ...

It has previously been shown that AbaSI exhibits a double-stranded cleavage to the 3′ side of a cytosine modification at N1113/N910 (22). If the enzymes exhibit a similar cutting pattern to AbaSI, cleavage of the hemi-glucosylhydroxymethylated 3′FAM-labeled substrate would result in two labeled fragments of 15(+/−) and 39(+/−) nt. Digestion of the 3′FAM-labeled substrate by the homologues showed the same cutting pattern as AbaSI, indicating a cleavage site of 12–13 nt away on the same strand of the modified cytosine (Figure 3A).

For the fully glucosylhydroxymethylated 5′FAM-labeled substrate, if AbaSI recognizes only one 5ghmC modification, cleavage would result in two labeled fragments of 17(+/−) and 39(+/−) nt. If AbaSI recognizes both 5ghmC modifications, cleavage would occur on both sides of the 5ghmC sites, resulting in two FAM-labeled fragments of 17(+/−) nt. Digestion of the 5′FAM-labeled substrate by the homologues result in a mixture of products from cleavage on only one side [39(+/−), 17(+/−) nt, Figure 3B (1a, 1b)] and on both sides [17(+/−), Figure 3B (2)] of the modifications. From the accurate measurement of the 17-nt fragment, we can deduce that the cleavage on the opposite strand is 9–10 nt away from the modification. The 39(+/−) nt fragment in Figure 3B is labeled as an intermediate because if the reaction went to completion, it would have been completely digested into smaller fragments. If only one modification is being recognized, we would expect to see equal intensities of the 39(+/−) and 17(+/−) nt bands. Instead, the 17(+/−) nt band is more intense, indicating that both modifications are being recognized. Incomplete enzyme digestion is often seen when using synthetic oligonucleotides compared with genomic DNA. In addition, as the cleavage site for the homologues on the fully hydroxymethylated substrate is the same as that reported for AbaSI on a hemi-hydroxymethylated substrate (22), we can conclude that the presence of two modifications does not alter the cleavage properties of the enzymes. Overall, our results suggest that most of the homologues cleave at the same position as AbaSI, N1113/N910 on the 3′ side of the modified cytosine (Table 4). However, there are some exceptions such as BmeDI, which cuts to a low degree at N1113/N910 but predominantly at N23/N01, a cut site close to the 5hmC modification.

Table 4.
Cut sites of the PvuRts1I homologues

Recognition site requirements

For efficient cleavage, PvuRts1I requires both a 5hmC modification and an additional cytosine on the opposite strand. According to Hua et al., 47% of the sequences cut by PvuRts1I homologue AbaSI contain two cytosines 21 nt apart and 45% contain two cytosines 22 nt apart (22). In addition, the cleavage efficiency was determined to be dependent on the modification status where when one of the 5hmCs in the recognition site changes to 5mC or C, the efficiency decreases (22). To determine whether the PvuRts1I homologues also possess specific requirements for site recognition, synthetic oligonucleotides were specifically designed to contain 5hmC on one strand and 5hmC, 5mC, C or no cytosine 22 nt away on the opposite strand (Figure 4A). The extent of digestion was determined by resolving the DNA on either a 10% tris borate EDTA or 5% agarose gel. Similar to PvuRts1I, while the homologues show modest activity with one 5hmC modification and an additional cytosine 22 nt away and on the opposite strand, the highest activity is exhibited on substrates with two 5hmCs 22 nt apart and on opposite strands (Figure 4B). In addition, the cutting efficiency decreases as the bottom-strand 5hmC modification changes from 5mC to C and there is no detectable cutting in the absence of C. This indicates that all of the homologues have an absolute requirement for a second cytosine on the opposite strand 22 nt away.


Enzymes have been used successfully for decades to answer important questions in biology. Here we present the characterization of a special class of enzymes specific for modified cytosines in DNA. These appear to have evolved as a defense mechanism in the struggle between unicellular organisms and their viruses. PvuRts1I was among the first of these enzymes identified and was shown to restrict T-even bacteriophages that contain 5hmC or 5ghmC (6). Characterization of the PvuRts1I family enzymes shows that, like AbaSI, all of the enzymes exhibit DNA-modification–dependent endonuclease activity with similar cleavage properties. Specifically, most of the homologues generate a double-stranded cut on the 3′side of the modified cytosine at a distance of CN1113 on the top strand and N910G on the bottom strand (Figure 4). Additionally, for efficient cleavage, the enzymes require two cytosines separated by 21–22 nt where one cytosine must be modified (Figure 4). The observation that two cytosines are required for efficient cleavage agrees with our finding that the homologues form a dimer in solution.

There is one outlier, BmeDI, which generates a double-stranded cut on the 3′side of the modified cytosine at a distance of CN23 on the top strand and N01G on the bottom strand. After performing a sequence alignment with the PvuRtS1I homologues, it is clear that the sequence for BmeDI is unique. Specifically, BmeDI has a long C-terminus of ~40 amino acids that extends beyond the end of the alignment of all the other homologues. This observation led us to create a phylogenetic tree based on the sequences of the homologues. The tree showed BmeDI on a branch of its own, indicating that it evolved differently from its other family members and supports our hypothesis that BmeDI is a unique enzyme. Further studies are required to determine the exact reason or amino acids that are responsible for the difference in cut site of BmeDI.

To compare and contrast the specificity of the PvuRts1I family members, we generated DNA substrates that contain different cytosine modifications. The specificity of this class of enzymes is especially important because the amount of 5hmC in the genome is extremely low in comparison with 5mC or C (28). All of the enzymes assayed exhibit different relative selectivities toward the various DNA substrates with minimal to non-existent cutting on C. Notably, AbaAI, AbaCI, AbaUI and AbaTI exhibit the highest selectivity between 5βghmC and 5mC at 1000:1, while BbiDI exhibit the highest selectivity between 5αghmC and 5mC at 2000:1 (Figure 2). The observation of homologous enzymes exhibiting different relative selectivities toward their substrates is consistent with many examples in the literature where the difference of only a few amino acids can result in varied substrate specificity (29,30).

Interestingly, this comparison revealed an additional layer of complexity with the homologues exhibiting varied specificity toward α- and β-glucosylated 5hmC DNA. For example, BbiDI, BmeDI and PvuRts1I show high selectivity for 5αghmC but are inhibited by 5βghmC, while AbaCI, AbaUI, AbaAI and AbaSI (to name a few) show high selectivity for 5βghmC but are inhibited by 5αghmC (Figure 2). T4wt DNA contains a mixture of α and β glucosylated 5hmC (8). To survive infection by T4-like phages, bacteria must contain enzymes with specificity toward either α or β 5ghmC. Enzyme active sites can be specific, and even a small change in the substrate or cofactor structure can have an immense effect on specificity. We believe the difference of the sugar ring conformation is likely attributable to differences in binding site specificity among the PvuRts1I homologues. This is supported by Gruber et al. (31) who determined that the UDP-galactopyranose mutase (UGM) has the ability to discriminate between two structurally similar substrates, UDP-galactopyranose (UDP-Galp) and UDP-glucose (UDP-Glc). Even though UDP-Galp and UDP-Glc differ only by the conformation of the sugar moiety, UGM discriminates against the latter during both binding and catalysis, which was attributed to the orientation of the sugar moieties in the active site. The crystal structure of AbaSI, once determined, will provide further insight into the mechanism of substrate specificity.

Lastly, the new observation of PvuRts1I homologues exhibiting either enhanced or inhibitory effects toward 5αghmC can present either an alternative or additive approach to map the hydroxymethylome. When designing such an experiment, it is imperative to have a high level of confidence that only 5hmC sites are being identified. This can be difficult because the amount of 5mC in the genome is much more abundant than that of 5hmC. Nevertheless, the unique characteristics of the PvuRts1I enzymes will provide this confidence. For example, AbaSI is strongly inhibited by α-glucosylation and enhanced by β-glucosylation. The DNA sample, which is β-glucosylated, will capture all 5hmC sites, in addition to sites that contain 5mC from low-level digestion by AbaSI. The DNA sample, which is α-glucosylated, will only capture sites that contain 5mC, and can serve as an experimental control. These two sample preparations can be compared and the difference will identify only the 5hmC sites. Furthermore, as Aba-seq has already proved to be a successful method in mapping the hydroxymethylome (23), we envision a similar use for the homologues with high selectivity between 5βghmC and 5mC.


National Institutes of Health (NIH), SBIR [GM096723]. Funding for open access charge: NIH.

Conflict of interest statement. None declared.


We would like to thank Drs Yu Zheng, Richard Morgan and Jurate Bitinaite for materials; Dr Zhiyi Sun for discussions and enzyme evolution analysis; Drs Geoffrey Wilson, William Jack and Richard Roberts for critical reviews and suggestions regarding the manuscript; and Jim Ellard and Donald Comb for support.


1. Warren RA. Modified bases in bacteriophage DNAs. Annu. Rev. Microbiol. 1980;34:137–158. [PubMed]
2. Kornberg SR, Zimmerman SB, Kornberg A. Glucosylation of deoxyribonucleic acid by enzymes from bacteriophage-infected Escherichia coli. J. Biol. Chem. 1961;236:1487–1493. [PubMed]
3. Mulligan EA, Dunn JJ. Cloning, purification and initial characterization of E. coli McrA, a putative 5-methylcytosine-specific nuclease. Protein Expr. Purif. 2008;62:98–103. [PMC free article] [PubMed]
4. Sutherland E, Coe L, Raleigh EA. McrBC: a multisubunit GTP-dependent restriction endonuclease. J. Mol. Biol. 1992;225:327–348. [PubMed]
5. Xu SY, Corvaglia AR, Chan SH, Zheng Y, Linder P. A type IV modification-dependent restriction enzyme SauUSI from Staphylococcus aureus subsp. aureus USA300. Nucleic Acids Res. 2011;39:5597–5610. [PMC free article] [PubMed]
6. Janosi L, Yonemitsu H, Hong H, Kaji A. Molecular cloning and expression of a novel hydroxymethylcytosine-specific restriction enzyme (PvuRts1I) modulated by glucosylation of DNA. J. Mol. Biol. 1994;242:45–61. [PubMed]
7. Rifat D, Wright NT, Varney KM, Weber DJ, Black LW. Restriction endonuclease inhibitor IPI* of bacteriophage T4: a novel structure for a dedicated target. J. Mol. Biol. 2008;375:720–734. [PMC free article] [PubMed]
8. Lehman IR, Pratt EA. On the structure of the glucosylated hydroxymethylcytosine nucleotides of coliphages T2, T4, and T6. J. Biol. Chem. 1960;235:3254–3259. [PubMed]
9. Rottach A, Leonhardt H, Spada F. DNA methylation-mediated epigenetic control. J. Cell Biochem. 2009;108:43–51. [PubMed]
10. Tahiliani M, Koh KP, Shen Y, Pastor WA, Bandukwala H, Brudno Y, Agarwal S, Iyer LM, Liu DR, Aravind L, et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science. 2009;324:930–935. [PMC free article] [PubMed]
11. Kriaucionis S, Heintz N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science. 2009;324:929–930. [PMC free article] [PubMed]
12. Ito S, D'Alessio AC, Taranova OV, Hong K, Sowers LC, Zhang Y. Role of Tet proteins in 5mC to 5hmC conversion, ES-cell self-renewal and inner cell mass specification. Nature. 2010;466:1129–1133. [PMC free article] [PubMed]
13. Meyer C, Kowarz E, Hofmann J, Renneville A, Zuna J, Trka J, Ben Abdelali R, Macintyre E, De Braekeleer E, De Braekeleer M, et al. New insights to the MLL recombinome of acute leukemias. Leukemia. 2009;23:1490–1499. [PubMed]
14. Cortellino S, Xu J, Sannai M, Moore R, Caretti E, Cigliano A, Le Coz M, Devarajan K, Wessels A, Soprano D, et al. Thymine DNA glycosylase is essential for active DNA demethylation by linked deamination-base excision repair. Cell. 2011;146:67–79. [PMC free article] [PubMed]
15. He YF, Li BZ, Li Z, Liu P, Wang Y, Tang Q, Ding J, Jia Y, Chen Z, Li L, et al. Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science. 2011;333:1303–1307. [PMC free article] [PubMed]
16. Maiti A, Drohat AC. Thymine DNA glycosylase can rapidly excise 5-formylcytosine and 5-carboxylcytosine: potential implications for active demethylation of CpG sites. J. Biol. Chem. 2011;286:35334–35338. [PubMed]
17. Valinluck V, Sowers LC. Endogenous cytosine damage products alter the site selectivity of human DNA maintenance methyltransferase DNMT1. Cancer Res. 2007;67:946–950. [PubMed]
18. Valinluck V, Tsai HH, Rogstad DK, Burdzy A, Bird A, Sowers LC. Oxidative damage to methyl-CpG sequences inhibits the binding of the methyl-CpG binding domain (MBD) of methyl-CpG binding protein 2 (MeCP2) Nucleic Acids Res. 2004;32:4100–4108. [PMC free article] [PubMed]
19. Booth MJ, Branco MR, Ficz G, Oxley D, Krueger F, Reik W, Balasubramanian S. Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science. 2012;336:934–937. [PubMed]
20. Yu M, Hon GC, Szulwach KE, Song CX, Zhang L, Kim A, Li X, Dai Q, Shen Y, Park B, et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell. 2012;149:1368–1380. [PMC free article] [PubMed]
21. Huang Y, Pastor WA, Shen Y, Tahiliani M, Liu DR, Rao A. The behaviour of 5-hydroxymethylcytosine in bisulfite sequencing. PLoS One. 2010;5:e8888. [PMC free article] [PubMed]
22. Wang H, Guan S, Quimby A, Cohen-Karni D, Pradhan S, Wilson G, Roberts RJ, Zhu Z, Zheng Y. Comparative characterization of the PvuRts1I family of restriction enzymes and their application in mapping genomic 5-hydroxymethylcytosine. Nucleic Acids Res. 2011;39:9294–9305. [PMC free article] [PubMed]
23. Sun Z, Terragni J, Borgaro JG, Liu Y, Yu L, Guan S, Wang H, Sun D, Cheng X, Zhu Z, et al. High resolution enymatic mapping of genomic 5-hydroxymethylcytosine in mouse embrionic stem cells. Cell Rep. 2012;3:1–10. [PubMed]
24. Puigbo P, Guzman E, Romeu A, Garcia-Vallve S. OPTIMIZER: a web server for optimizing the codon usage of DNA sequences. Nucleic Acids Res. 2007;35:W126–W131. [PMC free article] [PubMed]
25. Kuo TT, Huang TC, Teng MH. 5-Methylcytosine replacing cytosine in the deoxyribonucleic acid of a bacteriophage for Xanthomonas oryzae. J. Mol. Biol. 1968;34:373–375. [PubMed]
26. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. [PMC free article] [PubMed]
27. Georgopoulos CP, Revel HR. Studies with glucosyl transferase mutants of the T-even bacteriophages. Virology. 1971;44:271–285. [PubMed]
28. Song CX, Szulwach KE, Fu Y, Dai Q, Yi C, Li X, Li Y, Chen CH, Zhang W, Jian X, et al. Selective chemical labeling reveals the genome-wide distribution of 5-hydroxymethylcytosine. Nat. Biotechnol. 2011;29:68–72. [PMC free article] [PubMed]
29. Shi D, Yu X, Cabrera-Luque J, Chen TY, Roth L, Morizono H, Allewell NM, Tuchman M. A single mutation in the active site swaps the substrate specificity of N-acetyl-L-ornithine transcarbamylase and N-succinyl-L-ornithine transcarbamylase. Protein Sci. 2007;16:1689–1699. [PubMed]
30. Todd AE, Orengo CA, Thornton JM. Evolution of function in protein superfamilies, from a structural perspective. J. Mol. Biol. 2001;307:1113–1143. [PubMed]
31. Gruber TD, Borrok MJ, Westler WM, Forest KT, Kiessling LL. Ligand binding and substrate discrimination by UDP-galactopyranose mutase. J. Mol. Biol. 2009;391:327–340. [PMC free article] [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press