|Home | About | Journals | Submit | Contact Us | Français|
The post-translational modification of nucleocytoplasmic proteins with O-linked 2-acetamido-2-deoxy-d-glucopyranose (O-GlcNAc) is a topic of considerable interest and attracts a great deal of research effort. O-GlcNAcylation is a dynamic process which can occur multiple times over the lifetime of a protein, sometimes in a reciprocal relationship with phosphorylation. Several hundred proteins, which are involved in a diverse range of cellular processes, have been identified as being modified with the monosaccharide. The control of the O-GlcNAc modification state on different protein targets appears to be important in the aetiology of a number of diseases, including type II diabetes, neurodegenerative diseases and cancer. Two enzymes are responsible for the addition and removal of the O-GlcNAc modification: uridine diphospho-N-acetylglucosamine:polypeptide β-N-acetylglucosaminyltransferase (OGT) and O-GlcNAcase (OGA), respectively. Over the past decade the volume of information known about these two enzymes has increased significantly. In particular, mechanistic studies of OGA, in conjunction with structural studies of bacterial homologues of OGA have stimulated the design of inhibitors and offered a rationale for the binding of certain potent and selective inhibitors. Mechanistic information about OGT lags a little way behind OGA, but the recent deduction of the structure of an OGT bacterial homologue should now drive these studies forward.
The post-translational modification of proteins with a single 2-acetamido-2-deoxy-d-glucopyranose (GlcNAc) residue was first detected 25 years ago . Despite considerable research efforts in the field during this time, the precise functional roles this simple monosaccharide plays in vivo remain a topic of considerable continuing interest. The GlcNAc moiety is attached through a β-glycosidic O-linkage (thus O-GlcNAc) to the serine or threonine residues of proteins. Unlike all other forms of eukaryotic protein glycosylation, which occurs within the secretory pathway, proteins modified with O-GlcNAc are found in the nucleus or cytoplasm. The modification is ubiquitous through all higher eukaryotes. While disruption of the gene encoding the enzyme responsible for transfer of O-GlcNAc showed it was essential for embryonic stem cell viability in mice  and more recently for development in Drosophila melanogaster [3, 4], loss of the genes encoding the enzymes for either addition or removal of the monosaccharide in Caenorhabditis elegans results in growth defects [5, 6].
The extent of the protein targets that get O-GlcNAc modified is most likely not yet fully established, but to date several hundred have been identified [7-9]. The protein targets have widespread functions in cells, and include classes of proteins such as transcription factors (such as Sp1, p53 and c-myc [10, 11]), cytoskeletal proteins (such as talin, α-tubulin and the neurofilament proteins [12-15]), nuclear pore proteins (such as p62 and p180 [16-18]), chaperones (such as HSP70 ), as well as enzymes such as RNA polymerase II  and the proteasome . The control of the O-GlcNAc modified state of different protein targets appears important in the aetiology of a number of diseases, including type II diabetes, neurodegenerative diseases, and cancer. Control of global O-GlcNAc levels is believed to play a role in mediating insulin resistance [21-23], a key feature of type II diabetes, although recently this widely held view has been questioned [24-26] suggesting that more research in this area may clarify the processes involved. The balance between O-GlcNAcylation and phosphorylation on tau is implicated in the development of Alzheimer's disease; tau normally stabilizes microtubules but it is thought that its hyperphosphorylation leads to its aggregation into paired helical filaments which, in turn, aggregate to form the neurofibrillary tangles that constitute one half of the pathology of this disease . Control of O-GlcNAc levels may also be aberrant in cancer, since, for example, the tumour suppressor protein p53 and protooncogene product c-myc are modified [11, 28].
The O-GlcNAc modification is dynamic, with cycling of the addition and removal of the monosaccharide occurring several times over the lifetime of a target protein . In addition, O-GlcNAcylation has some interplay with phosphorylation (Fig. 1) ; the modifications have been observed to occur independently on the same serine or threonine residue of a protein (such as c-myc ) or competitively at adjacent or nearby residues (such as p53 ), which may allow control of cellular signalling. Although to date only a limited number of O-GlcNAc sites have been mapped, there appears to be no consensus sequence, unlike N-linked glycosylation. It has been noted, however, that Pro-Val-Ser is present in a significant portion of the sites mapped, which is a sequence similar to that recognized by the proline-directed kinases [8, 32, 33].
The cycling of the O-GlcNAc modification is modulated by two enzymes; uridine diphospho-N-acetylglucosamine: polypeptide β -N-acetylglucosaminyltransferase (OGT) [34, 35] is responsible for addition of the monosaccharide and a β-N-acetylglucosaminidase, known as O-GlcNAcase (OGA) [36, 37], catalyses removal of the moiety (Fig. 1). The presence of just one enzyme for the addition and one enzyme for the removal of O-GlcNAc in an organism differs greatly from the multitude of kinases and phosphatases that are responsible for phosphorylation [38, 39]. The question of how these two enzymes can recognise and modify such a large number of completely unrelated proteins, with no recognisable consensus sequence, has started to be addressed over the last decade or so. The deduction of the structure of homologues of each enzyme and a number of sophisticated recognition and mechanistic studies have started to elucidate some of these details, but there is still some way to go before the regulation and mode of action of OGT and OGA are fully understood.
OGA, or as it was first described hexosaminidase C [36, 40-43], was isolated and characterized at least a decade before the O-GlcNAc modification on proteins itself was elucidated . It was later purified from bovine brain, sequenced and subsequently the gene was cloned, protein over-expressed recombinantly and characterized [37, 44]; the cloned sequence was identical to the MGEA5 (meningioma expressed antigen 5) gene which had been identified in human meningiomas and preliminarily characterized as a hyaluronidase . The properties of the recombinant enzyme cloned by the Hart group were in agreement with previous studies of the purified OGA ; the enzyme is a monomer , and has an optimal activity at neutral pH, as expected for an enzyme located in the nucleocytoplasm. In addition, OGA showed an apparent absolute selectivity for GlcNAc-containing substrates (including O-GlcNAc modified synthetic peptides), and did not hydrolyse GalNAc-configured substrates . OGA activity was found to be present in both the nuclear and cytoplasmic fractions of cells, although when the gene encoding OGA was transfected into cells for over-expression, the recombinant tagged enzyme was observed to localize primarily to the cytoplasm. OGA is found in all tissue types, with highest expression levels in the brain, placenta, and pancreas .
The kinetic and physical properties of OGA contrast with the human lysosomal hexosaminidases which perform optimally at acidic pH, consistent with their localization to the lysosome, and are able to hydrolyse substrates containing either GlcNAc or GalNAc . There are two genes encoding the lysosomal hexosaminidases in humans, HEXA encodes the α subunit and HEXB encodes the β subunit. The protein products are dimeric isozymes in vivo: hexosaminidase A (HexA) is a heterodimer of an α and β subunit, hexosaminidase B (HexB) is a homodimer of two β subunits, and hexosaminidase S, which is unstable, is a homodimer of two α subunits . Mutations in these hexosaminidases are responsible for the GM2 gangliosidosis observed in Tay-Sachs and Sandhoff diseases . A fourth hexosaminidase, named hexosaminidase D (HexD), has also been purified from brain tissues [42, 48, 49], but is less well characterised and the function in vivo currently unknown. The gene encoding HexD (HEXDC) has recently been cloned from mouse and human and the protein expressed recombinantly . These studies demonstrate that this enzyme, like OGA, operates under neutral conditions and is localized to the nucleus and cytoplasm. HexD differs from OGA in that it prefers substrates containing GalNAc, although it appears to also process GlcNAc-configured substrates. HexD, however, shows no activity toward O-GlcNAc modified synthetic peptides and so is likely to fulfil a cellular role distinct from that of OGA .
The development of a coherent and reliable classification system for carbohydrate processing enzymes, by primary sequence similarity, into different ‘families’ has been a great boon to those studying these enzymes. The families of enzymes are listed in the Carbohydrate Active enZyme (CAZy) database  (available at http://afmb.cnrs-mrs.fr/CAZY/). A feature of virtually all CAZy families is that, because the primary sequence dictates structure, and structure ultimately determines function, the catalytic mechanism is conserved within a family . It is interesting to note that HexA, HexB, and HexD all belong to glycoside hydrolase family 20 (GH20), whereas HexC (OGA) is a member of family 84 (GH84). Family GH20 has been well characterized, both structurally and mechanistically for a number of bacterial enzymes [53, 54] and the human hexosaminidases HexA and HexB [55, 56]. The GH20 enzymes, which catalyse hydrolysis with retention of anomeric configuration, were shown to use a substrate participation mechanism using the acetamido group for nucleophilic attack at the anomeric carbon to create an oxazoline intermediate (Fig. 2). The intermediate is hydrolysed by a water molecule, activated by an enzymic general base, which attacks the anomeric carbon. This mechanism differs from the ‘classical’ glycoside hydrolase mechanism first postulated by Koshland, where the nucleophile is an enzymic residue , most commonly a carboxylate group .
Full length OGA is a large 103 kDa (916 amino acid) protein , which has been reported to stably interact with several proteins . A 75 kDa splice variant also exists that arises from failure to splice out intron 10; this intron contains a stop codon so the protein encoded by the resulting transcript lacks the C-terminus of the enzyme, but does contain an additional 15 residues encoded by the intron . This shorter isoform localizes predominantly to the nucleus , but when cloned recombinantly initially appeared to be inactive . Interestingly, although there was originally debate, primarily based on bioinformatics studies and sequence alignments, about whether the active site (for OGA) was likely to reside at the N-terminal region [60, 61] or the C-terminus , OGA activity was shown conclusively to occur in a recombinant protein comprising residues 1-350, whereas a protein consisting of residues 351-916 was inactive. The rate of catalysis for this N-terminal recombinant enzyme, however, was around 1000-fold lower than for the full length protein, which is consistent with a functionally important role for the C-terminus . A different study of the mouse OGA also confirmed the activity is encoded by a polypeptide comprising the N-terminal region, between residues 63-283 . More recently, the short nuclear splice variant has been shown to have catalytic activity, albeit at a much lower rate than the full length enzyme [65, 66].
Sequence alignments indicated the C-terminal region has amino acid sequence similarity to known acetyltransferases ; OGA, when over-expressed in mammalian cells, was later reported to possess acetyltransferase activity towards histone synthetic substrates and free core histones . More recently, however, efforts to independently verify this activity have been unsuccessful . OGA also contains a caspase-3 cleavage site , an executioner caspase in apoptosis , which might explain the early observation of a 65 kDa fragment in cells . This proteolytic processing by caspase-3 was shown to occur both in vitro and in vivo under conditions of Fas-induced apoptosis . Surprisingly, after caspase cleavage, the OGA activity is retained, at least in vitro . It is interesting to note, however, that if the N-terminal region (before the caspase cleavage site) is over-expressed recombinantly in cells alone there appears to be no OGA activity, but if co-over-expressed with the C-terminal region there is abundant activity present; this observation suggests an interaction between the two domains may be required for OGA to be optimally active . The precise roles played by this C-terminal domain, however, remain to be uncovered.
Consistent with the N-terminal domain of OGA harbouring the catalytic machinery required for O-GlcNAc processing, this domain bears similarity to around 70 open reading frames, predominantly found within bacteria, that place this enzyme into family GH84 of the CAZy system . The sequence identity between the N-terminal domain of OGA found in eukaryotes, and in particular mammals, is extremely high (for example 97% between human and mouse; 53% between human and Drosophila melanogaster), but lower between eukaryotes and prokaryotes (the highest sequence identity is between human and Streptomyces pyogenes at 36%). Notably, GH84 enzymes bear no obvious sequence similarity to family GH20 enzymes, which includes the lysosomal β-hexosaminidases.
A significant advance in the analysis of OGA and other GH84 members came from a study which deduced the mechanism of the enzyme and paved the way for the design of potent and selective inhibitors . OGA hydrolyses β-N-acetylglucosamine substrates with retention of anomeric configuration , a proposal verified in a study of the Streptococcus pyogenes GH84 homologue . Measurement of the dependence of catalytic rates on pH produced a bell-shaped curve, reflecting the ionization of important catalytic residues, with a pH optimum of 6.5 and kinetic pKa values for the two enzymic residues of approximately 5.0 and 7.5 [63, 64, 72]. To determine if OGA used a ‘classical’ hydrolytic mechanism, as found for some enzymes processing substrates with a 2-acetamido group, such as the GH22 lysozymes  and the GH3 β-N-acetylhexosaminidases [74, 75], or the less common substrate participation mechanism which involves the acetamido group acting as a nucleophile (Fig. 2), Macauley et al. conducted Taft-like linear free energy relationships (for a brief overview on this approach see ref. ). This involved using a series of substrates bearing varying numbers of fluorine atoms substituted at the methyl group of the N-acetyl substituent in order to determine the effect on the rate of catalysis . The electronegative fluorine atoms lower the basicity of the carbonyl group and, if OGA used a substrate participation mechanism as described for the GH20 enzymes, a decrease in rate of the enzyme catalyzed reaction would be expected. Such a trend was observed for both the human β-hexosaminidase and OGA; the slopes of these trends, however, were not equivalent, suggesting differences between these human enzymes either in the steric constraints in the active site or the position of the transition state along the reaction coordinate .
OGA was demonstrated to be able to hydrolyse both O- and S-linked synthetic substrates, using the same substrate participation mechanism, with similar efficiencies  (unlike most other glycosidases where this comparison has been made). Interestingly, Taft-like linear free energy analysis on the S-linked substrates showed a steeper slope than with the O-linked substrates suggesting a greater involvement of the 2-acetamido group. This is in agreement with secondary deuterium kinetic isotope effect (KIE) measurements (for a review on kinetic isotope effects see ref. ) on the second order rate constant (kcat/KM) where the kH/kD ratios for substrates containing either an O- or an S-glycoside are both large, suggesting dissociative transition states for both substrates. The KIE is slightly larger for the S-glycoside, suggesting the transition state may be later, with accordingly greater involvement of the 2-acetamido group . In addition, Brønsted linear free energy analyses (for a brief overview on this approach see ref. ) were conducted to provide insight into the development of relative charge at either the glycosidic oxygen or sulphur atom at these transition states. This study was carried out by measuring the values of the second order rate constant for OGA processing series of both O- and S-linked substrates bearing phenol or thiophenol leaving groups, respectively, with a range of pKa values. Interestingly, the slopes of the plotted lines (equal to βLG) differed between the two series. The slope was shallower (corresponding to a smaller βLG value) for the O-linked series suggesting there was little charge accumulation at the transition state, which is consistent with the kinetic isotope effect data that also suggest an advanced transition state where proton donation and glycosidic bond cleavage are significantly advanced. In contrast, the slope for the S-linked substrates was steep (with a large βLG close to −1) which is consistent with extensive relative negative charge accumulation occurring on the leaving group thiol at the transition state, and hence either no, or extremely little, general acid catalysis. The lack of general acid catalysis for OGA catalyzed cleavage of S-linked substrates was also suggested in the plot of the second order rate constant against pH, which showed the presence of only one ionizable group as opposed to two for O-linked substrates .
The deduction of the catalytic mechanism used by OGA was aided by data from other families that use the same, but less common, substrate-assisted mechanism for hydrolysis. In addition to GH84, substrate-assisted hydrolysis has been shown to occur in the GH18 family of chitinases [77-80], GH20 family of chitobiases and β-hexosaminidases [53, 54, 56, 79, 81], GH56 family of hyaluronidases , very recently in the GH85 family of endo-β-N-acetylglucosaminidases [83-85] (Fig. 3d) and is also likely in GH25 enzymes . Obviously a requirement for the enzymes using this mechanism is the presence of an acetamido group in the substrate which acts as the nucleophile to attack the anomeric position.
Two key enzymic groups (virtually always the carboxyl group of either an aspartate or glutamate residue) have been shown to be important for catalysis during the substrate-assisted mechanism. The mechanistic studies discussed above provide good support for the proposed mechanism with only some subtle variation between GH families. During the first step of the mechanism, one residue which is in its carboxylate form in the resting enzyme orients and polarizes the 2-acetamido group to increase its nucleophilicity, and hence facilitates attack at the anomeric centre to form an oxazoline intermediate. The other residue, a carboxylic acid in the resting enzyme, acts to aid fission of the glycosidic bond and leaving group departure by general acid catalysis (Fig. 2). During the second step, this second carboxyl residue now acts as a general base to facilitate the attack of a water molecule at the anomeric centre of the oxazoline intermediate, while the polarizing residue aids expulsion of the 2-acetamido group from the anomeric position. The net result of these processes is cleavage of the glycosidic bond to liberate the leaving group (the hydroxyl group of the protein in the case of OGA) and the hemiacetal of O-GlcNAc with retained stereochemistry. There is debate as to the absolute role the polarizing residue plays in catalysis and this remains unclear at present; some postulate for some families of glycoside hydrolases using substrate-assisted catalysis that the carboxylate residue electrostatically stabilizes the oxazolinium intermediate [78, 87], whereas others suggest it is more likely to function as a general acid/base to aid formation of the oxazoline intermediate [63, 70].
Structural and kinetic methods have been valuable for defining the important residues required for catalysis in these families using substrate participation. In the family GH18 [80, 88] and GH56 [82, 89] enzymes, there is a highly conserved DXE (where X can be any amino acid) motif; in the cases examined, the glutamate served as the acid/base residue and the aspartate acted as the polarizing residue of the 2-acetamido group. The GH85 enzymes appear to have a similar motif to the GH18 and GH56 enzymes, but instead of possessing an aspartate residue, have a highly conserved asparagine residue in the equivalent position (and thus have the motif NXE) [60, 84, 85, 90]. The GH85 crystal structure suggests, however, that a hydrogen bond is formed with the nitrogen atom of the amide moiety of the asparagine residue, and the authors of one report propose that the residue may exist in its imidic acid form, which by acting as a general base is able to aid formation of the oxazoline intermediate . Further studies, however, will be needed to support or discard this proposal. The family GH20 and GH84 enzymes differ from GH18, GH56, and GH85 in that the key catalytic residues are adjacent to each other, in a DE motif; the glutamate acts as the acid/base residue and the aspartate as the polarizing residue of the 2-acetamido group [91-93]. It is interesting to note that families possessing a motif with an amino acid between the two catalytic residues hydrolyse endoglycosidic bonds, whereas GH20 and GH84, which have adjacent catalytic residues, hydrolyse exoglycosidic bonds; this difference is also reflected in the size of the active site pocket (Fig. 3d).
Given the conservation of the substrate-assisted mechanism with these other families, the key catalytic residues in OGA were likely to be found in an analogous motif. One bioinformatics study showed that a conserved DD motif in GH84 enzymes could be aligned with the catalytic DE motif found in GH20 enzymes, and postulated that these two aspartate residues would constitute the residues involved in catalysis . There is, however, also a highly conserved DXE motif as found in the GH18 and GH56 enzymes which had also been suggested may include the catalytic residues . The residues in the DD motif, which equate to Asp174 and Asp175 in the human OGA, were mutated to alanine and displayed significant decreases in activity . Perturbations were also observed in the pH dependence of the mutant proteins. The pH dependence for the wild type enzyme is bell-shaped, a shape governed by the two catalytic residues where in the resting enzyme the polarizing residue is likely to be deprotonated and the acid/base residue protonated. Mechanistic studies indicate the acidic limb of the curve is most reasonably assigned to ionization of the polarizing residue, and the basic limb assigned to ionization of the acid/base residue . The pH dependence of the D174A mutant showed only the basic limb, which is consistent with titration of the acid/base residue, and thus implicates Asp174 as the polarizing residue of the acetamido group in catalysis. The pH dependence of D175A was more ambiguous, but rates of hydrolysis were lower at higher pH, which is consistent with Asp175 acting as the acid/base residue. This proposal is in agreement with Brønsted analyses of the D175A mutant, which gave a βLG of −1, indicating the absence of acid catalysis . In addition, the activity of the D175A mutant toward substrates with a good leaving group (i.e. where general acid catalysis is not necessary) was increased upon addition of azide, an exogenous nucleophile, and also yielded a β-azide product. Previous studies have shown that in the absence of an acid/base residue, a small molecule such as azide can attack the intermediate with greater effect than a water molecule since the water molecule cannot be activated by the general base [94-96]. The rate of hydrolysis of O- and S-linked glycosides by the D175A mutant was also similar , which is consistent with acid catalysis providing no benefit for cleavage of S-linked substrates with leaving groups that have a high pKa value, but being of greater importance for corresponding O-linked substrates. The role for the polarizing residue, Asp174, acting as a general acid/base rather than as a stabilizer of the oxazolinium ion is consistent with the large change in pKa of the protonated nitrogen in the acetamido group upon changing from an amide to an oxazoline; this dramatic change in pKa, which crosses the pKa value of the polarizing residue, suggests complete proton transfer from the amide and thus formation of an oxazoline intermediate . The role of the polarizing residue acting as a general acid/base, but not as a stabilizer of an oxazolinium ion, would also agree with the GH85 enzymes possessing an asparagine residue in this position; in an imidic acid form this asparagine residue would be able to accept a hydrogen bond from the nitrogen atom in the acetamido group, but as it is neutral in its resting state it is not in a position to stabilize the charge present on an oxazolium ion .
Ideally, a three-dimensional structure of the human OGA would serve to support or challenge a number of conclusions from the mechanistic studies, offer insight into the potency of selective inhibitors of OGA, inform the design of future inhibitors against the human enzyme, and facilitate studies directed at understanding the involvement of OGA in protein-protein interactions. Despite significant and continuing efforts, OGA has not proved amenable to analysis by X-ray diffraction to date. A number of prokaryotic homologues in family GH84, however, show significant homology to the N-terminal region of OGA, which has the catalytic activity. The structures of two GH84 bacterial homologues of OGA, one from Bacteroides thetaiotaomicron (BtGH84)  and the other from Clostridium perfringens (CpGH84) [98, 99] were published in 2006. The functional role of these bacterial enzymes in vivo remains unknown since prokaryotes do not have the O-GlcNAc post-translational modification, but both enzymes were shown to be capable of cleaving O-GlcNAc from proteins in cell lysates [97, 98].
BtGH84 and CpGH84 display an N-terminal α/β fold domain, followed by a canonical (β/α)8 fold domain, and ending in three (in BtGH84) or one (in CpGH84) further domain(s) at the C-terminus, which are specific to only the prokaryotic enzymes (Fig. 3a) [97, 98]. The structures reveal a remarkable structural homology to the GH20 enzymes [53, 54, 56], despite a lack of sequence identity. The (β/α)8 fold domain houses the catalytic machinery in a shallow depression, and the key DD motif which comprises the residues required for catalysis, is situated at the end of β4 strand; strikingly the positioning of these key residues is identical in the GH20 enzymes. The presence of the two carboxylate residues in the active site were judged to be appropriately positioned (using complexes with inhibitors, see below for more details) to act as the key catalytic residues, consistent with previous proposals made for the human enzyme [60, 63, 70, 72]; the first of these two aspartates acts as the polarizing residue of the 2-acetamido group and the latter as the acid/base residue. Site-directed mutagenesis of the bacterial enzymes also supports assignment of these catalytic residues as discussed above [97, 98]. The active site contains two conserved tryptophan residues, which are predicted to carefully position the acetamido group, and line a sizable pocket which is capped by a cysteine (in BtGH84 and OGA, but not CpGH84) and a methionine residue [97, 98] (Fig. 3b and 3c). It is postulated that a conserved tyrosine residue is in a suitable position to stabilize the transition state during catalysis, through hydrogen bonding with the glycosidic oxygen (where partial negative charge would develop at the transition state) . The structures also provided evidence, in addition to biochemical experiments , that these GH84 enzymes are not likely to act as hyaluronidases (as had been initially proposed ), as there is not sufficient room in the active site pocket to bind a substrate any larger than a monosaccharide (Fig. 3d). It is important to note that despite the relatively modest sequence similarity at the amino acid level in comparison to OGA, the important active site residues of these bacterial structures differ by only one (for BtGH84) or two (for CpGH84) residues; these structures therefore likely provide a highly accurate picture of the OGA catalytic machinery.
Given the number of biological roles proposed for O-GlcNAc and the relatively early stage in our understanding of the functional roles of this common post-translational modification, inhibition of OGA has been a topic of some interest since these inhibitors can be valuable research tools. As is the case with virtually all inhibitors, high potency is a necessity that facilitates avoiding off-target effects. More of a challenge with OGA, however, has been obtaining inhibitors selective for this enzyme over the functionally related human β-hexosaminidase, which as discussed above uses the same catalytic mechanism. The design of such selective inhibitors has, however, progressed over the past few years.
The vast majority of early studies in the field, carried out before design of selective inhibitors, involved increasing cellular O-GlcNAc levels using streptozotocin (N-methyl-N-nitrosoureido d-glucosamine; STZ)  and O-(2-acetamido-2-deoxy-d-glucopyransylidene)-amino-N-phenylcarbamate (PUGNAc)  (Fig. 4). STZ was shown to be a modest inhibitor of OGA in vitro [70, 102, 103] and has long served as a way to induce experimental type I diabetes  through its ability to induce apoptosis of pancreatic β -cells [105, 106]. These observations, coupled with the abundance of O-GlcNAc in β-cells, prompted the reasonable hypothesis that STZ induces apoptosis of pancreatic β -cells due to increased O-GlcNAc modification of proteins following inhibition of OGA [102, 103, 107, 108]. The mode of inhibition of OGA has been elusive, with suggestions that STZ is a suicide substrate via S-nitrosylation of a cysteine residue in the active site of the enzyme , or that OGA can assist in the conversion of STZ into a more potent tight binding transition state analogue . A greater body of data, however, supports the ability of STZ to cause apoptosis through other means, such as free radical production (for example nitric oxide from break down of the nitroso moiety) leading to DNA damage [110, 111] and methylation of DNA [112, 113]. The apoptosis-inducing effect of STZ being ascribed to inhibition of OGA was drawn into question when comparisons by two independent groups were made with PUGNAc; both compounds caused increases in cellular O-GlcNAc levels, but only STZ caused cell death [114, 115]. Furthermore, STZ is a low millimolar inhibitor of OGA, and inhibition has been shown to be reversible and hence it is not a suicide substrate as once proposed [70, 116]. The structures of CpGH84 and BtGH84 were later solved in complex with STZ, and showed the absence of any covalent attachment with an active site cysteine residue, and also a lack of conversion into a different compound as once proposed [116, 117]. Further, a recent compelling study showed the galacto-epimer of STZ, which did not inhibit OGA or increase cellular O-GlcNAc levels, induced apoptosis of pancreatic β-cells suggesting the ability of STZ to induce apoptosis is independent of OGA and attributable to the reactive nitrosourea group . The use of STZ in studies of the functional role of O-GlcNAc therefore presents obvious and serious problems and should be discouraged.
PUGNAc is a potent inhibitor of OGA, with a Ki value of around 50 nM [46, 70, 118], but is also a potent inhibitor of the GH20 human β-hexosaminidases. PUGNAc is able to increase cellular O-GlcNAc levels by inhibiting OGA , and as such has been used to implicate aberrant O-GlcNAc cycling as a cause of insulin resistance [23, 119]. Interpretation of the results with this inhibitor are, however, somewhat hampered by its poor selectivity for OGA . The structure of both BtGH84 and CpGH84 have been solved in complex with PUGNAc [24, 98]; in each case the N-phenylcarbamate moiety forms hydrogen bond interactions with the active site residues, including one between the exocyclic oxime nitrogen atom and the catalytic acid/base residue, and the phenyl moiety, which protrudes from the active site pocket, takes advantage of hydrophobic interactions with aromatic side chains. It is interesting to note that 2-acetamido-2-deoxy-d-gluconhydroximo-1,5-lactone (LOGNAc), which is similar to PUGNAc but lacks the phenyl carbamate group, is around 30 times less potent  demonstrating that this moiety makes an important contribution to the binding. The use of PUGNAc in studies has very recently been questioned, however, as it has been found to have a number of off-target effects; not only can it inhibit β-hexosaminidases, which leads to an increase in ganglioside levels that could impact insulin resistance , but it has also been shown to inhibit enzymes from other enzyme families  and impact cell growth . The results of a study by Macauley et al. show that a highly selective inhibitor of OGA (see below) did not induce insulin resistance in 3T3-L1 adipocytes as PUGNAc did . In addition, PUGNAc appears unable to cross the blood-brain barrier .
Despite the prolonged use of STZ and PUGNAc as inhibitors of OGA both in vitro and in cells to induce a diabetic or insulin resistant state, it is apparent that both compounds have off-target effects and interpreting the results obtained using either compound should be done with caution. The deduction of the substrate-assisted mechanism led to the use of 1,2-dideoxy-2′-methyl-α-d-glucopyranoso-[2,1-d]-Δ2′-thiazoline (NAG-thiazoline)  as an inhibitor of OGA . This is a mimic of the oxazoline intermediate (or a closely related transition state), with a sulphur atom in place of the oxygen in the oxazoline ring, and had been shown to be a good inhibitor of GH20 enzymes which use the same mechanism [54, 124]. NAG-thiazoline inhibits OGA with a Ki value of around 70 nM, which is similar to the value measured against the human β-hexosaminidase and so, as seen with PUGNAc, this inhibitor shows little selectivity . The structure of BtGH84 in complex with NAG-thiazoline, where the pyranose ring adopts a slightly distorted chair (4C1) conformation, shows a hydrogen bond interaction between Asp242 (predicted to be the polarizing residue) and the nitrogen of the thiazoline ring . The issue of finding a selective inhibitor for OGA over the β-hexosaminidase was addressed in the design of a series of inhibitors where an aliphatic chain of increasing length was added to the thiazoline ring of NAG-thiazoline . At the time this showed considerable foresight as the bacterial homologue structures had not yet been elucidated. The idea was based on the knowledge that the Taft-like analysis suggested the bulk of the N-acyl group may have been important for the substrate recognition in OGA, that GH20 enzymes possessed a small pocket which fitted the acetamido group snugly, and that OGA could process a chromogenic substrate bearing an azidoacetyl group . Indeed, as the aliphatic chain on the thiazoline ring was increased, the inhibitors were shown to dramatically increase in selectivity for OGA over the β-hexosaminidase, with only a small penalty in the potency of the compounds, and also were able to increase cellular O-GlcNAc levels with no effect on cell growth or morphology . The structural solution of two bacterial homologues of OGA later corroborated these biochemical findings, and revealed that indeed they possessed a deeper pocket into which the 2-N-acyl group protrudes [97, 98] (Fig. 3c). The observation of the thiazoline ring bearing an N-butyl chain (1,2-dideoxy-2′-propyl-α-d-glucopyranoso-[2, 1-d]-Δ2′-thiazoline; NButGT) in the active site of BtGH84 showed it bound in an identical position to NAG-thiazoline, with the acyl chain nestled in the active site pocket ; this compound has a Ki value against OGA of 230 nM and displays a 1500-fold selectivity over β-hexosaminidase. Efforts were made to exploit PUGNAc in a similar way with elongation of the aliphatic chain from the N-acyl moiety; there were some modest gains in selectivity for OGA over the lysosomal β-hexosaminidase, but were accompanied by a sacrifice in potency, and these compounds were significantly less potent than the thiazoline series [127, 128]. Others have also functionalized thiazoline inhibitors; a number of these also show apparent high potency coupled with good selectivity for OGA . Arguably the most successful attempt to generate more potent and selective inhibitors for OGA came with the advent of 1,2-dideoxy-2′-ethylamino-α-d-glucopyranoso-[2,1-d]-Δ2′-thiazoline (Thiamet-G) . It was speculated that a compound in which the thiazoline ring had a higher pKa would be protonated at physiological pH (NAG-thiazoline itself has a pKa of approximately 5) and make more favourable electrostatic interactions in the active site with the carboxylate of the polarizing residue; thus a compound similar to NButGT with the substituent next to the thiazoline ring linked via a nitrogen atom was designed. It was proposed this would increase the basicity of the endocyclic nitrogen atom, as well as retain the steric considerations required for selective binding. A highly potent inhibitor resulted, with a Ki value of 21 nM and a dramatic 37,000-fold selectivity over the lysosomal β-hexosaminidase . The structure of Thiamet-G in complex with BtGH84 shows that Asp242 (the key residue important for polarizing the acetamido group during catalysis) engages in favourable electrostatic interactions with both the endocyclic and exocyclic nitrogen atoms in the inhibitor (Fig. 3b); a consequence of these interactions is the subtle rearrangement of some of the active site residues which contracts the active site and draws the enzyme and inhibitor closer together . Thiamet-G is capable of increasing O-GlcNAc levels in cells and in vivo, including in the brain as it is able to cross the blood-brain barrier, and in addition shows greater chemical stability than the other thiazoline analogues. As such it is able to increase the level of O-GlcNAcylation on the protein tau, and thus cause a decrease in phosphorylation; an effect that may enable a better understanding of the hyperphosphorylation of this protein in various neurodegenerative diseases .
A different approach to the development of highly potent OGA inhibitors has been based on the naturally occurring β-hexosaminidase inhibitor nagstatin . Following its discovery, nagstatin was synthesized  and the scaffold manipulated to make potent β-glucosidase inhibitors [133, 134]. These tetrahydroimidazopyridine compounds were substituted with a number of different functional groups in order to promote favourable interactions in the active site of enzymes of interest [135-138], and these have proven to be among the most potent β-glycosidase inhibitors studied . The potency of the tetrahydroimidazopyridine compounds are attributed to the protonated nitrogen atom in the imidazole forming a strong hydrogen bond interaction with the general acid/base residue, and the pseudo-glycoside ring adopting a half chair/envelope conformation which mimics the proposed oxocarbenium transition state . The gluco-epimer of nagstatin inhibited OGA with a Ki value of 420 nM, but was more potent against β-hexosaminidase . Subsequent development of a PUGNAc-imidazole hybrid produced a low micromolar inhibitor of human OGA. The PUGNAc-imidazole hybrid was observed by crystallography in complex with BtGH84 and when overlapped with the CpGH84 complex with PUGNAc showed the phenyl ring was forced into a position where it was more solvent exposed and made fewer interactions with the enzyme, and that the amide carbonyl was in the opposite orientation which prevented it from forming a hydrogen bond with the general acid/base residue . Along similar lines, the van Aalten group designed GlcNAc-configured nagstatin derivatives (GlcNAcstatins) such as GlcNAcstatin C, which bears the tetrahydroimidazopyridine scaffold with a phenethyl functional group attached, and an isobutanamido moiety at C2. This was shown to be low picomolar inhibitor against CpGH84 , reported to be a 4 nM inhibitor of human OGA, and found to display 160-fold selectivity over the lysosomal β-hexosaminidase . In complex with CpGH84, GlcNAcstatin C was observed in an envelope conformation and the imidazole made a hydrogen bond interaction with the general acid/base residue as predicted. The phenethyl group extended out of the active site and formed hydrophobic interactions with a tryptophan residue . GlcNAcstatin D, which bears an N-propionyl group at C2, has a Ki value of 0.74 nM but only 4-fold selectivity over β-hexosaminidase; it is shown to bind in the same manner as GlcNAcstatin C . Both compounds were shown to be able to increase cellular O-GlcNAc levels [142, 143].
Several other unrelated inhibitors have been reported in the literature. These include alloxan, an analogue of uracil which is also capable of inhibiting OGT (see below) , but has some toxic effects and should therefore be used with caution , α-linked GlcNAc thiosulfonates, which appear to be better inhibitors of the short isoform of OGA than the full length enzyme , 7-membered ring azepanes , and an analogue of 6-epi-valienamine bearing an acetamido group . The observation of two of the azepane compounds and the 6-epi-valienamine analogue in the active site of BtGH84 showed they all bound in an unusual distorted conformation which differs from their predicted conformation in solution [147, 148].
The question of which glycosidase inhibitors are true mimics of the transition state has started to be addressed in recent years (for some recent examples see refs. [139, 149, 150]). Pauling first proposed in the 1940s that the highest affinity inhibitors for an enzyme would be those containing features of the transition state [151, 152]. More recently, Wolfenden demonstrated that glycoside-degrading enzymes can enhance the rate of catalysis 1017 fold over the uncatalysed reaction, making them a remarkably proficient class of enzyme . If binding of a glycosidase to a transition state analogue could capture all of the binding energy proposed to be available by Wolfenden, this would translate to an estimated dissociation constant for the transition state of approximately 10−22 M, an obviously extraordinary affinity [153, 154]. Therefore if inhibitors were designed which truly mimicked the transition state, in theory a significant portion of this binding potential could be harnessed for potent inhibition. In the case of glycosidases, the widely proposed oxocarbenium ion-like transition state possesses double bond character along the C1-O5 bond causing the C1, C2, O5 and C5 atoms of the pyranose ring to lie in a plane. Accompanying this flattening of the pyranose ring would be significant positive charge delocalization along the C1-O5 bond [155, 156]. A study by Whitworth et al. addressed this issue of transition state mimicry for the OGA inhibitors PUGNAc and NAG-thiazoline using linear free energy relationships . Such linear free energy relationships predict that, for a genuine transition state analogue, alterations made in the chemical structure of the substrate should perturb the transition state for the enzyme catalyzed reaction to an equivalent extent as parallel and analogous changes to a transition state analogue affect its binding to the enzyme. This can be assessed by introducing systematic changes at one position in an inhibitor and measuring the resultant Ki value, and also making the equivalent change in the substrate and determining the kcat/KM value, which reflects on the transition state structure in the first step of catalysis. Assuming these modifications do not change the mode of substrate binding or reactivity, or the rate-determining step during hydrolysis, the plot of the inverse logarithm of Ki against the inverse logarithm of kcat/KM should give a correlation with a slope equal to one, if the inhibitor is a transition state mimic . The correlation for OGA was assessed using series of inhibitors and substrates which varied in the volume of the alkyl group in the N-acyl moiety at the C2 position. The resulting plots of inverse logarithm Ki against inverse logarithm kcat/KM for the NAG-thiazoline series revealed a strong correlation, with a slope of one within error, indicating that it is a true transition state mimic. A different case, however, was observed for the PUGNAc series, which showed a weaker correlation suggesting this inhibitor may be a worse transition state analogue . These observations were surprising given the sp2 anomeric centre in PUGNAc would be predicted to be more ‘transition state-like’ than the sp3 centre in NAG-thiazoline, which may be considered more closely resembling the oxazoline intermediate, at least from a geometric perspective. It has been postulated this apparent anomaly may be attributable to the longer C-S bond length in the thiazoline ring of the inhibitor as compared to that of the C-O bond in the oxazoline ring of the intermediate. This increased length results in some distortion of the pyranose ring and might mimic the partial, but significant, bond order between the nucleophilic carbonyl oxygen atom and anomeric carbon of a very late transition state in which the acetamido group is heavily involved .
Theoretical studies have also been devoted to understanding the binding of PUGNAc and NAG-thiazoline to OGA. In one such study, molecular dynamics methods were used to investigate the conformational similarities between inhibitors and a calculated transition state for the uncatalyzed formation of the oxazoline intermediate from N-acetylglucosaminides . These simulations suggested that NAG-thiazoline was a poor conformational mimic of the transition state, but that GlcNAcstatin C was a better analogue of the modelled solution phase reaction. PUGNAc showed a high degree of conformational flexibility, and appeared to be only planar at C1 rather than C1 and O5 as would be likely at the transition state . Electrostatic potentials have also been calculated for PUGNAc and NAG-thiazoline, and predictions made about the likely interactions with active site residues that drive binding of inhibitors suggest these may influence the design of inhibitors in the future .
OGT was first purified from rat liver, where it was suggested to form a heterotrimer consisting of two 110 kDa subunits and one 78 kDa subunit . Subsequently, the genes encoding a 110 kDa polypeptide (1046 amino acids) in rat, human, and C. elegans were cloned and the proteins over-expressed recombinantly [35, 161]. OGT is found in all tissue types, although there are at least four transcripts or isoforms in rat and human, some of which appear to be tissue-specific [35, 161, 162]; the 110 kDa polypeptide itself, however, is observed in virtually every tissue examined , with particularly high levels in pancreatic tissue . A splice variant encoding a smaller OGT isoform has also been identified which is localized to the mitochondria [163, 164]. OGT was shown to be present, both endogenously and when transfected into cells, in the nucleus and cytoplasm; this observation is consistent with the presence of O-GlcNAc on nucleoplasmic proteins and the localization of OGA, as well as with the fact that OGT possesses a nuclear localization sequence [35, 161]. It was shown, by recombinantly expressing the 110 kDa subunit of OGT (which exists as a homotrimer ) in Escherichia coli, that this protein product alone was capable of modifying targets in vitro [165, 166]. OGT itself can be modified with O-GlcNAc, which may be one way in which cycling is regulated in vivo, and can also be tyrosine phosphorylated . In addition it has been proposed that OGT possesses a phosphoinositide-binding region . The gene encoding OGT has been shown in mammals to be vital for embryonic stem cell viability and embryonic development .
Sequence homology has shown OGT comprises two domains; an N-terminal domain consisting of 13.5 tetratricopeptide repeat (TPR) motifs (residues 1-464) [35, 161, 167, 168] and a C-terminal domain where the glycosyltransferase activity resides [165, 166]. The TPR domains likely modulate binding to a number of different protein substrates or accessory proteins , and have been demonstrated to interact with a variety of proteins including, for example, a histone deacetylation complex by binding to the corepressor mSin3A . The TPR motif is found as a domain in many proteins, and appears to be ubiquitous through many organisms and cellular locations . These repeats play an important role in modulating protein-protein interactions in a range of cellular processes including protein transport, transcription and cell cycle control [167, 168, 170]. The TPR motif itself consists of a 34 amino acid repeat, which contains eight residues throughout the sequence that are highly conserved, and the rest tend to be highly consistent in residue ‘type’ [167, 168]. The TPR motifs are often found in serial arrays having a different number of repeats (usually between 3 and 16), which structurally form a superhelix that is responsible for mediating the interactions with other proteins. The helix has a concave face which presents the precise binding residues defined in the overall structure to the target in a specific fashion . It is likely to be the number of repeats, and hence the size of the TPR groove(s) generated, that determines the number of different protein targets with which the TPR domains can interact . Consistent with the presence of so many TPR domains on OGT, numerous binding partners of OGT have been proposed [169, 171-175]. Interestingly, OGT does not appear to recognize a consensus sequence directing modification of target proteins, although a preference for Pro-Val-Ser-Thr and Thr-Thr-Ala has been proposed based on proteomics studies in which several O-GlcNAc sites were identified on a number of different proteins . Systematic amino acid variation of residues surrounding an O-GlcNAc site on an α-crystalline peptide has also suggested OGT has some modest preferences for particular residues at certain positions . The TPR domains have also been found to play a role in the activity of OGT. The deletion of a number of the TPR domains from the N-terminal end of OGT demonstrated that some, but not all, were required for enzymatic transfer of O-GlcNAc onto peptides; those missing the first three or six TPR domains had full activity but this region aided trimerization of OGT, whereas those missing the first nine or eleven TPR domains were catalytically inactive . Lubas and Hanover report a difference in the activity with TPR deletion mutants between peptide and protein substrates, although they appear to have used the mitochondrial OGT which has only nine TPR domains. They found that while the N-terminal deletion of the the three most N-terminal TPR domains resulted in an enzyme having unchanged OGT activity on peptide substrates, OGT activity of the same construct measured using nucleoporin p62 (np62) as a substrate was impaired, suggesting these TPR domains are involved in substrate recognition . In a different study, the interaction of OGT (110 kDa polypeptide) with the interaction domain (OID) of OIP106 (OGT-interacting protein, 106 kDa ) was investigated. If OGT was missing the most N-terminal 5.5 TPR domains it was unable to interact with OID (and was catalytically inactive), whereas if it was missing only the first 2.5 TPR domains it was still able to bind, yet remained catalytically impaired . Recombinantly expressed and purified TPR domains of OGT alone were capable of inhibiting OID glycosylation by competing for the same binding site on OGT; the same is not true, however, for a peptide substrate, which reaffirms the role of the TPR domains as a docking or binding domain . In addition, OID and np62 also compete with each other in in vitro glycosylation assays using OGT . More recently, an appealing hypothesis has emerged that OGT may also be positively regulated through its interaction with other protein partners such as PGC1-α , mSin3A , myosin phosphatasetargeting subunit 1 , co-activator-associated arginine methyltransferase 1 , OIP106  and p38 MAP kinase .
OGT is classified into family GT41 of the CAZy system , which it currently shares with around 270 prokaryotic sequences and 70 other eukaryotic sequences. OGT catalyzes glycosyl transfer with inversion of stereochemistry to generate the β-O-GlcNAc linkage to serine and threonine residues of substrate proteins. The amino acid sequence identity of human OGT with OGT proteins from other higher eukaryotes is very high (>95%), but much lower when compared with prokaryotic sequences encoding related proteins (around 35-40% at most). Mechanistic studies of OGT lag behind those of OGA and related homologues, which reflects the nature of the carbohydrate enzyme field in general [179, 180]. Primarily, the relative lack of knowledge about glycosyltransferases stems from the fact that many of these enzymes are larger in size than hydrolases and/or membrane bound, which makes them harder to express recombinantly, characterize, and crystallize for structural studies. In addition, the availability of small molecule tools or probes, such as synthetic substrates and inhibitors, is more limited for the glycosyltransferases.
The Leloir donor substrate of OGT is uridine diphospho-N-acetylglucosamine (UDP-GlcNAc) , which is the end product of the hexosamine biosynthetic pathway. It is estimated that 2-3% of the total incoming glucose to the cell is diverted from catabolism into this pathway to form UDP-GlcNAc . OGT shows an optimum catalytic activity at pH 6, but below pH 6 and above pH 7.5 enzyme activity decreases significantly . Unlike many other glycosyltransferases, OGT does not require the presence of a divalent metal ion for efficient catalysis [34, 160]. Early kinetic studies of OGT revealed unusual Michaelis-Menten kinetics for this enzyme. When using UDP-GlcNAc and a peptide substrate containing an endogenous O-GlcNAc-modifiable site, no saturation was observed and three KM values for UDP-GlcNAc were reported (6, 35, and 217 μM). In addition, it was observed that OGT had a different affinity for peptide substrates at different UDP-GlcNAc concentrations, suggesting that UDP-GlcNAc levels can differentially modulate the affinity OGT has for different peptide acceptor sites . Others, however, have observed that when a protein such as np62 is used as the acceptor, only one KM for UDP-GlcNAc is observed (KM for UDP-GlcNAc of 0.5 μM; KM for np62 of 1.2 μM); these kinetics appear to have reported on the mitochondrial OGT however, and so a direct comparison may not be entirely appropriate . The use of a whole protein substrate rather than a peptide may, however, be more physiologically relevant since the natural substrates of OGT in vivo are intact proteins. Dissection of the kinetic parameters on OGT using a peptide substrate suggested the enzyme used a random bi-bi mechanism . Further studies are needed to provide greater insight into the mechanisms by which OGT recognizes and acts to modify its protein substrates.
Full length human OGT has, to date, eluded crystallization and structure solution. Efforts to realize the structure of OGT are a topic of intense current interest and a structure would significantly aid attempts at dissecting the mechanistic details and perhaps aid inhibitor design. Jínek and coworkers, however, were able to solve the structure of the TPR domains from OGT , which has offered some insight into the molecular basis for substrate binding and recognition (Fig. 5c). The crystallized protein lacked the catalytic domain and consisted of 11.5 out of the 13.5 TPR domains (residues 16-400) of OGT. This protein construct was shown to contain the region responsible for recognition of np62, as demonstrated by competition assays with the full length OGT. The TPR domains form a homodimer (which differs from the trimeric state proposed for full length OGT ) of right-handed superhelices. Each TPR domain consists of two antiparallel helices, which possess the hydrophobic residues in conserved positions as seen in other proteins containing TPR motifs . The convex face of the superhelices is responsible for dimerization and this process is mediated primarily by hydrophobic interactions; mutation of a tryptophan and isoleucine which appear to be important for this dimerization causes the TPR domains to appear monomeric by size exclusion chromatography (whereas the wild type TPR domains run at a mass consistent with being dimeric) . The majority of the conserved surface exposed residues line the inner concave surface of the superhelix; in particular the central part of the groove is mostly lined with asparagine residues, which form a continuous ‘ladder’ through the superhelix . These conserved asparagine residues on the inner surface of the superhelix bear a marked similarity to the ARM-repeat proteins importin-α and β - catenin; in these proteins the asparagine residues are important for recognition of their target peptides and form bidentate hydrogen bonds with the protein main chain of the substrate [183-185] (Fig. 5d). It may be possible that different pockets of the TPR domains of OGT recognise and bind to different protein substrates; if this is the case the asparagine residues may be important in forming the interaction, but it is likely to be the neighbouring residues that confer specificity .
Despite the lack of a full length OGT structure, there are prokaryotic homologues in GT41 which show significant homology to OGT and, in particular, the C-terminal domain. The structure of a GT41 bacterial homologue of OGT from the plant pathogen Xanthomonas campestis (XcGT41), which has 36% sequence identity to the human enzyme, was published last year by two independent groups [186, 187]. The substrates of the bacterial enzymes are as yet unknown, but it was shown that, like OGT, XcGT41 was able to catalyze the transfer of UDP-GlcNAc to water in vitro . No activity of XcGT41 was detected towards human or bacterial cell lysates or synthetic peptides, but the enzyme may be able to modify a protein in Arabidopsis thaliana cell lysates with O-GlcNAc . The structure reveals that XcGT41 comprises an N-terminal domain consisting of 5.5 TPR domains, and a C-terminal catalytic domain which displays the GT-B topology of glycosyltransferases (Fig. 5a). The GT-B topology comprises two (α/β)-fold domains, between which the UDP-GlcNAc binds; these observations are consistent with previous predictions for this enzyme [188, 189]. The bacterial enzymes, however, lack a 120 amino acid insertion between the two domains at the N- and C-terminus of the GT-B fold, which appears in all mammalian sequences [186, 187]. The structure of XcGT41 in complex with UDP revealed the likely residues which are important for binding of the nucleotide-phosphate donor, and these were highly conserved in the eukaryotic sequences. Systematic mutation of these residues in human OGT support the structural model and revealed residues important for binding UDP-GlcNAc as well as those that are necessary for catalytic activity [186, 187] (Fig. 5b). In addition, a histidine residue (His558 in human OGT, which is equivalent to His218 in XcGT41) was identified as the likely general base for catalysis , based on both the fact that when deleted there is no activity, and by structural homology to other glycosyltransferases having the GT-B fold for which this role has been assigned ; Clarke et al. suggest the base may be a tyrosine residue, however, and a histidine residue may be important for stabilization of the leaving group . Glycosyltransferases that catalyze transfer with inversion of stereochemistry are believed to use a dissociative SN2-like mechanism; an enzymic residue acts as a general base to deprotonate the nucleophile of the acceptor, which facilitates displacement of the activated nucleotide-phosphate leaving group. In a number of glycosyltransferases for which the three-dimensional structure has been elucidated, the general base is predicted to be a histidine, which interacts with an adjacent aspartate or glutamate residue . Further studies will be required to clarify the identities of the catalytic residues and their roles in OGT. Notably, the structure of XcGT41 also reveals an unusual and intimate relationship between the TPR domains and the C-terminal glycosyltransferase domain. The last TPR motif is atypical, which allows a large contact area with the enzymatic domain and orients the catalytic domain to enable the superhelical groove of the TPR domains to be continuous with the active site cleft [186, 187], suggesting that substrates binding to the TPR domains may be positioned within the groove to be exposed to the catalytic machinery.
While the design of potent and selective inhibitors for OGA has flourished in recent years, unfortunately the same cannot be said for OGT. However, the advent of the structure of a bacterial homologue of OGT may aid efforts. It has long been established that UDP, UDP-GlcNAc, and UTP are potent (around 200 nM) inhibitors of OGT  (Fig. 6). In addition, alloxan, a uracil analogue, was also demonstrated to be an OGT inhibitor, with complete inhibition at 1 mM and half of the maximum inhibition at 0.1 mM; the mechanism for inhibition is unclear, however, and alloxan may disrupt cysteine residues rather than bind in place of UDP-GlcNAc . Sophisticated high throughput studies have been performed with the smaller splice variant of OGT, which can be over-expressed recombinantly in higher yield than the full length protein, in order to find potent inhibitors. Compounds were screened using a fluorescent UDP-GlcNAc analogue displacement assay, which led to the exciting discovery of three structurally unrelated compounds that inhibit OGT . It is, as yet, unknown if these inhibitors act in vivo, but if so, they could prove to be a useful tool to decrease O-GlcNAc levels and study the physiological role of O-GlcNAc. These molecules are poorly water soluble but generation of derivatives having improved physical properties should be feasible. Others have explored a C1 phosphonate analogue of UDP-GlcNAc as an inhibitor of OGT, but this proved to be a poor inhibitor with an IC50 of more than 5 mM, suggesting the presence of the glycosidic oxygen in UDP-GlcNAc is important . Despite the poor affinity, this compound was successfully soaked into crystals of XcGT41; the solution of the structure in complex with the inhibitor compared to the apo-structure showed that a loop was positioned differently in each suggesting this loop may act as an ‘active site lid’ .
This review has demonstrated that work involving the O-GlcNAc processing enzymes is highly active at present and reflects the research efforts of a number of groups. There is, however, still a number of unanswered questions. Despite the apparent simplicity of having only one enzyme for the addition and one enzyme for the removal of O-GlcNAc, the system of regulation and substrate specificity may be highly complex. These two enzymes are responsible for the O-GlcNAc cycling on hundreds of nucleocytoplasmic proteins, in the absence of a clear consensus sequence. Some regulatory mechanism is presumably in place to prevent futile O-GlcNAc cycling in vivo, but this is not yet understood, nor are the precise roles of the addition of this small sugar molecule to proteins. In addition it would be interesting to address why prokaryotes possess enzymes homologous to OGA and OGT, yet do not appear to have the O-GlcNAc modification of proteins; presumably they have gained (and retained) the genes encoding these proteins from the their eukaryotic hosts for some, as yet unexplained, reasons.
The OGA field has flourished in recent years, with the deduction of the catalytic mechanism, the generation of highly potent and selective inhibitors, and the solution of the structures of two bacterial homologues of this enzyme. The question remains, however, of the role played by the C-terminal region of OGA which is absent in the bacterial sequences. Although it has been suggested this region has the ability to acetylate histones, this function has not been related to its OGA activity, and one cannot help but think it might also play another role; particularly since confirmation of this activity has proven elusive. In addition, the structures have not really aided in the knowledge of how the protein substrate is recognised or binds to OGA, which may be useful for understanding how this enzyme is capable of modifying such a wide array of substrates. Ultimately the structure of the human OGA may address both of these questions.
Studies of OGT do lag behind those of OGA, which unfortunately reflects the general situation of glycosyltransferases vs. glycoside hydrolases. The solution of the structure of a bacterial homologue very recently, however, should now drive the field forward in the quest for a more detailed understanding of the catalytic mechanism and could aid design of potent inhibitors. Potent and selective inhibitors, such as those that have now been developed for OGA, will be an invaluable tool to use in cells to alter O-GlcNAc levels, which will ultimately aid elucidation of the way the enzyme works and is regulated.
T. M. G. is a Sir Henry Wellcome postdoctoral fellow and a Michael Smith for Health Research (MSFHR) trainee award holder. D. J. V. is a scholar of the MSFHR and holds a Canada Research Chair in Chemical Glycobiology.