The ability to sense and respond to nutrient levels is critical for the growth of all living systems. In eukaryotes, a major mechanism for nutrient sensing involves the essential11
protein glycosyltransferase OGT, which senses cellular glucose levels via UDP-GlcNAc concentrations, and responds by dynamically O
-GlcNAcylating a wide range of nuclear and cytoplasmic proteins.1,12
These include proteins involved in insulin-like signaling pathways7
and transcriptional activators that regulate glucose levels by controlling gluconeogenesis13
. Since many known O
-GlcNAcylation sites are also phosphorylation sites, OGT is proposed to play a major role in modulating cellular kinase signaling cascades14
. OGT is also involved in widespread transcriptional regulation15–17
. Prolonged hyperglycemia, such as occurs in diabetes, or excessive glucose uptake, such as occurs in cancer cells, results in hyper-O
-GlcNAcylation of cellular proteins by OGT, and this increased O
-GlcNAcylation has been linked to harmful cellular effects18
. Thus, strategies to modulate OGT activity may have therapeutic value for treating diabetic complications, cancer, and other diseases13
The lack of a crystal structure has been a major impediment to investigating OGT’s molecular mechanisms, understanding substrate recognition, and developing inhibitors. OGT comprises two distinct regions: an N-terminal region consisting of a series of tetratricopeptide repeat (TPR) units19,20
and a multidomain catalytic region. The TPR domain is proposed to scaffold interactions with other proteins, which may play a role in determining substrate selectivity21
. A crystal structure comprising 11.5 TPR units of human OGT was reported in 200421
, but there have been no structures of the catalytic region. From sequence analysis and structures of bacterial glycosyltransferases22–26
, including a bacterial homolog of unknown function25,26
, OGT was predicted to be a member of the GT-B superfamily of glycosyltransferases (Gtfs)27
. However, OGT is unusual because it is the only known member to glycosylate polypeptides and it contains a long uncharacterized intervening sequence (~120 amino acids) in the middle of the catalytic region. It is also proposed to contain a phosphatidylinositol (3,4,5)-trisphosphate (PIP3
) binding domain involved in membrane recruitment in response to insulin signaling7
We report two crystal structures of a human OGT construct (hOGT4.5
) containing 4.5 TPR units and the catalytic domain. The catalytic properties of this construct are similar to those of the full-length enzyme (Supplementary Fig. 1
. One structure (2.8 A, referred to as OGT-UDP) is a complex with UDP; the other structure (1.95 A, referred to as OGT-UDP-peptide) is a complex containing UDP and a well-characterized 14 residue-CKII peptide substrate28
. Based on currently available experimental data, we also present a model for the full-length enzyme (see Supplementary Information
). Details of structure determination are presented in Methods and Supplementary Tables 1 and 2
The OGT-UDP complex is shown in . The catalytic region contains three domains: the N-terminal domain (N-Cat), the C-terminal domain (C-Cat), and the intervening domain (Int-D) (). The N-Cat and C-Cat domains have Rossmann-like folds typical of GT-B superfamily members; however, the N-Cat domain is distinctive in containing two additional helices, H1 and H2, which form an essential part of the active site (). The Int-D domain, which has a novel fold, packs exclusively against the C-Cat domain (). The UDP moiety binds in a pocket in the C-Cat domain near the interface with the N-Cat domain27
. This pocket is lined with conserved residues shown to be important for catalytic activity (Supplementary Table 3
. A transitional helix (H3) links the catalytic region to the TPR repeats, which spiral along the upper surface of the catalytic region from the C-Cat domain to the N-Cat domain. The TPRs and the catalytic region are demarcated by a narrow horizontal cleft.
Overall structure of human OGT complexed to UDP
The OGT-UDP-peptide complex (), which crystallized in a different space group from the OGT-UDP complex, has a wider cleft between the TPR domain and the catalytic region than the OGT-UDP complex ( and ), and the CKII peptide binds in this cleft. This peptide, YPGGSTPVS
*SANMM, contains three serines and a threonine, but only one serine (underlined; referred to as Ser*) is glycosylated by OGT28
. The hydroxyl of Ser* points into the nucleotide-sugar binding site (). The two residues N-terminal to Ser* lie over the UDP moiety; the residues C-terminal to Ser* traverse towards the back of the cleft along the H2 helix of the N-Cat lobe. Although OGT glycosylates a wide range of target peptides, it prefers sequences in which the residues flanking the glycosylated amino acid enforce an extended conformation (e.g.,
prolines and β-branched amino acids; see Supplementary Fig. 2
and Supplementary Table 4
). Consistent with these preferences, the peptide is anchored mainly by contacts from OGT side chains to the amide backbone, with an additional contact from the UDP moiety to the backbone amide of Ser*. The cleft is also filled with ordered water molecules, enabling it to serve as an adaptable interface to bind a range of polypeptides containing side chains of different sizes and hydrogen bond properties. Since the peptide substrate is anchored by contacts to its backbone, it is reasonable to infer that protein substrates are glycosylated on flexible regions such as loops or termini that can bind in an extended conformation, exposing the amide backbone.
Structure of the OGT-UDP-Peptide Complex
The closed conformation of the substrate-binding cleft in the OGT-UDP structure is stabilized by a ‘latch’ comprising contacts between TPRs 10/11 and the H2 helix of the catalytic domain ( and Supplementary Fig. 3
). Opening of the cleft in the OGT-UDP-peptide complex occurs due to a hinge-like motion around a pivot point between TPRs 12 and 13. The two structures suggest that glycosylation substrates enter the active site from the face of the enzyme shown in , with the TPR domain restricting or allowing access depending on its conformation and its interactions with the catalytic domain. Molecular dynamics simulations indicate that the ‘hinge’ between the catalytic domain and the TPR domain is capable of large motions that fully expose the active site, which would allow protein substrates to approach closely enough for surface loops to enter (Supplemental Movie 1
). The molecular mechanisms that facilitate or stabilize opening of the cleft to allow access of protein substrates remain to be determined, but may involve interactions between protein substrates or adapter proteins and the other regions of OGT.
The OGT-UDP-peptide complex, in addition to revealing how peptide substrates bind, provides unexpected insights into the kinetic mechanism. OGT was previously proposed to have a random sequential bi-bi mechanism.28
The structure, however, indicates that the peptide substrate binds over the nucleotide-sugar binding pocket, blocking access to it. Moreover, the α-phosphate of the UDP moiety contacts the backbone amide of Ser* (), which helps orient the peptide. The peptide complex suggests an ordered mechanism in which UDP-GlcNAc binds prior to the polypeptide substrate. To assess the order of substrate binding, we analyzed the product inhibition patterns for UDP. At saturating peptide concentrations, a competitive inhibition pattern was obtained for UDP with respect to UDP-GlcNAc, which is inconsistent with a random mechanism, but supports the ordered sequential bi-bi mechanism implied by the crystal structure (Supplementary Fig. 4
Another insight from the crystal structure is the identity of the catalytic base. Based on analyses of other GT-B family members, including the bacterial OGT homolog, it was proposed that His558 is the catalytic base. Although we have verified that this residue is critical for catalytic activity, the peptide complex shows that it is more than 5 A away from the reactive serine hydroxyl and makes an apparent hydrogen bond with the backbone carbonyl of the preceding residue. In contrast, His498, which is invariant in metazoan OGTs but absent in the homologous bacterial enzyme, protrudes from helix H1 into the active site within 3.5 A of the Ser* hydroxyl. Since His498 is critical for activity and is located between the reactive serine hydroxyl and the GlcNAc binding pocket, it is the probable catalytic base in OGT.
We were unable to obtain a crystal of the OGT-UDP-GlcNAc complex due to hydrolysis of the substrate, but according to computational docking experiments the GlcNAc is oriented in a manner that exposes its β face to the overlying peptide (Supplementary Fig. 5
) and places the anomeric carbon near the reactive serine. This conformation is similar to the UDP-GlcNAc conformation observed in a complex of another GT-B family member23
, and its relevance is supported by evidence that the C2 N-acetyl moiety projects up from the OGT sugar binding pocket29
. Furthermore, it is consistent with the enzymatic reaction, which involves displacement of the α-UDP group to yield an inverted product. Based on the accumulated biochemical and structural data, we propose a general mechanism for the reaction ().
The most unusual feature of OGT is the intervening domain between the catalytic lobes, which is only found in metazoans (Supplementary Figs. 6 and 7
). This polypeptide adopts a topologically novel fold with a seven-stranded beta sheet core stabilized by flanking alpha helices (). There are two long unstructured loops for which electron density is missing. An electrostatic surface rendering shows that the intervening domain and an adjacent helix of the C-Cat domain form a large basic surface comprising ten lysine residues (), including K981 and K982, which were previously reported to constitute part of a PIP3
binding motif that recruits OGT to membranes7
. We mutated eight of these ten lysines in various combinations (Supplementary Table 3
). All mutants were catalytically active (Supplementary Figure 8
), but we were unable to identify a role for the Int-D domain in PIP binding (Supplementary Table 5
). We suggest this domain is involved in other functions in vivo
. These functions may include substrate selection, cellular localization, or interactions with regulatory factors or receptors. The reported structures and mutant data provide a crucial starting point for investigating the possible roles of the intervening domain.
Structure of the intervening domain and full length models of human OGT
The structures reported here show how OGT recognizes peptide sequences and provide new information on the enzymatic mechanism as well as a view of the intervening domain. Models of full-length human OGT in its open and closed states, constructed based on crystal structures and MD simulations, highlight the conformational changes that may regulate access of substrates to the active site (). Our structures may assist in the development of inhibitors with possible therapeutic value for treating diseases associated with excessive O-GlcNAcylation.