|Home | About | Journals | Submit | Contact Us | Français|
O-glycan biosynthesis is initiated by the transfer of N-acetylgalactosamine (GalNAc) from a nucleotide-sugar donor (UDP-GalNAc) to Ser/Thr residues of an acceptor substrate. The detailed transfer mechanism, catalyzed by the UDP-GalNAc polypeptide:N-acetyl-α-galactosaminyltransferases (ppGalNAcTs), remains unclear despite structural information available for several isoforms in complex with substrates, at various stages along the catalytic pathway. We used all-atom molecular dynamics (MD) simulations with explicit solvent and counterions to study the conformational dynamics of ppGalNAcT-2 in several enzymatic states along the catalytic pathway. ppGalNAcT-2 is simulated both in the presence and absence of substrates and reaction products to examine the role of conformational changes in ligand binding. In multiple 40 ns–long simulations of more than 600 ns total run time, we studied systems ranging from 45,000 to 95,000 atoms. Our simulations accurately identified dynamically active regions of the protein, as previously revealed by the x-ray structures, and permitted a detailed, atomistic description of the conformational changes of loops near the active site and the characterization of the ensemble of structures adopted by the transferase complex on the transition pathway between the ligand-bound and ligand-free states. In particular, the conformational transition of a functional loop adjacent to the active site from closed (active) to open (inactive) is correlated with the rotameric state of the conserved residue W331. Analysis of water dynamics in the active site revealed that internal water molecules have an important role in enhancing the enzyme flexibility. We also found evidence that charged side chains in the active site rearrange during site opening to facilitate ligand binding. Our results are consistent with the single-displacement transfer mechanism previously proposed for ppGalNAcTs based on x-ray structures and mutagenesis data and provide new evidence for possible functional roles of certain amino acids conserved across several isoforms.
Glycosyltransferases (GTs; EC 2.4.x.y) are a large family of enzymes involved in the biosynthesis of oligosaccharides, polysaccharides, and glycoconjugates.1 GTs catalyze the transfer of the glycosyl moiety from a sugar donor to an acceptor. In most cases, the donor is a nucleoside phosphosugar and the acceptor is a hydroxyl group of another sugar, an amino acid, or a lipid. GTs are classified as either retaining or inverting based upon the stereochemistry of the anomeric carbon atom of the transferred sugar relative to its configuration in the donor substrate.
UDP-N-acetylgalactosamine:polypeptide N-acetylgalactosaminyltransferases (ppGalNAcTs, EC 18.104.22.168, family 27 in CAZy2) are retaining GTs. They initiate synthesis of mucin-type glycoproteins by transferring the GalNAc moiety from the sugar donor UDP-GalNAc to Ser or Thr residues, to form the Tn antigen3 (GalNAcα1-O-Ser/Thr). This family is large, with up to 20 members in humans, and evolutionarily conserved, but the polypeptide substrate preferences of individual isoforms have not been elucidated, despite structural information determined for several isoforms.
The recently solved crystal structures of three isoforms of the ppGalNAcT family capture different stages of the reaction pathway: murine T14 (PDB code 1XHB – apo form), human T25 (PDB code 2FFV – binary complex with UDP, and PDB code 2FFU – ternary complex with UDP and the acceptor peptide EA2); and human T106 (PDB code 2D7I – complex with hydrolyzed UDP-GalNAc and PDB code 2D7R – complex with GalNAcα1-O-Ser). These structures indicate the presence of two domains connected by a flexible linker that permits significant changes in their relative orientations. The catalytic domain belongs to the GT-A fold, whereas the carbohydrate-binding lectin domain (LD) adopts a β-trefoil fold, as classified in the Conserved Domain Database.7 The enzymatic activity of the catalytic domain depends on the presence of a Mn2+ ion coordinated by His and Asp residues, which form the so-called DXH motif of the active site, and by an additional His residue, all conserved throughout the ppGalNAcT family.3
The enzymatic mechanism of retaining GTs is still under debate.3,8,9 However, in the case of ppGalNAcTs a general outline of the process can be proposed, emerging from the available crystal structures as a series of transfer reaction stages listed below.
The enzyme in the apo- form, before any ligand is bound, is described by the mT1 structure. Here, the conformation of the loop bridging opposite edges of the active site (shown as loop B in Figure 1, residues R347-T358 in mT1 and R361-S377 in hT2), is not resolved due to crystallographic disorder.
The enzyme binds the donor substrate UDP-GalNAc, before peptide binding, according to a proposed ordered-sequential mechanism for retaining GTs.10–12 This is illustrated by the hT10 structure (PDB code 2D7I) containing hydrolyzed UDP-GalNAc. Here, loop R361-S377 (loop B in Figure 1) is stabilized in a “closed” conformation at the edge of the active site, forming a lid over the UDP moiety and leaving the active site groove exposed and able to bind the acceptor peptide.
The enzyme, bound with donor and acceptor peptide, is illustrated by the ternary complex of hT2 (PDB code 2FFU). Loop B (Figure 1) maintains its closed conformation, but does not make significant contact with the peptide, as shown by the x-ray structure. Peptide binding is dominated by hydrogen bonds towards the N-terminus and by hydrophobic interactions towards the C-terminus, but the residues mediating these interactions are less conserved, accounting for differences in isoform affinity for different peptides.5 In contrast, UDP-GalNAc interacts with the protein residues mainly via hydrogen bonds, and residues forming its binding pocket are highly conserved among isoforms, accounting for their common preference for UDP-GalNAc as a sugar donor.5
The GalNAc moiety is transferred from the sugar-nucleotide donor to the Ser/Thr residue on the acceptor peptide. A structure containing UDP and a glycosylated peptide in the active site is not available and the residues involved in catalysis have not been clearly identified.
The glycosylated peptide is detached, and the reaction product UDP leaves the active site. This step may be illustrated by the binary complex of hT2 and UDP (PDB code 2FFV), since the orientation of the ribose and uridine moieties of UDP are flipped by ~180° relative to the ternary complex. Loop B (R361-S377) is folded back over the active site, exposing UDP to the bulk water molecules. Comparative structural analysis also indicates that in addition to loop B, the N-terminal segment of the catalytic domain, up to residue D97 in hT2, is highly flexible and undergoes significant conformational changes. However, the relevance of these changes for catalysis is unclear. This stretch is not resolved in mT1, and it forms an α-helix in hT10 whose position relative to the catalytic domain is dramatically altered compared to hT2. Consequently, the conformation of loop A (G89-P98, Figure 1) is changed. Although loop A is not in direct contact with the substrates, it interacts with loop B via hydrogen bonds between residues in the hinge regions (N102-R362 and K103-Q364) and possibly influences its conformation.
The transfer reaction steps described above suggest that by undergoing significant conformational changes, loop A and especially loop B may be involved in positioning the substrates into the active site. Similar mechanisms with concerted conformational changes of loops near the active site upon ligand binding have also been reported for members of other GT families such as: β-(1, 4)-Gal-T113,14 in family 7 of inverting GT, α-(1,3)-galactosyltransferase15,16 and human blood group A and B glycosyltransferases17,18 in family 6 of retaining GT, and α-(1, 4)-galactosyltransferase LgtC19 in family 8 of retaining GT. Loop mobility around the active site may thus be a general feature of the GT-A superfamily but further testing of this hypothesis for other members of the superfamily is needed.
The catalytic mechanism of retaining GTs has not been clearly elucidated yet3,8,9 but several scenarios have been proposed. The first is a double-displacement mechanism, by analogy to the well characterized glycosidases. In the double-displacement mechanism, catalysis requires two nucleophilic active site carboxyl groups and proceeds in two steps that involve the formation and subsequent breakdown of a covalent glycosyl–enzyme intermediate: 1) glycosylation: attack of the anomeric center of the donor sugar by one nucleophile, to form a glycosyl–enzyme species, followed by 2) deglycosylation (hydrolysis): deprotonation of the reactive hydroxyl of the acceptor, performed by the other carboxyl, which serves a dual role as acid-base catalyst.
This mechanism was initially proposed for family 6 of retaining GTs based on a low resolution x-ray structure of α-(1,3)-galactosyltransferase20 showing a carboxylic amino acid residue in a suitable position to act as an enzyme nucleophile. However, it was not confirmed by a subsequent high-resolution structure15 and mutagenesis experiments targeting the proposed nucleophile.21 Also, the x-ray structure of family 8 α-(1, 4)-galactosyltransferase LgtC11 showed no carboxylic amino acids (Asp or Glu) close to the anomeric carbon of the donor substrate and raised further doubts about such a mechanism. A Gln residue was thought to be the enzyme nucleophile and an attempt to trap an enzyme-sugar intermediate did result in such a complex, but surprisingly with the sugar substrate covalently attached to an adjacent Asp residue positioned far, at ~8 Å distance from the reaction center, as shown by the x-ray structure.22
In the case of ppGalNAcTs, a covalent enzyme-sugar intermediate was not present in the x-ray structure of isoform T10 where the sugar-phosphate linkage is hydrolyzed. Moreover, the nearest acidic residues that might function as nucleophiles are D224 of the DxH motif binding Mn2+, and E334 of the WGGEN motif. However, the distances from these residues to the β phosphate oxygen are ~7Å in the x-ray structures. As a consequence, a double displacement mechanism would require a large conformational change to bring these residues in proximity of the β-phosphate oxygen.
Failure to provide clear evidence for a double-displacement mechanism has led to the proposal of an alternative mechanism in which the reaction proceeds via a front-side single displacement, also known as SNi mechanism.8,11 In this “concerted one-step” mechanism the nucleophilic hydroxyl group of the acceptor attacks the anomeric carbon at the same side from which the UDP leaving group departs. This mechanistic view has also been extended to other retaining GTs: GT Family 20 trehalose-6-phosphate synthase OtsA,23 GT Family 4 trehalose phosphorylase24 and GT Family 35 glycogen phosphorylase.25
In this study, we probe the mechanistic aspects related to conformational changes of ppGalNAcT-2 during ligand binding and processing by using all-atom molecular dynamics simulations (MD) in conjunction with the recently determined experimental structures.4–6 Previous MD studies have been used to analyze the flexibility of the above mentioned families 7 and 8 of GTs.13,14,19 Here, we perform the first MD simulations for a member of ppGalNAcTs family (human T2), both in the presence and absence of donor and acceptor substrates. Our MD simulations permit a detailed characterization of the ensemble of structures adopted by the transferase complex on the transition pathway between several experimentally determined ligand-bound and ligand-free states. These simulations reveal large and possibly functional conformational changes of the protein loops near the active site during ligand binding. With regard to possible catalytic mechanisms, we find that throughout our simulations, distances between potential nucleophilic residues and the glycosidic oxygen of the sugar-nucleotide ligand remain large (~7 Å), suggesting that the double-displacement mechanism is less likely than other alternative mechanisms such as SNi.
We investigate also the influence of the UDP-GalNAc and peptide substrates on loop dynamics and the effect of these conformational changes on the mechanism of ligand binding/detachment. Based on our analysis, we discuss a possible binding mechanism that suggests a central role of the evolutionarily conserved residue W331. That residue appears to act as a gatekeeper, controlling the conformational transition of the functional loop B adjacent to the active site, from closed (unable to accept ligand) to open (able to accept ligand). We also analyze the dynamics of water molecules in the active site region, and discuss the possible role of water in the catalytic process. We explore the conformational preferences of the bound acceptor peptide and the stability of its specific interactions with the catalytic domain. This information should prove useful in future sequence-scanning and flexible docking experiments to identify preferences of individual isoforms for peptide substrates and for the design of alternative peptide substrates.
Figure 1 depicts the x-ray structures of ppGalNAcT-2 with PDB codes 2FFU and 2FFV. A comparison of these two x-ray structures indicates that the conformational changes of the catalytic domain associated to the ligand binding and processing are concentrated in two long loops: loop A (residues 89–98) located relatively far from the active site (modeled in 2FFV as in 2FFU); and loop B (residues 361–377) located at one end of the active site. Loop B undergoes a large movement, shifting by as much as 24 Å from one side of the peptide binding groove to the other. In addition, smaller conformational changes of the protein backbone occur in three other short loops: loop C (residues 127–134), an exposed loop placed on the side opposite of the active site; loop D (residues 287–297) forming a short α-helix, located more than 9 Å away from the peptide binding groove, that moves by approximately 5 Å in the absence of the peptide; and loop E (residues 330–333), showing a 4 Å shift. These loops are shown in Figure 1, colored blue in 2FFU and orange in 2FFV. In analyzing our simulations, we are particularly interested in the flexibility of the loops A, B and E and their possible role in ligand binding.
To assess the dynamical properties of ppGalNacT-2, we performed extensive MD simulations of a series of complexes between the enzyme and its different ligands based upon the x-ray structures described above. With solvent molecules treated explicitly, including Na+ and Cl− counter ions, the simulation systems contained up to 95,000 atoms. Simulations of up to 40 ns per trajectory resulted in a total simulation time of more than 600 ns (Table 1). To study the relative dynamics of the two domains our initial simulations included both the catalytic and lectin domains using the 2FFU crystal structure from which UDP and EA2 had been removed. We also simulated the same system in the presence of the substrates, but with the lectin domain positioned near the catalytic domain, as guided by the mT1 structure. This simulation allowed us to test whether the binding of substrates triggered movement of the lectin domain away from the catalytic domain. A comparison of the results of these simulations indicated that the average flexibility pattern of each domain, as measured by RMSD per residue, is independent of the presence of the other domain or the distance between the domains. Thus, to reduce computational requirements, subsequent simulations focused on the catalytic domain to test the influence of substrates on the dynamics of the enzyme at different steps in the catalytic process. In this the second series of MD trajectories, we simulated the catalytic domain as in the ternary complex in the absence of substrates, in the presence of UDP-GalNAc, in the presence of UDP-GalNAc and EA2 (substrate complex), in the presence of UDP and glycosylated EA2 (product complex), and in the presence of only UDP or UDP and EA2. To evaluate the dynamics of the enzyme after the reaction (structure 2FFV), we also simulated the catalytic domain as in the binary complex, either alone or in the presence of the reaction product UDP.
We compared the simulations of the catalytic domain starting from the ternary complex 2FFU in the presence or absence of UDP-GalNAc donor and peptide acceptor and starting from the binary complex 2FFV in the presence and absence of the reaction product UDP.
As a measure of the average flexibility of various regions in the protein, we first calculated for each residue the root-mean-square distance (RMSD) of the alpha carbon (Cα) from the average position and compared it to the corresponding RMSD between the two crystal structures, 2FFU and 2FFV. The results of the simulations done in the presence of UDP-GalNAc and EA2 (C(UG)Px), in the presence of UDP-GalNAc only (C(UG)x) and in the absence of substrates (Cx) are represented in Figure 2. The theoretical and experimental RMSDs agree well and the MD simulations correctly identify the flexible regions in the protein. In particular, the highest peaks in calculated RMSD as a function of residue number correspond to the flexible loops described above.
The highest RMSD (13 Å) is seen in loop A in the absence of ligands. However, when both UDP-GalNAc and the peptide are present, the RMSD of loop A relative to the average position is reduced significantly from 13 Å to 4 Å (Figure 2, peak A). Even though loop A (missing in the 2FFV “open” x-ray structure) was consistently modeled as in the “closed” 2FFU x-ray structure, its dynamics depends not only on the presence of ligands but also the state of loop B. In simulations starting from the structure with loop B in the closed conformation (Figure 3a), loop A appears more stable, because loops A and B interact through a hydrogen bond between the main chain oxygen atom of N102 and the R362 side chain. In the absence of UDP-GalNAc, loop B opens and this interaction is lost. Consequently, loop A becomes more flexible. In simulations starting from the structure with loop B in an open conformation (Figure 3b), this interaction is absent from the beginning and the conformational space sampled by loop A is much larger.
The flexibility of loop B is slightly lower than that of loop A in the absence of substrates but is significantly reduced, from 12 Å to less than 4 Å RMSD, upon UDP-GalNAc binding. However, the subsequent binding of the peptide does not provide additional stabilization. Since this loop interacts directly with the donor and acceptor and its dynamics greatly affects the binding of substrates, its motions will be discussed in more detail in the following section.
In the simulation starting from the closed conformation, but with both substrates removed (Cx), loop B undergoes a conformational transition towards the open state (Figure 4a). This is a slow process that starts almost 1.5 ns after the simulation begins and lasts for about 25 ns. To identify the factors that trigger this large conformational change, we analyzed the residues in the vicinity of loop B. Interacting with its end, we identified W331 (part of the longer, conserved WGGEN motif within loop E) whose side-chain flips by approximately Δχ1~180° around the Cα-Cβ bond after 1.5 ns simulation time. This rearrangement (Figure 4c, d, e) creates an unfavorable steric clash with P366 and Y367, which are both well conserved, and a cluster of bulky residues (H365, F369, F377). The W331 χ1 rotation also appears to trigger the slow movement of loop B towards the opposite side of the peptide binding groove. W331 maintains the flipped conformation until loop B is stabilized in an intermediate conformation at about 8 Å RMSD relative to the completely open state. At this point W331 is no longer interacting with loop B. Thus, its side-chain is free to switch back to the initial conformation (Figure 4e).
To determine whether the opening of loop B observed in the absence of the donor is correlated with the side-chain rotation of W331 or is simply due to the lack of interaction with UDP-GalNAc, we simulated an identical system (i.e., in the absence of the nucleotide-sugar donor) but with the rotameric state of the W331 side chain restrained (Table 1, Cwx). In this simulation, we observe that loop B maintains its original conformation and does not open during the 40 ns-long simulation (Figure 4a), providing further evidence that the change in the W331 side chain orientation is indeed required for loop opening.
Interestingly, the conformational change of the W331 side chain occurs only in the absence of donor UDP-GalNAc. The phosphate groups of either the substrate UDP-GalNAc or the product UDP stabilize the inward orientation of W331 through two hydrogen bonds between their oxygen atoms and nitrogen NE1. In the simulations starting with loop B in its open state and with UDP present (Figure 4b, CopenU), the W331 side chain does not change orientation. In contrast, with the loop initially in the open state but without UDP (Figure 4b, Copen), the side chain behaves similarly to the case with the loop B closed (Figure 4a, Cx). The correlated change in the orientation of the W331 side chain with loop B opening suggests that the W331 amino acid may be facilitating donor access into the active site and product release.
Similar mechanisms, in which access to the active site is controlled by the position of a long loop whose dynamics is in turn correlated with motions in a short and conserved loop, have been described for other families of transferases using different molecules as donors and acceptors. For example, β-(1,4)-Gal-T1 undergoes a transition from an open to a closed conformation upon glycosyl donor substrate binding.26,27 The corresponding Trp residue, W314, is again part of a WGGEN motif within a small flexible loop of that protein. W314 plays an important role in the conformational change of the adjacent long loop, by moving toward the catalytic pocket to interact with the complex of donor substrate and Mn2+ ion upon donor substrate binding.28 This conformational change seems to be essential for the subsequent productive binding of an acceptor sugar substrate, since a W314Y mutation drastically reduced both the binding affinity to the sugar nucleotide and the catalytic activity of the enzyme.29
In molecular dynamics simulations of β-(1,4)-Gal-T1 in both implicit and explicit solution, Gunasekaran et al.13,14 reported tight coupling of motions in a long loop (that would correspond to our loop B) to the structure and motions of the corresponding 314-WGG-316 motif. Mutations of the conserved Trp and the flanking Gly residues to Ala reduced the fluctuations also in the long loop. Overall, the simulation results for β-(1,4)-Gal-T1 are consistent with our results for ppGalNAcT-2. However, in our specific case, the main chain of the WGGEN loop stays relatively rigid and the conformational change appears to be triggered by the side chain of W331 alone. Thus, our simulations suggest a mechanism in which W331 acts as a gatekeeper that controls the transition between the closed (unable to accept ligand) and open (able to accept ligand) conformations of loop B.
The accessible surface area of the UDP-GalNAc binding site doubles during loop opening (Figure 5). The newly exposed surface consists mainly of positively charged residues (R201, R362) and hydrophobic residues, while negatively charged residues do not become exposed. The two arginine residues reorient towards solvent, whereas two His residues (H145, H226) do not change orientation but become exposed due to loop repositioning. Both R201 and R362 are highly conserved in the ppGalNAcTs and interact with the uridine ring and phosphate oxygen atoms of UDP-GalNAc. In addition to their favorable electrostatic interactions with the bound UDP, they may also facilitate donor entry into the active site by making weak long-range contacts in the open, highly flexible state of the loop. The observed flexibility of the unstructured regions in the vicinity of the donor binding site may thus be a relevant factor for protein-substrate recognition.
A manganese ion is required for both the stability and activity of ppGalNAcTs and is bound by residues in the active site motif DxH and an additional conserved His.3,5 Mn2+ bridges the protein side chains with the diphosphate group of the donor phosphate. X-ray structures containing UDP-GalNAc or UDP show that Mn2+ bridges three residues in the active site (2 × His, 1 × Asp) and two oxygen atoms in the phosphate groups of UDP. The presence of one water molecule at a distance of ~2.2 Å results in a coordination number 6 of the Mn2+ ion.
We analyzed the stability of Mn2+ interactions in the presence and absence of substrates and the hydration patterns in the active site, to explore possible implications on the enzymatic mechanism of ppGalNAcTs.
Simulations in the presence of UDP-GalNAc indicate that the distance between Mn2+ and the beta-phosphate-oxygen is maintained essentially constant at ~4 Å, in agreement with the x-ray structures. This is also true for simulations containing free UDP, although the oxygen atoms in the terminal phosphate group interchange positions due to rotation of the second phosphate group.
We find that in simulations in the presence of UDP-GalNAc alone or UDP-GalNAc and peptide, a single water molecule with high residence time remains in the first hydration shell of Mn2+ (Figure 6a, b). Even in the simulations performed in the absence of crystal water, a water molecule from the surrounding bulk solvent takes this role, suggesting the importance of hydration for the Mn2+ ion in the active site.
In the presence of UDP-GalNAc, the water molecule in the first hydration layer of Mn2+ is part of a larger water network formed with molecules in the vicinity of glycosidic oxygen. In the presence of both UDP-GalNAc and the EA2 peptide, the glycosidic oxygen interacts with the OG1 atom of the accepting T7 peptide residue via three water molecules: the first bridges only the glycosidic oxygen and OG1, the second bridges in addition the main chain N atom of the T7 side chain, and the third bridges also the O5* atom of the GalNAc moiety. Taken independently, each of these water molecules have a residence time of around 50% of the simulation time. However, taken together, at least one of these water bridges is present during more than 90% of the simulation time. Therefore, the oxygen of the anomeric carbon and the hydroxyl group of the acceptor Thr are nearly continuously bridged by water.
In the simulations of the catalytic domain without substrates, Mn2+ is 7-coordinated, with 3 water molecules replacing the two oxygen atoms of the phosphate groups. The first hydration layer of Mn2+ contains 4 water molecules with high residence time (Figure 6c), as indicated also by the peak at 2.5 Å in the histogram in Figure 6d. The average coordination geometry of Mn2+ is a pentagonal bipyramid; the coordination partners are the water molecules, oxygen atoms of D224 stably coordinating during the entire trajectory, and nitrogen atoms of H226 and H359 competing with additional water molecules for Mn2+ coordination. In the absence of crystal water, molecules from bulk solvent take the role of solvating the Mn2+ ion. The same change of coordination number of the Mn2+ ion has been described also in the case of another retaining glycosyltransferase, α-(1,4)-galactosyltransferase LgtC from Neisseria meningitidis,19 where it was confirmed also by ab-initio quantum mechanics calculations. The increase in the coordination number is correlated with a higher accessibility of the active site in the absence of substrates.
These observations have implications on the enzymatic mechanism. Previously published data do not unequivocally support any of the mechanisms proposed for retaining glycosyltransferases. Our simulations are more consistent with the SNi mechanism than the double displacement mechanism since the large conformational changes of the active site, required by a double-displacement mechanism to bring putative nucleophiles closer to the beta-phosphate-oxygen, did not occur on the timescale of the simulations. However, a double displacement mechanism is still possible if the timescale of the conformational change needed for the enzyme-sugar intermediate is much longer than the time sampled by molecular dynamics. Alternatively, a single displacement mechanism would imply that the nucleophilic attack on C1-O bond is performed by the hydroxy group of the acceptor Ser/Thr. The distance between the glycosidic oxygen and the nucleophilic hydroxy group is about 3 Å and is maintained nearly constant during the simulation, which would at least structurally be consistent with a nucleophile role of the acceptor.
Hydration of the active site is important not only for catalysis, but also for the dynamics and flexibility of loop B controlling ligand access in the active site. Simulations of the system in the presence and absence of crystal water molecules for the catalytic domain alone (Table 1, Cx and C) showed that in the absence of water molecules hydrating deep cavities in the active site, the side chain of W331 is stabilized in the inward orientation, and consequently loop B does not open. Interestingly, loop B is the only region whose dynamics is affected in a major way by the removal of x-ray crystallographic water molecules in the initial structure. We found that in the cavity around W331 two internal water molecules (HOH1160 and 1162) do not get replaced by bulk water. These molecules form hydrogen bonds with W331 and E334 in the conserved loop WGGEN. Without these water molecules, W331 and E334 interact directly, preventing the conformational change of W331 side chain. Therefore, these internal water molecules may have a “lubricating” role in the vicinity of W331. Similar effects of internal water molecules have been reported by calculation of hydration free energies and entropies30 suggesting that the addition of a water molecule can increase protein flexibility through effectively weakened protein-protein hydrogen bonds.
In the presence of UDP (Table 1, CUx and CU) and in the presence of UDP and EA2 (Table 1, CUPx and CUP) the RMSD per residue profiles were not significantly different, explained partly by the fact that large conformational changes do not occur in the presence of ligands, and partly by the observation that bulk water molecules from solution occupy the crystallographic water sites close to the protein surface during the simulation.
One of the fundamental questions regarding the family of ppGalNAcTs is related to the in vivo peptide substrate preferences of individual isoforms. A detailed understanding of peptide conformational preferences is crucial for the evaluation of peptide affinity in sequence scanning experiments. Moreover, an accurate description of the conformational space sampled by the backbone of the peptide in complex with the enzyme and the identification of patterns of flexibility along peptide sequence is essential for understanding peptide specificity31 and the design of alternative peptide substrates. Previous studies showed that by introducing backbone perturbations in conformations of peptide ligands observed in crystal structures, it was possible to identify a larger number of interacting sequences for a given docking target.32
The analysis of the x-ray ternary complex (PDB code 2FFU5) shows that EA2 binds in an extended conformation into a shallow groove on the surface of hT2. This conformation was maintained throughout our simulation, except for the terminal residues where the backbone residues sample several conformers (Figure 7a). Hydrogen bonds formed between residues at both ends of the peptide (S5 - H365 and K13-G265, Figure 7b, c) are less stable, while interactions near the glycosylation site (T6-R362 and T11-S267) are more stable, consistent with the lower flexibility in the center of the peptide. However, the terminal residues are less flexible when the GalNAc moiety is present, either covalently linked to UDP or to the peptide, compared with the trajectory containing UDP only.
A general preference of transferases for peptides adopting extended or random coil conformations has been established both by NMR experiments33,34 and by database analysis,35 and these observations have been used successfully in predictions of mucin-type O-glycosylation sites. The higher backbone flexibility of the peptide termini seen here suggests a less pronounced sequence preference at the ends.
ppGalNAcTases, family 27 of GTs,2 play a crucial role in the first step of O-glycosylation by catalyzing the transfer of the GalNAc moiety from a sugar-nucleotide donor UDP-GalNAc to a Ser/Thr residue in an acceptor peptide. Despite a series of x-ray crystal structures,4–6 details of the catalytic transfer mechanism have remained an open question. Similarly, the functional relevance of large-scale loop rearrangements upon substrate binding, as observed in the crystal structures,5 has been unclear. In this work, we address these questions through multiple molecular dynamics simulations of ppGalNAcT-2, both in the presence and absence of its substrates, based on recently solved crystal structures4–6 of three isoforms of the ppGalNAcT family that capture different stages of the transfer reaction pathway.
Regarding the catalytic transfer mechanism, the results of our study show that potential nucleophilic residues from the catalytic domain do not reach the proximity of the beta-phosphate-oxygen. At the same time, distances between the hydroxy group of the acceptor Ser/Thr side chains and the glycosidic oxygen can be relatively short (~3 Å) during the simulations, which demonstrates the structural possibility for a nucleophilic role of the acceptor. In relation to previously proposed transfer mechanisms, our results are more consistent with a single-displacement mechanism,8,11 as suggested for ppGalNAcTs on the basis of x-ray structures and site-directed mutagenesis data, than a double-displacement transfer mechanism.20 Our simulations also suggest that the partially hydrated Mn2+ ion in the active site may play a role in the enzymatic reaction.
We also observed concerted conformational changes of the loops near the active site. These large-scale loop rearrangements are tied to substrate binding, and thus appear to be crucial for the enzymatic activity. In ppGalNAcT-2, in agreement with the x-ray data, our simulations identify two flexible loops (i.e., A and B, in the vicinity of the active site) as important factors in the binding mechanism. While loop A exhibits a high flexibility both with and without ligands, the dynamics of the longer loop B is significantly reduced when UDP-GalNAc is bound. Analysis of the loop B dynamics revealed a critical interaction with a side chain (W331) located on another less flexible but highly conserved region, loop E. Rotation of the W331 side chain creates unfavorable interactions with residues in loop B. These interactions trigger a conformational change of loop B that exposes the active site, and makes it accessible to ligands. Overall, our analysis suggests a mechanism in which W331 acts as a gatekeeper, controlling the transition of loop B from closed (unable to accept ligand) to open (able to accept ligand) conformations. This structural change is facilitated by internal water molecules, trapped in the structure of the catalytic domain, that prove to be essential for ensuring protein flexibility around the W331 site. Large-scale loop rearrangements have been shown to be important for members of families 6, 7 and 8 of GTs.13–19 In particular, our results, potentially testable through Cys-Cys crosslinking experiments,36,37 are consistent with observations from simulation studies of Beta-1,4-Gal-T113,14 regarding the functional role of a conserved Trp residue (here: W331), suggesting that the observed loop motions and their coupling to local side-chain conformations may also be relevant for other families of GTs.
Our simulations show that the opening of loop B leads to a significant increase in the amount of positively charged surface area in the active site, which can be traced to the exposure of two evolutionarily conserved active-site Arg side chains (R201 and R362). Thus, these residues may play a direct role in the binding of the negatively charged UDP-GalNAc.
Finally, the MD simulations containing the peptide acceptor indicate that its extended conformation is readily maintained when bound to the catalytic domain. The peptide ends exhibit a higher flexibility than the residues near its center, and hydrogen bonds formed by terminal residues are therefore more likely to break than those formed by middle residues. We find that the C-terminal end of the peptide ligand is more dynamic than the N-terminal end, consistent with the observation that the C-terminal region is stabilized only by non-directional hydrophobic interactions. These results are relevant for future studies focused on understanding the binding affinities of peptides to ppGalNAcTs and peptide preferences of individual isoforms.
Table 1 lists the components of the all-atom models that are studied in this paper. The various ppGalNAcT/ligand complexes determined by x-ray crystallography were used as starting conformations. The initial coordinates were taken from the x-ray structures of ppGalNAcT isoform 2, either in ternary complex with UDP and acceptor peptide EA2 (PDB code 2FFU) or in binary complex with UDP (PDB code 2FFV). The missing coordinates in the binary complex, residues G89-P98, were modeled in MODELLER 8v2,38,39 using the ternary complex as a template.
UDP-GalNAc was modeled in the active site of 2FFU using the hT10 structure (PDB code 2D7I) containing hydrolyzed UDP-GalNAc as a template. In one of the systems (CL-close), we positioned the hT2 lectin domain near its catalytic domain using the mT1 x-ray structure (PDB code 1XHB) as a template. The structure containing the EA2 peptide glycosylated at Thr7 was modeled using the Glycam Online Glycoprotein Builder tool.40 (www.glycam.com)
In all cases, starting structures for MD simulations were prepared as follows: (a) hydrogen atoms were added (with acidic residues ionized and His residues neutral, in the absence of experimental evidence for His ionization states) and the protein structure file was generated using the psfgen program in VMD41 (b) the system was explicitly solvated with TIP3P water molecules.42 The dimensions of the rectangular solvent box were chosen so that the minimum distance from the box boundaries to the protein was 10 Å. (c) Na+ and Cl− ions were added in order to maintain electro-neutrality.
Parameters for the sugar-nucleotide donor UDP-GalNAc were unavailable and were developed as described below. We used the CHARMM27 force field43 parameters and topology for the UDP moiety and the GLYCAM0440 parameters and topology for GalNAc, converted to CHARMM format. Thus, the only missing parameters were for the sugar-phosphate linkage. We tested three types of parameter values: (a) modeled using the AMBER force field parameters for other sugar-nucleotides;44 (b) modeled according to CHARMM2743 parameters for similar atom types; and (c) modeled using a “null model” (i.e., without any additional force constants for bending and torsion, such that the effective angle potentials arise indirectly from other interactions). In a comparison of the energy profiles as a function of the angle, no significant differences were found between these three cases. Therefore, we adopted the simplest “null model” in our simulations.
The energy profile around the sugar-phosphate bond showed a sharp peak at 160 degrees with a half-width of about 30 degrees (corresponding to a conformation in which the N-acetyl group clashes with the oxygen atoms in the second phosphate group), and local minima at 30, 90, 250 and 310 degrees. The conformations in these local minima are extended, with the angle in gauche + and − conformations, in agreement with previous findings from MD simulations and NMR data for UDP-Glucose.45 The UDP-GalNAc moiety modeled using the T10 structure as a template corresponds to the local minimum at 30 degrees. Therefore, this orientation has been used as initial structure in our simulations.
Our MD simulations were performed in the NPT ensemble using the NAMD2 package46,47 with the CHARMM2743 force field parameters and topology for the protein and UDP, following simulation and analysis methods used in previous MD studies with explicit solvent of multi-component biomolecular systems.48 For the Mn2+ ion we used a vander-Waals radius of 1.6 Å, ε=−0.014 kcal/mol, similar to the Mg2+ ion and a net charge of +2. The simulation systems were subject to a stepwise equilibration before the production phase, involving initial energy minimization, followed by 10 ps of MD with the protein backbone constrained to the initial coordinates, and 10 ps without any constraints. The systems were gradually heated to 298 K during 20 ps runs at a pressure of 1 atm, first with the protein Cα trace constrained to the crystal coordinates, and then equilibrated for 20 ps with restraints on the Cα atoms, followed by 10 ps without any restraints. After the initial minimization, heating and equilibration stages, the production phase of each MD simulation lasted 40 ns (using a timestep of 2 fs) and the coordinates were saved every 10 ps. The Langevin piston method49,50 was used to maintain a constant pressure of 1 atm. The temperature (298 K) was controlled by using Langevin dynamics with a coupling coefficient of 1 ps−1. We used periodic boundary conditions and the particle mesh Ewald (PME) method51 with a real-space cutoff distance of 10 Å and a grid width of 0.96–1.07 Å. The switching distance for non-bonded electrostatics and van der Waals interactions was 8.5 Å and the integration time-step was 1 fs.
Data visualization and calculation of geometric parameters (RMSD, distances, angles etc) were carried out using VMD.41 For the analysis of water interactions we used in-house code.
This study utilized the high-performance computational capabilities of the Biowulf Linux cluster at the National Institutes of Health, Bethesda, Md. (http://biowulf.nih.gov). This research was supported by the Intramural Research Program of the NIH, NIDDK. The authors want to thank Dr. Stefano Costanzi, Laboratory of Biological Modeling, NIDDK, NIH for fruitful discussions.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.